[unable to retrieve full-text content]In this tutorial, I use raw bash commands and regex to process raw and messy JSON file and raw HTML page. The tutorial helps us understand the text processing mechanism under the hood.
Original Post: Text Mining on the Command Line
[unable to retrieve full-text content]In this post, we walk through investigating, retrieving, and cleaning a real world data set. We will also describe the cost benefits and necessary tools involved in building your own data sets.
Original Post: Data Retrieval and Cleaning: Tracking Migratory Patterns
[unable to retrieve full-text content]A portfolio of real-world projects is the best way to break into data science. This article highlights the 5 types of projects that will help land you a job and improve your career.
Original Post: 5 Data Science Projects That Will Get You Hired in 2018
[unable to retrieve full-text content]Stagraph is a new simple visual interface for R, which focuses on data import, data wrangling and data visualization.
Original Post: Stagraph – a general purpose R GUI, for data import, wrangling, and visualization
[unable to retrieve full-text content]Check out this collection of NLP resources for beginners, starting from zero and slowly progressing to the point that readers should have an idea of where to go next.
Original Post: Natural Language Processing Nuggets: Getting Started with NLP
[unable to retrieve full-text content]This article introduces ioModel, an open source research platform that ingests data and automatically generates descriptive statistics on that data.
Original Post: ioModel Machine Learning Research Platform – Open Source
[unable to retrieve full-text content]Check out our lineup of upcoming virtual seminars, online learning courses, and customized training in your office. Space is limited, so reserve your seat early and score the best savings!
Original Post: Virtual Training Events Without Leaving Your Desk
[unable to retrieve full-text content]The main challenge for a data science team is to decide who will be responsible for labeling, estimate how much time it will take, and what tools are better to use.
Original Post: How to Organize Data Labeling for Machine Learning: Approaches and Tools
[unable to retrieve full-text content]This article is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images.
Original Post: Data Augmentation: How to use Deep Learning when you have Limited Data
[unable to retrieve full-text content]Machine Learning Yearning is a book by AI and Deep Learning guru Andrew Ng, focusing on how to make machine learning algorithms work and how to structure machine learning projects. Here we present 7 very useful suggestions from the book.
Original Post: 7 Useful Suggestions from Andrew Ng “Machine Learning Yearning”