Lessons Learned From Benchmarking Fast Machine Learning Algorithms

[unable to retrieve full-text content]Boosted decision trees are responsible for more than half of the winning solutions in machine learning challenges hosted at Kaggle, and require minimal tuning. We evaluate two popular tree boosting software packages: XGBoost and LightGBM and draw 4 important lessons.
Original Post: Lessons Learned From Benchmarking Fast Machine Learning Algorithms

Data Version Control in Analytics DevOps Paradigm

[unable to retrieve full-text content]DevOps and DVC tools can help reduce time data scientists spend on mundane data preparation and achieve their dream of focusing on cool machine learning algorithms and interesting data analysis.
Original Post: Data Version Control in Analytics DevOps Paradigm

Making Predictive Models Robust: Holdout vs Cross-Validation

[unable to retrieve full-text content]The validation step helps you find the best parameters for your predictive model and prevent overfitting. We examine pros and cons of two popular validation strategies: the hold-out strategy and k-fold.
Original Post: Making Predictive Models Robust: Holdout vs Cross-Validation

How Convolutional Neural Networks Accomplish Image Recognition?

[unable to retrieve full-text content]Image recognition is very interesting and challenging field of study. Here we explain concepts, applications and techniques of image recognition using Convolutional Neural Networks.
Original Post: How Convolutional Neural Networks Accomplish Image Recognition?

Going deeper with recurrent networks: Sequence to Bag of Words Model

[unable to retrieve full-text content]Deep learning makes it possible to convert unstructured text to computable formats, incorporating semantic knowledge to train machine learning models. These digital data troves help us understand people on a new level.
Original Post: Going deeper with recurrent networks: Sequence to Bag of Words Model

Why Apache Arrow is the future for open source-columnar memory analytics

[unable to retrieve full-text content]Apache Arrow is a de-facto standard for columnar in-memory analytics. In the coming years we can expect all the big data platforms adopting Apache Arrow as its columnar in-memory layer.
Original Post: Why Apache Arrow is the future for open source-columnar memory analytics