The Tidy Time Series Platform: tibbletime 0.1.0

We’re happy to announce the third release of the tibbletime package. This is a huge update, mainly due to a complete rewrite of the package. It contains a ton of new functionality and a number of breaking changes that existing users need to be aware of. All of the changes have been well documented in the NEWS file, but it’s worthwhile to touch on a few of them here and discuss the future of the package. We’re super excited so let’s check out the vision for tibbletime and its new functionality! About Tibbletime For those new to to package, tibbletime is a new package that enables the creation of time aware tibbles. It’s sole purpose is to make working with time series in the tidyverse much easier! The documentation really explains everything, and here are a few important vignettes that…
Original Post: The Tidy Time Series Platform: tibbletime 0.1.0

Six Reasons To Learn R For Business

Data science for business (DS4B) is the future of business analytics yet it is really difficult to figure out where to start. The last thing you want to do is waste time with the wrong tool. Making effective use of your time involves two pieces: (1) selecting the right tool for the job, and (2) efficiently learning how to use the tool to return business value. This article focuses on the first part, explaining why R is the right choice in six points. Our next article will focus on the second part, learning R in 12 weeks. Reason 1: R Has The Best Overall Qualities There are a number of tools available business analysis/business intelligence (with DS4B being a subset of this area). Each tool has its pros and cons, many of which are important in the business context. We…
Original Post: Six Reasons To Learn R For Business

Customer Analytics: Using Deep Learning With Keras To Predict Customer Churn

Customer churn is a problem that all companies need to monitor, especially those that depend on subscription-based revenue streams. The simple fact is that most organizations have data that can be used to target these individuals and to understand the key drivers of churn, and we now have Keras for Deep Learning available in R (Yes, in R!!), which predicted customer churn with 82% accuracy. We’re super excited for this article because we are using the new keras package to produce an Artificial Neural Network (ANN) model on the IBM Watson Telco Customer Churn Data Set! As for most business problems, it’s equally important to explain what features drive the model, which is why we’ll use the lime package for explainability. We cross-checked the LIME results with a Correlation Analysis using the corrr package. We’re not done yet. In addition,…
Original Post: Customer Analytics: Using Deep Learning With Keras To Predict Customer Churn

EARL Presentation on HR Analytics: Using ML to Predict Employee Turnover

The EARL Boston 2017 conference was held November 1 – 3 in Boston, Mass. There were some excellent presentations illustrating how R is being embraced in enterprises, especially in the financial and pharmaceutical industries. Matt Dancho, founder of Business Science, presented on using machine learning to predict and explain employee turnover, a hot topic in HR! We’ve uploaded the HR Analytics presentation to YouTube. Check out the presentation, and don’t forget to follow us on social media to stay up on the latest Business Science news, events and information! If you’re interested in HR Analytics and R applications in business, check out our 30 minute presentation from EARL Boston 2017! We talk about: HR Analytics: Using Machine Learning for Employee Turnover Prediction and Explanation Using H2O for automated machine learning Using LIME for feature importance of black-box (non-linear) models such…
Original Post: EARL Presentation on HR Analytics: Using ML to Predict Employee Turnover

Demo Week: Time Series Machine Learning with h2o and timetk

We’re at the final day of Business Science Demo Week. Today we are demo-ing the h2o package for machine learning on time series data. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today you’ll see how we can use timetk + h2o to get really accurate time series forecasts. Here we go! Previous Demo Week Demos The h2o package is a product offered by H2O.ai that contains a number of cutting edge machine learning algorithms, performance metrics, and auxiliary functions to make machine learning both powerful and easy. One of the main benefits of H2O is that it can…
Original Post: Demo Week: Time Series Machine Learning with h2o and timetk

Demo Week: Tidy Time Series Analysis with tibbletime

We’re into the fourth day of Business Science Demo Week. We have a really cool one in store today: tibbletime, which uses a new tbl_time class that is time-aware!! For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Let’s take tibbletime for a spin! Previous Demo Week Demos The future of “tidy” time series analysis: New class tbl_time rests on top of tbl and makes tibbles time aware. Time Series Functions: Can use a series of “tidy” time series functions designed specifically for tbl_time objects. Some of them are: time_filter(): Succinctly filter a tbl_time object by…
Original Post: Demo Week: Tidy Time Series Analysis with tibbletime

Demo Week: Tidy Forecasting with sweep

We’re into the third day of Business Science Demo Week. Hopefully by now you’re getting a taste of some interesting and useful packages. For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today is sweep, which has broom-style tidiers for forecasting. Let’s get going! Previous Demo Week Demos sweep is used for tidying the forecast package workflow. Like broom is to the stats library, sweep is to forecast package. It has useful functions including: sw_tidy, sw_glance, sw_augment, and sw_sweep. We’ll check out each in this demo. An added benefit to sweep and timetk is if the…
Original Post: Demo Week: Tidy Forecasting with sweep

Demo Week: Time Series Machine Learning with timetk

We’re into the second day of Business Science Demo Week. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Second up is timetk, your toolkit for time series in R. Here we go! Previous Demo Week Demos There are three main uses: Time series machine learning: Using regression algorithms to forecast Making future time series indicies: Extract, explore, and extend a time series index using patterns in the time-base Coercing (converting) between time classes (e.g. between tbl, xts, zoo, ts): Consistent coercion makes working in the various time classes much easier! We’ll go over time series ML and coercion today.…
Original Post: Demo Week: Time Series Machine Learning with timetk

Demo Week: class(Monday) <- tidyquant

We’ve got an exciting week ahead of us at Business Science: we’re launching our first ever Business Science Demo Week. Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. First up is tidyquant, our flagship package that’s useful for financial and time series analysis. Here we go! Six reasons to use tidyquant: Getting web data from Yahoo! Finance, FRED Database, Quandl and more Tidy application of financial and time series functions from xts, zoo, quantmod, TTR and PerformanceAnalytics Graphing: Beautiful themes and financial geoms (e.g. geom_ma) Aggregating portfolios Financial performance analysis and portfolio attribution metrics Great base for financial and time series analysis:…
Original Post: Demo Week: class(Monday) <- tidyquant

Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders

Sales, customer service, supply chain and logistics, manufacturing… no matter which department you’re in, you more than likely care about backorders. Backorders are products that are temporarily out of stock, but a customer is permitted to place an order against future inventory. Back orders are both good and bad: Strong demand can drive back orders, but so can suboptimal planning. The problem is when a product is not immediately available, customers may not have the luxury or patience to wait. This translates into lost sales and low customer satisfaction. The good news is that machine learning (ML) can be used to identify products at risk of backorders. In this article we use the new H2O automated ML algorithm to implement Kaggle-quality predictions on the Kaggle dataset, “Can You Predict Product Backorders?”. This is an advanced tutorial, which can be difficult…
Original Post: Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders