Demo Week: Time Series Machine Learning with h2o and timetk

We’re at the final day of Business Science Demo Week. Today we are demo-ing the h2o package for machine learning on time series data. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today you’ll see how we can use timetk + h2o to get really accurate time series forecasts. Here we go! Previous Demo Week Demos The h2o package is a product offered by H2O.ai that contains a number of cutting edge machine learning algorithms, performance metrics, and auxiliary functions to make machine learning both powerful and easy. One of the main benefits of H2O is that it can…
Original Post: Demo Week: Time Series Machine Learning with h2o and timetk

Demo Week: Tidy Time Series Analysis with tibbletime

We’re into the fourth day of Business Science Demo Week. We have a really cool one in store today: tibbletime, which uses a new tbl_time class that is time-aware!! For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Let’s take tibbletime for a spin! Previous Demo Week Demos The future of “tidy” time series analysis: New class tbl_time rests on top of tbl and makes tibbles time aware. Time Series Functions: Can use a series of “tidy” time series functions designed specifically for tbl_time objects. Some of them are: time_filter(): Succinctly filter a tbl_time object by…
Original Post: Demo Week: Tidy Time Series Analysis with tibbletime

Demo Week: Tidy Forecasting with sweep

We’re into the third day of Business Science Demo Week. Hopefully by now you’re getting a taste of some interesting and useful packages. For those that may have missed it, every day this week we are demo-ing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Today is sweep, which has broom-style tidiers for forecasting. Let’s get going! Previous Demo Week Demos sweep is used for tidying the forecast package workflow. Like broom is to the stats library, sweep is to forecast package. It has useful functions including: sw_tidy, sw_glance, sw_augment, and sw_sweep. We’ll check out each in this demo. An added benefit to sweep and timetk is if the…
Original Post: Demo Week: Tidy Forecasting with sweep

Demo Week: Time Series Machine Learning with timetk

We’re into the second day of Business Science Demo Week. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Second up is timetk, your toolkit for time series in R. Here we go! Previous Demo Week Demos There are three main uses: Time series machine learning: Using regression algorithms to forecast Making future time series indicies: Extract, explore, and extend a time series index using patterns in the time-base Coercing (converting) between time classes (e.g. between tbl, xts, zoo, ts): Consistent coercion makes working in the various time classes much easier! We’ll go over time series ML and coercion today.…
Original Post: Demo Week: Time Series Machine Learning with timetk

Demo Week: class(Monday) <- tidyquant

We’ve got an exciting week ahead of us at Business Science: we’re launching our first ever Business Science Demo Week. Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. First up is tidyquant, our flagship package that’s useful for financial and time series analysis. Here we go! Six reasons to use tidyquant: Getting web data from Yahoo! Finance, FRED Database, Quandl and more Tidy application of financial and time series functions from xts, zoo, quantmod, TTR and PerformanceAnalytics Graphing: Beautiful themes and financial geoms (e.g. geom_ma) Aggregating portfolios Financial performance analysis and portfolio attribution metrics Great base for financial and time series analysis:…
Original Post: Demo Week: class(Monday) <- tidyquant

Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders

Sales, customer service, supply chain and logistics, manufacturing… no matter which department you’re in, you more than likely care about backorders. Backorders are products that are temporarily out of stock, but a customer is permitted to place an order against future inventory. Back orders are both good and bad: Strong demand can drive back orders, but so can suboptimal planning. The problem is when a product is not immediately available, customers may not have the luxury or patience to wait. This translates into lost sales and low customer satisfaction. The good news is that machine learning (ML) can be used to identify products at risk of backorders. In this article we use the new H2O automated ML algorithm to implement Kaggle-quality predictions on the Kaggle dataset, “Can You Predict Product Backorders?”. This is an advanced tutorial, which can be difficult…
Original Post: Sales Analytics: How to Use Machine Learning to Predict and Optimize Product Backorders

It’s tibbletime v0.0.2: Time-Aware Tibbles, New Functions, Weather Analysis and More

Today we are introducing tibbletime v0.0.2, and we’ve got a ton of new features in store for you. We have functions for converting to flexible time periods with the ~period formula~ and making/calculating custom rolling functions with rollify() (plus a bunch more new functionality!). We’ll take the new functionality for a spin with some weather data (from the weatherData package). However, the new tools make tibbletime useful in a number of broad applications such as forecasting, financial analysis, business analysis and more! We truly view tibbletime as the next phase of time series analysis in the tidyverse. If you like what we do, please connect with us on social media to stay up on the latest Business Science news, events and information! Introduction We are excited to announce the release of tibbletime v0.0.2 on CRAN. Loads of newfunctionality have been…
Original Post: It’s tibbletime v0.0.2: Time-Aware Tibbles, New Functions, Weather Analysis and More

HR Analytics: Using Machine Learning to Predict Employee Turnover

Employee turnvover (attrition) is a major cost to an organization, and predicting turnover is at the forefront of needs of Human Resources (HR) in many organizations. Until now the mainstream approach has been to use logistic regression or survival curves to model employee attrition. However, with advancements in machine learning (ML), we can now get both better predictive performance and better explanations of what critical features are linked to employee attrition. In this post, we’ll use two cutting edge techniques. First, we’ll use the h2o package’s new FREE automatic machine learning algorithm, h2o.automl(), to develop a predictive model that is in the same ballpark as commercial products in terms of ML accuracy. Then we’ll use the new lime package that enables breakdown of complex, black-box machine learning models into variable importance plots. We can’t stress how excited we are to…
Original Post: HR Analytics: Using Machine Learning to Predict Employee Turnover

It’s tibbletime: Time-Aware Tibbles

We are very excited to announce the initial release of our newest R package,tibbletime. As evident from the name, tibbletime is built on top of thetibble package (and more generally on top of the tidyverse) with the mainpurpose of being able to create time-aware tibbles through a one-timespecification of an “index” column (a column containing timestamp information). There are a ton of useful time functions that we can now use such as time_filter(), time_summarize(), tmap(), as_period() and time_collapse(). We’ll walk through the basics in this post. If you like what we do, please follow us on social media to stay up on the latest Business Science news, events and information! As always, we are interested in both expanding our network of data scientists and seeking new clients interested in applying data science to business and finance. If interested, contact us.…
Original Post: It’s tibbletime: Time-Aware Tibbles

alphavantager: An R interface to the Free Alpha Vantage Financial Data API

We’re excited to announce the alphavantager package, a lightweight R interface to the Alpha Vantage API! Alpha Vantage is a FREE API for retreiving real-time and historical financial data. It’s very easy to use, and, with the recent glitch with the Yahoo Finance API, Alpha Vantage is a solid alternative for retrieving financial data for FREE! It’s definitely worth checking out if you are interested in financial analysis. We’ll go through the alphavantager R interface in this post to show you how easy it is to get real-time and historical financial data. In the near future, we have plans to incorporate the alphavantager into tidyquant to enable scaling from one equity to many. If you like what you read, please follow us on social media to stay up on the latest Business Science news, events and information! As always, we…
Original Post: alphavantager: An R interface to the Free Alpha Vantage Financial Data API