Education Analytics with R and Cortana Intelligence Suite

By Fang Zhou, Microsoft Data Scientist; Hong Ooi, Microsoft Senior Data Scientist; and Graham Williams, Microsoft Director of Data Science Education is a relatively late adopter of predictive analytics and machine learning as a management tool. A keen desire for improving educational outcomes for society is now leading universities and governments to perform student predictive analytics to provide better-informed and timely decision making. Student predictive analytics often aims to solve two key problems: Predict student academic outcomes so as to better target support. Predict students at risk of dropping out so as to prevent attrition. Education systems face enormous diversity across regions and countries. Two case studies demonstrate the novel and unique landscape for machine learning in the education world. A mixed effects regression model has been developed in conjunction with an Australian education department to measure the influence of…
Original Post: Education Analytics with R and Cortana Intelligence Suite

Analyzing emotions in video with R

In the run-up to the election last year, Ben Heubl from The Economist used the Emotion API to chart the emotions portrayed by the candidates during the debates (note: auto-play video in that link). In his walkthrough of the implementation, Ben used Python to process the video files, and R to create the charts from the sentiment scores generated by the API. Now, the learn dplyr blog has recreated the analysis using R. A detailed walkthrough steps through the process of creating a free Emotion API key, submitting a video to the API using the httr package, and retrieving the emotion scores as an R data frame.  With the emotion scores in hand, the blog visualizes the data using exploratory.io, a web-based data exploration GUI based on R. With a few points and clicks (plus a little R code to wrangle the…
Original Post: Analyzing emotions in video with R

The Flexibility of Remote and Local R Workspaces

by Sean Wells, Senior Software Engineer, Microsoft The mrsdeploy R package facilitates Remote Execution and Web Service interactions from your local R IDE command line against a remote Microsoft R Server instance. Both core features can be used independently of one another or combined to support different convenient workflows. These different workflows composed together can produce some creative R development and operationalization solutions. mrsdeploy comes bundled with installations of Microsoft R Client and Microsoft R Server. Before using any mrsdeploy API you must first authenticate against the R Server and before we authenticate we should become familiar with the different authentication approaches and their supported arguments. Currently, there are two ways to authenticate: remoteLogin() using an on premises Active Directory server on your network: remoteLogin( endpoint, session = TRUE, diff = TRUE, commandline = TRUE, username = NULL, password =…
Original Post: The Flexibility of Remote and Local R Workspaces

Power BI custom visuals, based on R

You’ve been able to include user-defined charts using R in Power BI dashboards for a while now, but a recent update to Power BI includes seven new custom charts based on R in the customs visuals gallery. You can see the new chart types by visiting the Power BI Custom Visuals Gallery and clicking on the “R-powered visuals” tab. The new custom visuals are listed below. Click on the visual names to see an example, and click on the “GitHub” link at the bottom of the pop-up to see the actual R code used. You can also combine custom R charts with other Power BI tools to create interactive R charts. A favorite of mine is the Timeline slicer, which allows you to select a range from a date variable. (The dashboard, including the R charts, is updated using only…
Original Post: Power BI custom visuals, based on R

Parallelizing Data Analytics on Azure with the R Interface Tool

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) In data science, to develop a model with optimal performance, exploratory experiments on different sets of hyper-parameters are often performed. Preliminary analyses on smaller data can be performed on a single machine, while the experimental one on large-scale data by sweeping multi-sets of parameters can be run on a cluster to boost the computational efforts. Scalable computation resources which can be easily managed are desired for such an application scenario. This blog post shares a walk-through using the Azure Interface tool that operates and manages Azure cloud instances directly from R using the AzureSMR package, and executes scalable analytical jobs on deployed instances using customized Microsoft R Server computation contexts, which are easily specified in R using an “interface object”. The overall architecture of the Interface Tool is described in…
Original Post: Parallelizing Data Analytics on Azure with the R Interface Tool

Take a Test Drive of the Linux Data Science Virtual Machine

If you’ve been thinking about trying out the Data Science Virtual Machine on Linux, but don’t yet have an Azure account, you can now take a free test drive — no credit card required! Just visit the Linux DSVM Marketplace page and click the blue button: The Linux Data Science Virtual Machine includes all of the tools a modern data scientist needs, in one easy-to-launch package. With it, you can try exploring data with Apache Drill, train deep neural networks for computer vision with MXNet, develop AI applications with the Cognitive Toolkit, or create statistical models with big data in R with Microsoft R Server 9.0. For details on how to test drive Linux Data Science Virtual Machine, follow the link below. Cortana Intelligence and Machine Learning Blog: New Additions to the Data Science Virtual Machine – Test Drive, Community Forums,…
Original Post: Take a Test Drive of the Linux Data Science Virtual Machine

Introducing the AzureSMR package: Manage Azure services from your R session

by Alan Weaver, Advanced Analytics Specialist at Microsoft Very often data scientists and analysts require access to back-end resources on Azure. For example, they may need to start a virtual machine or resize a Hadoop cluster. This typically requires making a request to the IT department and patiently waiting.  AzureSMR is a simple R package that enables those users to do many of those operations themselves. It’s very easy to script commonly-used functions which can be run without having to navigate the portal or wizards. AzureSMR uses the Azure Systems Management API and leverages standard packages such as httr, so it can easily run in any R session (you don’t need Microsoft R Server).  You can also manage multiple Azure subscriptions from within the same session. The AzureSMR functions currently addresses the following Azure Services: Azure Blob: List, Read and…
Original Post: Introducing the AzureSMR package: Manage Azure services from your R session

Interactive decision trees with Microsoft R

Even though ensembles of trees (random forests and the like) generally have better predictive power and robustness, fitting a single decision tree to data can often be very useful for: understanding the important variables in a data set exploring unusual subsegments of the data (and the explanatory variables that define them) presenting a simple, decision-based model to management to explain behaviors in data illustrating a model graphically But to get the best out of a decision tree, you need to be able to look at it, interact with it, and able to present it attractively. This blog post by Longhow Lam demonstrates the interactive tree viewer in Microsoft R, which lets you explore the individual nodes and breakpoints in the fitted tree, which can be embedded on a web page or printed in a report. Click on the screenshot below (from an…
Original Post: Interactive decision trees with Microsoft R

How the State of Indiana uses R and Azure to forecast employment

“Big Data” generates a lot of news these days, but sometimes small data still means big computation. Indiana’s Department of Workforce Development has the responsibility to forecast future employment rates in the State of Indiana. And not just the number of jobs available: the department also needs to forecast the types of jobs that will be available, so the administration can link up future job requirements with training and education policies and make sure a skilled workforce can fill the future demand.  The State of Indiana contracted with analytics professional services firm Inquidia Consulting to develop a system to generate the forecast, as part of the “Demand-Driven Workforce System” project. After assembling a database of employment data, the team quickly found that this wasn’t exactly a big-data problem: each forecast was based on just 1kb to 3kb of data, and could…
Original Post: How the State of Indiana uses R and Azure to forecast employment