Highlights from the Connect(); conference

Connect();, the annual Microsoft developer conference, is wrapping up now in New York. The conference was the venue for a number of major announcements and talks. Here are some highlights related to data science, machine learning, and artificial intelligence: Lastly, I wanted to share this video presented at the conference from Stack Overflow. Keep an eye out for R community luminary David Robinson programming in R! You can find more from the Connect conference, including on-demand replays of the talks and keynotes, at the link below. Microsoft: Connect(); November 15-17, 2017
Original Post: Highlights from the Connect(); conference

How (& Why) Data Scientists and Data Engineers Should Share a Platform

[unable to retrieve full-text content]Sharing one platform has some obvious benefits for Data Science and Data Engineering teams, but technical, language and process challenges often make this a challenge. Learn how one company implemented single cloud platform for R, Python and other workloads – and some of the unexpected benefits they discovered along the way.
Original Post: How (& Why) Data Scientists and Data Engineers Should Share a Platform

The City of Chicago uses R to issue beach safety alerts

Among the many interesting talks I saw a the Domino Data Science Pop-Up in Chicago earlier this week was the presentation by Gene Lynes and Nick Lucius from the City of Chicago. The City of Chicago Tech Plan encourages smart communities and open government, and as part of that initiative the city has undertaken dozens of open-source, open-data projects in areas such as food safety inspections, preventing the spread of West Nile virus, and keeping sidewalks clear of snow.  This talk was on the Clear Water initiative, a project to monitor the water quality of Chicago’s many public beaches on Lake Michigan, and to issue safety alerts (or in serious cases, beach closures) when E Coli levels in the water get too high. The problem is that E Coli levels can change rapidly: water levels can be normal for weeks,…
Original Post: The City of Chicago uses R to issue beach safety alerts

Updated curl package provides additional security for R on Windows

There are many R packages that connect to the internet, whether it’s to import data (readr), install packages from Github (devtools), connect with cloud services (AzureML), or many other web-connected tasks. There’s one R package in particular that provides the underlying connection between R and the Web: curl, by Jeroen Ooms, who is also the new maintainer for R for Windows. (The name comes from curl, a command-line utility and interface library for connecting to web-based services). The curl package provides replacements for the standard url and download.file functions in R with support for encryption, and the package was recently updated to enhance its security, particularly on Windows. To implement secure communications, the curl package needs to connect with a library that handles the SSL (secure socket layer) encryption. On Linux and Macs, curl has always used the OpenSSL library, which is…
Original Post: Updated curl package provides additional security for R on Windows

Recap: EARL Boston 2017

By Emmanuel Awa, Francesca Lazzeri and Jaya Mathew, data scientists at Microsoft A few of us got to attend EARL conference in Boston last week which brought together a group of talented users of R from academia and industry. The conference highlighted various Enterprise Applications of R. Despite being a small conference, the quality of the talks were great and showcased various innovative ways in using some of the newer packages available for use in the R language. Some of the attendees were veteran R users while some were new comers to the R language, so there was a mix in the level of proficiency in using the R language.   R currently has a vibrant community of users and there are over 11,000 open source packages. The conference also encouraged women to join their local chapter for R Ladies…
Original Post: Recap: EARL Boston 2017

Calculating the house edge of a slot machine, with R

Modern slot machines (fruit machine, pokies, or whatever those electronic gambling devices are called in your part of the world) are designed to be addictive. They’re also usually quite complicated, with a bunch of features that affect the payout of a spin: multiple symbols with different pay scales, wildcards, scatter symbols, free spins, jackpots … the list goes on. Many machines also let you play multiple combinations at the same time (20 lines, or 80, or even more with just one spin). All of this complexity is designed to make it hard for you, the player, to judge the real odds of success. But rest assured: in the long run, you always lose.  All slot machines are designed to have a “house edge” — the percentage of player bets retained by the machine in the long run — greater than…
Original Post: Calculating the house edge of a slot machine, with R

In case you missed it: October 2017 roundup

In case you missed them, here are some articles from October of particular interest to R users. A recent survey of competitors on the Kaggle platform reveals that Python (76%) and R (59%) are the preferred tools for building predictive models. Microsoft’s “Team Data Science Process” has been updated with new guidelines on use of the IDEAR framework for R and Python. Microsoft R Open 3.4.2 is now available for Windows, Mac and Linux. Using the foreach package to estimate bias of rpart trees via bootstrapping. Replays of webinars on the Azure Data Science VM, and on document collection analysis with Azure ML Workbench, are now available. The “officer” package makes it possible to create PowerPoint and Word documents from R, and even include editable R charts. An online book on statistical machine learning with the MicrosoftML package. An updated…
Original Post: In case you missed it: October 2017 roundup