New open data sets from Microsoft Research

Microsoft has released a number of data sets produced by Microsoft Research and made them available for download at Microsoft Research Open Data.   The Datasets in Microsoft Research Open Data are categorized by their primary research area, such as Physics, Social Science, Environmental Science, and Information Science. Many of the data sets have not been previously available to the public, and many are large and useful for research in AI and Machine Learning techniques. Many of the datasets also include links to associated papers from Microsoft Research. For example, the 10Gb DESM Word Embeddings dataset provides the IN and the OUT word2vec embeddings for 2.7M words trained on a Bing query corpus of 600M+ queries. Other data sets of note include: A collection of 38M tweets related to the 2012 US election 3-D capture data from individuals performing a variety of hand gestures…
Original Post: New open data sets from Microsoft Research

Announcing Microsoft Research Open Data, a cloud hosted platform for sharing datasets

[unable to retrieve full-text content]Microsoft announces Microsoft Research Open Data, datasets representing many years of data curation and research efforts by Microsoft that were published as research outcomes.
Original Post: Announcing Microsoft Research Open Data, a cloud hosted platform for sharing datasets

How to Execute R and Python in SQL Server with Machine Learning Services

[unable to retrieve full-text content]Machine Learning Services in SQL Server eliminates the need for data movement – you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
Original Post: How to Execute R and Python in SQL Server with Machine Learning Services

Hotfix for Microsoft R Open 3.5.0 on Linux

On Monday, we learned about a serious issue with the installer for Microsoft R Open on Linux-based systems. (Thanks to Norbert Preining for reporting the problem.) The issue was that the installation and de-installation scripts would modify the system shell, and did not use the standard practices to create and restore symlinks for system applications. The Microsoft R team developed a solution the problem with the help of some Debian experts at Microsoft, and last night issued a hotfix for Microsoft R Open 3.5.0 which is now available for download. With this fix, the MRO installer no longer relinks /bin/sh to /bin/bash, and instead uses dpkg-divert for Debian-based platforms and update-alternatives for RPM-based platforms. We will also request a discussion with the Debian maintainers of R to further review our installation process. Finally, with the next release — MRO 3.5.1, scheduled for…
Original Post: Hotfix for Microsoft R Open 3.5.0 on Linux

Microsoft R Open 3.5.0 now available

Microsoft R Open 3.5.0 is now available for download for Windows, Mac and Linux. This update includes the open-source R 3.5.0 engine, which is a major update with many new capabilities and improvements to R. In particular, it includes a major new framework for handling data in R, with some major behind-the-scenes performance and memory-use benefits (and with further improvements expected in the future). Microsoft R Open 3.5.0 points to a fixed CRAN snapshot taken on June 1 2018. This provides a reproducible experience when installing CRAN packages by default, but you always change the default CRAN repository or the built-in checkpoint package to access snapshots of packages from an earlier or later date. Relatedly, many new packages have been released since the last release of Microsoft R Open, and you can browse a curated list of some interesting ones on the Microsoft…
Original Post: Microsoft R Open 3.5.0 now available

What's new in Azure for Machine Learning and AI

There were a lot of big announcements at last month’s Build conference, and many of them were related to machine learning and artificial intelligence. With my colleague Tim Heuer, we summarized some of the big announcements — and a few you may have missed — in a recent webinar. The slides are embedded below, and include links to recordings of the Build sessions where you can find in-depth details. You can’t see the videos or demos in the slides, unfortunately — my favorite is a demo of using Microsoft Translator, trained by a hearing-impaired user, to accurately transcribe “deaf voice”. But you can find the videos and discussion from Tim and me in the on-demand recording available at the link below. Azure Webinar Series: Top Azure Takeaways from Microsoft Build
Original Post: What's new in Azure for Machine Learning and AI

Custom R charts coming to Excel

This week at the BUILD conference, Microsoft announced that Power BI custom visuals will soon be available as charts with Excel. You’ll be able to choose a range of data within an Excel workbook, and pass those data to one of the built-in Power BI custom visuals, or one you’ve created yourself using the API. Since you can create Power BI custom visuals using R, that means you’ll be able to design a custom R-based chart, and make it available to people using Excel — even if they don’t know how to use R themselves. There also many pre-defined custom visuals available, including some familiar R charts like decision trees, calendar heatmaps, and hexbin scatterplots. For more details on how you’ll be able to use custom R visuals in Excel, check out the blog post linked below. PowerBI Blog: Excel announces…
Original Post: Custom R charts coming to Excel

Open-Source Machine Learning in Azure

The topic for my talk at the Microsoft Build conference yesterday was “Migrating Existing Open Source Machine Learning to Azure”. The idea behind the talk was to show how you can take the open-source tools and workflows you already use for machine learning and data science, and easily transition them to the Azure cloud to take advantage of its capacity and scale. The theme for the talk was “no surprises”, and other than the Azure-specific elements I tried to stick to standard OSS tools rather than Microsoft-specific things, to make the process as familiar as possible. In the talk I covered: Using Visual Studio Code as a cross-platform, open-source editor and interface to Azure services Using the Azure CLI to script the deployment, functions, and deletion of resources in the Azure cloud Using the range of data science and machine learning tools…
Original Post: Open-Source Machine Learning in Azure

Microsoft R Open 3.4.4 now available

An update to Microsoft R Open (MRO) is now available for download on Windows, Mac and Linux. This release upgrades the R language engine to version 3.4.4, which addresses some minor issues with timezone detection and some edge cases in some statistics functions. As a maintenance release, it’s backwards-compatible with scripts and packages from the prior release of MRO. MRO 3.4.4 points to a fixed CRAN snapshot taken on April 1 2018, and you can see some highlights of new packages released since the prior version of MRO on the Spotlights page. As always, you can use the built-in checkpoint package to access packages from an earlier date (for reproducibility) or a later date (to access new and updated packages). Looking ahead, the next update based on R 3.5.0 has started the build and test process. Microsoft R Open 3.5.0 is scheduled for release…
Original Post: Microsoft R Open 3.4.4 now available