Registration and talk proposals now open for useR!2018

Registration is now open for useR! 2018, the official R user conference to be held in Brisbane, Australia July 10-13. If you haven’t been to a useR! conference before, it’s a fantastic opportunity to meet and mingle with other R users from around the world, see talks on R packages and applications, and attend tutorials for deep dives on R-related topics. This year’s conference will also feature keynotes from Jenny Bryan, Steph De Silva, Heike Hofmann, Thomas Lin Pedersen, Roger Peng and Bill Venables. It’s my favourite conference of the year, and I’m particularly looking forward to this one. This video from last year’s conference in Brussels (a sell-out with over 1,1000 attendees) will give you a sense of what a useR! conference is like: The useR! conference brought to you by the R Foundation and is 100% community-led.…
Original Post: Registration and talk proposals now open for useR!2018

Microsoft R Open 3.4.3 now available

Microsoft R Open (MRO), Microsoft’s enhanced distribution of open source R, has been upgraded to version 3.4.3 and is now available for download for Windows, Mac, and Linux. This update upgrades the R language engine to the latest R (version 3.4.3) and updates the bundled packages (specifically: checkpoint, curl, doParallel, foreach, and iterators) to new versions.  MRO is 100% compatible with all R packages. MRO 3.4.3 points to a fixed CRAN snapshot taken on January 1 2018, and you can see some highlights of new packages released since the prior version of MRO on the Spotlights page. As always, you can use the built-in checkpoint package to access packages from an earlier date (for reproducibility) or a later date (to access new and updated packages). MRO 3.4.3 is based on R 3.4.3, a minor update to the R engine (you can see the detailed list…
Original Post: Microsoft R Open 3.4.3 now available

KDnuggets™ News 18:n03, Jan 17: Top 10 TED Talks on Data Science, Machine Learning; How Docker Can Help You Become A More Effective Data Scientist

[unable to retrieve full-text content]Also A Primer on Web Scraping in R; Elasticsearch for Dummies; Generative Adversarial Networks, an overview,
Original Post: KDnuggets™ News 18:n03, Jan 17: Top 10 TED Talks on Data Science, Machine Learning; How Docker Can Help You Become A More Effective Data Scientist

A simple way to set up a SparklyR cluster on Azure

The SparklyR package from RStudio provides a high-level interface to Spark from R. This means you can create R objects that point to data frames stored in the Spark cluster and apply some familiar R paradigms (like dplyr) to the data, all the while leveraging Spark’s distributed architecture without having to worry about memory limitations in R. You can also access the distributed machine-learning algorithms included in Spark directly from R functions.  If you don’t happen to have a cluster of Spark-enabled machines set up in a nearby well-ventilated closet, you can easily set one up in your favorite cloud service. For Azure, one option is to launch a Spark cluster in HDInsight, which also includes the extensions of Microsoft ML Server. While this service recently had a significant price reduction, it’s still more expensive than running a “vanilla” Spark-and-R…
Original Post: A simple way to set up a SparklyR cluster on Azure

Topological Data Analysis for Data Professionals: Beyond Ayasdi

[unable to retrieve full-text content]We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.
Original Post: Topological Data Analysis for Data Professionals: Beyond Ayasdi

Services and tools for building intelligent R applications in the cloud

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) As an in-memory application, R is sometimes thought to be constrained in performance or scalability for enterprise-grade applications. But by deploying R in a high-performance cloud environment, and by leveraging the scale of parallel architectures and dedicated big-data technologies, you can build applications using R that provide the necessary computational efficiency, scale, and cost-effectiveness. We identify four application areas and associated applications and Azure services that you can use to deploy R in enterprise applications. They cover the tasks required to prototype, build, and operationalize an enterprise-level data science and AI solution. In each of the four, there are R packages and tools specifically for accelerating the development of desirable analytics. Below is a brief introduction of each. Cloud resource management and operation Cloud computing instances…
Original Post: Services and tools for building intelligent R applications in the cloud

A Primer on Web Scraping in R

[unable to retrieve full-text content]If you are a data scientist who wants to capture data from such web pages then you wouldn’t want to be the one to open all these pages manually and scrape the web pages one by one. To push away the boundaries limiting data scientists from accessing such data from web pages, there are packages available in R.
Original Post: A Primer on Web Scraping in R

R jumps to 8th position in TIOBE language rankings

The R language surged to 8th place in the 2017 TIOBE language rankings, up 8 places from a year before. Fellow data science language language Python also saw an increase in rankings, taking the 4th spot (one ahead of its January 2016 ranking). (Click the table for the current top 20 rankings.) TIOBE ranks programming languages according to their search engine rankings, and R has been steadily climbing since the rankings began: You can find the current TIOBE language rankings, updated monthly, at the link below. TIOBE: TIOBE Index 
Original Post: R jumps to 8th position in TIOBE language rankings

Three new domain-specific (embedded) languages with a Stan backend

Three new domain-specific (embedded) languages with a Stan backend One is an accident. Two is a coincidence. Three is a pattern. Perhaps it’s no coincidence that there are three new interfaces that use Stan’s C++ implementation of adaptive Hamiltonian Monte Carlo (currently an updated version of the no-U-turn sampler). ScalaStan embeds a Stan-like language in Scala. It’s a Scala package largely (if not entirely written by Joe Wingbermuehle.[GitHub link] tmbstan lets you fit TMB models with Stan. It’s an R package listing Kasper Kristensen as author.[CRAN link] SlicStan is a “blockless” and self-optimizing version of Stan. It’s a standalone language coded in F# written by Maria Gorinova.[pdf language spec] These are in contrast with systems that entirely reimplement a version of the no-U-turn sampler, such as PyMC3, ADMB, and NONMEM.
Original Post: Three new domain-specific (embedded) languages with a Stan backend