Tutorial: Using seplyr to Program Over dplyr

seplyr is an R package that makes it easy to program over dplyr 0.7.*. To illustrate this we will work an example. Suppose you had worked out a dplyr pipeline that performed an analysis you were interested in. For an example we could take something similar to one of the examples from the dplyr 0.7.0 announcement. suppressPackageStartupMessages(library(“dplyr”)) packageVersion(“dplyr”) ## [1] ‘0.7.2’ cat(colnames(starwars), sep=’n’) ## name ## height ## mass ## hair_color ## skin_color ## eye_color ## birth_year ## gender ## homeworld ## species ## films ## vehicles ## starships starwars %>% group_by(homeworld) %>% summarise(mean_height = mean(height, na.rm = TRUE), mean_mass = mean(mass, na.rm = TRUE), count = n()) ## # A tibble: 49 x 4 ## homeworld mean_height mean_mass count ## ## 1 Alderaan 176.3333 64.0 3 ## 2 Aleen Minor 79.0000 15.0 1 ## 3 Bespin 175.0000 79.0…
Original Post: Tutorial: Using seplyr to Program Over dplyr

What’s in our internal chaimagic package at work

At my day job I’m a data manager and statistician for an epidemiology project called CHAI lead by Cathryn Tonne. CHAI means “Cardio-vascular health effects of air pollution in Telangana, India” and you can find more about it in our recently published protocol paper . At my institute you could also find the PASTA and TAPAS projects so apparently epidemiologists are good at naming things, or obsessed with food… But back to CHAI! This week Sean Lopp from RStudio wrote an interesting blog post about internal packages. I liked reading it and feeling good because we do have an internal R package for CHAI! In this blog post, I’ll explain what’s in there, in the hope of maybe providing inspiration for your own internal package! As posted in this tweet, this pic represents the Barcelona contingent of CHAI, a…
Original Post: What’s in our internal chaimagic package at work

Because it's Friday: How Bitcoin works

Cryptocurrencies have been in the news quite a bit lately. Bitcoin prices have been soaring recently after the community narrowly avoided the need for a fork, while $32M in rival currency Etherium was recently stolen, thanks to a coding error in wallet application Purity. But what is a crypto-currency, and what does a “wallet” or a “fork” mean in that context? The video below gives the best explanation I’ve seen for how cryptocurrencies work. It’s 25 minutes long, but it’s a complex and surprisingly subtle topic, made easy to understand by math explainer channel 3Blue1Brown. That’s all from the blog for this week. Have a great weekend, and we’ll be back on Monday.
Original Post: Because it's Friday: How Bitcoin works

Caution: Optimism about AI improving society is high, but drops with experience developing AI systems

[unable to retrieve full-text content]While about 60% of KDnuggets readers think AI and Automation will improve society, the optimism drops significantly among those with 4 or more years experience developing AI systems. Should we pay more attention to the experts?
Original Post: Caution: Optimism about AI improving society is high, but drops with experience developing AI systems

Stan Weekly Roundup, 21 July 2017

It was another productive week in Stan land. The big news is that Jonathan Auerbach reports that A team of Columbia students (mostly Andrew’s, including myself) recently won first place in a competition predicting elementary school enrollment. I heard 192 entered, and there were 5 finalists….Of course, we used Stan (RStan specifically). … Thought it might be Stan news worthy. I’d say that’s newsworthy. Jon also provided a link to the “challenge” page, a New York City government sponsored “call for innovations”: Enhancing School Zoning Efforts by Predicting Population Change. They took home a US$20K paycheck for their efforts! Stan’s seeing quite a lot of use these days among demographers and others looking to predict forward from time series data. Jonathan’s been very active using government data sets (see his StanCon 2017 presentation with Rob Trangucci, Twelve Cities: Does…
Original Post: Stan Weekly Roundup, 21 July 2017

IEEE Spectrum 2017 Top Programming Languages

IEEE Spectrum has published its fourth annual ranking of of top programming languages, and the R language is again featured in the Top 10. This year R ranks at #6, down a spot from its 2016 ranking (and with an IEEE score — derived from search, social media, and job listing trends — tied with the #5 place-getter, C#). Python has taken the #1 slot from C, jumping from its #3 ranking in 2016. For R (a domain specific language for data science) to rank in the top 10, and for Python (a general-purpose language with many data science applications) to take the top spot, may seem like a surprise. I attribute this to continued broad demand for machine intelligence application development, driven by the growth of “big data” initiatives and the strategic imperative to capitalize on these data stores by…
Original Post: IEEE Spectrum 2017 Top Programming Languages

IEEE Spectrum 2017 Top Programming Languages

IEEE Spectrum has published its fourth annual ranking of of top programming languages, and the R language is again featured in the Top 10. This year R ranks at #6, down a spot from its 2016 ranking (and with an IEEE score — derived from search, social media, and job listing trends — tied with the #5 place-getter, C#). Python has taken the #1 slot from C, jumping from its #3 ranking in 2016. For R (a domain specific language for data science) to rank in the top 10, and for Python (a general-purpose language with many data science applications) to take the top spot, may seem like a surprise. I attribute this to continued broad demand for machine intelligence application development, driven by the growth of “big data” initiatives and the strategic imperative to capitalize on these data stores by companies…
Original Post: IEEE Spectrum 2017 Top Programming Languages

How to create reports with R Markdown in RStudio

Introduction R Markdown is one of the most popular data science tools and is used to save and execute code to create exceptional reports whice are easily shareable. The documents that R Markdown provides are fully reproducible and support a wide variety of static and dynamic output formats. R Markdown uses markdown syntax, which provides an easy way of creating documents that can be converted to many other file types, while embeding R code in the report, so it is not necessary to keep the report and R script separately. Furthermore The report is written as normal text, so knowledge of HTML is not required. Of course no additional files are needed because everything is incorporated in the HTML file. Package Installation In order to use the R Markdown package we have to install and call it with:install.packages(“rmarkdown”)library(rmarkdown) Create…
Original Post: How to create reports with R Markdown in RStudio