Wrapping Access to Web-Services in R-functions.

One of the great features of R is the possibility to quickly access web-services. While some companies have the habit and policy to document their APIs, there is still a large chunk of undocumented but great web-services that help the regular data scientist. In the following short post, I will show how we can turn a simple web-serivce in a nice R-function.The example I am going to use is the linguee translation service: DeepL.Just as google translate, Deepl features a simple text field. When a user types in text, the translation appears in a second textbox. Users can choose between the languages. In order to see how the service works in the backend, let’s have a quick look at the network traffic.For that we open the browser’s developer tools and jump to the network tab. Next, we type in a…
Original Post: Wrapping Access to Web-Services in R-functions.

#15: Tidyverse and data.table, sitting side by side … (Part 1)

Welcome to the fifteenth post in the rarely rational R rambling series, or R4 for short. There are two posts I have been meaning to get out for a bit, and hope to get to shortly—but in the meantime we are going start something else. Another longer-running idea I had was to present some simple application cases with (one or more) side-by-side code comparisons. Why? Well at times it feels like R, and the R community, are being split. You’re either with one (increasingly “religious” in their defense of their deemed-superior approach) side, or the other. And that is of course utter nonsense. It’s all R after all. Programming, just like other fields using engineering methods and thinking, is about making choices, and trading off between certain aspects. A simple example is the fairly well-known trade-off between memory use and…
Original Post: #15: Tidyverse and data.table, sitting side by side … (Part 1)

Advisory on Multiple Assignment dplyr::mutate() on Databases

I currently advise R dplyr users to take care when using multiple assignment dplyr::mutate() commands on databases. (image: Kingroyos, Creative Commons Attribution-Share Alike 3.0 Unported License) In this note I exhibit a troublesome example, and a systematic solution. First let’s set up dplyr, our database, and some example data. library(“dplyr”) ## ## Attaching package: ‘dplyr’ ## The following objects are masked from ‘package:stats’: ## ## filter, lag ## The following objects are masked from ‘package:base’: ## ## intersect, setdiff, setequal, union packageVersion(“dplyr”) ## [1] ‘0.7.4’ packageVersion(“dbplyr”) ## [1] ‘1.2.0’ db <- DBI::dbConnect(RSQLite::SQLite(), “:memory:”) d <- dplyr::copy_to( db, data.frame(xorig = 1:5, yorig = sin(1:5)), “d”) Now suppose somewhere in one of your projects somebody (maybe not even you) has written code that looks somewhat like the following. d %>% mutate( delta = 0, x0 = xorig + delta, y0 = yorig…
Original Post: Advisory on Multiple Assignment dplyr::mutate() on Databases

ggplot2 Time Series Heatmaps: revisited in the tidyverse

I revisited my previous post on creating beautiful time series calendar heatmaps in ggplot, moving the code into the tidyverse.To obtain following example: Simply use the following code:I hope the commented code is self-explanatory – enjoy 🙂 Related If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook…
Original Post: ggplot2 Time Series Heatmaps: revisited in the tidyverse

R Weekly Bulletin Vol – XIV

This week’s R bulletin covers some interesting ways to list functions, to list files and illustrates the use of double colon operator. We will also cover functions like path.package, fill.na, and rank. Click To TweetHope you like this R weekly bulletin. Enjoy reading! Shortcut Keys 1. New document – Ctrl+Shift+N2. Close active document – Ctrl+W3. Close all open documents – Ctrl+Shift+W Problem Solving Ideas How to list functions from an R package We can view the functions from a particular R package by using the “jwutil”s package. Install the package and use the lsf function from the package. The syntax of the function is given as: lsf(pkg) Where pkg is a character string containing package name. The function returns a character vector of function names in the given package. Example: library(jwutil) library(rowr) lsf(“rowr”) How to list files with a particular…
Original Post: R Weekly Bulletin Vol – XIV

Who wants to work at Google?

In this tutorial, we will explore the open roles at Google, and try to see what common attributes Google is looking for, in future employees. This dataset is a compilation of job descriptions of 1200+ open roles at Google offices across the world. This dataset is available for download from the Kaggle website, and contains text information about job location, title, department, minimum, preferred qualifications and responsibilities of the position. You can download the dataset here, and run the code on the Kaggle site itself here. Using this dataset we will try to answer the following questions: Where are the open roles? Which departments have the most openings? What are the minimum and preferred educational qualifications needed to get hired at Google? How much experience is needed? What categories of roles are the most in demand? Step1 – Data Preparation and Cleaning:…
Original Post: Who wants to work at Google?

Rcpp 0.12.15: Numerous tweaks and enhancements

The fifteenth release in the 0.12.* series of Rcpp landed on CRAN today after just a few days of gestation in incoming/. This release follows the 0.12.0 release from July 2016, the 0.12.1 release in September 2016, the 0.12.2 release in November 2016, the 0.12.3 release in January 2017, the 0.12.4 release in March 2016, the 0.12.5 release in May 2016, the 0.12.6 release in July 2016, the 0.12.7 release in September 2016, the 0.12.8 release in November 2016, the 0.12.9 release in January 2017, the 0.12.10.release in March 2017, the 0.12.11.release in May 2017, the 0.12.12 release in July 2017, the 0.12.13.release in late September 2017, and the 0.12.14.release in November 2017 making it the nineteenth release at the steady and predictable bi-montly release frequency. Rcpp has become the most popular way of enhancing GNU R with C or…
Original Post: Rcpp 0.12.15: Numerous tweaks and enhancements

Winter solstice challenge #3: the winner is Bianca Kramer!

Part of the winning submission in the category ‘best tool‘. A bit later than intended, but I am pleased to announce the winner of the Winter solstice challenge: Bianca Kramer! Of course, she was the only contender, but her solution is awesome! In fact, I am surprised no one took her took, ran it on their own data and just submit that (which was perfectly well within the scope of the challenge). Best Tool: Bianca KramerThe best tool (see the code snippet on the right) uses R and a few R packages (rorcid, rjson, httpcache) and services like ORCID and CrossRef (and the I4OC project), and the (also awesome) oadoi.org project. The code is available on GitHub. Highest Open Knowledge Score: Bianca KramerI did not check the self-reported score of 54%, but since no one challenged here, Bianca wins this category too.…
Original Post: Winter solstice challenge #3: the winner is Bianca Kramer!

Version 2.2.2 Released

ggtern version 2.2.2 has just been submitted to CRAN, and it includes a number of new features. This time around, I have adapted the hexbin geometry (and stat), and additionally, created an almost equivalent geometry which operates on a triangular mesh rather than a hexagonal mesh. There are some subtle differences which give some added functionality, and together these will provide an additional level of richness to ternary diagrams produced with ggtern, when the data-set is perhaps significantly large and points themselves start to lose their meaning from visual clutter. Ternary Hexbin Firstly, lets look a the ternary hexbin, which, as the name suggests has the capability to bin points in a regular hexagonal grid to produce a pseudo-surface. Now in the original ggplot version, this geometry is somewhat limiting since it only performs a ‘count’ on the number of…
Original Post: Version 2.2.2 Released

Data Driven DIY

Statisfix – Which fixing should I buy? I have a bathroom cabinet to put up. It needs to go onto a tiled plasterboard (drywall) wall.Because of the tiles, I can’t use the fixings I normally use to keep heavy objects fixed to the wall.And bog standard rawlplugs aren’t going to do the job. So what should I buy? YouTube to the rescue – more specifically, this fine chap at Ultimate Handyman. Not only does he demonstrate how to use the fixings, but also produced this strangely mesmerising strength test showing how much weight the fixings support before the plasterboard gives out. As well as the strength of the fixing, I need to consider the price of the fixings, and also, the size of the hole required (which in turn, will also impact the overall cost of the job if I…
Original Post: Data Driven DIY