Statistics Sunday: Two R Packages to Check Out

I’m currently out of town and not spending as much time on my computer as I have over the last couple months. (It’s what happens when you’re the only one in your department at work and also most of your hobbies involve a computer.) But I wanted to write up something for Statistics Sunday and I recently discovered two R packages I need to check out in the near future. The first is called echor, which allows you to search and download data directly from the US Environmental Protection Agency (EPA) Environmental Compliance and History Online (ECHO), using the ECHO-API. According to the vignette, linked above, “ECHO provides data for: Stationary sources permitted under the Clean Air Act, including data from the National Emissions Inventory, Greenhouse Gas Reporting Program, Toxics Release Inventory, and Clean Air Markets Division Acid Rain Program…
Original Post: Statistics Sunday: Two R Packages to Check Out

Intersecting points and overlapping polygons

I’ve been doing some spatial stuff of late and the next little step will involve intersecting points with possibly many overlapping polygons. The sp package has a function called over which returns the polygons that points intersects with. The catch though, is that it only returns the last (highest numerical value) polygon a point overlaps with. So it’s not so useful if you have many overlapping polygons. A little playing, and I’ve overcome that problem… Here’s a toy example. Create a couple of polygons and put them into a SpatialPolygons object. library(sp) p1 <- matrix(c(1,1, 2,1, 4,2, 3,2), ncol = 2, byrow = TRUE) p2 <- matrix(c(2.2,1, 3,1, 3,2, 3,3, 2.8,3), ncol = 2, byrow = TRUE) p1s <- Polygons(list(Polygon(p1)), 3) p2s <- Polygons(list(Polygon(p2)), 4) sps <- SpatialPolygons(list(p1s, p2s)) Define a few points and put them in a SpatialPoints object…
Original Post: Intersecting points and overlapping polygons

The Power of Standards and Consistency

I’m going to (eventually) write a full post on the package I’m mentioning in this one : osqueryr . The TLDR on osqueryr is that it is an R DBI wrapper (that has just enough glue to also be plugged into dbplyr) for osquery . The TLDR on osquery is that it “exposes an operating system as a high-performance relational database. This design allows you to write SQL-based queries efficiently and easily to explore operating systems.” In short, osquery turns the metadata and state information of your local system (or remote system(s)) into a SQL-compliant database. It also works on Windows, Linux, BSD and macOS. This means you can query a fleet of systems with a (mostly) normalized set of tables and get aggregated results. Operations and information security staff use this to manage systems and perform incident response tasks,…
Original Post: The Power of Standards and Consistency

RStudio:addins part 3 – View objects, files, functions and more with 1 keypress

In this post in the RStudio:addins series we will try to make our work more efficient with an addin for better inspection of objects, functions and files within RStudio. RStudio already has a very useful View function and a Go To Function / File feature with F2 as the default keyboard shortcut and yes, I know I promised automatic generation of @importFrom roxygen tags in the previous post, unfortunately we will have to wait a bit longer for that one but I believe this one more than makes up for it in usefulness. The addin we will create in this article will let us use RStudio to View and inspect a wide range of objects, functions and files with 1 keypress. The addins in action As a first step, we need to be able to retrieve the value of the…
Original Post: RStudio:addins part 3 – View objects, files, functions and more with 1 keypress

Tips for great graphics

R is a great program for generating top-notch graphics. But to get the best out of it, you need to put in a little more work. Here are a few tips for adapting your R graphics to make them look a little better. 1) Dont use the “File/Save as…/” menu. If you set up your graphic in the first place then theres no need to post-process (eg crop, scale etc) the graphic in other software. Use the graphic devices (jpeg(), tiff(), postscript(), etc), set your height and width to whatever you want the finished product to be and then create the graph. tiff(“~/Manuscript/Figs/Fig1.tiff”, width =2, height =2, units =”in”, res = 600) plot(dist ~ speed, cars) # cars is a base R dataset – data(cars) dev.off() The first argument to a graphic device such as tiff or jpeg is the…
Original Post: Tips for great graphics

WVPlots now at version 1.0.0 on CRAN!

Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. We are excited to announce the WVPlots is now at version 1.0.0 on CRAN! The idea is: we sacrifice some of the flexibility and composability inherent to ggplot2 in R for a menu of prescribed presentation solutions. This is a package to produce plots while you are in the middle of another task. For example the plot below showing both an observed discrete empirical distribution (as stems) and a matching theoretical distribution (as bars) is a built in “one liner.” set.seed(52523) d <- data.frame(wt=100*rnorm(100)) WVPlots::PlotDistCountNormal(d,’wt’,’example’) The graph above is actually the product of a number of presentation decisions: Using a discrete histogram approach to summarize data…
Original Post: WVPlots now at version 1.0.0 on CRAN!

Because it's Friday: Bad road

Sometimes I think the potholes in the roads in Chicago are bad, but then a road like this puts things into perspective: (Thanks to TH for the link.) Don’t miss the shots looking back near the end to see how many people are in that vehicle! That’s all from the blog for this week. Have a great weekend, and we’ll be back after the US holiday on Monday. Enjoy!
Original Post: Because it's Friday: Bad road

Reflections on the ROpenSci Unconference

I had an amazing time this week participating in the 2018 ROpenSci Unconference, the sixth annual ROpenSci hackathon bringing together people to advance the tools and community for scientific computing with R. It was so inspiring to be among such a talented and dedicated group of people — special kudos goes to the organizing committee for curating such a great crowd. (I heard there were over 200 hundred nominations from which the 65 or so attendees were selected.) The idea behind the unconference is to spend two full days hacking on projects of interest to the community. Before the conference begins, the participants suggest projects as Github issues and begin discussions there. On the first day of the conference (after an icebreaker), the participants vote for projects they’d be interested in working on, and then form up into groups of 2-6 people or so…
Original Post: Reflections on the ROpenSci Unconference

Reflections on the ROpenSci Unconference

I had an amazing time this week participating in the 2018 ROpenSci Unconference, the sixth annual ROpenSci hackathon bringing together people to advance the tools and community for scientific computing with R. It was so inspiring to be among such a talented and dedicated group of people — special kudos goes to the organizing committee for curating such a great crowd. (I heard there were over 200 hundred nominations from which the 65 or so attendees were selected.)The idea behind the unconference is to spend two full days hacking on projects of interest to the community. Before the conference begins, the participants suggest projects as Github issues and begin discussions there. On the first day of the conference (after an icebreaker), the participants vote for projects they’d be interested in working on, and then form up into groups of 2-6 people or so to…
Original Post: Reflections on the ROpenSci Unconference

Learn AI and Data Science rapidly based only on high school math – KDnuggets Offer

[unable to retrieve full-text content]This 3-month program, created by Ajit Jaokar, who teaches at Oxford, is interactive and delivered by video. Coding examples are in Python. Places limited – check special KDnuggets rate.
Original Post: Learn AI and Data Science rapidly based only on high school math – KDnuggets Offer