What’s inside? pkginspector provides helpful tools for inspecting package contents

R packages are widely used in science, yet the code behind them often does not come under scrutiny. To address this lack, rOpenSci has been a pioneer in developing a peer review process for R packages. The goal of pkginspector is to help that process by providing a means to better understand the internal structure of R packages. It offers tools to analyze and visualize the relationship among functions within a package, and to report whether or not functions’ interfaces are consistent. If you are reviewing an R package (maybe your own!), pkginspector is for you. We began building pkginspector during unconf18, with support from rOpenSci and guidance from Noam Ross. The package focuses on facilitating a few of the many tasks involved in reviewing a package; it is one of a collection of packages, including pkgreviewr (rOpenSci) and goodpractice,…
Original Post: What’s inside? pkginspector provides helpful tools for inspecting package contents

phylogram: dendrograms for evolutionary analysis

Evolutionary biologists are increasingly using R for building,editing and visualizing phylogenetic trees.The reproducible code-based workflow and comprehensive array of toolsavailable in packages such as ape,phangorn andphytools make R an ideal platform forphylogenetic analysis.Yet the many different tree formats are not well integrated,as pointed out in a recentpost. The standard data structure for phylogenies in R is the “phylo”object, a memory efficient, matrix-based tree representation.However, non-biologists have tended to use a tree structurecalled the “dendrogram”, which is a deeply nested list withnode properties defined by various attributes stored at each level.While certainly not as memory efficient as the matrix-based format,dendrograms are versatile and intuitive to manipulate, and hencea large number of analytical and visualization functions existfor this object type. A good example is thedendextend package,which features an impressive range of options for editing dendrogramsand plotting publication-quality trees. To better integrate the…
Original Post: phylogram: dendrograms for evolutionary analysis

Exploring ways to address gaps in maternal-child health research

It’s easy to come to a conference and feel intimidated by the wealth of knowledge and expertise of other attendees. As Ellen Ullman, a software engineer and writer describes, I was aware at all times that I had only islands of knowledge separated by darkness; that I was surrounded by chasms of not-knowing, into one of which I was certain to fall. One of the best ways to start feeling less intimidated is to start talking to others. Ullman continues, I learned I was not alone. I met a postdoctoral student in computer science at Berkeley. I talked with him about my islands, the darkness, the fear. He answered without hesitations: ‘Oh, I feel that way all the time.’ At rOpenSci unconf18, we learned that it’s ok to feel like you don’t know everything – indeed, that’s how just about…
Original Post: Exploring ways to address gaps in maternal-child health research

A package for tidying nested lists

Data == knowledge! Much of the data we use, whether it be fromgovernment repositories, social media, GitHub, or e-commerce sites comesfrom public-facing APIs. The quantity of data available is trulystaggering, but munging JSON output into a format that is easilyanalyzable in R is an equally staggering undertaking. When JSON isturned into an R object, it usually becomes a deeply nested list riddledwith missing values that is difficult to untangle into a tidy format.Moreover, every API presents its own challenges; code you’ve written toclean up data from GitHub isn’t necessarily going to work on Twitterdata, as each API spews data out in its own unique, headache-inducingnested list structure. To ease and generalize this process, AmandaDobbyn proposed anunconf18 project for a general API response tidier! Welcome roomba,our first stab at easing the process of tidying nested lists! roomba will eventually be able…
Original Post: A package for tidying nested lists

Announcing new software review editors: Anna Krystalli and Lincoln Mullen

Part of rOpenSci’s mission is to create technical infrastructure in the form of carefully vetted R software tools that lower barriers to working with data sources on the web. Our open peer software review system for community-contributed tools is a key component of this. As the rOpenSci community grows and more package authors submit their work for peer review, we need to expand our editorial board to maintain a speedy process. As our recent post shows, package submissions have grown every year since we started this experiment, and we see no reason they will slow down! Editors manage the review process, performing initial package checks, identifying reviewers, and moderating the process until the package is accepted by reviewers and transferred to rOpenSci. Anna Krystalli and Lincoln Mullen have both served as guest editors for rOpenSci and now they join as…
Original Post: Announcing new software review editors: Anna Krystalli and Lincoln Mullen

Chat with the rOpenSci team at upcoming meetings

You can find members of the rOpenSci team at various meetings and workshops around the world. Come say ‘hi’, learn about how our software packages can enable your research, or about our process for open peer software review and onboarding, how you can get connected with the community or tell us how we can help you do open and reproducible research. Where’s rOpenSci? When Who Where What June 23, 2018 Maëlle Salmon Cardiff, UK satRday Cardiff June 27-28, 2018 Scott Chamberlain Portland, OR Bioinformatics Open Source Conference 2018 (BOSC) July 4-6, 2018 Maëlle Salmon Rennes, FR French R conference July 10-13, 2018 Jenny Bryan Brisbane, AU UseR! July 28-Aug 2, 2018 Jenny Bryan Vancouver, CA Joint Statistical Meetings (JSM) Aug 6-10, 2018 Carl Boettiger, Dan Sholler New Orleans, LA Ecological Society of America (ESA) Aug 15-16, 2018 Stefanie Butland Cambridge,…
Original Post: Chat with the rOpenSci team at upcoming meetings

Exploring European attitudes and behaviours using the European Social Survey

Introduction I never thought that I’d be programming software in my career. I startedusing R a little over 2 years now and it’s been one of the most importantdecisions in my career. Secluded in a small academic office with no oneto discuss/interact about my new hobby, I started searching the web fortutorials and packages. After getting to know how amazing and nurturingthe R community is, it made me want to become a data scientist. So I setout to do it. Throughout the journey I repeatedly found myself usingthe European Social Survey (ESS from now on), areally neat dataset that collects information on attitudes, beliefs andbehaviour patterns of diverse populations in more than thirty Europeannations since 2002. After seeing a niche in the R package community, I created the packageessurvey (previously ess in CRAN) to access this data easily from R.The…
Original Post: Exploring European attitudes and behaviours using the European Social Survey

The ssh Package: Secure Shell (SSH) Client for R

Have you ever needed to connect to a remote server over SSH to transfer files via SCP or to setup a secure tunnel, and wished you could do so from R itself? The new rOpenSci ssh package provides a native ssh client in R allows you to do that and even more, like running a command or script on the host while streaming stdout and stderr directly to the client. The package is based on libssh, a powerful C library implementing the SSH protocol. install.packages(“ssh”) Because the ssh package is based on libssh it does not need to shell out. Therefore it works natively on all platforms without any runtime dependencies. Even on Windows. The package is still work in progress, but the core functionality should work. Below some examples to get you started from the intro vignette. Connecting to…
Original Post: The ssh Package: Secure Shell (SSH) Client for R

Unconf18 projects 4: umapr, greta, roomba, proxy-bias-vignette, http caching

For the fourth and last day of project recaps from this year’s unconf, here is an overview of the next five projects. In the spirit of exploration and experimentation at rOpenSci unconferences, these projects are not necessarily finished products or in scope for rOpenSci packages. umapr Summary: umapr wraps the Python implementation of UMAP to make the algorithm accessible from within R, leveraging reticulate to interface with Python. Uniform Manifold Approximation and Projection (UMAP) is a non-linear dimensionality reduction algorithm. It is similar to t-SNE but computationally more efficient. Team: Angela Li, Ju Kim, Malisa Smith, Sean Hughes, Ted Laderas code: https://github.com/ropenscilabs/umapr umapr team picture by Mauro Lepore ~greta Summary: greta is an R package for writing statistical models and fitting them by MCMC. We luckily had the greta creator at the unconf: Nick Golding. The unconf team worked on…
Original Post: Unconf18 projects 4: umapr, greta, roomba, proxy-bias-vignette, http caching

.rprofile: Julia Silge

Dr. Julia Silge [@juliasilge on Twitter] is a data scientist at Stack Overflow. We talked about why R brings Julia joy, her path to a career in data science and what it was like to co-write a book for O’Reilly Media. This interview occurred on February 3, 2018 at the RStudio Conference in San Diego. KO: What is your name, job title, and how long have you been using R? JS: My name is Julia Silge and I’m a data scientist at Stack Overflow. I have been working in R for less than three years. KO: Wow! What were you all about before that? JS: I know a lot of people in data science say, “oh I had this weird path that brought me to where I am today.” But I actually do think I have an especially weird and…
Original Post: .rprofile: Julia Silge