Statistical Machine Learning with Microsoft ML

MicrosoftML is an R package for machine learning that works in tandem with the RevoScaleR package. (In order to use the MicrosoftML and RevoScaleR libraries, you need an installation of Microsoft Machine Learning Server or Microsoft R Client.) A great way to see what MicrosoftML can do is to take a look at the on-line book Machine Learning with the MicrosoftML Package Package by Ali Zaidi. The book includes worked examples on several topics: Exploratory data analysis and feature engineering Regression models Classification models for computer vision Convolutional neural networks for computer vision Natural language processing Transfer learning with pre-trained DNNs The book is part of Ali’s in-person workshop “Statistical Machine Learning with MicrosoftML”, and you can find further materials including data and scripts at this Github repository. If you’d like to experience the workshop in person, Ali will be presenting it…
Original Post: Statistical Machine Learning with Microsoft ML

Statistical Machine Learning with Microsoft ML

MicrosoftML is an R package for machine learning that works in tandem with the RevoScaleR package. (In order to use the MicrosoftML and RevoScaleR libraries, you need an installation of Microsoft Machine Learning Server or Microsoft R Client.) A great way to see what MicrosoftML can do is to take a look at the on-line book Machine Learning with the MicrosoftML Package Package by Ali Zaidi. The book includes worked examples on several topics: Exploratory data analysis and feature engineering Regression models Classification models for computer vision Convolutional neural networks for computer vision Natural language processing Transfer learning with pre-trained DNNs The book is part of Ali’s in-person workshop “Statistical Machine Learning with MicrosoftML”, and you can find further materials including data and scripts at this Github repository. If you’d like to experience the workshop in person, Ali will be presenting it…
Original Post: Statistical Machine Learning with Microsoft ML

Because it's Friday: 30 days on a cargo ship

This time-lapse taken during a cargo ship’s 30-day voyage from the Red Sea to Hong Kong is strangely hypnotic (via Kottke). In addition to the beautiful scenery, it also makes you appreciate the logistics behind loading and unloading a container ship! If you have a 4K monitor, be sure to watch it full-screen. That’s all from the blog for this week. Enjoy your weekend, and we’ll be back with more on Monday.
Original Post: Because it's Friday: 30 days on a cargo ship

An Updated History of R

Here’s a refresher on the history of the R project: 1992: R development begins as a research project in Auckland, NZ by Robert Gentleman and Ross Ihaka  1993: First binary versions of R published at Statlib  1995: R first distributed as open-source software, under GPL2 license 1997: R core group formed 1997: CRAN founded (by Kurt Jornik and Fritz Leisch) 1999: The R website, r-project.org, founded 2000: R 1.0.0 released (February 29)  2001: R News founded (later to become the R Journal) 2003: R Foundation founded 2004: First UseR! conference (in Vienna) 2004: R 2.0.0 released 2009: First edition of the R Journal 2013: R 3.0.0 released 2015: R Consortium founded, with R Foundation participation 2016: New R logo adopted I’ve added some additional dates gleaned from the r-announce mailing list archives and a 1998 paper on the history of R written by co-founder…
Original Post: An Updated History of R

An Updated History of R

Here’s a refresher on the history of the R project: 1992: R development begins as a research project in Auckland, NZ by Robert Gentleman and Ross Ihaka  1993: First binary versions of R published at Statlib [see update, below] 1995: R first distributed as open-source software, under GPL2 license 1997: R core group formed 1997: CRAN founded (by Kurt Hornik and Fritz Leisch) 1999: The R website, r-project.org, founded 2000: R 1.0.0 released (February 29)  2001: R News founded (later to become the R Journal) 2003: R Foundation founded 2004: First UseR! conference (in Vienna) 2004: R 2.0.0 released 2009: First edition of the R Journal 2013: R 3.0.0 released 2015: R Consortium founded, with R Foundation participation 2016: New R logo adoptedt I’ve added some additional dates gleaned from the r-announce mailing list archives and a 1998 paper on the history of R…
Original Post: An Updated History of R

The R manuals in bookdown format

While there are hundreds of excellent books and websites devoted to R, the canonical source of truth regarding the R system remains the R manuals. You can find the manuals at your local CRAN mirror and on your laptop as part of the R distribution (try Help > Manuals in RGui, or Help > R Help in RStudio to find them). Unlike books, the R manuals are updated by the R Core Team with every new release, so if you’re not sure how the base R system is supposed to work this is the place to check. Note that the manuals don’t cover any of the R packages (other than the base and recommended packages), so if you want to learn about the wider R ecosystem, well, that’s what all those books and websites are for. (MRAN is one place to…
Original Post: The R manuals in bookdown format

The R manuals in bookdown format

While there are hundreds of excellent books and websites devoted to R, the canonical source of truth regarding the R system remains the R manuals. You can find the manuals at your local CRAN mirror and on your laptop as part of the R distribution (try Help > Manuals in RGui, or Help > R Help in RStudio to find them). Unlike books, the R manuals are updated by the R Core Team with every new release, so if you’re not sure how the base R system is supposed to work this is the place to check. Note that the manuals don’t cover any of the R packages (other than the base and recommended packages), so if you want to learn about the wider R ecosystem, well, that’s what all those books and websites are for. (MRAN is one place to…
Original Post: The R manuals in bookdown format

Is it faster to take a bike or taxi in NYC?

Taxis are plentiful and convenient in New York City, but the city is also served by a wide network of commuter bicycles (Citi Bikes). If you need to get from, say, the West Village to the Garment District, are you better off time-wise hailing a cab, or heading over to the nearest Citi Bike station? Data scientist Todd W. Schnieder crunched the number on travel times for both taxis and Citi Bikes to figure out which was better. Neither is universally the best, but for some trips taxis are most often the fastest, and for others bikes are faster. An interactive map (created with R) allows you to select the time of day and an origin neighborhood, and the map will then tell you the fraction of the time (according to the historical data) that a Citi Bike will outpace…
Original Post: Is it faster to take a bike or taxi in NYC?

Is it faster to take a bike or taxi in NYC?

Taxis are plentiful and convenient in New York City, but the city is also served by a wide network of commuter bicycles (Citi Bikes). If you need to get from, say, the West Village to the Garment District, are you better off time-wise hailing a cab, or heading over to the nearest Citi Bike station? Data scientist Todd W. Schnieder crunched the number on travel times for both taxis and Citi Bikes to figure out which was better. Neither is universally the best, but for some trips taxis are most often the fastest, and for others bikes are faster. An interactive map (created with R) allows you to select the time of day and an origin neighborhood, and the map will then tell you the fraction of the time (according to the historical data) that a Citi Bike will outpace…
Original Post: Is it faster to take a bike or taxi in NYC?

Saving Snow Leopards with Artificial Intelligence

The snow leopard, the large cat native to the mountain ranges of Central and South Asia, is a highly endangered species. With an estimated estimated 3900-6500 individuals left in the wild, conservation efforts led by the Snow Leopard Trust are focused on preserving this iconic animal. But the snow leopard is an elusive creature: given their range and emote habitat (including the highlands of the Himalayas), they are difficult to study. In order to gather data about the creatures, researchers have used camera traps to capture more than 1 million images.  But not all of those images are of snow leopards. It’s a time-consuming process to classify those images as being of snow leopards, their prey, some other animal or nothing at all. To make things even more difficult, snow leopards have excellent camouflage, and can be difficult to spot even by…
Original Post: Saving Snow Leopards with Artificial Intelligence