Gender roles in film direction, analyzed with R

What do women do in films? If you analyze the stage directions in film scripts — as Julia Silge, Russell Goldenberg and Amber Thomas have done for this visual essay for ThePudding — it seems that women (but not men) are written to snuggle, giggle and squeal, while men (but not women) shoot, gallop and strap things to other things.   This is all based on an analysis of almost 2,000 film scripts mostly from 1990 and after. The words come from pairs of words beginning with “he” and “she” in the stage directions (but not the dialogue) in the screenplays — directions like “she snuggles up to him, strokes his back” and “he straps on a holster under his sealskin cloak”. The essay also includes an analysis of words by the writer and character’s gender, and includes lots of lovely interactive…
Original Post: Gender roles in film direction, analyzed with R

From Notebooks to JupyterLab – The Evolution of Data Science IDEs

[unable to retrieve full-text content]This live webinar (Aug 22) will discuss the impact that the notebook experience has had on data science, and how JupyterLab – the next generation data science IDE – has evolved from the classic notebooks.
Original Post: From Notebooks to JupyterLab – The Evolution of Data Science IDEs

Buzzfeed trains an AI to find spy planes

Last year, Buzzfeed broke the story that US law enforcement agencies were using small aircraft to observe points of interest in US cities, thanks to analysis of public flight-records data. With the data journalism team no doubt realizing that the Flightradar24 data set hosted many more stories of public interest, the challenge lay in separating routine, day-to-day aircraft traffic from the more unusual, covert activities.   So they trained an artificial intelligence model to identify unusual flight paths in the data. The model, implemented in the R programming language, applies a random forest algorithm to identify flight patterns similar to those of covert aircraft identified in their earlier “Spies in the Skies” story. When that model was applied to the almost 20,000 flights in the FlightRadar24 dataset, about 69 planes were flagged as possible surveillance aircraft. Several of those were…
Original Post: Buzzfeed trains an AI to find spy planes

Reproducibility: A cautionary tale from data journalism

Timo Grossenbacher, data journalist with Swiss Radio and TV in Zurich, had a bit of a surprise when he attempted to recreate the results of one of the R Markdown scripts published by SRF Data to accompany their data journalism story about vested interests of Swiss members of parliament. Upon re-running the analysis in R last week, Timo was surprised when the results differed from those published in August 2015. There was no change to the R scripts or data in the intervening two-year period, so what caused the results to be different? Image credit: Timo Grossenbacher The version of R Timo was using had been updated, but that wasn’t the root cause of the problem. What had also changed was the version of the dplyr package used by the script: version 0.5.0 now, versus version 0.4.2 then. For some unknown…
Original Post: Reproducibility: A cautionary tale from data journalism

Data Version Control in Analytics DevOps Paradigm

[unable to retrieve full-text content]DevOps and DVC tools can help reduce time data scientists spend on mundane data preparation and achieve their dream of focusing on cool machine learning algorithms and interesting data analysis.
Original Post: Data Version Control in Analytics DevOps Paradigm

Data Science Primer: Basic Concepts for Beginners

[unable to retrieve full-text content]This collection of concise introductory data science tutorials cover topics including the difference between data mining and statistics, supervised vs. unsupervised learning, and the types pf patterns we can mine from data.
Original Post: Data Science Primer: Basic Concepts for Beginners

KDnuggets™ News 17:n30, Aug 9: Machine Learning Algorithms: Concise Overview; Train your Deep Learning model faster and sharper

[unable to retrieve full-text content]Also: A unified deep learning framework for time-series mobile sensing data processing; EDISON Data Science Framework Release 2; Data mining Airbnb.
Original Post: KDnuggets™ News 17:n30, Aug 9: Machine Learning Algorithms: Concise Overview; Train your Deep Learning model faster and sharper

Strata Data Conference, the reunion of data brain trust – KDnuggets Offer

[unable to retrieve full-text content]Strata Data Conference, the annual reunion of data brain trust, is Sept 25-28 in New York. Early price ends Aug 11 – save more with code KDNU.
Original Post: Strata Data Conference, the reunion of data brain trust – KDnuggets Offer