Transformative treatments

Transformative treatments Posted by Andrew on 31 December 2016, 9:10 am Kieran Healy and Laurie Paul wrote a new article, “Transformative Treatments,” (see also here) which reminds me a bit of my article with Guido, “Why ask why? Forward causal inference and reverse causal questions.” Healy and Paul’s article begins: Contemporary social-scientific research seeks to identify specific causal mechanisms for outcomes of theoretical interest. Experiments that randomize populations to treatment and control conditions are the “gold standard” for causal inference. We identify, describe, and analyze the problem posed by transformative treatments. Such treatments radically change treated individuals in a way that creates a mismatch in populations, but this mismatch is not empirically detectable at the level of counterfactual dependence. In such cases, the identification of causal pathways is underdetermined in a previously unrecognized way. Moreover, if the treatment is indeed transformative…
Original Post: Transformative treatments

“Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs.”

In my previous post, I wrote: Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs. It turns out that Lewis does have his own blog. His latest entry contains a bunch of links, starting with this one: Populism and the Return of the “Paranoid Style”: Some Evidence and a Simple Model of Demand for Incompetence as Insurance against Elite Betrayal Rafael Di Tella & Julio Rotemberg NBER Working Paper, December 2016 Abstract:We present a simple model of populism as the rejection of “disloyal” leaders. We show that adding the assumption that people are worse off when they experience low income as a result of leader betrayal (than when it is the result of bad luck) to a simple voter choice model yields a preference for incompetent leaders. These deliver worse material outcomes…
Original Post: “Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs.”

Because it's Friday: Goodbye, 2016

Between the deaths of beloved heroes and heroines, the civil unrest and political upheavals, and a slew of natural disasters, 2016 wasn’t the greatest year. If you made a movie about it, this is what the trailer would look like: We’ll be taking a few days off to recover from 2016. See you back here on January 3 and in the meantime, a very happy (and welcome!) New Year to all of our readers. 
Original Post: Because it's Friday: Goodbye, 2016

Power BI custom visuals, based on R

You’ve been able to include user-defined charts using R in Power BI dashboards for a while now, but a recent update to Power BI includes seven new custom charts based on R in the customs visuals gallery. You can see the new chart types by visiting the Power BI Custom Visuals Gallery and clicking on the “R-powered visuals” tab. The new custom visuals are listed below. Click on the visual names to see an example, and click on the “GitHub” link at the bottom of the pop-up to see the actual R code used. You can also combine custom R charts with other Power BI tools to create interactive R charts. A favorite of mine is the Timeline slicer, which allows you to select a range from a date variable. (The dashboard, including the R charts, is updated using only…
Original Post: Power BI custom visuals, based on R

Citizen Data Scientist, Jumbo Shrimp, and Other Descriptions That Make No Sense

Okay, let me get this out there: I find the term “Citizen Data Scientist” confusing. Gartner defines a “citizen data scientist as “a person who creates or generates models that leverage predictive or prescriptive analytics but whose primary job function is outside of the field of statistics and analytics.” While we teach business users to “think like a data scientist” in their ability to identify those variables and metrics that might be better predictors of performance, I do not expect that the business stakeholders are going to be able to create and generate analytic models. I do not believe, nor do I expect, that the business stakeholders are going to be proficient enough with tools like SAS or R or Python or Mahout or MADlib to 1) create or generate the models, and then 2) be proficient enough to be able to…
Original Post: Citizen Data Scientist, Jumbo Shrimp, and Other Descriptions That Make No Sense

Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs. In the meantime, I keep posting the stuff they send me, as part of my desperate effort to empty my inbox. 1. From Lewis: “Should Students Assessed as Needing Remedial Mathematics Take College-Level Quantitative Courses Instead? A Randomized Controlled Trial,” by A. W. Logue, Mari Watanabe-Rose, and Daniel Douglas, which begins: Many college students never take, or do not pass, required remedial mathematics courses theorized to increase college-level performance. Some colleges and states are therefore instituting policies allowing students to take college-level courses without first taking remedial courses. However, no experiments have compared the effectiveness of these approaches, and other data are mixed. We randomly assigned 907 students to (a) remedial elementary algebra, (b) that course with workshops, or (c) college-level statistics with…
Original Post: Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Using R to prevent food poisoning in Chicago

There are more than 15,000 restaurants in Chicago, but fewer than 40 inspectors tasked with making sure they comply with food-safety standards. To help prioritize the facilities targeted for inspection, the City of Chicago used R to create a model that predicts which restaurants are most likely to fail an inspection. Using this model to deploy inspectors, the City is able to detect unsafe restaurants more than a week sooner than by using traditional selection methods, and cite 37 additional restaurants per month. Chicago’s Department of Public Health used the R language to build and deploy the model, and made the code available as an open source project on GitHub. The reasons given are twofold: An open source approach helps build a foundation for other models attempting to forecast violations at food establishments. The analytic code is written in R, an open source,…
Original Post: Using R to prevent food poisoning in Chicago

Game Theory Reveals the Future of Deep Learning

By Carlos Perez, Intuition Machine. Image credit If you’ve been following my articles up to now, you’ll begin to perceive, what’s apparent to many advanced practitioners of Deep Learning (DL), is the emergence of Game Theoretic concepts in the design of newer architectures. This makes intuitive sense for two reasons. The first intuition is that DL systems will eventually need to tackle situations with imperfect knowledge. In fact we’ve already seen this in DeepMind’s AlphaGo that uses partial knowledge to tactically and strategically best the world-best human in the game of Go. The second intuition is that systems will not remain monolithic as they are now, but rather would involve multiple coordinating (or competing) cliques of DL systems. We actually already do see this now in the construction of adversarial networks. Adversarial networks consists of competing neural networks, a generator and discriminator,…
Original Post: Game Theory Reveals the Future of Deep Learning

“The Pitfall of Experimenting on the Web: How Unattended Selective Attrition Leads to Surprising (Yet False) Research Conclusions”

“The Pitfall of Experimenting on the Web: How Unattended Selective Attrition Leads to Surprising (Yet False) Research Conclusions” Posted by Andrew on 29 December 2016, 9:55 am Kevin Lewis points us to this paper by Haotian Zhou and Ayelet Fishbach, which begins: The authors find that experimental studies using online samples (e.g., MTurk) often violate the assumption of random assignment, because participant attrition—quitting a study before completing it and getting paid—is not only prevalent, but also varies systemically across experimental conditions. Using standard social psychology paradigms (e.g., ego-depletion, construal level), they observed attrition rates ranging from 30% to 50% (Study 1). The authors show that failing to attend to attrition rates in online panels has grave consequences. By introducing experimental confounds, unattended attrition misled them to draw mind-boggling yet false conclusions: that recalling a few happy events is considerably more effortful…
Original Post: “The Pitfall of Experimenting on the Web: How Unattended Selective Attrition Leads to Surprising (Yet False) Research Conclusions”

2017 will be the year the data science and big data community engage with AI technologies

The Tulip Stairs and lantern at the Queen’s House in Greenwich by Inigo Jones. (source: Mcginnly on Wikimedia Commons).Strata + Hadoop World San Jose will take place March 13-16, 2017. Use code BIGDATA20 for a 20% discount on registration. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. This episode consists of excerpts from a recent talk I gave at a conference commemorating the end of the UC Berkeley AMPLab project. This section pertained to some recent trends in Data and AI. For a complete list of trends we’re watching in 2017, as well as regular doses of highly curated resources, subscribe to our Data and AI newsletters. As 2016 draws to a close, I see the big data and data…
Original Post: 2017 will be the year the data science and big data community engage with AI technologies