We fiddle while Rome burns: p-value edition

Raghu Parthasarathy presents a wonderfully clear example of disastrous p-value-based reasoning that he saw in a conference presentation. Here’s Raghu: Consider, for example, some tumorous cells that we can treat with drugs 1 and 2, either alone or in combination. We can make measurements of growth under our various drug treatment conditions. Suppose our measurements give us the following graph: . . . from which we tell the following story: When administered on their own, drugs 1 and 2 are ineffective — tumor growth isn’t statistically different than the control cells (p > 0.05, 2 sample t-test). However, when the drugs are administered together, they clearly affect the cancer (p < 0.05); in fact, the p-value is very small (0.002!). This indicates a clear synergy between the two drugs: together they have a much stronger effect than each alone does.…
Original Post: We fiddle while Rome burns: p-value edition

Nooooooo, just make it stop, please!

Dan Kahan wrote: You should do a blog on this. I replied: I don’t like this article but I don’t really see the point in blogging on it. Why bother? Kahan: BECAUSE YOU REALLY NEVER HAVE EXPLAINED WHY. Gelman-Rubin criticque of BIC is not responsive; you have something in mind—tell us what, pls! Inquiring minds what to know. Me: Wait, are you saying it’s not clear to you why I should hate that paper?? Kahan: YES!!!!!!! Certainly what say about “model selection” aspects of BIC in Gelman-Rubin don’t apply. Me: OK, OK. . . . The paper is called, Bayesian Benefits for the Pragmatic Researcher, and it’s by some authors whom I like and respect, but I don’t like what they’re doing. Here’s their abstract: The practical advantages of Bayesian inference are demonstrated here through two concrete examples. In the…
Original Post: Nooooooo, just make it stop, please!

Emails I never bothered to answer

Emails I never bothered to answer Posted by Andrew on 26 December 2016, 9:10 am So, this came in the email one day: Dear Professor Gelman, I would like to shortly introduce myself: I am editor in the ** Department at the publishing house ** (based in ** and **). As you may know, ** has taken over all journals of ** Press. We are currently restructuring some of the journals and are therefore looking for new editors for the journal **. You have published in the journal, you work in the field . . . your name was recommended by Prof. ** as a potential editor for the journal. . . . We think you would be an excellent choice and I would like to ask you kindly whether you are interested to become an editor of the journal. In…
Original Post: Emails I never bothered to answer

p=.03, it’s gotta be true!

p=.03, it’s gotta be true! Posted by Andrew on 24 December 2016, 9:39 am Howie Lempel writes: Showing a white person a photo of Obama w/ artificially dark skin instead of artificially lightened skin before asking whether they support the Tea Party raises their probability of saying “yes” from 12% to 22%. 255 person Amazon Turk and Craigs List sample, p=.03. Nothing too unusual about this one. But it’s particularly grating when hyper educated liberal elites use shoddy research to decide that their political opponents only disagree with them because they’re racist. https://www.washingtonpost.com/news/wonk/wp/2016/05/13/how-psychologists-used-these-doctored-obama-photos-to-get-white-people-to-support-conservative-politics/ https://news.stanford.edu/2016/05/09/perceived-threats-racial-status-drive-white-americans-support-tea-party-stanford-scholar-says/ Hey, they could have a whole series of this sort of experiment: – Altering the orange hue of Donald Trump’s skin and seeing if it affects how much people trust the guy . . . – Making Hillary Clinton fatter and seeing if that somehow makes her…
Original Post: p=.03, it’s gotta be true!

This is not news.

This is not news. Posted by Andrew on 22 December 2016, 11:06 am Anne Pier Salverda writes: I’m not sure if you’re keeping track of published failures to replicate the power posing effect, but this article came out earlier this month: “Embodied power, testosterone, and overconfidence as a causal pathway to risk-taking.” From the abstract: We were unable to replicate the findings of the original study and subsequently found no evidence for our extended hypotheses. Gotta love that last sentence of the abstract: As our replication attempt was conducted in the Netherlands, we discuss the possibility that cultural differences may play a moderating role in determining the physiological and psychological effects of power posing. Let’s just hope that was a joke. Jokes are ok in academic papers, right?
Original Post: This is not news.

Hark, hark! the p-value at heaven’s gate sings

Three different people pointed me to this post, in which food researcher and business school professor Brian Wansink advises Ph.D. students to “never say no”: When a research idea comes up, check it out, put some time into it and you might get some success. I like that advice and I agree with it. Or, at least, this approached worked for me when I was a student and it continues to work for me now, and my favorite students are those who follow this approach. That said, there could be some selection bias here, that the students who say Yes to new projects are the ones who are more likely to be able to make use of such opportunities. Maybe the students who say No would just end up getting distracted and making no progress, were they to follow this…
Original Post: Hark, hark! the p-value at heaven’s gate sings

What is valued by the Association for Psychological Science

Someone pointed me to this program of the forthcoming Association for Psychological Science conference: Kind of amazing that they asked Amy Cuddy to speak. Weren’t Dana Carney or Andy Yap available? What would really have been bold would have been for them to invite Eva Ranehill or Anna Dreber. Good stuff. The chair of the session is Susan Goldin-Meadow, who’s famous both for inviting that non-peer-reviewed “methodological terrorism” article that complained about non-peer-reviewed criticism, and also for some over-the-top claims of her own, including this amazing statement: Barring intentional fraud, every finding is an accurate description of the sample on which it was run. This is ridiculous. For example, I think it’s safe to assume that Reinhart and Rogoff did not do any intentional fraud in that famous paper of theirs—even their critics just talked about an “Excel error.” But…
Original Post: What is valued by the Association for Psychological Science

fMRI clusterf******

Several people pointed me to this paper by Anders Eklund, Thomas Nichols, and Hans Knutsson, which begins: Functional MRI (fMRI) is 25 years old, yet surprisingly its most common statistical methods have not been validated using real data. Here, we used resting-state fMRI data from 499 healthy controls to conduct 3 million task group analyses. Using this null data with different experimental designs, we estimate the incidence of significant results. In theory, we should find 5% false positives (for a significance threshold of 5%), but instead we found that the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%. These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results. I’m not a big fan of the whole false-positive,…
Original Post: fMRI clusterf******

“Dear Major Textbook Publisher”: A Rant

Dear Major Academic Publisher, You just sent me, unsolicited, an introductory statistics textbook that is 800 pages and weighs about 5 pounds. It’s the 3rd edition of a book by someone I’ve never heard of. That’s fine—a newcomer can write a good book. The real problem is that the book is crap. It’s just the usual conventional intro stat stuff. The book even has a table of the normal distribution on the inside cover! How retro is that? The book is bad in so many many ways, I don’t really feel like going into it. There’s nothing interesting here at all, the examples are uniformly fake, and I really can’t imagine this is a good way to teach this material to anybody. None of it makes sense, and a lot of the advice is out-and-out bad (for example, a table…
Original Post: “Dear Major Textbook Publisher”: A Rant

Hot hand 1, WSJ 0

Hot hand 1, WSJ 0 Posted by Andrew on 6 December 2016, 5:49 pm In a generally good book review on “uncertainty and the limits of human reason,” William Easterly writes: Failing to process uncertainty correctly, we attach too much importance to too small a number of observations. Basketball teams believe that players suddenly have a “hot hand” after they have made a string of baskets, so you should pass them the ball. Tversky showed that the hot hand was a myth—among many small samples of shooting attempts, there will randomly be some streaks. Instead of a hot hand, there was “regression to the mean”—players fall back down to their average shooting prowess after a streak. Likewise a “cold” player will move back up to his own average. No no no. The funny thing is: 1. As Miller and Sanjurjo explain,…
Original Post: Hot hand 1, WSJ 0