Letter to the Editor of Perspectives on Psychological Science

[relevant cat picture] tl;dr: Himmicane in a teacup. Back in the day, the New Yorker magazine did not have a Letters to the Editors column, and so the great Spy magazine (the Gawker of its time) ran its own feature, Letters to the Editor of the New Yorker, where they posted the letters you otherwise would never see. Here on this blog we can start a new feature, Letters to the Editor of Perspectives on Psychological Science, which will feature corrections that this journal refuses to print. Here’s our first entry: “In the article, ‘Going in Many Right Directions, All at Once,’ published in this journal, the author wrote, “some critics go beyond scientific argument and counterargument to imply that the entire field is inept and misguided (e.g., Gelman, 2014; Shimmack [sic], 2014).’ However, this article provided no evidence that…
Original Post: Letter to the Editor of Perspectives on Psychological Science

Delegate at Large

Delegate at Large Posted by Andrew on 29 July 2017, 9:42 am Asher Meir points to this delightful garden of forking paths, which begins: • Politicians on the right look more beautiful in Europe, the U.S. and Australia.• As beautiful people earn more, they are more likely to oppose redistribution.• Voters use beauty as a cue for conservatism in low-information elections.• Politicians on the right benefit more from beauty in low-information elections. I wrote: On the plus side, it did not appear in a political science journal! Economists and psychologists can be such suckers for the “voters are idiots” models of politics. Meir replied: Perhaps since I am no longer an academic these things don’t even raise my hackles anymore. I just enjoy the entertainment value. This stuff still raises my hackles, partly because I’m in the information biz so I…
Original Post: Delegate at Large

“Statistics textbooks (including mine) are part of the problem, I think, in that we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.”

Carol Nickerson pointed me to a new article by Arie Kruglanski, Marina Chernikova, Katarzyna Jasko, entitled, “Social psychology circa 2016: A field on steroids.” I wrote: 1. I have no idea what is the meaning of the title of the article. Are they saying that they’re using performance-enhancing drugs? 2. I noticed this from the above article: “Consider the ‘power posing’ effects (Carney, Cuddy, & Yap, 2010; Carney, Cuddy, & Yap, 2015) or the ‘facial feedback’ effects (Strack, Martin, & Stepper, 1988), both of which recently came under criticism on grounds of non-replicability. We happen to believe that these effects could be quite real rather than made up, albeit detectable only under some narrowly circumscribed conditions. Our beliefs derive from what (we believe) is the core psychological mechanism mediating these phenomena.” This seems naive to me. If we want to…
Original Post: “Statistics textbooks (including mine) are part of the problem, I think, in that we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.”

How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

Someone who wishes to remain anonymous writes in with a story: Linking to a new paper by Jorge Luis García, James J. Heckman, and Anna L. Ziff, an economist Sue Dynarski makes this “joke” on facebook—or maybe it’s not a joke: How does one adjust standard errors to account for the fact that N of papers on an experiment > N of participants in the experiment? Clicking through, the paper uses data from the “Abecedarian” (ABC) childhood intervention program of the 1970s. Well, the related ABC & “CARE” experiments, pooled together. From Table 3 on page 7, the ABC experiment has 58 treatment and 56 control students, while ABC has 17 treatment and 23 control. If you type “abecedarian” into Google Scholar, sure enough, you get 9,160 results! OK, but maybe some of those just have citations or references to…
Original Post: How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. A common conceptual error is that researchers take the rejection of a straw-man null as evidence in favor of their preferred alternative. A standard mode of operation goes like this: p < 0.05 is taken as strong evidence against the null hypothesis, p > 0.15 is taken as evidence in favor of the null, and p near 0.10 is taken either as weak evidence for an effect or as evidence of a weak effect. Unfortunately, none of those inferences is generally appropriate: a…
Original Post: Some natural solutions to the p-value communication problem—and why they won’t work.

What is needed to do good research (hint: it’s not just the avoidance of “too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves”)

[cat picture] In a news article entitled, “No, Wearing Red Doesn’t Make You Hotter,” Dalmeet Singh Chawla recounts the story of yet another Psychological Science / PPNAS-style study (this one actually appeared back in 2008 in Journal of Personality and Social Psychology, the same prestigious journal which published Daryl Bem’s ESP study a couple years later). Chawla’s article is just fine, and I think these non-replications should continue to get press, as much press as the original flawed studies. I have just two problem. The first is when Chawla writes: The issues at hand seem to be the same ones surfacing again and again in the replication crisis—too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves. I mean, sure, yeah, I agree with…
Original Post: What is needed to do good research (hint: it’s not just the avoidance of “too much weight given to small samples, a tendency to publish positive results and not negative results, and perhaps an unconscious bias from the researchers themselves”)

Mockery is the best medicine

Mockery is the best medicine Posted by Andrew on 11 May 2017, 4:37 pm [cat picture] I’m usually not such a fan of twitter, but Jeff sent me this, from Andy Hall, and it’s just hilarious: The background is here. But Hall is missing a few key determinants of elections and political attitudes: subliminal smiley faces, college football, fat arms, and, of course, That Time of the Month. You can see why I can’t do twitter. I’m not concise enough.
Original Post: Mockery is the best medicine

“P-hacking” and the intention-to-cheat effect

“P-hacking” and the intention-to-cheat effect Posted by Andrew on 10 May 2017, 5:53 pm I’m a big fan of the work of Uri Simonsohn and his collaborators, but I don’t like the term “p-hacking” because it can be taken to imply an intention to cheat. The image of p-hacking is of a researcher trying test after test on the data until reaching the magic “p less than .05.” But, as Eric Loken and I discuss in our paper on the garden of forking paths, multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. I worry that the widespread use term “p-hacking” gives two wrong impressions: First, it implies that the many researchers who use p-values incorrectly are cheating or “hacking,” even though I suspect they’re mostly…
Original Post: “P-hacking” and the intention-to-cheat effect

We fiddle while Rome burns: p-value edition

Raghu Parthasarathy presents a wonderfully clear example of disastrous p-value-based reasoning that he saw in a conference presentation. Here’s Raghu: Consider, for example, some tumorous cells that we can treat with drugs 1 and 2, either alone or in combination. We can make measurements of growth under our various drug treatment conditions. Suppose our measurements give us the following graph: . . . from which we tell the following story: When administered on their own, drugs 1 and 2 are ineffective — tumor growth isn’t statistically different than the control cells (p > 0.05, 2 sample t-test). However, when the drugs are administered together, they clearly affect the cancer (p < 0.05); in fact, the p-value is very small (0.002!). This indicates a clear synergy between the two drugs: together they have a much stronger effect than each alone does.…
Original Post: We fiddle while Rome burns: p-value edition

Nooooooo, just make it stop, please!

Dan Kahan wrote: You should do a blog on this. I replied: I don’t like this article but I don’t really see the point in blogging on it. Why bother? Kahan: BECAUSE YOU REALLY NEVER HAVE EXPLAINED WHY. Gelman-Rubin criticque of BIC is not responsive; you have something in mind—tell us what, pls! Inquiring minds what to know. Me: Wait, are you saying it’s not clear to you why I should hate that paper?? Kahan: YES!!!!!!! Certainly what say about “model selection” aspects of BIC in Gelman-Rubin don’t apply. Me: OK, OK. . . . The paper is called, Bayesian Benefits for the Pragmatic Researcher, and it’s by some authors whom I like and respect, but I don’t like what they’re doing. Here’s their abstract: The practical advantages of Bayesian inference are demonstrated here through two concrete examples. In the…
Original Post: Nooooooo, just make it stop, please!