Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. A common conceptual error is that researchers take the rejection of a straw-man null as evidence in favor of their preferred alternative. A standard mode of operation goes like this: p < 0.05 is taken as strong evidence against the null hypothesis, p > 0.15 is taken as evidence in favor of the null, and p near 0.10 is taken either as weak evidence for an effect or as evidence of a weak effect. Unfortunately, none of those inferences is generally appropriate: a…
Original Post: Some natural solutions to the p-value communication problem—and why they won’t work.

#NotAll4YearOlds

I think there’s something wrong this op-ed by developmental psychologist Alison Gopnik, “4-year-olds don’t act like Trump,” and which begins, The analogy is pervasive among his critics: Donald Trump is like a child. . . . But the analogy is profoundly wrong, and it’s unfair to children. The scientific developmental research of the past 30 years shows that Mr. Trump is utterly unlike a 4-year-old. Gopnik continues with a list of positive attributes, each one which, she asserts, is held by four-year-olds but not by the president: Four-year-olds care deeply about the truth. . . . But Mr. Trump doesn’t just lie; he seems not even to care whether his statements are true. Four-year-olds are insatiably curious. One study found that the average preschooler asks hundreds of questions per day. . . . Mr. Trump refuses to read and is…
Original Post: #NotAll4YearOlds

Hotel room aliases of the statisticians

Hotel room aliases of the statisticians Posted by Andrew on 20 May 2017, 9:25 am Barry Petchesky writes: Below you’ll find a room list found before Game 1 at the Four Seasons in Houston (right across from the arena), where the Thunder were staying for their first-round series against the Rockets. We didn’t run it then because we didn’t want Rockets fans pulling the fire alarm or making late-night calls to the rooms . . . This is just great, and it makes me think we need the same thing at statistics conferences: LAPLACE, P . . . Christian RobertEINSTEIN, A . . . Brad EfronCICCONE, M . . . Grace WahbaSPRINGSTEEN, B . . . Brad CarlinNICKS, S . . . Jennifer HillTHATCHER, M . . . Deb NolanKEILLOR, G . . . Jim BergerBARRIS, C . . . Rob…
Original Post: Hotel room aliases of the statisticians

Taking Data Journalism Seriously

This is a bit of a followup to our recent review of “Everybody Lies.” While writing the review I searched the blog for mentions of Seth Stephens-Davidowitz, and I came across this post from last year, concerning a claim made by author J. D. Vance that “the middle part of America is more religious than the South.” This was a claim that stunned me, given that I’d seen some of the statistics on the topic, and it turned out that Vance had been mistaken, that he’d used some unadjusted numbers which were not directly comparable when looking at different regions of the country. It was an interesting statistical example, also interesting in that claims made in data journalism, just like claims made in academic research, can get all sorts of uncritical publicity. People just trust the numbers, which makes sense…
Original Post: Taking Data Journalism Seriously

Accounting for variation and uncertainty

Accounting for variation and uncertainty Posted by Andrew on 12 May 2017, 9:35 am [cat picture] Yesterday I gave a list of the questions they’re asking me when I speak at the Journal of Accounting Research Conference. All kidding aside, I think that a conference of accountants is the perfect setting for a discussion of of research integrity, as accounting is all about setting up institutions to enable trust. The challenge is that traditional accounting is deterministic: there’s a ledger and that’s that. In statistics, we talk all the time about accounting for variation and uncertainty. Maybe “accounting” is more than a metaphor here, and maybe there’s more of a connection to the traditional practices of accounting than I’d thought.
Original Post: Accounting for variation and uncertainty

A completely reasonable-sounding statement with which I strongly disagree

From a couple years ago: In the context of a listserv discussion about replication in psychology experiments, someone wrote: The current best estimate of the effect size is somewhere in between the original study and the replication’s reported value. This conciliatory, split-the-difference statement sounds reasonable, and it might well represent good politics in the context of a war over replications—but from a statistical perspective I strongly disagree with it, for the following reason. The original study’s estimate typically has a huge bias (due to the statistical significance filter). The estimate from the replicated study, assuming it’s a preregistered replication, is unbiased. I think in such a setting the safest course is to use the replication’s reported value as our current best estimate. That doesn’t mean that the original study is “wrong,” but it is wrong to report a biased estimate…
Original Post: A completely reasonable-sounding statement with which I strongly disagree

7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid?

7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid? Posted by Andrew on 5 May 2017, 9:51 am [cat picture] Laura Kapitula writes: I wanted to share a cute story that gave me a bit of hope. My daughter who is in 7th grade was doing her science project. She had designed an experiment comparing lemon batteries to potato batteries, a 2×4 design with lemons or potatoes as one factor and number of fruits/vegetables as the other factor (1, 2, 3 or 4). She had to “preregister” her experiment with her teacher and had basically designed her experiment herself and done her analysis plan without any help from her statistician mother. Typical scientist not consulting the statistician until she was already collecting data. She was running the experiment and after she had done all her batteries and…
Original Post: 7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid?

What hypothesis testing is all about. (Hint: It’s not what you think.)

The conventional view: Hyp testing is all about rejection. The idea is that if you reject the null hyp at the 5% level, you have a win, you have learned that a certain null model is false and science has progressed, either in the glamorous “scientific revolution” sense that you’ve rejected a central pillar of science-as-we-know-it and are forcing a radical re-evaluation of how we think about the world (those are the accomplishments of Kepler, Curie, Einstein, and . . . Daryl Bem), or in the more usual “normal science” sense in which a statistically significant finding is a small brick in the grand cathedral of science (or a stall in the scientific bazaar, whatever, I don’t give a damn what you call it), a three-yards-and-a-cloud-of-dust, all-in-a-day’s-work kind of thing, a “necessary murder” as Auden notoriously put it (and for…
Original Post: What hypothesis testing is all about. (Hint: It’s not what you think.)

The statistical crisis in science: How is it relevant to clinical neuropsychology?

The statistical crisis in science: How is it relevant to clinical neuropsychology? Posted by Andrew on 3 May 2017, 9:52 am [cat picture] Hilde Geurts and I write: There is currently increased attention to the statistical (and replication) crisis in science. Biomedicine and social psychology have been at the heart of this crisis, but similar problems are evident in a wide range of fields. We discuss three examples of replication challenges from the field of social psychology and some proposed solutions, and then consider the applicability of these ideas to clinical neuropsychology. In addition to procedural developments such as preregistration and open data and criticism, we recommend that data be collected and analyzed with more recognition that each new study is a part of a learning process. The goal of improving neuropsychological assessment, care, and cure is too important to not…
Original Post: The statistical crisis in science: How is it relevant to clinical neuropsychology?

A small, underpowered treasure trove?

A small, underpowered treasure trove? Posted by Andrew on 12 January 2017, 9:48 am Benjamin Kirkup writes: As you sometimes comment on such things; I’m forwarding you a journal editorial (in a society journal)that presents “lessons learned” from an associated research study. What caught my attention was the comment on the “notorious” design, the lack of “significant” results, and the “interesting data on nonsignificant associations.” Apparently, the work “does not serve to inform the regulatory decision-making process with respect to antimicrobial compounds” but is “still valuable and can be informative.” Given the commissioning of a lessons-learned, how do you think the scientific publishing community should handle manuscripts presenting work with problematic designs and naturally uninformative outcomes? The editorial in question is called Lessons Learned from Probing for Impacts of Triclosan and Triclocarban on Human Microbiomes, it is by Rolf Halden, and…
Original Post: A small, underpowered treasure trove?