A stunned Dyson

A stunned Dyson Posted by Andrew on 22 July 2017, 9:30 am Terry Martin writes: I ran into this quote and thought you might enjoy it. It’s from p. 273 of Segre’s new biography of Fermi, The Pope of Physics: When Dyson met with him in 1953, Fermi welcomed him politely, but he quickly put aside the graphs he was being shown indicating agreement between theory and experiment. His verdict, as Dyson remembered, was “There are two ways of doing calculations in theoretical physics. One way, and this is the way I prefer, is to have a clear physical picture of the process you are calculating. The other way is to have a precise and self-consistent mathematical formalism. You have neither.” When a stunned Dyson tried to counter by emphasizing the agreement between experiment and the calculations, Fermi asked him how…
Original Post: A stunned Dyson

How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

Someone who wishes to remain anonymous writes in with a story: Linking to a new paper by Jorge Luis García, James J. Heckman, and Anna L. Ziff, an economist Sue Dynarski makes this “joke” on facebook—or maybe it’s not a joke: How does one adjust standard errors to account for the fact that N of papers on an experiment > N of participants in the experiment? Clicking through, the paper uses data from the “Abecedarian” (ABC) childhood intervention program of the 1970s. Well, the related ABC & “CARE” experiments, pooled together. From Table 3 on page 7, the ABC experiment has 58 treatment and 56 control students, while ABC has 17 treatment and 23 control. If you type “abecedarian” into Google Scholar, sure enough, you get 9,160 results! OK, but maybe some of those just have citations or references to…
Original Post: How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. A common conceptual error is that researchers take the rejection of a straw-man null as evidence in favor of their preferred alternative. A standard mode of operation goes like this: p < 0.05 is taken as strong evidence against the null hypothesis, p > 0.15 is taken as evidence in favor of the null, and p near 0.10 is taken either as weak evidence for an effect or as evidence of a weak effect. Unfortunately, none of those inferences is generally appropriate: a…
Original Post: Some natural solutions to the p-value communication problem—and why they won’t work.

#NotAll4YearOlds

I think there’s something wrong this op-ed by developmental psychologist Alison Gopnik, “4-year-olds don’t act like Trump,” and which begins, The analogy is pervasive among his critics: Donald Trump is like a child. . . . But the analogy is profoundly wrong, and it’s unfair to children. The scientific developmental research of the past 30 years shows that Mr. Trump is utterly unlike a 4-year-old. Gopnik continues with a list of positive attributes, each one which, she asserts, is held by four-year-olds but not by the president: Four-year-olds care deeply about the truth. . . . But Mr. Trump doesn’t just lie; he seems not even to care whether his statements are true. Four-year-olds are insatiably curious. One study found that the average preschooler asks hundreds of questions per day. . . . Mr. Trump refuses to read and is…
Original Post: #NotAll4YearOlds

Hotel room aliases of the statisticians

Hotel room aliases of the statisticians Posted by Andrew on 20 May 2017, 9:25 am Barry Petchesky writes: Below you’ll find a room list found before Game 1 at the Four Seasons in Houston (right across from the arena), where the Thunder were staying for their first-round series against the Rockets. We didn’t run it then because we didn’t want Rockets fans pulling the fire alarm or making late-night calls to the rooms . . . This is just great, and it makes me think we need the same thing at statistics conferences: LAPLACE, P . . . Christian RobertEINSTEIN, A . . . Brad EfronCICCONE, M . . . Grace WahbaSPRINGSTEEN, B . . . Brad CarlinNICKS, S . . . Jennifer HillTHATCHER, M . . . Deb NolanKEILLOR, G . . . Jim BergerBARRIS, C . . . Rob…
Original Post: Hotel room aliases of the statisticians

Taking Data Journalism Seriously

This is a bit of a followup to our recent review of “Everybody Lies.” While writing the review I searched the blog for mentions of Seth Stephens-Davidowitz, and I came across this post from last year, concerning a claim made by author J. D. Vance that “the middle part of America is more religious than the South.” This was a claim that stunned me, given that I’d seen some of the statistics on the topic, and it turned out that Vance had been mistaken, that he’d used some unadjusted numbers which were not directly comparable when looking at different regions of the country. It was an interesting statistical example, also interesting in that claims made in data journalism, just like claims made in academic research, can get all sorts of uncritical publicity. People just trust the numbers, which makes sense…
Original Post: Taking Data Journalism Seriously

Accounting for variation and uncertainty

Accounting for variation and uncertainty Posted by Andrew on 12 May 2017, 9:35 am [cat picture] Yesterday I gave a list of the questions they’re asking me when I speak at the Journal of Accounting Research Conference. All kidding aside, I think that a conference of accountants is the perfect setting for a discussion of of research integrity, as accounting is all about setting up institutions to enable trust. The challenge is that traditional accounting is deterministic: there’s a ledger and that’s that. In statistics, we talk all the time about accounting for variation and uncertainty. Maybe “accounting” is more than a metaphor here, and maybe there’s more of a connection to the traditional practices of accounting than I’d thought.
Original Post: Accounting for variation and uncertainty

A completely reasonable-sounding statement with which I strongly disagree

From a couple years ago: In the context of a listserv discussion about replication in psychology experiments, someone wrote: The current best estimate of the effect size is somewhere in between the original study and the replication’s reported value. This conciliatory, split-the-difference statement sounds reasonable, and it might well represent good politics in the context of a war over replications—but from a statistical perspective I strongly disagree with it, for the following reason. The original study’s estimate typically has a huge bias (due to the statistical significance filter). The estimate from the replicated study, assuming it’s a preregistered replication, is unbiased. I think in such a setting the safest course is to use the replication’s reported value as our current best estimate. That doesn’t mean that the original study is “wrong,” but it is wrong to report a biased estimate…
Original Post: A completely reasonable-sounding statement with which I strongly disagree

7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid?

7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid? Posted by Andrew on 5 May 2017, 9:51 am [cat picture] Laura Kapitula writes: I wanted to share a cute story that gave me a bit of hope. My daughter who is in 7th grade was doing her science project. She had designed an experiment comparing lemon batteries to potato batteries, a 2×4 design with lemons or potatoes as one factor and number of fruits/vegetables as the other factor (1, 2, 3 or 4). She had to “preregister” her experiment with her teacher and had basically designed her experiment herself and done her analysis plan without any help from her statistician mother. Typical scientist not consulting the statistician until she was already collecting data. She was running the experiment and after she had done all her batteries and…
Original Post: 7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid?

What hypothesis testing is all about. (Hint: It’s not what you think.)

The conventional view: Hyp testing is all about rejection. The idea is that if you reject the null hyp at the 5% level, you have a win, you have learned that a certain null model is false and science has progressed, either in the glamorous “scientific revolution” sense that you’ve rejected a central pillar of science-as-we-know-it and are forcing a radical re-evaluation of how we think about the world (those are the accomplishments of Kepler, Curie, Einstein, and . . . Daryl Bem), or in the more usual “normal science” sense in which a statistically significant finding is a small brick in the grand cathedral of science (or a stall in the scientific bazaar, whatever, I don’t give a damn what you call it), a three-yards-and-a-cloud-of-dust, all-in-a-day’s-work kind of thing, a “necessary murder” as Auden notoriously put it (and for…
Original Post: What hypothesis testing is all about. (Hint: It’s not what you think.)