The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia)

The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia) Posted by Andrew on 6 November 2017, 5:20 pm I’m speaking Mon 13 Nov, 6pm, at Low Library Rotunda at Columbia: The Statistical Crisis in Science—and How to Move Forward Using examples ranging from elections to birthdays to policy analysis, Professor Andrew Gelman will discuss ways in which statistical methods have failed, leading to a replication crisis in much of science, as well as directions for improvements through statistical methods that make use of more information. Online reservation is required; follow the link currently full and closed. This will be a talk for a general audience.
Original Post: The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia)

The time reversal heuristic (priming and voting edition)

Ed Yong writes: Over the past decade, social psychologists have dazzled us with studies showing that huge social problems can seemingly be rectified through simple tricks. A small grammatical tweak in a survey delivered to people the day before an election greatly increases voter turnout. A 15-minute writing exercise narrows the achievement gap between black and white students—and the benefits last for years. “Each statement may sound outlandish—more science fiction than science,” wrote Gregory Walton from Stanford University in 2014. But they reflect the science of what he calls “wise interventions” . . . They seem to work, if the stream of papers in high-profile scientific journals is to be believed. But as with many branches of psychology, wise interventions are taking a battering. A new wave of studies that attempted to replicate the promising experiments have found discouraging results.…
Original Post: The time reversal heuristic (priming and voting edition)

Post-publication review succeeds again! (Two-lines edition.)

Post-publication review succeeds again! (Two-lines edition.) Posted by Andrew on 3 November 2017, 10:18 pm A couple months ago, Uri Simonsohn posted online a suggested statistical method for detecting nonmonotonicity in data. He called it: “Two-lines: The First Valid Test of U-Shaped Relationships.” With a title like that, I guess you’re asking for it. And, indeed, awhile later I received an email from Yair Heller identifying some problems with Uri’s method. After checking with Yair, I forwarded his message to Simonsohn who found the problem and fixed it. Uri’s update is here. Now, I don’t actually agree with Uri or Yair on this one: I don’t really buy the hypothesis-testing, type-1-error framework that they’re using. But that’s ok: it’s not my job to vet their methods. If these ideas are useful to others, great. My real point here is that post-publication…
Original Post: Post-publication review succeeds again! (Two-lines edition.)

More thoughts on that “What percent of Americans would you say are gay or lesbian?” survey

We had some discussion yesterday about this Gallup poll that asked respondents to guess the percentage of Americans who are gay. The average response was 23%—and this stunningly high number was not just driven by outliers: more than half the respondents estimated the proportion gay as 20% or more. All this is in stark contrast to direct estimates from surveys that 3 or 4% of Americans are gay. One thing that came up in comments is that survey respondents with minority sexual orientations might not want to admit it. So maybe that 3-4% is an underestimate. Here’s an informative news article by Samantha Allen which suggests that the traditionally-cited 10% number might not be so far off. But even if the real rate is 10% (including lots of closeted people), that’s still much less than the 23% from the survey.…
Original Post: More thoughts on that “What percent of Americans would you say are gay or lesbian?” survey

Statistical Significance and the Dichotomization of Evidence (McShane and Gal’s paper, with discussions by Berry, Briggs, Gelman and Carlin, and Laber and Shedden)

Statistical Significance and the Dichotomization of Evidence (McShane and Gal’s paper, with discussions by Berry, Briggs, Gelman and Carlin, and Laber and Shedden) Posted by Andrew on 1 November 2017, 4:05 pm Blake McShane sent along this paper by himself and David Gal, which begins: In light of recent concerns about reproducibility and replicability, the ASA issued a Statement on Statistical Significance and p-values aimed at those who are not primarily statisticians. While the ASA Statement notes that statistical significance and p-values are “commonly misused and misinterpreted,” it does not discuss and document broader implications of these errors for the interpretation of evidence. In this article, we review research on how applied researchers who are not primarily statisticians misuse and misinterpret p-values in practice and how this can lead to errors in the interpretation of evidence. We also present new data…
Original Post: Statistical Significance and the Dichotomization of Evidence (McShane and Gal’s paper, with discussions by Berry, Briggs, Gelman and Carlin, and Laber and Shedden)

“Americans Greatly Overestimate Percent Gay, Lesbian in U.S.”

“Americans Greatly Overestimate Percent Gay, Lesbian in U.S.” Posted by Andrew on 1 November 2017, 9:48 am This sort of thing is not new but it’s still amusing. From a Gallup report by Frank Newport: The American public estimates on average that 23% of Americans are gay or lesbian, little changed from Americans’ 25% estimate in 2011, and only slightly higher than separate 2002 estimates of the gay and lesbian population. These estimates are many times higher than the 3.8% of the adult population who identified themselves as lesbian, gay, bisexual or transgender in Gallup Daily tracking in the first four months of this year. Newport provides some context: Part of the explanation for the inaccurate estimates of the gay and lesbian population rests with Americans’ general unfamiliarity with numbers and demography. Previous research has shown that Americans estimate that a…
Original Post: “Americans Greatly Overestimate Percent Gay, Lesbian in U.S.”

Contour as a verb

Contour as a verb Our love is like the border between Greece and Albania – The Mountain Goats (In which I am uncharacteristically brief) Andrew’s answer to recent post reminded me of one of my favourite questions: how do you visualise uncertainty in spatial maps.  An interesting subspecies of this question relates to exactly how you can plot a contour map for a spatial estimate.  The obvious idea is to take a point estimate (like your mean or median spatial field) and draw a contour map on that. But this is problematic because it does not take into account the uncertainty in your estimate.  A contour on a map indicates a line that separates two levels of a field, but if you do not know the value of the field exactly, you cannot separate it precisely.  Bolin and Lindgren have constructed a…
Original Post: Contour as a verb

“Quality control” (rather than “hypothesis testing” or “inference” or “discovery”) as a better metaphor for the statistical processes of science

I’ve been thinking for awhile that the default ways in which statisticians think about science—and which scientists think about statistics—are seriously flawed, sometimes even crippling scientific inquiry in some subfields, in the way that bad philosophy can do. Here’s what I think are some of the default modes of thought: – Hypothesis testing, in which the purpose of data collection and analysis is to rule out a null hypothesis (typically, zero effect and zero systematic error) that nobody believes in the first place; – Inference, which can work in the context of some well-defined problems (for example, studying trends in public opinion or estimating parameters within an agreed-upon model in pharmacology), but which doesn’t capture the idea of learning from the unexpected; – Discovery, which sounds great but which runs aground when thinking about science as a routine process: can…
Original Post: “Quality control” (rather than “hypothesis testing” or “inference” or “discovery”) as a better metaphor for the statistical processes of science

My favorite definition of statistical significance

My favorite definition of statistical significance Posted by Andrew on 28 October 2017, 1:08 pm From my 2009 paper with Weakliem: Throughout, we use the term statistically significant in the conventional way, to mean that an estimate is at least two standard errors away from some “null hypothesis” or prespecified value that would indicate no effect present. An estimate is statistically insignificant if the observed value could reasonably be explained by simple chance variation, much in the way that a sequence of 20 coin tosses might happen to come up 8 heads and 12 tails; we would say that this result is not statistically significantly different from chance. More precisely, the observed proportion of heads is 40 percent but with a standard error of 11 percent—thus, the data are less than two standard errors away from the null hypothesis of 50…
Original Post: My favorite definition of statistical significance

A stunned Dyson

A stunned Dyson Posted by Andrew on 22 July 2017, 9:30 am Terry Martin writes: I ran into this quote and thought you might enjoy it. It’s from p. 273 of Segre’s new biography of Fermi, The Pope of Physics: When Dyson met with him in 1953, Fermi welcomed him politely, but he quickly put aside the graphs he was being shown indicating agreement between theory and experiment. His verdict, as Dyson remembered, was “There are two ways of doing calculations in theoretical physics. One way, and this is the way I prefer, is to have a clear physical picture of the process you are calculating. The other way is to have a precise and self-consistent mathematical formalism. You have neither.” When a stunned Dyson tried to counter by emphasizing the agreement between experiment and the calculations, Fermi asked him how…
Original Post: A stunned Dyson