The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia)

The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia) Posted by Andrew on 6 November 2017, 5:20 pm I’m speaking Mon 13 Nov, 6pm, at Low Library Rotunda at Columbia: The Statistical Crisis in Science—and How to Move Forward Using examples ranging from elections to birthdays to policy analysis, Professor Andrew Gelman will discuss ways in which statistical methods have failed, leading to a replication crisis in much of science, as well as directions for improvements through statistical methods that make use of more information. Online reservation is required; follow the link currently full and closed. This will be a talk for a general audience.
Original Post: The Statistical Crisis in Science—and How to Move Forward (my talk next Monday 6pm at Columbia)

Why won’t you cheat with me?

But I got some ground rules I’ve found to be sound rules and you’re not the one I’m exempting. Nonetheless, I confess it’s tempting. – Jenny Toomey sings Franklin Bruno It turns out that I did something a little controversial in last week’s post. As these things always go, it wasn’t the thing I was expecting to get push back from, but rather what I thought was a fairly innocuous scaling of the prior. One commenter (and a few other people on other communication channels) pointed out that the dependence of the prior on the design didn’t seem kosher.  Of course, we (Andrew, Mike and I) wrote a paper that was sort of about this a few months ago, but it’s one of those really interesting topics that we can probably all deal with thinking more about. So in this…
Original Post: Why won’t you cheat with me?

The king must die

“And then there was Yodeling Elaine, the Queen of the Air. She had a dollar sign medallion about as big as a dinner plate around her neck and a tiny bubble of spittle around her nostril and a little rusty tear, for she had lassoed and lost another tipsy sailor“– Tom Waits It turns out I turned thirty two and became unbearable. Some of you may feel, with an increasing sense of temporal dissonance, that I was already unbearable. (Fair point) Others will wonder how I can look so good at my age. (Answer: Black Metal) None of that matters to me because all I want to do is talk about the evils of marketing like the 90s were a vaguely good idea. (Narrator: “They were not. The concept of authenticity is just another way for the dominant culture to suppress more interesting…
Original Post: The king must die

Using Mister P to get population estimates from respondent driven sampling

From one of our exams: A researcher at Columbia University’s School of Social Work wanted to estimate the prevalence of drug abuse problems among American Indians (Native Americans) living in New York City. From the Census, it was estimated that about 30,000 Indians live in the city, and the researcher had a budget to interview 400. She did not have a list of Indians in the city, and she obtained her sample as follows. She started with a list of 300 members of a local American Indian community organization, and took a random sample of 100 from this list. She interviewed these 100 persons and asked each of these to give her the names of other Indians in the city whom they knew. She asked each respondent to characterize him/herself and also the people on the list on a 1-10…
Original Post: Using Mister P to get population estimates from respondent driven sampling

Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that did not deliver what was hoped for, I am anticipating that and speculating why here. I have real doubts about such material actually enabling others to meaningfully interpret Bayesian analyses let alone implement them themselves. For instance, in a conversation last year with David Spiegelhalter, his take was that some material I had could easily be followed by many, but the concepts that material was trying to get across were very subtle and few would have the background to connect to them. On the other…
Original Post: Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the work I do involves “backtesting” investment strategies, where I simulate the performance of an investment portfolio using historical data on returns. The primary summary statistics I generate from this sort of analysis are mean return (both arithmetic and geometric) and standard deviation (called “volatility” in my industry). Basically the idea is to select strategies that are likely to generate high returns given the amount of volatility they experience. However, historical market data are very noisy, with stock portfolios generating an average monthly return of around…
Original Post: Died in the Wool

“Bayes factor”: where the term came from, and some references to why I generally hate it

“Bayes factor”: where the term came from, and some references to why I generally hate it Posted by Andrew on 21 July 2017, 9:47 am Someone asked: Do you know when this term was coined or by whom? Kass and Raftery’s use of the tem as the title of their 1995 paper suggests that it was still novel then, but I have not noticed in the paper any information about where it started. I replied: According to Etz and Wagenmakers (2016), “The term ‘Bayes factor’ comes from Good, who attributes the introduction of the term to Turing, who simply called it the ‘factor.’” They refer to Good (1988) and Fienberg (2006) for historical review. I generally hate Bayes factors myself, for reasons discussed at a technical level in our Bayesian Data Analysis book (see chapter 7 of the third edition). Or,…
Original Post: “Bayes factor”: where the term came from, and some references to why I generally hate it

Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version of R and the latest version of Stan.) If problems occur please join the stan-users group and post any questions. It’s important that all participants get Stan running and bring their laptops to the course. Class structure and example topics for the three days: Day 1: FoundationsFoundations of Bayesian inferenceFoundations of Bayesian computation with Markov chain Monte CarloIntro to Stan with hands-on exercisesReal-life StanBayesian workflow Day 2: Linear and Generalized Linear ModelsFoundations of Bayesian regressionFitting GLMs in Stan (logistic regression, Poisson regression)Diagnosing model misfit using…
Original Post: Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. A common conceptual error is that researchers take the rejection of a straw-man null as evidence in favor of their preferred alternative. A standard mode of operation goes like this: p < 0.05 is taken as strong evidence against the null hypothesis, p > 0.15 is taken as evidence in favor of the null, and p near 0.10 is taken either as weak evidence for an effect or as evidence of a weak effect. Unfortunately, none of those inferences is generally appropriate: a…
Original Post: Some natural solutions to the p-value communication problem—and why they won’t work.

A continuous hinge function for statistical modeling

This comes up sometimes in my applied work: I want a continuous “hinge function,” something like the red curve above, connecting two straight lines in a smooth way. Why not include the sharp corner (in this case, the function y=-0.5x if x<0 or y=0.2x if x>0)? Two reasons. First, computation: Hamiltonian Monte Carlo can trip on discontinuities. Second, I want a smooth curve anyway, as I’d expect it to better describe reality. Indeed, the linear parts of the curve are themselves typically only approximations. So, when I’m putting this together, I don’t want to take two lines and then stitch them together with some sort of quadratic or cubic, creating a piecewise function with three parts. I just want one simple formula that asymptotes to the lines, as in the above picture. As I said, this problem comes up occasion,…
Original Post: A continuous hinge function for statistical modeling