Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that did not deliver what was hoped for, I am anticipating that and speculating why here. I have real doubts about such material actually enabling others to meaningfully interpret Bayesian analyses let alone implement them themselves. For instance, in a conversation last year with David Spiegelhalter, his take was that some material I had could easily be followed by many, but the concepts that material was trying to get across were very subtle and few would have the background to connect to them. On the other…
Original Post: Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the work I do involves “backtesting” investment strategies, where I simulate the performance of an investment portfolio using historical data on returns. The primary summary statistics I generate from this sort of analysis are mean return (both arithmetic and geometric) and standard deviation (called “volatility” in my industry). Basically the idea is to select strategies that are likely to generate high returns given the amount of volatility they experience. However, historical market data are very noisy, with stock portfolios generating an average monthly return of around…
Original Post: Died in the Wool

“Bayes factor”: where the term came from, and some references to why I generally hate it

“Bayes factor”: where the term came from, and some references to why I generally hate it Posted by Andrew on 21 July 2017, 9:47 am Someone asked: Do you know when this term was coined or by whom? Kass and Raftery’s use of the tem as the title of their 1995 paper suggests that it was still novel then, but I have not noticed in the paper any information about where it started. I replied: According to Etz and Wagenmakers (2016), “The term ‘Bayes factor’ comes from Good, who attributes the introduction of the term to Turing, who simply called it the ‘factor.’” They refer to Good (1988) and Fienberg (2006) for historical review. I generally hate Bayes factors myself, for reasons discussed at a technical level in our Bayesian Data Analysis book (see chapter 7 of the third edition). Or,…
Original Post: “Bayes factor”: where the term came from, and some references to why I generally hate it

Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version of R and the latest version of Stan.) If problems occur please join the stan-users group and post any questions. It’s important that all participants get Stan running and bring their laptops to the course. Class structure and example topics for the three days: Day 1: FoundationsFoundations of Bayesian inferenceFoundations of Bayesian computation with Markov chain Monte CarloIntro to Stan with hands-on exercisesReal-life StanBayesian workflow Day 2: Linear and Generalized Linear ModelsFoundations of Bayesian regressionFitting GLMs in Stan (logistic regression, Poisson regression)Diagnosing model misfit using…
Original Post: Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. A common conceptual error is that researchers take the rejection of a straw-man null as evidence in favor of their preferred alternative. A standard mode of operation goes like this: p < 0.05 is taken as strong evidence against the null hypothesis, p > 0.15 is taken as evidence in favor of the null, and p near 0.10 is taken either as weak evidence for an effect or as evidence of a weak effect. Unfortunately, none of those inferences is generally appropriate: a…
Original Post: Some natural solutions to the p-value communication problem—and why they won’t work.

A continuous hinge function for statistical modeling

This comes up sometimes in my applied work: I want a continuous “hinge function,” something like the red curve above, connecting two straight lines in a smooth way. Why not include the sharp corner (in this case, the function y=-0.5x if x<0 or y=0.2x if x>0)? Two reasons. First, computation: Hamiltonian Monte Carlo can trip on discontinuities. Second, I want a smooth curve anyway, as I’d expect it to better describe reality. Indeed, the linear parts of the curve are themselves typically only approximations. So, when I’m putting this together, I don’t want to take two lines and then stitch them together with some sort of quadratic or cubic, creating a piecewise function with three parts. I just want one simple formula that asymptotes to the lines, as in the above picture. As I said, this problem comes up occasion,…
Original Post: A continuous hinge function for statistical modeling

Causal inference using Bayesian additive regression trees: some questions and answers

[cat picture] Rachael Meager writes: We’re working on a policy analysis project. Last year we spoke about individual treatment effects, which is the direction we want to go in. At the time you suggested BART [Bayesian additive regression trees; these are not averages of tree models as are usually set up; rather, the key is that many little nonlinear tree models are being summed; in that sense, Bart is more like a nonparametric discrete version of a spline model. —AG]. But there are 2 drawbacks of using BART for this project. (1) BART predicts the outcome not the individual treatment effect – although those are obviously related and there has been some discussion of this in the econ literature. (2) It will be hard for us to back out the covariate combinations / interactions that predict the outcomes / treatment…
Original Post: Causal inference using Bayesian additive regression trees: some questions and answers

Using Stan for week-by-week updating of estimated soccer team abilites

Milad Kharratzadeh shares this analysis of the English Premier League during last year’s famous season. He fit a Bayesian model using Stan, and the R markdown file is here. The analysis has three interesting features: 1. Team ability is allowed to continuously vary throughout the season; thus, once the season is over, you can see an estimate of which teams were improving or declining. 2. But that’s not what is shown in the plot above. Rather, the plot above shows estimated team abilities after the model was fit to prior information plus week 1 data alone; prior information plus data from weeks 1 and 2; prior information plus data from weeks 1, 2, and 3; etc. For example, look at the plot for surprise victor Leicester City: after a few games, the team is already estimated to be in the…
Original Post: Using Stan for week-by-week updating of estimated soccer team abilites

Splines in Stan! (including priors that enforce smoothness)

Milad Kharratzadeh shares a new case study. This could be useful to a lot of people.Just for example, here’s the last section of the document, which shows how to simulate the data and fit the model graphed above:Location of Knots and the Choice of Priors In practical problems, it is not always clear how to choose the number/location of the knots. Choosing too many/too few knots may lead to overfitting/underfitting. In this part, we introduce a prior that alleviates the problems associated with the choice of number/locations of the knots to a great extent. Let us start by a simple observation. For any given set of knots, and any B-spline order, we have: $$ sum_{i} B_{i,k}(x) = 1. $$ The proof is simple and can be done by induction. This means that if the B-spline coefficients, $a_i = a$, are…
Original Post: Splines in Stan! (including priors that enforce smoothness)

Accounting for variation and uncertainty

Accounting for variation and uncertainty Posted by Andrew on 12 May 2017, 9:35 am [cat picture] Yesterday I gave a list of the questions they’re asking me when I speak at the Journal of Accounting Research Conference. All kidding aside, I think that a conference of accountants is the perfect setting for a discussion of of research integrity, as accounting is all about setting up institutions to enable trust. The challenge is that traditional accounting is deterministic: there’s a ledger and that’s that. In statistics, we talk all the time about accounting for variation and uncertainty. Maybe “accounting” is more than a metaphor here, and maybe there’s more of a connection to the traditional practices of accounting than I’d thought.
Original Post: Accounting for variation and uncertainty