A continuous hinge function for statistical modeling

This comes up sometimes in my applied work: I want a continuous “hinge function,” something like the red curve above, connecting two straight lines in a smooth way. Why not include the sharp corner (in this case, the function y=-0.5x if x<0 or y=0.2x if x>0)? Two reasons. First, computation: Hamiltonian Monte Carlo can trip on discontinuities. Second, I want a smooth curve anyway, as I’d expect it to better describe reality. Indeed, the linear parts of the curve are themselves typically only approximations. So, when I’m putting this together, I don’t want to take two lines and then stitch them together with some sort of quadratic or cubic, creating a piecewise function with three parts. I just want one simple formula that asymptotes to the lines, as in the above picture. As I said, this problem comes up occasion,…
Original Post: A continuous hinge function for statistical modeling

Using Stan for week-by-week updating of estimated soccer team abilites

Milad Kharratzadeh shares this analysis of the English Premier League during last year’s famous season. He fit a Bayesian model using Stan, and the R markdown file is here. The analysis has three interesting features: 1. Team ability is allowed to continuously vary throughout the season; thus, once the season is over, you can see an estimate of which teams were improving or declining. 2. But that’s not what is shown in the plot above. Rather, the plot above shows estimated team abilities after the model was fit to prior information plus week 1 data alone; prior information plus data from weeks 1 and 2; prior information plus data from weeks 1, 2, and 3; etc. For example, look at the plot for surprise victor Leicester City: after a few games, the team is already estimated to be in the…
Original Post: Using Stan for week-by-week updating of estimated soccer team abilites

Splines in Stan! (including priors that enforce smoothness)

Milad Kharratzadeh shares a new case study. This could be useful to a lot of people.Just for example, here’s the last section of the document, which shows how to simulate the data and fit the model graphed above:Location of Knots and the Choice of Priors In practical problems, it is not always clear how to choose the number/location of the knots. Choosing too many/too few knots may lead to overfitting/underfitting. In this part, we introduce a prior that alleviates the problems associated with the choice of number/locations of the knots to a great extent. Let us start by a simple observation. For any given set of knots, and any B-spline order, we have: $$ sum_{i} B_{i,k}(x) = 1. $$ The proof is simple and can be done by induction. This means that if the B-spline coefficients, $a_i = a$, are…
Original Post: Splines in Stan! (including priors that enforce smoothness)

StanCon 2017 Schedule

StanCon 2017 Schedule Posted by Daniel on 11 January 2017, 12:39 pm The first Stan Conference is next Saturday, January 21, 2017! If you haven’t registered, here’s the link: https://stancon2017.eventbrite.comI wouldn’t wait until the last minute — we might sell out before you’re able to grab a ticket. We’re up to 125 registrants now.(If we have any tickets left, they are $400 at the door.) Schedule. January 21, 2017. Time What 7:30 AM – 8:45 AM Registration and breakfast 8:45 AM – 9:00 AM Opening statements 9:00 AM – 10:00 AM Dev talk:Andrew Gelman:“10 Things I Hate About Stan” 10:00 AM – 10:30 AM Coffee 10:30 AM – 12:00 PM Contributed talks: Jonathan Auerbach, Rob Trangucci:“Twelve Cities: Does lowering speed limits save pedestrian lives?” Milad Kharratzadeh:“Hierarchical Bayesian Modeling of the English Premier League” Victor Lei, Nathan Sanders, Abigail Dawson:“Advertising Attribution Modeling…
Original Post: StanCon 2017 Schedule

R packages interfacing with Stan: brms

R packages interfacing with Stan: brms Posted by Jonah on 10 January 2017, 8:45 pm Over on the Stan users mailing list I (Jonah) recently posted about our new document providing guidelines for developing R packages interfacing with Stan. As I say in the post and guidelines, we (the Stan team) are excited to see the emergence of some very cool packages developed by our users. One of these packages is Paul Bürkner’s brms. Paul is currently working on his PhD in statistics at the University of Münster, having previously studied psychology and mathematics at the universities of Münster and Hagen (Germany). Here is Paul writing about brms: The R package brms implements a wide variety of Bayesian regression models using extended lme4 formula syntax and Stan for the model fitting. It has been on CRAN for about one and a…
Original Post: R packages interfacing with Stan: brms

Stan 2.14 released for R and Python; fixes bug with sampler

Stan 2.14 released for R and Python; fixes bug with sampler Stan 2.14 is out and it fixes the sampler bug in Stan versions 2.10 through 2.13. Critical update It’s critical to update to Stan 2.14. See: The other interfaces will update when you udpate CmdStan. The process After Michael Betancourt diagnosed the bug, it didn’t take long for him to generate a test statistic so we can test this going forward, then submit a pull request for the patch and new test. I code reviewed that and made sure a clean check out did the right thing and then we merged. We had a few other fixes in, including one from Mitzi Morris that completed the compound declare define feature. Then Mitzi and Daniel built the releases for the Stan math library, the core Stan C++ library, and then…
Original Post: Stan 2.14 released for R and Python; fixes bug with sampler

Michael found the bug in Stan’s new sampler

Michael found the bug in Stan’s new sampler Gotcha! Michael found the bug! That was a lot of effort, during which time he produced ten pages of dense LaTeX to help Daniel and me understand the algorithm enough to help debug (we’re trying to write a bunch of these algorithmic details up for a more general audience, so stay tuned). So what was the issue? In Michael’s own words: There were actually two bugs. The first is that the right subtree needs it’s own rho in order to compute the correct termination criterion. The second is that in order to compute the termination criterion you need the points on the left and right of each subtree (the orientation of left and right relative to forwards and backwards depends on in which direction you’re trying to extend the trajectory). That means…
Original Post: Michael found the bug in Stan’s new sampler

Stan 2.10 through Stan 2.13 produce biased samples

[Update: rolled in info from comments.] After all of our nagging of people to use samplers that produce unbiased samples, we are mortified to have to announce that Stan versions 2.10 through 2.13 produce biased samples. The issue Thanks to Matthew R. Becker for noticing this with a simple bivariate example and for filing the issue with a reproducible example: The change to Stan Stan 2.10 changed the NUTS algorithm from using slice sampling along a Hamiltonian trajectory to a new algorithm that uses categorical sampling of points along the trajectory proportional to the density (plus biases to the second half of the chain, which is a subtle aspect of the original NUTS algorithm). The new approach is described here: From Michael Betancourt on Stan’s users group: Let me temper the panic by saying that the bias is relatively small…
Original Post: Stan 2.10 through Stan 2.13 produce biased samples

Using Stan in an agent-based model: Simulation suggests that a market could be useful for building public consensus on climate change

Jonathan Gilligan writes: I’m writing to let you know about a preprint that uses Stan in what I think is a novel manner: Two graduate students and I developed an agent-based simulation of a prediction market for climate, in which traders buy and sell securities that are essentially bets on what the global average temperature will be at some future time. We use Stan as part of the model: at every time step, simulated traders acquire new information and use this information to update their statistical models of climate processes and generate predictions about the future. J.J. Nay, M. Van der Linden, and J.M. Gilligan, Betting and Belief: Prediction Markets and Attribution of Climate Change, (code here). ABSTRACT: Despite much scientific evidence, a large fraction of the American public doubts that greenhouse gases are causing global warming. We present a…
Original Post: Using Stan in an agent-based model: Simulation suggests that a market could be useful for building public consensus on climate change

Interesting epi paper using Stan

Jon Zelner writes: Just thought I’d send along this paper by Justin Lessler et al. Thought it was both clever & useful and a nice ad for using Stan for epidemiological work. Basically, what this paper is about is estimating the true prevalence and case fatality ratio of MERS-CoV [Middle East Respiratory Syndrome Coronavirus Infection] using data collected via a…
Original Post: Interesting epi paper using Stan