Where do I learn about log_sum_exp, log1p, lccdf, and other numerical analysis tricks?

Richard McElreath inquires: I was helping a colleague recently fix his MATLAB code by using log_sum_exp and log1m tricks. The natural question he had was, “where do you learn this stuff?” I checked Numerical Recipes, but the statistical parts are actually pretty thin (at least in my 1994 edition). Do you know of any books/papers that describe these techniques? I’d love to hear this blog’s answers to these questions. I replied that I learned numerical analysis “on the street” through HMM implementations. HMMs are also a good introduction to the kind of dynamic programming technique I used for that Poisson-binomial implementation we discussed (which we’ll build into Stan one of these days—it’ll be a fun project for someone). Then I picked up the rest through a hodge-podge of case-based learning. “Numerical analysis” is name of the field and the textbooks…
Original Post: Where do I learn about log_sum_exp, log1p, lccdf, and other numerical analysis tricks?

Three new domain-specific (embedded) languages with a Stan backend

Three new domain-specific (embedded) languages with a Stan backend One is an accident. Two is a coincidence. Three is a pattern. Perhaps it’s no coincidence that there are three new interfaces that use Stan’s C++ implementation of adaptive Hamiltonian Monte Carlo (currently an updated version of the no-U-turn sampler). ScalaStan embeds a Stan-like language in Scala. It’s a Scala package largely (if not entirely written by Joe Wingbermuehle.[GitHub link] tmbstan lets you fit TMB models with Stan. It’s an R package listing Kasper Kristensen as author.[CRAN link] SlicStan is a “blockless” and self-optimizing version of Stan. It’s a standalone language coded in F# written by Maria Gorinova.[pdf language spec] These are in contrast with systems that entirely reimplement a version of the no-U-turn sampler, such as PyMC3, ADMB, and NONMEM.
Original Post: Three new domain-specific (embedded) languages with a Stan backend

Three new domain-specific (embedded) languages with a Stan backend

One is an accident. Two is a coincidence. Three is a pattern. Perhaps it’s no coincidence that there are three new interfaces that use Stan’s C++ implementation of adaptive Hamiltonian Monte Carlo (currently an updated version of the no-U-turn sampler). ScalaStan embeds a Stan-like language in Scala. It’s a Scala package largely (if not entirely written by Joe Wingbermuehle.[GitHub link] tmbstan lets you fit TMB models with Stan. It’s an R package listing Kasper Kristensen as author.[CRAN link] SlicStan is a “blockless” and self-optimizing version of Stan. It’s a standalone language coded in F# written by Maria Gorinova.[pdf language spec] These are in contrast with systems that entirely reimplement a version of the no-U-turn sampler, such as PyMC3, ADMB, and NONMEM. The post Three new domain-specific (embedded) languages with a Stan backend appeared first on Statistical Modeling, Causal Inference, and…
Original Post: Three new domain-specific (embedded) languages with a Stan backend

Interactive visualizations of sampling and GP regression

Interactive visualizations of sampling and GP regression You really don’t want to miss Chi Feng‘s absolutely wonderful interactive demos. (1) Markov chain Monte Carlo sampling I believe this is exactly what Andrew was asking for a few Stan meetings ago: This tool lets you explore a range of sampling algorithms including random-walk Metropolis, Hamiltonian Monte Carlo, and NUTS operating over a range of two-dimensional distributions (standard normal, banana, donut, multimodal, and one squiggly one). You can control both the settings of the algorithms and the settings of the visualizations. As you run it, it even collects the draws into a sample which it summarizes as marginal histograms. Source code The demo is implemented in Javascript with the source code on Chi Feng’s GitHub organization: Wish list 3D (glasses or virtual reality headset) multiple chains in parallel scatterplot breadcrumbs Gibbs sampler…
Original Post: Interactive visualizations of sampling and GP regression

Stan Roundup, 10 November 2017

We’re in the heart of the academic season and there’s a lot going on. James Ramsey reported a critical performance regression bug in Stan 2.17 (this affects the latest CmdStan and PyStan, not the latest RStan). Sean Talts and Daniel Lee diagnosed the underlying problem as being with the change from char* to std::string arguments—you can’t pass char* and rely on the implicit std::string constructor without the penalty of memory allocation and copying. The reversion goes back to how things were before with const char* arguments. Ben Goodrich is working with Sean Talts to cherry-pick the performance regression fix to Stan that led to a very slow 2.17 release for the other interfaces. RStan 2.17 should be out soon, and it will be the last pre-C++11 release. We’ve already opened the C++11 floodgates on our development branches (yoo-hoo!). Quentin F.…
Original Post: Stan Roundup, 10 November 2017

Stan Roundup, 27 October 2017

I missed two weeks and haven’t had time to create a dedicated blog for Stan yet, so we’re still here. This is only the update for this week. From now on, I’m going to try to concentrate on things that are done, not just in progress so you can get a better feel for the pace of things getting done. Not one, but two new devs! This is my favorite news to post, hence the exclamation. Matthijs Vákár from University of Oxford joined the dev team. Matthijs’s first major commit is a set of GLM functions for negative binomial with log link (2–6 times speedup), normal linear regression with identity link (4–5 times), Poisson with log link (factor of 7) and bernoulli with logit link (9 times). Wow! And he didn’t just implement the straight-line case—this is a fully vectorized…
Original Post: Stan Roundup, 27 October 2017

Halifax, NS, Stan talk and course Thu 19 Oct

Halfiax, here we come. I (Bob, not Andrew) am going to be giving a talk on Stan and then Mitzi and I will be teaching a course on Stan after that. The public is invited, though space is limited for the course. Here are details if you happen to be in the Maritime provinces. TALK: Stan: A Probabilistic Programming Language for Bayesian Inference Date: Thursday October 19, 2017 Time: 10am Location: Slonim Conference room (#430), Goldberg Computer Science Building, Dalhousie University, 6050 University Avenue, Halifax Abstract I’ll describe Stan’s probabilistic programming language, and how it’s used, including blocks for data, parameter, and predictive quantities transforms of constrained parameters to unconstrained spaces, with automatic Jacobian corrections automatic computation of first- and higher-order derivatives operator, function, and linear algebra library vectorized density functions, cumulative distributions, and random number generators user-defined functions (stiff)…
Original Post: Halifax, NS, Stan talk and course Thu 19 Oct

Stan Biweekly Roundup, 6 October 2017

I missed last week and almost forgot to add this week’s. Jonah Gabry returned from teaching a one-week course for a special EU research institute in Spain. Mitzi Morris has been knocking out bug fixes for the parser and some pull requests to refactor the underlying type inference to clear the way for tuples, sparse matrices, and higher-order functions. Michael Betancourt with help from Sean Talts spent last week teaching an intro course to physicists about Stan. Charles Margossian attended and said it went really well. Ben Goodrich, in addition to handling a slew of RStan issues has been diving into the math library to define derivatives for Bessel functions. Aki Vehtari has put us in touch with the MxNet developers at Amazon UK and we had our first conference call with them to talk about adding sparse matrix functionality…
Original Post: Stan Biweekly Roundup, 6 October 2017

Will Stanton hit 61 home runs this season?

[edit: Juho Kokkala corrected my homework. Thanks! I updated the post. Also see some further elaboration in my reply to Andrew’s comment. As Andrew likes to say …] So far, Giancarlo Stanton has hit 56 home runs in 555 at bats over 149 games. Miami has 10 games left to play. What’s the chance he’ll hit 61 or more home runs? Let’s make a simple back-of-the-envelope Bayesian model and see what the posterior event probability estimate is. Sampling notation A simple model that assumes a home run rate per at bat with a uniform (conjugate) prior: The data we’ve seen so far is 56 home runs in 555 at bats, so that gives us our likelihood. Now we need to simulate the rest of the season and compute event probabilities. We start by assuming the at-bats in the rest of…
Original Post: Will Stanton hit 61 home runs this season?

Stan Weekly Roundup, 25 August 2017

This week, the entire Columbia portion of the Stan team is out of the office and we didn’t have an in-person/online meeting this Thursday. Mitzi and I are on vacation, and everyone else is either teaching, TA-ing, or attending the Stan course. Luckily for this report, there’s been some great activity out of the meeting even if I don’t have a report of what everyone around Columbia has been up to. If a picture’s really worth a thousand words, this is the longest report yet. Ari Hartikainen has produced some absolutely beautiful parallel coordinate plots of HMC divergences* for multiple parameters. The divergent transitions are shown in green and the lines connect a single draw. The top plot is unnormalized, whereas the bottom scales all parameters to a [0, 1] range. You can follow the ongoing discussion on the forum…
Original Post: Stan Weekly Roundup, 25 August 2017