A Python program for multivariate missing-data imputation that works on large datasets!?

Alex Stenlake and Ranjit Lall write about a program they wrote for imputing missing data: Strategies for analyzing missing data have become increasingly sophisticated in recent years, most notably with the growing popularity of the best-practice technique of multiple imputation. However, existing algorithms for implementing multiple imputation suffer from limited computational efficiency, scalability, and capacity to exploit complex interactions among large numbers of variables. These shortcomings render them poorly suited to the emerging era of “Big Data” in the social and natural sciences. Drawing on new advances in machine learning, we have developed an easy-to-use Python program – MIDAS (Multiple Imputation with Denoising Autoencoders) – that leverages principles of Bayesian nonparametrics to deliver a fast, scalable, and high-performance implementation of multiple imputation. MIDAS employs a class of unsupervised neural networks known as denoising autoencoders, which are capable of producing complex,…
Original Post: A Python program for multivariate missing-data imputation that works on large datasets!?

“Handling Multiplicity in Neuroimaging through Bayesian Lenses with Hierarchical Modeling”

Donald Williams points us to this new paper by Gang Chen, Yaqiong Xiao, Paul Taylor, Tracy Riggins, Fengji Geng, Elizabeth Redcay, and Robert Cox: In neuroimaging, the multiplicity issue may sneak into data analysis through several channels . . . One widely recognized aspect of multiplicity, multiple testing, occurs when the investigator fits a separate model for each voxel in the brain. However, multiplicity also occurs when the investigator conducts multiple comparisons within a model, tests two tails of a t-test separately when prior information is unavailable about the directionality, and branches in the analytic pipelines. . . . More fundamentally, the adoption of dichotomous decisions through sharp thresholding under NHST may not be appropriate when the null hypothesis itself is not pragmatically relevant because the effect of interest takes a continuum instead of discrete values and is not expected…
Original Post: “Handling Multiplicity in Neuroimaging through Bayesian Lenses with Hierarchical Modeling”

A debate about robust standard errors: Perspective from an outsider

A colleague pointed me to a debate among some political science methodologists about robust standard errors, and I told him that the topic didn’t really interest me because I haven’t found a use for robust standard errors in my own work. My colleague urged me to look at the debate more carefully, though, so I did. But before getting to that, let me explain where I’m coming from. I won’t be trying to make the “Holy Roman Empire” argument that they’re not robust, not standard, and not an estimate of error. I’ll just say why I haven’t found those methods useful myself, and then I’ll get to the debate. The paradigmatic use case goes like this: You’re running a regression to estimate a causal effect. For simplicity suppose you have good identification and also suppose you have enough balance that…
Original Post: A debate about robust standard errors: Perspective from an outsider

The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself

[cat picture] A fundamental tenet of social psychology, behavioral economics, at least how it is presented in the news media, and taught and practiced in many business schools, is that small “nudges,” often the sorts of things that we might not think would affect us at all, can have big effects on behavior. Thus the claims that elections are decided by college football games and shark attacks, or that the subliminal flash of a smiley face can cause huge changes in attitudes toward immigration, or that single women were 20% more likely to vote for Barack Obama, or three times more likely to wear red clothing, during certain times of the month, or that standing in a certain position for two minutes can increase your power, or that being subliminally primed with certain words can make you walk faster or…
Original Post: The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself

Orphan drugs and forking paths: I’d prefer a multilevel model but to be honest I’ve never fit such a model for this sort of problem

Amos Elberg writes: I’m writing to let you know about a drug trial you may find interesting from a statistical perspective. As you may know, the relatively recent “orphan drug” laws allow (basically) companies that can prove an off-patent drug treats an otherwise untreatable illness, to obtain intellectual property protection for otherwise generic or dead drugs. This has led to a new business of trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses, with a large number of success criteria. Charcot-Marie-Tooth (CMT) is a moderately rare genetic degenerative peripheral nerve disease with no known treatment. CMT causes the Schwann cells, which surround the peripheral nerves, to weaken and eventually die, leading to demyelination of the nerves, a loss of nerve conduction velocity, and an eventual loss of nerve efficacy. PXT3003 is a drug currently…
Original Post: Orphan drugs and forking paths: I’d prefer a multilevel model but to be honest I’ve never fit such a model for this sort of problem

What I missed on fixed effects (plural).

In my [Keith] previous post that criticised a publish paper, the first author commented they wanted some time to respond and I agreed. I also suggested that if the response came in after most readers have moved on I would re-post their response as a new post pointing back to the previous. So here we are. Now there has been a lot of discussion on this blog about public versus private criticism and their cost and benefit trade offs. One change I am making is to refer to first, second or third author rather than names. Here I should also clarify that I have previously worked with the first and second author (so they are not strangers) and that the first author posted the paper on my blog post (they brought it to my attention).  Now my three main points in…
Original Post: What I missed on fixed effects (plural).

Using Mister P to get population estimates from respondent driven sampling

From one of our exams: A researcher at Columbia University’s School of Social Work wanted to estimate the prevalence of drug abuse problems among American Indians (Native Americans) living in New York City. From the Census, it was estimated that about 30,000 Indians live in the city, and the researcher had a budget to interview 400. She did not have a list of Indians in the city, and she obtained her sample as follows. She started with a list of 300 members of a local American Indian community organization, and took a random sample of 100 from this list. She interviewed these 100 persons and asked each of these to give her the names of other Indians in the city whom they knew. She asked each respondent to characterize him/herself and also the people on the list on a 1-10…
Original Post: Using Mister P to get population estimates from respondent driven sampling

Multilevel modeling: What it can and cannot do

Today’s post reminded me of this article from 2005: We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. . . . Compared with the two classical estimates (no pooling and complete pooling), the inferences from the multilevel models are more reasonable. . . . Although the specific assumptions of model (1) could be questioned or improved, it would be difficult to argue against the use of multilevel modeling for the purpose of estimating radon levels within counties. . . . Perhaps the clearest advantage of multilevel models comes in prediction. In our example we can predict the radon levels for new houses in an existing county or a new county. . . . We can use cross-validation to formally demonstrate the benefits of multilevel modeling. . .…
Original Post: Multilevel modeling: What it can and cannot do

Adding a predictor can increase the residual variance!

Adding a predictor can increase the residual variance! Posted by Andrew on 25 July 2017, 9:08 am Chao Zhang writes: When I want to know the contribution of a predictor in a multilevel model, I often calculate how much of the total variance is reduced in the random effects by the added predictor. For example, the between-group variance is 0.7 and residual variance is 0.9 in the null model, and by adding the predictor the residual variance is reduced to 0.7, then VPC = (0.7 + 0.9 – 0.7 – 0.7) / (0.7 + 0.9) = 0.125. Then I assume that the new predictor explained 12.5% more of the total variance than the null model. I guess this is sometimes done by some researchers when they need a measure of sort of an effect size. However, now I have a case…
Original Post: Adding a predictor can increase the residual variance!

#NotAll4YearOlds

I think there’s something wrong this op-ed by developmental psychologist Alison Gopnik, “4-year-olds don’t act like Trump,” and which begins, The analogy is pervasive among his critics: Donald Trump is like a child. . . . But the analogy is profoundly wrong, and it’s unfair to children. The scientific developmental research of the past 30 years shows that Mr. Trump is utterly unlike a 4-year-old. Gopnik continues with a list of positive attributes, each one which, she asserts, is held by four-year-olds but not by the president: Four-year-olds care deeply about the truth. . . . But Mr. Trump doesn’t just lie; he seems not even to care whether his statements are true. Four-year-olds are insatiably curious. One study found that the average preschooler asks hundreds of questions per day. . . . Mr. Trump refuses to read and is…
Original Post: #NotAll4YearOlds