The Real World Interactive Learning Tutorial

The Real World Interactive Learning Tutorial Alekh and I have been polishin the Real World Interactive Learning tutorial for ICML 2017 on Sunday. This tutorial should be of pretty wide interest. For data scientists, we are crossing a threshold into easy use of interactive learning while for researchers interactive learning is plausibly the most important frontier of understanding. Great progress on both the theory and especially on practical systems has been made since an earlier NIPS 2013 tutorial. Please join us if you are interested
Original Post: The Real World Interactive Learning Tutorial

It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition

It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition Posted by Andrew on 3 August 2017, 9:55 am Jonathan Falk points to this article and writes: Thoughts? I would have liked to have seen the data matched on age, rather than simply using age in a Cox regression, since I suspect that’s what really going on here. The non-chili eaters were much older, and I suspect that the failure to interact age, or at least specify the age effect more finely, has a gigantic impact here, especially since the raw inclusion of age raised the hazard ratio dramatically. Having controlled for Blood, Sugar, and Sex, the residual must be Magik. My reply: Yes, also they need to interact age x sex, and smoking is another…
Original Post: It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition

Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that did not deliver what was hoped for, I am anticipating that and speculating why here. I have real doubts about such material actually enabling others to meaningfully interpret Bayesian analyses let alone implement them themselves. For instance, in a conversation last year with David Spiegelhalter, his take was that some material I had could easily be followed by many, but the concepts that material was trying to get across were very subtle and few would have the background to connect to them. On the other…
Original Post: Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

Giving feedback indirectly by invoking a hypothetical reviewer

Giving feedback indirectly by invoking a hypothetical reviewer Posted by Andrew on 2 August 2017, 9:44 am Ethan Bolker points us to this discussion on “How can I avoid being “the negative one” when giving feedback on statistics?”, which begins: Results get sent around a group of biological collaborators for feedback. Comments come back from the senior members of the group about the implications of the results, possible extensions, etc. I look at the results and I tend not to be as good at the “big picture” stuff (I’m a relatively junior member of the team), but I’m reasonably good with statistics (and that’s my main role), so I look at the details. Sometimes I think to myself “I don’t think those conclusions are remotely justified by the data”. How can I give honest feedback in a way that doesn’t come…
Original Post: Giving feedback indirectly by invoking a hypothetical reviewer

DePaul University, School of Computing: Instructor in Data Science

[unable to retrieve full-text content]Seeking Instructors in Data Science, with expertise and teaching experience in data science with an emphasis in computational statistics, data mining, data visualization, pattern recognition or machine learning.
Original Post: DePaul University, School of Computing: Instructor in Data Science

Machine Learning the Future Class

Machine Learning the Future Class This spring, I taught a class on Machine Learning the Future at Cornell Tech covering a number of advanced topics in machine learning including online learning, joint (structured) prediction, active learning, contextual bandit learning, logarithmic time prediction, and parallel learning. Each of these classes was recorded from the laptop via Zoom and I just uploaded the recordings to Youtube. In some ways, this class is a followup to the large scale learning class I taught with Yann LeCun 4 years ago. The videos for that class were taken down(*) so these lectures both update and replace shared subjects as well as having some new subjects. Much of this material is fairly close to research so to assist other machine learning lecturers around the world in digesting the material, I’ve made all the source available as…
Original Post: Machine Learning the Future Class

Taking Data Journalism Seriously

This is a bit of a followup to our recent review of “Everybody Lies.” While writing the review I searched the blog for mentions of Seth Stephens-Davidowitz, and I came across this post from last year, concerning a claim made by author J. D. Vance that “the middle part of America is more religious than the South.” This was a claim that stunned me, given that I’d seen some of the statistics on the topic, and it turned out that Vance had been mistaken, that he’d used some unadjusted numbers which were not directly comparable when looking at different regions of the country. It was an interesting statistical example, also interesting in that claims made in data journalism, just like claims made in academic research, can get all sorts of uncritical publicity. People just trust the numbers, which makes sense…
Original Post: Taking Data Journalism Seriously

Should computer programming be a prerequisite for learning statistics?

Should computer programming be a prerequisite for learning statistics? Posted by Andrew on 14 May 2017, 9:09 am [cat picture] This came up in a recent discussion thread, I can’t remember exactly where. A commenter pointed out, correctly, that you shouldn’t require computer programming as a prerequisite for a statistics course: there’s lots in statistics that can be learned without knowing how to program. Sure, if you can program you can do a better job of statistics, but you can still do a bit with just point and click. Here’s what I will say, though: In the twentieth century, it was said that if you wanted to do statistics, you had to be a bit of a mathematician, whether you want to or not. In the twenty-first century, if you want to do statistics, you have to be a bit of…
Original Post: Should computer programming be a prerequisite for learning statistics?