Machine Learning the Future Class

Machine Learning the Future Class This spring, I taught a class on Machine Learning the Future at Cornell Tech covering a number of advanced topics in machine learning including online learning, joint (structured) prediction, active learning, contextual bandit learning, logarithmic time prediction, and parallel learning. Each of these classes was recorded from the laptop via Zoom and I just uploaded the recordings to Youtube. In some ways, this class is a followup to the large scale learning class I taught with Yann LeCun 4 years ago. The videos for that class were taken down(*) so these lectures both update and replace shared subjects as well as having some new subjects. Much of this material is fairly close to research so to assist other machine learning lecturers around the world in digesting the material, I’ve made all the source available as…
Original Post: Machine Learning the Future Class

Taking Data Journalism Seriously

This is a bit of a followup to our recent review of “Everybody Lies.” While writing the review I searched the blog for mentions of Seth Stephens-Davidowitz, and I came across this post from last year, concerning a claim made by author J. D. Vance that “the middle part of America is more religious than the South.” This was a claim that stunned me, given that I’d seen some of the statistics on the topic, and it turned out that Vance had been mistaken, that he’d used some unadjusted numbers which were not directly comparable when looking at different regions of the country. It was an interesting statistical example, also interesting in that claims made in data journalism, just like claims made in academic research, can get all sorts of uncritical publicity. People just trust the numbers, which makes sense…
Original Post: Taking Data Journalism Seriously

Should computer programming be a prerequisite for learning statistics?

Should computer programming be a prerequisite for learning statistics? Posted by Andrew on 14 May 2017, 9:09 am [cat picture] This came up in a recent discussion thread, I can’t remember exactly where. A commenter pointed out, correctly, that you shouldn’t require computer programming as a prerequisite for a statistics course: there’s lots in statistics that can be learned without knowing how to program. Sure, if you can program you can do a better job of statistics, but you can still do a bit with just point and click. Here’s what I will say, though: In the twentieth century, it was said that if you wanted to do statistics, you had to be a bit of a mathematician, whether you want to or not. In the twenty-first century, if you want to do statistics, you have to be a bit of…
Original Post: Should computer programming be a prerequisite for learning statistics?

“P-hacking” and the intention-to-cheat effect

“P-hacking” and the intention-to-cheat effect Posted by Andrew on 10 May 2017, 5:53 pm I’m a big fan of the work of Uri Simonsohn and his collaborators, but I don’t like the term “p-hacking” because it can be taken to imply an intention to cheat. The image of p-hacking is of a researcher trying test after test on the data until reaching the magic “p less than .05.” But, as Eric Loken and I discuss in our paper on the garden of forking paths, multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. I worry that the widespread use term “p-hacking” gives two wrong impressions: First, it implies that the many researchers who use p-values incorrectly are cheating or “hacking,” even though I suspect they’re mostly…
Original Post: “P-hacking” and the intention-to-cheat effect

We fiddle while Rome burns: p-value edition

Raghu Parthasarathy presents a wonderfully clear example of disastrous p-value-based reasoning that he saw in a conference presentation. Here’s Raghu: Consider, for example, some tumorous cells that we can treat with drugs 1 and 2, either alone or in combination. We can make measurements of growth under our various drug treatment conditions. Suppose our measurements give us the following graph: . . . from which we tell the following story: When administered on their own, drugs 1 and 2 are ineffective — tumor growth isn’t statistically different than the control cells (p > 0.05, 2 sample t-test). However, when the drugs are administered together, they clearly affect the cancer (p < 0.05); in fact, the p-value is very small (0.002!). This indicates a clear synergy between the two drugs: together they have a much stronger effect than each alone does.…
Original Post: We fiddle while Rome burns: p-value edition

Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs. In the meantime, I keep posting the stuff they send me, as part of my desperate effort to empty my inbox. 1. From Lewis: “Should Students Assessed as Needing Remedial Mathematics Take College-Level Quantitative Courses Instead? A Randomized Controlled Trial,” by A. W. Logue, Mari Watanabe-Rose, and Daniel Douglas, which begins: Many college students never take, or do not pass, required remedial mathematics courses theorized to increase college-level performance. Some colleges and states are therefore instituting policies allowing students to take college-level courses without first taking remedial courses. However, no experiments have compared the effectiveness of these approaches, and other data are mixed. We randomly assigned 907 students to (a) remedial elementary algebra, (b) that course with workshops, or (c) college-level statistics with…
Original Post: Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Hark, hark! the p-value at heaven’s gate sings

Three different people pointed me to this post, in which food researcher and business school professor Brian Wansink advises Ph.D. students to “never say no”: When a research idea comes up, check it out, put some time into it and you might get some success. I like that advice and I agree with it. Or, at least, this approached worked for me when I was a student and it continues to work for me now, and my favorite students are those who follow this approach. That said, there could be some selection bias here, that the students who say Yes to new projects are the ones who are more likely to be able to make use of such opportunities. Maybe the students who say No would just end up getting distracted and making no progress, were they to follow this…
Original Post: Hark, hark! the p-value at heaven’s gate sings

Avoiding only the shadow knowing the motivating problem of a post.

Graphic From Given I am starting to make some posts to this blog (again) I was pleased to run across a youtube of Xiao-Li Meng being interviewed on the same topic by Suzanne Smith the Director of the Center for Writing and Communicating Ideas. One thing I picked up was to make the problem being addressed in a any communication very clear as there should be a motivating problem – the challenges of problem recognising and problem defining should not be over looked. The other thing was that the motivating problem should be located in the sub-field(s) of statistics that addresses such problems. The second is easier as my motivating problems mostly involve ways to better grasp insight(s) from theoretical statistics in order to better apply statistics in applications – so the sub-fields are theory and application, going primarily from theory…
Original Post: Avoiding only the shadow knowing the motivating problem of a post.