Scraping Wikipedia Tables from Lists for Visualisation

Get WikiTables from Lists Recently I was asked to submit a short take-home challenge and I thought what better excuse for writing a quick blog post! It was on short notice so initially I stayed within the confines of my comfort zone and went for something safe and bland. However, I alleviated that rather fast; I guess you want to stand out a bit in a competitive setting. Note that it was a visualisation task, so the data scraping was just a necessary evil. On that note. I resorted to using Wikipedia as I was asked to visualise change in a certain x going back about 500 hundred years. Not many academic datasets go that far, so Wiki will have to do for our purposes. And once you are there, why only visualise half a millennium, let’s go from 1…
Original Post: Scraping Wikipedia Tables from Lists for Visualisation

Predicting Conflict Duration with (gg)plots using Keras

An Unlikely Pairing Last week, Marc Cohen from Google Cloud was on campus to give a hands-on workshop on image classification using TensorFlow. Consequently, I spent most of my time thinking about how I can incorporate image classifiers in my work. As my research is primarily on forecasting armed conflict duration, it’s not really straightforward to make a connection between the two. I mean, what are you going to do, analyse portraits of US presidents to see whether you can predict military use of force based on their facial features? Also, I’m sure someone, somewhere has already done that, given this. For the purposes of this blog post, I went ahead with the second most ridiculous idea that popped into my mind: why don’t I generate images from my research and use them to answer my own research question? This…
Original Post: Predicting Conflict Duration with (gg)plots using Keras

Quantitative Story Telling with Shiny: Gender Bias in Syllabi

LSE IR Gender and Diversity Project Two shinydashboard posts in a row, that’s a first. As I mentioned on Twitter, I’m not really this productive; rather, the apps had been on the proverbial shelf for a while and I’m just releasing them now. In fact, this is one of my earlier works: quantifying the gender imbalance as it manifests itself in the LSE International Relations (IR) reading lists. You can access the app here. This is a much larger project that I got involved during its second year, so I’m just visualising other peoples’ hard work. The recentness of my contribution to the project was clearly on display when I amused my audience by saying cross-sectional feminism instead of inter-sectional. Are you a statistician or what? Baby steps. In a nutshell, about twenty or so PhD candidates at the department…
Original Post: Quantitative Story Telling with Shiny: Gender Bias in Syllabi

Visualising US Voting Records with shinydashboard

Introducing adavis My second ever post on this blog was on introducing adamap, a Shiny app that maps Americans for Democratic Action voting scores (the so-called Liberal Quotient) between 1947-2015. It was built with highcharter, and hence it was nicely interactive but quite slow. I wanted to switch to another package since, and when I eventually ran into statebins, I knew what had to be done. I was certain that statebins would definitely add some oomph to the design, but because it’s so easy to implement, I had some spare time to do other things. As it is often the case, one thing led to the other, and I came to the conclusion that the revamped app should feature one plot from every major graphics package. Of course, a strict implementation of that statement would be quite difficult, so I…
Original Post: Visualising US Voting Records with shinydashboard

Mining Game of Thrones Scripts with R

Quantitative Text Analysis Part II I meant to showcase the quanteda package in my previous post on the Weinstein Effect but had to switch to tidytext at the last minute. Today I will make good on that promise. quanteda is developed by Ken Benoit and maintained by Kohei Watanabe – go LSE! On that note, the first 2018 LondonR meeting will be taking place at the LSE on January 16, so do drop by if you happen to be around. quanteda v1.0 will be unveiled there as well. Given that I have already used the data I had in mind, I have been trying to identify another interesting (and hopefully less depressing) dataset for this particular calling. Then it snowed in London, and the dire consequences of this supernatural phenomenon were covered extensively by the r/CasualUK/. One thing led to…
Original Post: Mining Game of Thrones Scripts with R

A Tidytext Analysis of the Weinstein Effect

Quantifying He-Said, She-Said: Newspaper Reporting I have been meaning to get into quantitative text analysis for a while. I initially planned this post to feature a different package (that I wanted to showcase), however I ran into some problems with their .json parsing methods and currently waiting for the issue to be solved on their end. The great thing about doing data science with R is that there are multiple avenues leading you to the same destination, so let’s take advantage of that. My initial inspiration came from David Robinson’s post on gendered verbs. I remember sending it around and thinking it was quite cool. Turns out he was building on Julia Silge’s earlier post on gender pronouns. I see that post and I go, ‘what a gorgeous looking ggplot theme!’. So. Neat. Praise be the open source gods, the…
Original Post: A Tidytext Analysis of the Weinstein Effect