There was something about them that made me uneasy, some longing and at the same time some deadly fear – Dracula (Stoker, Bram) Twitter is a very good source of inspiration. Some days ago I came across with this: The tweet refers to a presentation (in Spanish) available here, which is a very concise and well illustrated document about the state-of-the-art of text mining in R. I discovered there several libraries that I will try to use in the future. In this experiment I have used one of them: the syuzhet package. As can be read in the documentation: this package extracts sentiment and sentiment-derived plot arcs from text using three sentiment dictionaries conveniently packaged for consumption by R users. Implemented dictionaries include syuzhet (default) developed in the Nebraska Literary Lab, afinn developed by Finn Arup Nielsen, bing developed by Minqing Hu and Bing…

Original Post: A Shiny App to Create Sentimental Tweets Based on Project Gutenberg Books

# R-bloggers

## Loading R Packages: library() or require()?

When I was an R newbie, I was taught to load packages by using the command library(package). In my Linear Models class, the instructor likes to use require(package). This made me wonder, are the commands interchangeable? What’s the difference, and which command should I use? Interchangeable commands . . . The way most users will use these commands, most of the time, they are actually interchangeable. That is, if you are loading a library that has already been installed, and you are using the command outside of a function definition, then it makes no difference if you use “require” or “library.” They do the same thing. … Well, almost interchangeable There are, though, a couple of important differences. The first one, and the most obvious, is what happens if you try to load a package that has not previously been…

Original Post: Loading R Packages: library() or require()?

## Statistical Machine Learning with Microsoft ML

MicrosoftML is an R package for machine learning that works in tandem with the RevoScaleR package. (In order to use the MicrosoftML and RevoScaleR libraries, you need an installation of Microsoft Machine Learning Server or Microsoft R Client.) A great way to see what MicrosoftML can do is to take a look at the on-line book Machine Learning with the MicrosoftML Package Package by Ali Zaidi. The book includes worked examples on several topics: Exploratory data analysis and feature engineering Regression models Classification models for computer vision Convolutional neural networks for computer vision Natural language processing Transfer learning with pre-trained DNNs The book is part of Ali’s in-person workshop “Statistical Machine Learning with MicrosoftML”, and you can find further materials including data and scripts at this Github repository. If you’d like to experience the workshop in person, Ali will be presenting it…

Original Post: Statistical Machine Learning with Microsoft ML

## The Return of Free Data and Possible Volatility Trading Subscription

This post will be about pulling free data from AlphaVantage, and gauging interest for a volatility trading subscription service. So first off, ever since the yahoos at Yahoo decided to turn off their free data, the world of free daily data has been in somewhat of a dark age. Well, thanks to http://blog.fosstrading.com/2017/10/getsymbols-and-alpha-vantage.html#gpluscommentsJosh Ulrich, Paul Teetor, and other R/Finance individuals, the latest edition of quantmod (which can be installed from CRAN) now contains a way to get free financial data from AlphaVantage since the year 2000, which is usually enough for most backtests, as that date predates the inception of most ETFs. Here’s how to do it. First off, you need to go to alphaVantage, register, and https://www.alphavantage.co/support/#api-keyget an API key. Once you do that, downloading data is simple, if not slightly slow. Here’s how to do it. require(quantmod) getSymbols(‘SPY’,…

Original Post: The Return of Free Data and Possible Volatility Trading Subscription

## 14 data science Jobs for R users from around the world (2017-10-23)

To post your R job on the next post Just visit this link and post a new R job to the R community. You can post a job for free (and there are also “featured job” options available for extra exposure). Current R jobs Job seekers: please follow the links below to learn more and apply for your R job of interest: Featured Jobs Full-Time Data Scientist Cincinnati Reds – Posted by MichaelSchatz Anywhere 13 Oct 2017 Freelance The R Foundation is looking for a volunteer to organize R’s contributed documentation R: The R Foundation – Posted by Tal Galili Anywhere 13 Oct 2017 Full-Time Rays Research & Development Analyst Tampa Bay Rays – Posted by kferris10 Saint Petersburg Florida, United States 11 Oct 2017 Full-Time PROGRAMMER/SOFTWARE DEVELOPMENT ENGINEER/COMPUTATIONAL AND MACHINE LEARNING SPECIALIST fchiava Cambridge Massachusetts, United States 10 Oct 2017 Freelance R Shiny Developer – Work From Home Data…

Original Post: 14 data science Jobs for R users from around the world (2017-10-23)

## „One function to rule them all“ – visualization of regression models in #rstats w/ #sjPlot

I’m pleased to announce the latest update from my sjPlot-package on CRAN. Beside some bug fixes and minor new features, the major update is a new function, plot_model(), which is both an enhancement and replacement of sjp.lm(), sjp.glm(), sjp.lmer(), sjp.glmer() and sjp.int(). The latter functions will become deprecated in the next updates and removed somewhen in the future. plot_model() is a „generic“ plot function that accepts many model-objects, like lm, glm, lme, lmerMod etc. It offers various plotting types, like estimates/coefficient plots (aka forest or dot-whisker plots), marginal effect plots and plotting interaction terms, and sort of diagnostic plots. In this blog post, I want to describe how to plot estimates as forest plots. The plot-type is defined via the type-argument. The default is type = “fe”, which means that fixed effects (model coefficients) are plotted. First, we fit a…

Original Post: „One function to rule them all“ – visualization of regression models in #rstats w/ #sjPlot

## 9th MilanoR meeting on November 20th: call for presentations!

MilanoR Staff is happy to announce the 9th MilanoR Meeting! The meeting will take place on November 20th, from 7pm to about 9:30 pm, in Mikamai (close to the Pasteur metro station) [save the date, more info soon] This time we want to focus on a specific topic: data visualization with R. We are curious to see if there are interesting contributions about this topic within the community. Then: have you build a gorgeous and smart visualization with R, or developed a package that handle some data viz stuff in a new way? Have you created a Shiny HTML widget or a dashboard that has something new to say? Do you feel you have something to input, o you can recommend someone? Send your contribution at admin[at]milanor[dot]net: you may present it at the 9th MilanoR meeting! If you want to contribute but you cannot attend…

Original Post: 9th MilanoR meeting on November 20th: call for presentations!

## Economic time series data quiz as a shiny app for mobile phones

Nowadays, a lot of interesting time series data is freely available that allows us to compare important economic, social and environmental trends across countries. I feel that one can learn a lot by surfing through the data sections on the websites of institutions like the Gapminder Foundation, the World Bank, or the OECD. At the same time, I am quite a big fan of learning facts with quiz questions. Since my internet search did not yield any apps or websites that present these interesting time series in forms of quizzes, I coded a bit in R and generated a Shiny app that creates such quizzes based on OECD data and some Eurostat data. Here is a screenshot: The quiz is hosted here: http://econ.mathematik.uni-ulm.de:4501/dataquiz/ Why not do some supervised learning for your biological neural network, and take a look? I would…

Original Post: Economic time series data quiz as a shiny app for mobile phones

## Who knew likelihood functions could be so pretty?

I just released a new iteration of simstudy (version 0.1.6), which fixes a bug or two and adds several spline related routines (available on CRAN). The previous post focused on using spline curves to generate data, so I won’t repeat myself here. And, apropos of nothing really – I thought I’d take the opportunity to do a simple simulation to briefly explore the likelihood function. It turns out if we generate lots of them, it can be pretty, and maybe provide a little insight. If a probability density (or mass) function is more or less forward-looking – answering the question of what is the probability of seeing some future outcome based on some known probability model, the likelihood function is essentially backward-looking. The likelihood takes the data as given or already observed – and allows us to assess how likely…

Original Post: Who knew likelihood functions could be so pretty?

## Demo Week: class(Monday) <- tidyquant

We’ve got an exciting week ahead of us at Business Science: we’re launching our first ever Business Science Demo Week. Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. First up is tidyquant, our flagship package that’s useful for financial and time series analysis. Here we go! Six reasons to use tidyquant: Getting web data from Yahoo! Finance, FRED Database, Quandl and more Tidy application of financial and time series functions from xts, zoo, quantmod, TTR and PerformanceAnalytics Graphing: Beautiful themes and financial geoms (e.g. geom_ma) Aggregating portfolios Financial performance analysis and portfolio attribution metrics Great base for financial and time series analysis:…

Original Post: Demo Week: class(Monday) <- tidyquant