Obstacles to performance in parallel programming

Making your code run faster is often the primary goal when using parallel programming techniques in R, but sometimes the effort of converting your code to use a parallel framework leads only to disappointment, at least initially. Norman Matloff, author of Parallel Computing for Data Science: With Examples in R, C++ and CUDA, has shared chapter 2 of that book online, and it describes some of the issues that can lead to poor performance. They include: Communications overhead, particularly an issue with fine-grained parallelism consisting of a very large number of relatively small tasks; Load balance, where the computing resources aren’t contributing equally to the problem; Impacts from use of RAM and virtual memory, such as cache misses and page faults; Network effects, such as latency and bandwidth, that impact performance and communication overhead; Interprocess conflicts and thread scheduling;  Data access and…
Original Post: Obstacles to performance in parallel programming

Obstacles to performance in parallel programming

Making your code run faster is often the primary goal when using parallel programming techniques in R, but sometimes the effort of converting your code to use a parallel framework leads only to disappointment, at least initially. Norman Matloff, author of Parallel Computing for Data Science: With Examples in R, C++ and CUDA, has shared chapter 2 of that book online, and it describes some of the issues that can lead to poor performance. They include: Communications overhead, particularly an issue with fine-grained parallelism consisting of a very large number of relatively small tasks; Load balance, where the computing resources aren’t contributing equally to the problem; Impacts from use of RAM and virtual memory, such as cache misses and page faults; Network effects, such as latency and bandwidth, that impact performance and communication overhead; Interprocess conflicts and thread scheduling;  Data access and…
Original Post: Obstacles to performance in parallel programming

PayPal: Applied Research Scientist (AI-ML R&D / NLP / Deep Learning)

[unable to retrieve full-text content]Seeking an Applied Research Scientist to work on deep learning research for multiple data science applications within the company. There will be access to huge amount of internal data and lots of opportunities to innovate.
Original Post: PayPal: Applied Research Scientist (AI-ML R&D / NLP / Deep Learning)

Starting a Rmarkdown Blog with Bookdown + Hugo + Github

Finally, -after 24h of failed attempts-, I could get my favourite Hugo theme up and running with R Studio and Blogdown. All the steps I followed are detailed in my new Blogdown entry, which is also a GitHub repo. After exploring some alternatives, like Shirin’s (with Jekyll), and Amber Thomas advice (which involved Git skills beyond my basic abilities), I was able to install Yihui’s hugo-lithium-theme in a new repository. However, I wanted to explore other blog templates, hosted in GiHub, like: The three first themes are currently linked in the blogdown documentation as being most simple and easy to set up for unexperienced blog programmers, but I hope the list will grow in the following months. For those who are willing to experiment, the complete list is here. Finally I chose the hugo-tranquilpeak theme, by Thibaud Leprêtre, for which…
Original Post: Starting a Rmarkdown Blog with Bookdown + Hugo + Github

Corinium Chief Analytics Officer, Fall, Boston, Oct 2-5 – special rate till Aug 25

[unable to retrieve full-text content]Over 200+ senior analytics executives will attend this largest C-Level, Analytics event in North America, and only 60+ places left. See who you could be meeting at the event.
Original Post: Corinium Chief Analytics Officer, Fall, Boston, Oct 2-5 – special rate till Aug 25

GoTr – R wrapper for An API of Ice And Fire

Ava Yang It’s Game of Thrones time again as the battle for Westeros is heating up. There are tons of ideas, ingredients and interesting analyses out there and I was craving for my own flavour. So step zero, where is the data? Jenny Bryan’s purrr tutorial introduced the list got_chars, representing characters information from the first five books, which seems not much fun beyond exercising list manipulation muscle. However, it led me to an API of Ice and Fire, the world’s greatest source for quantified and structured data from the universe of Ice and Fire including the HBO series Game of Thrones. I decided to create my own API functions, or better, an R package (inspired by the famous rwar package). The API resources cover 3 types of endpoint – Books, Characters and Houses. GoTr pulls data in JSON format…
Original Post: GoTr – R wrapper for An API of Ice And Fire