Caserta: Big Data Solutions Architects

[unable to retrieve full-text content]Seeking a Big Data Solutions Architect, to work in small agile teams to deliver innovative solutions with the latest in modern architecture across the US, and participate and contribute to industry thought leadership by attending conferences, speaking at events, blogging etc.
Original Post: Caserta: Big Data Solutions Architects

[Webinar] Data Science for Big Data with Anaconda Enterprise, Oct 19

[unable to retrieve full-text content]This Team Anaconda webinar, Oct 19, will demonstrate how easily the Anaconda Enterprise data science platform integrates with Hadoop and Spark clusters, giving your data scientists access to the libraries they need and empowering you to extract the most value from your Big Data.
Original Post: [Webinar] Data Science for Big Data with Anaconda Enterprise, Oct 19

Tutorial: Azure Data Lake analytics with R

The Azure Data Lake store is an Apache Hadoop file system compatible with HDFS, hosted and managed in the Azure Cloud. You can store and access the data within directly via the API, by connecting the filesystem directly to Azure HDInsight services, or via HDFS-compatible open-source applications. And for data science applications, you can also access the data directly from R, as this tutorial explains.  To interface with Azure Data Lake, you’ll use U-SQL, a SQL-like language extensible using C#. The R Extensions for U-SQL allow you to reference an R script from a U-SQL statement, and pass data from Data Lake into the R Script. There’s a 500Mb limit for the data passed to R, but the basic idea is that you perform the main data munging tasks in U-SQL, and then pass the prepared data to R for analysis. With this…
Original Post: Tutorial: Azure Data Lake analytics with R

Announcing dplyrXdf 1.0

I’m delighted to announce the release of version 1.0.0 of the dplyrXdf package. dplyrXdf began as a simple (relatively speaking) backend to dplyr for Microsoft Machine Learning Server/Microsoft R Server’s Xdf file format, but has now become a broader suite of tools to ease working with Xdf files. This update to dplyrXdf brings the following new features: Support for the new tidyeval framework that powers the current release of dplyr Support for Spark and Hadoop clusters, including integration with the sparklyr package to process Hive tables in Spark Integration with dplyr to process SQL Server tables in-database Simplified handling of parallel processing for grouped data Several utility functions for Xdf and file management Workarounds for various glitches and unexpected behaviour in MRS and dplyr Spark, Hadoop and HDFS New in version 1.0.0 of dplyrXdf is support for Xdf files and datasets stored…
Original Post: Announcing dplyrXdf 1.0

Introduction to Blockchains & What It Means to Big Data

[unable to retrieve full-text content]Perhaps most significant development in IT over the past few years, blockchain has the potential to change the way that the world approaches big data, with enhanced security and data quality.
Original Post: Introduction to Blockchains & What It Means to Big Data

Top 10 Active Big Data, Data Science, Machine Learning Influencers on LinkedIn, Updated

[unable to retrieve full-text content]Looking for advice? Guidance? Stories? We’ve put a list of the top ten LinkedIn influencers of the last three months, follow them and stay up-to-date with the latest news in Big Data, Data Science, Analytics, Machine Learning and AI.
Original Post: Top 10 Active Big Data, Data Science, Machine Learning Influencers on LinkedIn, Updated

TDWI Orlando, where we bring the future of data and analytics to life, Dec 3-8

[unable to retrieve full-text content]Our comprehensive agenda covers the most important topics and success factors for high-impact data insights, with expert instructors whose only goal is to get you to the next level. Big savings when you register by Oct 13 with priority code KD20.
Original Post: TDWI Orlando, where we bring the future of data and analytics to life, Dec 3-8