Big Data Analytics with H20 in R Exercises -Part 1

We have dabbled with RevoScaleR before , In this exercise we will work with H2O , another high performance R library which can handle big data very effectively .It will be a series of exercises with increasing degree of difficulty . So Please do this in sequence .H2O requires you to have Java installed in your system .So please install Java before trying with H20 .As always check the documentation before trying these exercise set .Answers to the exercises are available here.If you want to install the latest release from H20 , install it via this instructions . Exercise 1Download the latest stable release from h20 and initialize the cluster Exercise 2Check the cluster information via clusterinfo Exercise 3You can see how h2o works via the demo function , Check H2O’s glm via demo method . Exercise 4 down load…
Original Post: Big Data Analytics with H20 in R Exercises -Part 1

Big Data analytics with RevoScaleR Exercises-2

In the last set of exercises , you have seen the basic functionalities of RevoScaleR .In this exercise set we will explore RevoScaleR further.get the Credit card fraud data set from revolutionanalytics and lets get startedAnswers to the exercises are available here.Please check the documentation before starting these exercise set If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1RevoScaleR provides option to convert a dataframe into a xdf file,which you might need while storing temporary data frame that you create during analysis work .Now Create an XDF file from airquality dataset Exercise 2In the previous set of exercise you have seen rxHistogram briefly,Now we will see how to get meaningful information from large dataset with a visualization .create a scatterplot…
Original Post: Big Data analytics with RevoScaleR Exercises-2

Big Data analytics with RevoScaleR Exercises

In this set of exercise , you will explore how to handle bigdata with RevoscaleR package from Microsoft R (previously Revolution Analytics).It comes with Microsoft R client . You can get it from here . get the Credit card fraud data set from revolutionanalytics and lets get startedAnswers to the exercises are available here.Please check the documentation before starting these exercise set If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1The heart of RevoScaleR is the xdf file format , convert the creditcardfraud data set into xdf format . Exercise 2 use the newly created xdf file to get information about the variables and print 10 rows to check the data . Learn more about importing big data in the…
Original Post: Big Data analytics with RevoScaleR Exercises

More string Hacking with Regex and Rebus

For a begineer in R or any language,regular expression might seem like a daunting task . Rebus package in R gives a lowers the barrier for common regular expression tasks and is useful for a begineer or even for advanced users for most of the common regex skills in a more intuitive yet verbose way .Check out the package and try this exercises to test your knowledge .Load stringr/stringi as well for this set of exercise . I encourage you to do this andthis before working on this set .Answers are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1create two stringsSuppose you have a vectorex <- c(“stringer”,”stringi”,”rebus”,”redbus”) use rebus and find the strings starting with st .Hint…
Original Post: More string Hacking with Regex and Rebus

Hacking Strings with stringi

In the last set of exercises, we worked on the basic concepts of string manipulation with stringr. In this one we will go further into hacking strings universe and learn how to use stringi package.Note that stringi acts as a backend of stringr but have many more useful string manipulation functions compared to stringr and one should really know stringi for text manipulation . Answers to the exercises are available here.If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1create two stringsc1 c2 Now stringi comes with many functions and wrappers around functions to check if two string are equivalent. Check if they are equivalent withstri_compare, %s Learn more about Text analysis in the online course Text Analytics/Text Mining Using…
Original Post: Hacking Strings with stringi

Hacking strings with stringr

This is first of the set of exercise on string manipulation with stringr Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1use a stringr function to merge this 3 strings .x y z Exercise 2 Now use a vector which contains x,y,z and NA and make it a single sentence using paste ,do the same by the same function you used for exercise1 .Can you spot the difference . Exercise 3 Install the babynames dataset ,find the vector of length of the babynames using stringr functions. You may wonder nchar can do the same so why not use that ,try finding out the difference and let me know in the comments. Exercise…
Original Post: Hacking strings with stringr

Data Manipulation with data.table (part -2)

In the last set of exercise of data.table ,we saw some interesting features of data.table .In this set we will cover some of the advanced features like set operation ,join in data.table.You should ideally complete the first part before attempting this one .Answers to the exercises are available here.If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1Create a data.table from diamonds dataset ,create key using setkey over cut and color .Now select first entry of the groups Ideal and Premium Exercise 2With the same dataset,select the first and last entry of the groups Ideal and Premium Exercise 3Earlier we have seen how we can create/update columns by reference using := .However there is a lower over head ,faster alternative…
Original Post: Data Manipulation with data.table (part -2)

Data Manipulation with Data Table -Part 1

In the exercises below we cover the some useful features of data.table ,data.table is a library in R for fast manipulation of large data frame .Please see the data.table vignette before trying the solution .This first set is intended for the begineers of data.table package and does not cover set keywords, joins of data.table which will be covered in the next set . Load the data.table library in your r session before starting the exerciseAnswers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1Load the iris dataset ,make it a data.table and name it iris_dt ,Print mean of Petal.Length, grouping by first letter of Species from iris_dt . Exercise 2Load the diamonds dataset…
Original Post: Data Manipulation with Data Table -Part 1

A Primer in functional Programming in R (part -2)

In the last exercise, We have seen how powerful functional programming principles can be and how it can drammatically increase the readablity of the code and how easily you can work with them .In this set of exercises we will look at functional programming principles with purrr.Purrr comes with a number of interesting features and is really useful in writing clean and concise code . Please check the documentation and load the purrr library in your R session before starting these exercise set .Answers to the exercises are available here If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1From the airquality dataset( available in base R ) , Find the mean ,median ,standard deviation of all columns using map functions…
Original Post: A Primer in functional Programming in R (part -2)

A Primer in Functional Programming in R Exercises (Part – 1)

In the exercises below we cover the basics of functional programming in R( part 1 of a two series exercises on functional programming) . We consider recursion with R , apply family of functions , higher order functions such as Map ,Reduce,Filter in R .Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Exercise 1create a function which calculates factorial of a number with help of Reduce , Exercise 2create the same function which calculates factorial but with recursion and memoization. : Exercise 3create a function cum_add which makes cumulative summation for e.g if x cum_add(x) will result in 1 3 6 .Don’t use cumsum . Exercise 4create a function which takes a dataframe and…
Original Post: A Primer in Functional Programming in R Exercises (Part – 1)