Big Data with R

blur

Learn Path Description

R has great ways to handle working with big data including programming in parallel and interfacing with Spark. In this track, you'll learn how to write scalable and efficient R code and ways to visualize it too.

Skills You Will Gain

Courses In This Learning Path

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Writing Efficient R Code

R's strength lies in its ability to do data analysis. Sometimes R can be slow and cause problems in our analysis. You should be familiar with the best techniques to speed up analysis in order to reduce computation time and gain insight as quickly as possible.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Visualizing Big Data with Trelliscope in R

Now you can tackle larger datasets after mastering ggplot2. This course will show you how to visualize large data using scalable visualization techniques such as faceting. This course will show you how to use Trelliscope, which is part of the trelliscopejs packages. Trelliscope seamlessly integrates with standard R workflows, allowing you to create interactive visualizations that allow you to explore your data deeply. This course will show you how to create interactive exploratory displays using large datasets. This course will help you and your colleagues gain new insights.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Scalable Data Processing in R

R programmers have to deal with problems when data sets exceed the RAM. All variables are defaulty stored in memory. Learn how to extract, analyze, and process data from the disk. Split-apply - combine will be used. Also, you'll learn how to create scalable codes with the bigmemory and iotools packages. Federal Housing Finance Agency data will be used in this course. This data set is publicly available and records all mortgages that were held or securitized in the period 2009 to 2015.

blur
icon

Total Duration

4 hours

icon

Level

Beginner

icon

Learn Type

Certifications

Introduction to Spark with sparklyr in R

R was designed to make data analysis code easy, understandable, and quickly. Apache Spark is optimized for large-scale data analysis. Sparklyr allows the creation of Spark cluster-compatible R codes. You get the best of both. This course will show you how to use Spark DataFrames both with the native Spark interface as well as the dplyr interface. Machine learning techniques are also covered. The course will also cover the Million Song Dataset.

blur