Machine Learning Scientist with R

blur

Learn Path Description

Master the essential skills to land a job as a machine learning scientist! You'll augment your R programming skill set with the toolbox to perform supervised and unsupervised learning. You'll learn how to process data for modeling, train your models, visualize your models and assess their performance, and tune their parameters for better performance. In the process, you'll get an introduction to Bayesian statistics, natural language processing, and Spark.

Skills You Will Gain

Courses In This Learning Path

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Supervised Learning in R: Classification

This introduction to machine-learning is intended for beginners. It covers four of the most well-known classification algorithms. This course will provide a solid understanding of the approach each algorithm takes to learning tasks and the R functions that are required to apply these tools to your own work.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Supervised Learning in R: Regression

Regression is a method that uses inputs to predict numerical outcomes. This is machine learning. This course will help you learn about regression models, how to train them in R, and how to make predictions with them.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Unsupervised Learning in R

Machine learning is often used to find patterns in data. It is impossible to predict the future. This is unsupervised learning. This can be used to identify the unsupervised learning that is being done to target marketing campaigns by grouping consumers based on their buying history and demographics. Another example is to determine the unmeasured factors that affect differences in crime rates between cities. This course will give you a general introduction to clustering and dimension in R from a machine learning perspective. This course will allow you to quickly get data into insight.

blur
icon

Total Duration

5 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Machine Learning in the Tidyverse

This course will teach you how to use tools from the "tidyverse" to create, evaluate, and analyze machine-learning models. To deal with complex models, you will be able to use purrr and tidyr packages. You will also learn how to use the Broom package to explore your models. You will learn the test-train-validate process. This will enable you to evaluate both classification and regression models. It also provides information that can be used to optimize model performance via hyperparameter tuning.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Intermediate Regression in R

Two of the most widely used statistical models are linear regression and logistic regression. These models are the key to unlocking the secrets of data sets. This course builds on the skills acquired in "Introduction to Regression in R" and covers both logistic and linear regression with multiple explanation variables. Learn how variables interact with real-world data such as Taiwan house prices and customer churn modeling, among other topics. This course will show you how to combine multiple explanatory variables in a model, how they interact and how logistic regression and linear regression work.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Cluster Analysis in R

Cluster analysis is an important tool within the data science toolset. It is used to identify clusters that share similar characteristics. These similarities can be used to help you make better business decisions. It can also help you target different customers in marketing. This course will cover both hierarchical clustering and k-means clustering. These methods will teach you not only how to use them but also how to interpret their results. Three datasets will be used to develop this intuition: longitudinal occupational wage data for soccer players, wholesale customer spending data for wholesale customers, and soccer player positions.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Machine Learning with caret in R

Machine learning is the study and application of algorithms that can learn data and make predictions. It is used in all aspects of our lives, including search results and self-driving cars. This is also one the fastest-growing areas of data science research. This course covers the basics of machine learning. This course teaches you how to create, evaluate, tune, and improve predictive models. This course uses the popular R package, which provides a consistent interface to all R's most powerful machine learning tools.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Modeling with tidymodels in R

Tidymodels is a powerful suite of R packages that simplifies machine-learning workflows. Split data for cross validation, preprocess data with tidymodels' recipe pack and fine-tune machine learning algorithms. You will learn key concepts such as how to create modeling workflows and define model objects. Next, use your knowledge to predict home prices and classify employees according to their likelihood of leaving the company.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Machine Learning with Tree-Based Models in R

Tree-based machine learning models can reveal complex, nonlinear data relationships. They often win machine-learning competitions. This course will show you how to use tidymodels in order to create different tree-based models. These can range from simple decision trees to complex random forests. You will also learn how to use the powerful machine-learning technique of boosted tree that uses ensemble learning to create highly-performing predictive models. Learn how credit and health data can be used to predict customer churn or diabetes.

blur
icon

Total Duration

24 hours

icon

Level

Beginner

icon

Learn Type

Certifications

Machine Learning Fundamentals in R

Machine learning is a method that predicts the future. This track covers the prediction of categorical and numerical responses through regression and classification, as well as uncovering the hidden structure in datasets (unsupervised Learning). You will learn how to model data, how you can train models, how your models are visualized and evaluated, and how you can tune their parameters to improve performance.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Support Vector Machines in R

This course introduces the powerful classifier, the support vector machine (SVM), through an intuitive and visual approach. Students will learn how support vector machines work in R. They'll also get to use the e1071 program R libsvm. Students will be able to understand concepts like hard and flexible margins, kernel tricks and different types of kernels. They also learn how to tune SVM parameters. This model allows you to classify data.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Fundamentals of Bayesian Data Analysis in R

Bayesian data analysis is a popular method for statistical modeling and machine learning. This framework provides a consistent way to build problem-specific models that can both be used for prediction and statistical inference. This course will cover Bayesian data analytics. It is an important tool for data science.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Topic Modeling in R

The course introduces students to topics modelling. This course covers topics modeling, including preparation of corpus and fitting topic models with Latent Dirichlet algorithm in package topicmodels. It also includes visualizing results using ggplot2 and wordclouds.

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Hyperparameter Tuning in R

It is not easy to just run a machine-learning problem from the box and make a prediction. The best model must accurately predict the outcome. You can improve your model by hyperparameter tuning. This is the process of optimizing your model's settings. This course will show you how to use caret, mlr and h2o packages to find the optimal combination of hyperparameters. This course uses grid search and random searching as well as adaptive resampling and automated machine learning (AutoML). You can also tune different supervised models like support vector and gradient boosting machines. Tune up!

blur
icon

Total Duration

4 hours

icon

Level

Intermediate

icon

Learn Type

Certifications

Bayesian Regression Modeling with rstanarm

Bayesian estimation can be used to model techniques that are dependent on p values. This course will show you how to calculate linear regression models using Bayesian methods and the rstanarm. You will also learn about posterior predictive model checking, prior distributions, and how to use the Bayesian framework. The model you have constructed will be used for predicting new data.

blur
icon

Total Duration

4 hours

icon

Level

Beginner

icon

Learn Type

Certifications

Introduction to Spark with sparklyr in R

R was designed to make data analysis code easy, understandable, and quickly. Apache Spark is optimized for large-scale data analysis. Sparklyr allows the creation of Spark cluster-compatible R codes. You get the best of both. This course will show you how to use Spark DataFrames both with the native Spark interface as well as the dplyr interface. Machine learning techniques are also covered. The course will also cover the Million Song Dataset.

blur