Sign In
    Saved
      Sign In
      Saved

Introduction to PySpark

Home / DataCamp / Introduction to PySpark
Certification

Introduction to PySpark

To learn machine learning and distributed data management in Spark, use the PySpark package

5
(3)

Description

View More

Features

This course includes

Duration

4 hours
Video Content
4 hours
Level
Beginner
Instruction Type
Self Paced
Delivery Method
Online
Available on
Mobile, Desktop, Laptop
Accessibility
Limited Access
Language
English
Subtitles
English

Skills

Spark SQLSpark StreamingSpark MLLibSpark DataFrameSparkSession

Learning Goals

Learn to implement distributed data management and machine learning in Spark using the PySpark package
In this course, you'll learn how to use Spark from Python!
You'll use this package to work with data about flights from Portland and Seattle
You'll learn to wrangle this data and build a whole machine learning pipeline to predict whether or not flights will be delayed

Course Content

View More
Expand All Sections

Prerequisites/Requirements

Introduction to Python

Instructors

Profile Image

Nick Solomon

Data Scientist

View More
Profile Image

Lore Dirick

Director of Data Science Education at Flatiron School

View More

Course Overview

Hands-On Training, Instructor-Moderated Discussions

Post course interactions

Virtual labs

International faculty

Reviews

Average rating based on 3 reviews

5.0
100%

Sort by

Showing 0 of 0 Reviews

Course Cover

₹1,093

View More
0