Building Batch Data Pipelines on GCP

Course Cover

5

(8)

compare button icon

Course Features

icon

Duration

13 hours

icon

Delivery Method

Online

icon

Available on

Limited Access

icon

Accessibility

Desktop, Laptop

icon

Language

English

icon

Subtitles

English

icon

Level

Intermediate

icon

Teaching Type

Self Paced

icon

Video Content

13 hours

Course Description

Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course will explain which paradigm is best for batch data and when to use it. This course also covers many technologies in Google Cloud Data Transformation including BigQuery, executing Spark with Dataproc, pipeline graphs, Cloud Data Fusion, and serverless data processing using Dataflow. Qwiklabs will allow learners to build data pipeline components in Google Cloud.

Course Overview

projects-img

International Faculty

projects-img

Post Course Interactions

projects-img

Instructor-Moderated Discussions

Skills You Will Gain

What You Will Learn

Review different methods of data loading: EL, ELT and ETL and when to use what

Run Hadoop on Dataproc, leverage Cloud Storage, and optimize Dataproc jobs

Use Dataflow to build your data processing pipelines

Manage data pipelines with Data Fusion and Cloud Composer

Course Instructors

Author Image

Google Cloud Training

other

The Google Cloud Training team is responsible for developing, delivering and evaluating training that enables our enterprise customers and partners to use our products and solution offerings in an ef...

Course Accreditations

Course Reviews

Average Rating Based on 8 reviews

5.0

100%

Course Cover