Hands On Training

Using Kudu with Apache Spark and Apache Flume

O’Reilly

(3)

Course Features

Duration

27 minutes

Delivery Method

Online

Available on

Limited Access

Accessibility

Desktop, Laptop

Language

English

Subtitles

English

Level

Intermediate

Teaching Type

Self Paced

Video Content

27 minutes

Course Description

Apache Kudu, the revolutionary storage technology, is often used with other Hadoop ecosystem frameworks to data ingest and processing. This course is practical and hands-on. It demonstrates how Kudu works in conjunction with four frameworks: Apache Spark SQL, Spark SQL, MLlib and Apache Flume.
You will use the KuduSpark module with SparkSQL and SparkSQL to create, move and update data seamlessly between Kudu, Spark. Next, you will use Apache Flume for streaming events into Kudu tables and then query it using Apache Impala. This course is for learners who have limited experience with Hadoop ecosystem components such as HDFS, Hive and Spark.

Kudu gives you hands-on experience and lets you add more tools to your Big Data toolbox
Learn how to move data from Kudu tables to Spark apps using Kudu-Spark.
Flume and Kudu allow you to stream and analyze data real-time.
Flume can be used to predict movie ratings and you can save the predicted values to Kudu
These open-source tools can be combined to create data engineering pipelines that are simple and quick.

Ryan Bosshart, a Principal Systems Engineer at Cloudera is responsible for a specialized team that focuses on Hadoop ecosystem storage technologies like HDFS, Hbase and Kudu. Ryan Bosshart is a co-chair of the Twin Cities Spark and Hadoop User Group. He has been a builder and architect of large-scale distributed systems, since 2006. Ryan speaks at conferences across North America about Hadoop technologies and holds an Augsburg College degree in computer science.

Course Overview

International Faculty

Post Course Interactions

Hands-On Training,Instructor-Moderated Discussions

Skills You Will Gain

What You Will Learn

Learn how to move data between kudu tables and spark apps using the kudu-spark module understand how to stream and analyze data in real-time with flume and kudu create a movie ratings predictor using flume and save the predicted values into kudu see how t

Course Content

Expand all sections

Module 1: Welcome To The Coursequeue

Module 2: About The Authorqueue

Module 3: Integrating Kudu With Apache Flumequeue

Module 4: Using Kudu With Apache Spark ueue

Course Instructors

Ryan Bosshart

Instructor

Ryan Bosshart is the instructor for this course

Course Reviews

Average Rating Based on 3 reviews

5.0

100%