Using Kudu with Apache Spark and Apache Flume

Course Cover

5

(3)

compare button icon

Course Features

icon

Duration

27 minutes

icon

Delivery Method

Online

icon

Available on

Limited Access

icon

Accessibility

Desktop, Laptop

icon

Language

English

icon

Subtitles

English

icon

Level

Intermediate

icon

Teaching Type

Self Paced

icon

Video Content

27 minutes

Course Description

Apache Kudu, the revolutionary storage technology, is often used with other Hadoop ecosystem frameworks to data ingest and processing. This course is practical and hands-on. It demonstrates how Kudu works in conjunction with four frameworks: Apache Spark SQL, Spark SQL, MLlib and Apache Flume.
You will use the KuduSpark module with SparkSQL and SparkSQL to create, move and update data seamlessly between Kudu, Spark. Next, you will use Apache Flume for streaming events into Kudu tables and then query it using Apache Impala. This course is for learners who have limited experience with Hadoop ecosystem components such as HDFS, Hive and Spark.

  • Kudu gives you hands-on experience and lets you add more tools to your Big Data toolbox
  • Learn how to move data from Kudu tables to Spark apps using Kudu-Spark.
  • Flume and Kudu allow you to stream and analyze data real-time.
  • Flume can be used to predict movie ratings and you can save the predicted values to Kudu
  • These open-source tools can be combined to create data engineering pipelines that are simple and quick.

Ryan Bosshart, a Principal Systems Engineer at Cloudera is responsible for a specialized team that focuses on Hadoop ecosystem storage technologies like HDFS, Hbase and Kudu. Ryan Bosshart is a co-chair of the Twin Cities Spark and Hadoop User Group. He has been a builder and architect of large-scale distributed systems, since 2006. Ryan speaks at conferences across North America about Hadoop technologies and holds an Augsburg College degree in computer science.

Course Overview

projects-img

International Faculty

projects-img

Post Course Interactions

projects-img

Hands-On Training,Instructor-Moderated Discussions

Skills You Will Gain

What You Will Learn

Learn how to move data between kudu tables and spark apps using the kudu-spark module understand how to stream and analyze data in real-time with flume and kudu create a movie ratings predictor using flume and save the predicted values into kudu see how t

Course Instructors

Author Image

Ryan Bosshart

Instructor

Ryan Bosshart is the instructor for this course

Course Reviews

Average Rating Based on 3 reviews

5.0

100%

Course Cover