
Hadoop Overview
This instructor lead crash course provides a technical overview of Apache Hadoop. It includes high-level information about concepts, architecture, operation, and uses of the Hortonworks Data Platform (HDP) and the Hadoop ecosystem. The course provides an foundational basics and whole stack overview.
Schedule
Apache Hadoop is an open source software for affordable supercomputing; it provides the distributed file system and the parallel processing required to run a massive computing cluster. This 1-Day course provides an explanation and demonstration of the most popular components in the Hadoop ecosystem
Target Audience
Data architects, data integration architects, managers, C-level executives, decision makers, technical infrastructure team, and Hadoop administrators or developers who want to understand the fundamentals of Big Data and the Hadoop ecosystem.
Course Objectives
Upon successful completion of this course, students will have an understanding of:
​
-
Ecosystem for Hadoop
-
Installation of Hadoop
-
Data Repository with HDFS and HBase
-
Data Repository with Flume
-
Data Repository with Sqoop
-
Data Refinery with YARN and MapReduce
-
Data Factory with Hive
-
Data Factory with Pig
-
Data Factory with Oozie and Hue
-
Data Flow for the Hadoop Ecosystem