Duration : 20 Hours
You can’t talk about BigData for too long without mentioning Hadoop! Hadoop is a framework to store and process distributed data over multiple nodes of a cluster. It's leading the market in the world of Big Data. Your big data application can continue running even if a few clusters fail. Learn all about Hadoop and Map reduce with along with hands on experience.
1.Big Data Introduction
1.1 The brief understanding of Big Data World.
1.2 Hadoop and the Eco System 1Evolution
2.1 Hadoop Architecture in Detail.
2.2Hadoop 2.0, HDFS including Name/Data nodes, Map Reduce, Hadoop Shell, Yarn including Resource Manager.
3. Map Reduce 2
3.1 The chapter covers Map Reduce in detail through Java drivers Input Splits.
3.2 Relation between Input Splits and HDFS Blocks.
3.3 MapReduce Job Submission Flow.
3.4 Demo of Input Splits.
3.5 MapReduce: Combiner & Partitioner, Demo on de-identifying Health Care Data set, Demo on Weather Data set.
4.1 MapReduce Vs Pig.
4.2 Pig Use Cases.
4.3 Programming Structure in Pig.
4.4 Pig Running Modes.
4.5 Pig components.
4.6 Pig Execution.
4.7 Pig Latin Program.
4.8 Data Models in Pig.
4.9 Pig Data Types.
5.1 HBAse Architecture.
5.2 Regions Servers.
5.4 HBase Data Model.
5.5 HBase Shell.
5.6 HBase Client API.
5.7 Data Loading Techniques.
6. Oozie and Sqoop
6.1 How to import data from RDBMS to HDFS and Vice Versa.
6.2 How to manage inter dependies of different Hadoop jobs
On conclusion of the course the participant would acquire the following knowledge
a. Understand broad concepts of Big Data and Hadoop
b. Gain foundational knowledge of Hadoop
c. HDFS and Map Reduce processing
d. Understanding of Pig and Hive languagues
e. Using Oozie and Sqoop to support Data over Hadoop