Big Data Hadoop Training

 >>  Big Data Hadoop Training

Big Data Hadoop Training


 (5) | 1500 Ratings


Introduction


Big Data Hadoop Training Details
Track Regular Track Weekend Track Fast Track
Course Duration 35 Hrs 8 Weekends 5 Days
Hours 1hr/day 2 Hours a day 6 Hours a day
Training Mode Online Classroom Online Classroom Online Classroom
Delivery Instructor Led-Live Instructor Led-Live Instructor Led-Live


Course Curriculum

HADOOP ONLINE TRAINING-COMPLETE COURSE DETAILS HDFS AND MAPREDUCE.

1. INTRODUCTION TO BIG DATA AND ITS CHARACTERISTICS

  • 4 V's of BIG DATA(IBM Definition of BIG DATA)
  • What is Hadoop?
  • Why Hadoop?
  • Core Components of Hadoop
  • Intro to HDFS and its Architecture
  • Difference b/w Code Locality and Data Locality
  • HDFS commands
  • Name Node’s Safe Mode
  • Different Modes of Hadoop
  • Intro to MAPREDUCE
  • Versions of HADOOP
  • What is Daemon?
  • Hadoop Daemons?
  • What is Name Node?
  • What is Data Node?
  • What is Secondary name Node?
  • What is Job Tracker?
  • What is Task Tracker?
  • What is Edge computer in Hadoop Cluster and Its role
  • Read/Write operations in HDFS
  • Complete Overview of Hadoop1.x and Its architecture
  • Rack awareness
  • Introduction to Block size
  • Introduction to Replication Factor(R.F)
  • Introduction to HeartBeat Signal/Pulse
  • Introduction to Block report
  • MAPREDUCE Architecture
  • What is Mapper phase?
  • What is shuffle and sort phase?
  • What is Reducer phase?
  • What is split?
  • Difference between Block and split
  • Intro to first Word Count program using MAPREDUCE
  • Different classes for running MAPREDUCE program using Java
  • Mapper class
  • Reducer Class and Its role
  • Driver class
  • Submitting the Word Count MAPREDUCE program
  • Going through the Jobs system output
  • Intro to Partitioner with example
  • Intro to Combiner with example
  • Intro to Counters and its types
  • Different types of counters
  • Different types of input/output formats in HADOOP
  • Use cases for HDFS & MapReduce programs using Java
  • Single Node cluster Installation
  • Multi Node cluster Installation
  • Introduction to Configuration files in Hadoop and Its Imp.
  • Complete Overview of Hadoop2.x and Its architecture
  • Introduction to YARN
  • Resource Manager
  • Node Manager
  • Application Master(AM)
  • Applications Manager(AsM)
  • Journal Nodes
  • Difference Between Hadoop1.x and Hadoop2.x
  • High Availability(HA)
  • Hadoop Federation

2. PIG

  • The difference between MAPREDUCE and PIG
  • When to go with MAPREDUCE?
  • When to go with PIG?
  • PIG data types
  • What is field in PIG?
  • What is tuple in PIG?
  • What is Bag in PIG?
  • Intro to Grunt shell?
  • Different modes in PIG
  • Local Mode
  • MAPREDUCE mode
  • Running PIG programs
  • PIG Script
  • Intro to PIG UDFs
  • Writing PIG UDF using Java
  • Registering PIG UDF
  • Running PIG UDF
  • Different types of UDFs in PIG
  • Word Count program using PIG script
  • Use cases for PIG scripts

3. HIVE

  • Intro to HIVE
  • Why HIVE?
  • History of HIVE
  • Difference between PIG and HIVE
  • HIVE data types
  • Complex data types
  • What is Metastore and its importance?
  • Different types of tables in HIVE
  • Managed tables
  • External tables
  • Running HIVE queries
  • Intro to HIVE partitions
  • Intro to HIVE Buckets
  • How to perform the JOINS using HIVE queries
  • Intro to HIVE UDFs
  • Different types of UDFs in HIVE
  • Running HIVE queries for Word Count example
  • Use cases for HIVE

4. HBASE

  • Intro to HBASE
  • Intro to NoSQL database
  • Sparse and dense Concept in RDBMS
  • Intro to columnar/column oriented database
  • Core architecture of HBase
  • Why Hbase?
  • HDFS vs HBase
  • Intro to Regions, Region server and Hmaster
  • Limitations of Hbase
  • Integration with Hive and Hbase
  • Hbase commands
  • Use cases for HBASE

5. FLUME

  • Intro to Flume
  • Intro to Sink, Source, Flume Master and Flume agents
  • Importance of Flume agents
  • Live Demo on copying LOG DATA into HDFS

6. SQOOP

  • Intro to Sqoop
  • Importing and exporting the RDBMS into HDFS
  • Intro to incremental imports and its types
  • Use cases to import the Mysql data into HDFS

7. ZOOKEEPER

  • Intro to Zookeeper
  • Zookeeper operations

8. OOZIE

  • Intro to Oozie
  • What is Job.properties
  • What is workflow.xml
  • Scheduling the jobs in Oozie
  • Scheduling MapReduce, HIVE, PIG jobs/Programs using Oozie.
  • Setting up the VMware for Hadoop
  • Installing all Hadoop Components
  • Intro to Hadoop Distributions
  • Intro to Cloudera and its major components

9. SPARK

  • Overview of BigData and Spark
  • MapReduce limitations
  • Spark History
  • Spark Architecture
  • Benefits of Spark
  • Apache Spark - Installation
  • What is Spark Ecosystem
  • What is Scala and its utility in Spark
  • What is SparkContext
  • How to work on RDD in Spark
  • How to run a Spark Cluster
  • Comparison of MapReduce vs Spark
  • Transformations & actions
  • loading and saving data

10. SCALA

  • Scala Introduction
  • Advantages of using Scala for Apache Spark
  • Variable declaration in Scala
  • scala programming basics
  • collections
  • Working with RDD in Apache Spark using Scala
  • Working with DataFrame in Apache Spark using Scala
  • Building a Machine Learning Model

11. TABLEAU

  • Tableau Fundamentals
  • Tableau Analytics
  • Visual Analytics
  • Connecting
  • Hadoop Integration with Tableau

12. HADOOP AND PYTHON

PYTHON TRAINING INCLUSIVE OF SCIKIT AND INTRODUCTION TO HADOOP

  • This module covers the Scikit introduction, popularity of Hadoop, MapReduce Framework and the Fuctional Programming.
  • Introduction to Scikit-Learn
  • Inbuilt Algorithms for Use
  • What is Hadoop and why it is popular
  • Distributed Computation and Functional Programming
  • Understanding MapReduce Framework
  • Sample Map Reduce Job Run.

HADOOP AND PYTHON

  • This module of Python course covers the concepts like Map Reduce Jobs, PIG UDF, Hadoop and much more.
  • PIG and HIVE Basics
  • Streaming Feature in Hadoop
  • Map Reduce Job Run using Python
  • Writing a PIG UDF in Python
  • Writing a HIVE UDF in Python
  • Pydoop and MRjob Basics.

 

Exam & Certification

0

Course Review

(5)
5 stars
4 stars
3 stars
2 stars
1 stars

Course Curriculum

HADOOP ONLINE TRAINING-COMPLETE COURSE DETAILS HDFS AND MAPREDUCE.

1. INTRODUCTION TO BIG DATA AND ITS CHARACTERISTICS

  • 4 V's of BIG DATA(IBM Definition of BIG DATA)
  • What is Hadoop?
  • Why Hadoop?
  • Core Components of Hadoop
  • Intro to HDFS and its Architecture
  • Difference b/w Code Locality and Data Locality
  • HDFS commands
  • Name Node’s Safe Mode
  • Different Modes of Hadoop
  • Intro to MAPREDUCE
  • Versions of HADOOP
  • What is Daemon?
  • Hadoop Daemons?
  • What is Name Node?
  • What is Data Node?
  • What is Secondary name Node?
  • What is Job Tracker?
  • What is Task Tracker?
  • What is Edge computer in Hadoop Cluster and Its role
  • Read/Write operations in HDFS
  • Complete Overview of Hadoop1.x and Its architecture
  • Rack awareness
  • Introduction to Block size
  • Introduction to Replication Factor(R.F)
  • Introduction to HeartBeat Signal/Pulse
  • Introduction to Block report
  • MAPREDUCE Architecture
  • What is Mapper phase?
  • What is shuffle and sort phase?
  • What is Reducer phase?
  • What is split?
  • Difference between Block and split
  • Intro to first Word Count program using MAPREDUCE
  • Different classes for running MAPREDUCE program using Java
  • Mapper class
  • Reducer Class and Its role
  • Driver class
  • Submitting the Word Count MAPREDUCE program
  • Going through the Jobs system output
  • Intro to Partitioner with example
  • Intro to Combiner with example
  • Intro to Counters and its types
  • Different types of counters
  • Different types of input/output formats in HADOOP
  • Use cases for HDFS & MapReduce programs using Java
  • Single Node cluster Installation
  • Multi Node cluster Installation
  • Introduction to Configuration files in Hadoop and Its Imp.
  • Complete Overview of Hadoop2.x and Its architecture
  • Introduction to YARN
  • Resource Manager
  • Node Manager
  • Application Master(AM)
  • Applications Manager(AsM)
  • Journal Nodes
  • Difference Between Hadoop1.x and Hadoop2.x
  • High Availability(HA)
  • Hadoop Federation

2. PIG

  • The difference between MAPREDUCE and PIG
  • When to go with MAPREDUCE?
  • When to go with PIG?
  • PIG data types
  • What is field in PIG?
  • What is tuple in PIG?
  • What is Bag in PIG?
  • Intro to Grunt shell?
  • Different modes in PIG
  • Local Mode
  • MAPREDUCE mode
  • Running PIG programs
  • PIG Script
  • Intro to PIG UDFs
  • Writing PIG UDF using Java
  • Registering PIG UDF
  • Running PIG UDF
  • Different types of UDFs in PIG
  • Word Count program using PIG script
  • Use cases for PIG scripts

3. HIVE

  • Intro to HIVE
  • Why HIVE?
  • History of HIVE
  • Difference between PIG and HIVE
  • HIVE data types
  • Complex data types
  • What is Metastore and its importance?
  • Different types of tables in HIVE
  • Managed tables
  • External tables
  • Running HIVE queries
  • Intro to HIVE partitions
  • Intro to HIVE Buckets
  • How to perform the JOINS using HIVE queries
  • Intro to HIVE UDFs
  • Different types of UDFs in HIVE
  • Running HIVE queries for Word Count example
  • Use cases for HIVE

4. HBASE

  • Intro to HBASE
  • Intro to NoSQL database
  • Sparse and dense Concept in RDBMS
  • Intro to columnar/column oriented database
  • Core architecture of HBase
  • Why Hbase?
  • HDFS vs HBase
  • Intro to Regions, Region server and Hmaster
  • Limitations of Hbase
  • Integration with Hive and Hbase
  • Hbase commands
  • Use cases for HBASE

5. FLUME

  • Intro to Flume
  • Intro to Sink, Source, Flume Master and Flume agents
  • Importance of Flume agents
  • Live Demo on copying LOG DATA into HDFS

6. SQOOP

  • Intro to Sqoop
  • Importing and exporting the RDBMS into HDFS
  • Intro to incremental imports and its types
  • Use cases to import the Mysql data into HDFS

7. ZOOKEEPER

  • Intro to Zookeeper
  • Zookeeper operations

8. OOZIE

  • Intro to Oozie
  • What is Job.properties
  • What is workflow.xml
  • Scheduling the jobs in Oozie
  • Scheduling MapReduce, HIVE, PIG jobs/Programs using Oozie.
  • Setting up the VMware for Hadoop
  • Installing all Hadoop Components
  • Intro to Hadoop Distributions
  • Intro to Cloudera and its major components

9. SPARK

  • Overview of BigData and Spark
  • MapReduce limitations
  • Spark History
  • Spark Architecture
  • Benefits of Spark
  • Apache Spark - Installation
  • What is Spark Ecosystem
  • What is Scala and its utility in Spark
  • What is SparkContext
  • How to work on RDD in Spark
  • How to run a Spark Cluster
  • Comparison of MapReduce vs Spark
  • Transformations & actions
  • loading and saving data

10. SCALA

  • Scala Introduction
  • Advantages of using Scala for Apache Spark
  • Variable declaration in Scala
  • scala programming basics
  • collections
  • Working with RDD in Apache Spark using Scala
  • Working with DataFrame in Apache Spark using Scala
  • Building a Machine Learning Model

11. TABLEAU

  • Tableau Fundamentals
  • Tableau Analytics
  • Visual Analytics
  • Connecting
  • Hadoop Integration with Tableau

12. HADOOP AND PYTHON

PYTHON TRAINING INCLUSIVE OF SCIKIT AND INTRODUCTION TO HADOOP

  • This module covers the Scikit introduction, popularity of Hadoop, MapReduce Framework and the Fuctional Programming.
  • Introduction to Scikit-Learn
  • Inbuilt Algorithms for Use
  • What is Hadoop and why it is popular
  • Distributed Computation and Functional Programming
  • Understanding MapReduce Framework
  • Sample Map Reduce Job Run.

HADOOP AND PYTHON

  • This module of Python course covers the concepts like Map Reduce Jobs, PIG UDF, Hadoop and much more.
  • PIG and HIVE Basics
  • Streaming Feature in Hadoop
  • Map Reduce Job Run using Python
  • Writing a PIG UDF in Python
  • Writing a HIVE UDF in Python
  • Pydoop and MRjob Basics.

 

    Click here for Help and Support: info@sacrostectservices.com     For Inquiry Call Us:   +91 996-629-7972(IND)

  +91 996-629-7972(IND)
X

Quick Enquiry

X

Business Enquiry