Big Data Hadoop Administration Training

 >>  Big Data Hadoop Administration Training

Big Data Hadoop Administration Training


 (4.9) | 750 Ratings


Introduction


Big Data Hadoop Administration Training Details
Track Regular Track Weekend Track Fast Track
Course Duration 35 Hrs 8 Weekends 5 Days
Hours 1hr/day 2 Hours a day 6 Hours a day
Training Mode Online Classroom Online Classroom Online Classroom
Delivery Instructor Led-Live Instructor Led-Live Instructor Led-Live


Course Curriculum

Understanding Big Data and Hadoop

  • Introduction to big data
  • limitations of existing solutions
  • Hadoop architecture
  • Hadoop components and ecosystem
  • data loading & reading from HDFS
  • replication rules
  • rack awareness theory
  • Hadoop cluster administrator: Roles and responsibilities.

Hadoop Architecture And Cluster Setup

  • Hadoop server roles and their usage
  • Hadoop installation and initial configuration
  • deploying Hadoop in a pseudo-distributed mode
  • deploying a multi-node Hadoop cluster
  • Installing Hadoop Clients
  • understanding the working of HDFS and resolving simulated problems.

Hadoop Cluster Administration & Understanding MapReduce

  • Understanding secondary namenode
  • working with Hadoop distributed cluster
  • Decommissioning or commissioning of nodes
  • understanding MapReduce
  • understanding schedulers and enabling them.

Backup, Recovery And Maintenance

  • Key admin commands like Balancer
  • Trash
  • Import Check Point
  • Distcp
  • data backup and recovery
  • enabling trash
  • namespace count quota or space quota
  • manual failover or metadata recovery.

Hadoop 2.0 Cluster: Planning And Management

  • Planning a Hadoop 2.0 cluster
  • cluster sizing
  • hardware
  • network and software considerations
  • popular Hadoop distributions
  • workload and usage patterns
  • industry recommendations.

Hadoop 2.0 And Its Features

  • Limitations of Hadoop 1.x
  • features of Hadoop 2.0
  • YARN framework
  • MRv2
  • Hadoop high availability and federation
  • yarn ecosystem and Hadoop 2.0 Cluster setup.

Setting Up Hadoop 2.X With High Availability And Upgrading Hadoop

  • Configuring Hadoop 2 with high availability
  • upgrading to Hadoop 2
  • working with Sqoop
  • understanding Oozie
  • working with Hive
  • working with Hbase.

Project: Cloudera Manager And Cluster Setup, Overview On Kerberos

  • Cloudera manager and cluster setup
  • Hive administration
  • HBase architecture
  • HBase setup
  • Hadoop/Hive/Hbase performance optimization
  • Pig setup and working with a grunt
  • why Kerberos and how it helps.

Advanced Hadoop Admin Content:

Module 1 – Learning Objectives:

  • By end of the module, the student will be able to understand the basics of big data, he/she will have the foundation of Hadoop daemons and Hadoop architecture.
  • Understanding Big Data Basics
  • Big Data Use Cases
  • Introduction to Hadoop
  • Understanding Hadoop Ecosystem
  • Introduction to HDFS
  • Introduction to Namenode
  • Introduction to Datanode
  • Introduction to Secondary Namenode
  • Introduction to MapReduce
  • Introduction to JobTracker
  • Introduction to TaskTracker
  • Summarizing Hadoop Architecture
  • Roles and Responsibilities of a Hadoop Administrator

Module 2 – Learning Objectives:

  • By end of the module, the student will be able to create a multi node Hadoop cluster. For preparing the students to create Hadoop cluster, this module gives the deep understanding of how linux works, how to setup the virtual machines, how to setup the passwordless ssh.
  • Linux internals
  • Commands that are required
  • Linux basics
  • Hadoop Cluster Installation Pre-requisites
  • Pre-requisites of Hadoop Installation
    • Softwares Download
    • Preparing yo;ur VM
    • Enabling VM with VMware
    • Understanding mandatory changes in the operating system
  • Installation and Configuration
  • Understanding Hadoop cluster installation modes
  • Understanding Hadoop version 1 installation and configuration
  • Passwordless SSH setup
  • Hands-On Practice for creating a Hadoop cluster
  • Helping individually in practicing Hadoop cluster installation

Module 3 – Learning Objectives:

  • By end of the module, the student will be able to understand how to plan a production cluster of Hadoop. Students will understand the hardware and software requirements of Hadoop cluster, performance tuning after cluster creation and benchmarking.

Module 4 – Learning Objectives:

  • By end of the module, the student will be able to administrate the Hadoop cluster. Students will understand how to copy the data from one Hadoop cluster to another Hadoop cluster, different Hadoop schedulers to run the jobs, backup and recovery of metadata, data, configurations, and applications data and recover the cluster data.

Module 5 – Learning Objectives:

  • By end of the module, the student will be able to understand how the next version of Hadoop and YARN works. New features of Hadoop version 2, yarn framework, deploying a Hadoop 2 cluster in pseudo distributed and multi distributed mode.
  • Hadoop 2.0 new features
  • YARN
  • Understanding Resource Manager
  • Understanding Application Master
  • Understanding Node Manager
  • Understanding Hadoop 2 Job Execution Framework
  • Hadoop 2 Multi-node cluster creation
  • Pre-requisites of Hadoop Installation
  • Softwares Download
  • Preparing your VM
  • Enabling VM with VMware
  • Understanding mandatory changes in the operating system
  • Installation and Configuration
  • Understanding Hadoop version 2 installation and configuration
  • Passwordless SSH setup

Module 6 – Learning Objectives:

  • By end of the module, the student will be able to learn how to achieve high availability, how to enable federation in namenode and what the various improvements in Hadoop 2 are.
  • Practice Hadoop 2 multi-node Cluster Creation
  • Helping individuals in practicing Hadoop 2 cluster installation
  • b. Sample Yarn Job execution
  • c. Understanding Issues of Hadoop 1
  • d. Understanding improvements in Hadoop 2
  • e. Namenode Federation
  • Enable segregation of HDFS using multiple namenodes
  • f. Namenode – High Availability
  • Achieving Namenode High-Availability using Quorum Journal Manager
  • Achieving Namenode High-Availability using Network File System
  • g. Implementation of NN High Availability
  • Helping individuals achieving Namenode High Availability

Module 7 – Learning Objectives:

  • By end of the module, the student will be able to administrate the basics of Hadoop ecosystem components like Hive, Hbase, Sqoop, Flume and Pig.
  • Hadoop Ecosystem Introduction
  • Understanding the integration of Hadoop ecosystem
  • b. Touchbase with Hive
  • What is Hive
  • Architecture of Hive
  • Understanding Hive metastore concepts
  • c. HBase
  • Understading HBase Basics
  • Understanding HBase storage Model
  • Understanding HBase Architecture
  • Cluster Installation and Configuration
  • d. Pig
  • What is Pig?
  • How Pig integrates with Hadoop cluster?
  • Demo of Pig Jobs using MapReduce
  • e. Sqoop
  • What is Sqoop?
  • How to import and export the data from Sqoop to RDBMS?
  • Example of Sqoop jobs using MySQL
  • f. Flume
  • What is F
  • Sample Flume jobs

Module 8 Learning Objectives:

  • By end of the module, the student will be able to build a multi node Cloudera cluster using Cloudera Manager, how to achieve high availability and how to add a new node into the cluster using Cloudera Manager.
  • Understanding the internals of Cloudera Manager
  • Understanding the automation of Hadoop installation using Cloudera Manager
  • Understanding Cloudera Hadoop Distribution and Cloudera Manager
  • Understanding the underlying directory structure of Cloudera Hadoop
  • Cloudera Hadoop Cluster Installation – CDH

Practice Test & Interview Questions

 

Exam & Certification

0

Course Review

(4.9)
5 stars
4 stars
3 stars
2 stars
1 stars

Course Curriculum

Understanding Big Data and Hadoop

  • Introduction to big data
  • limitations of existing solutions
  • Hadoop architecture
  • Hadoop components and ecosystem
  • data loading & reading from HDFS
  • replication rules
  • rack awareness theory
  • Hadoop cluster administrator: Roles and responsibilities.

Hadoop Architecture And Cluster Setup

  • Hadoop server roles and their usage
  • Hadoop installation and initial configuration
  • deploying Hadoop in a pseudo-distributed mode
  • deploying a multi-node Hadoop cluster
  • Installing Hadoop Clients
  • understanding the working of HDFS and resolving simulated problems.

Hadoop Cluster Administration & Understanding MapReduce

  • Understanding secondary namenode
  • working with Hadoop distributed cluster
  • Decommissioning or commissioning of nodes
  • understanding MapReduce
  • understanding schedulers and enabling them.

Backup, Recovery And Maintenance

  • Key admin commands like Balancer
  • Trash
  • Import Check Point
  • Distcp
  • data backup and recovery
  • enabling trash
  • namespace count quota or space quota
  • manual failover or metadata recovery.

Hadoop 2.0 Cluster: Planning And Management

  • Planning a Hadoop 2.0 cluster
  • cluster sizing
  • hardware
  • network and software considerations
  • popular Hadoop distributions
  • workload and usage patterns
  • industry recommendations.

Hadoop 2.0 And Its Features

  • Limitations of Hadoop 1.x
  • features of Hadoop 2.0
  • YARN framework
  • MRv2
  • Hadoop high availability and federation
  • yarn ecosystem and Hadoop 2.0 Cluster setup.

Setting Up Hadoop 2.X With High Availability And Upgrading Hadoop

  • Configuring Hadoop 2 with high availability
  • upgrading to Hadoop 2
  • working with Sqoop
  • understanding Oozie
  • working with Hive
  • working with Hbase.

Project: Cloudera Manager And Cluster Setup, Overview On Kerberos

  • Cloudera manager and cluster setup
  • Hive administration
  • HBase architecture
  • HBase setup
  • Hadoop/Hive/Hbase performance optimization
  • Pig setup and working with a grunt
  • why Kerberos and how it helps.

Advanced Hadoop Admin Content:

Module 1 – Learning Objectives:

  • By end of the module, the student will be able to understand the basics of big data, he/she will have the foundation of Hadoop daemons and Hadoop architecture.
  • Understanding Big Data Basics
  • Big Data Use Cases
  • Introduction to Hadoop
  • Understanding Hadoop Ecosystem
  • Introduction to HDFS
  • Introduction to Namenode
  • Introduction to Datanode
  • Introduction to Secondary Namenode
  • Introduction to MapReduce
  • Introduction to JobTracker
  • Introduction to TaskTracker
  • Summarizing Hadoop Architecture
  • Roles and Responsibilities of a Hadoop Administrator

Module 2 – Learning Objectives:

  • By end of the module, the student will be able to create a multi node Hadoop cluster. For preparing the students to create Hadoop cluster, this module gives the deep understanding of how linux works, how to setup the virtual machines, how to setup the passwordless ssh.
  • Linux internals
  • Commands that are required
  • Linux basics
  • Hadoop Cluster Installation Pre-requisites
  • Pre-requisites of Hadoop Installation
    • Softwares Download
    • Preparing yo;ur VM
    • Enabling VM with VMware
    • Understanding mandatory changes in the operating system
  • Installation and Configuration
  • Understanding Hadoop cluster installation modes
  • Understanding Hadoop version 1 installation and configuration
  • Passwordless SSH setup
  • Hands-On Practice for creating a Hadoop cluster
  • Helping individually in practicing Hadoop cluster installation

Module 3 – Learning Objectives:

  • By end of the module, the student will be able to understand how to plan a production cluster of Hadoop. Students will understand the hardware and software requirements of Hadoop cluster, performance tuning after cluster creation and benchmarking.

Module 4 – Learning Objectives:

  • By end of the module, the student will be able to administrate the Hadoop cluster. Students will understand how to copy the data from one Hadoop cluster to another Hadoop cluster, different Hadoop schedulers to run the jobs, backup and recovery of metadata, data, configurations, and applications data and recover the cluster data.

Module 5 – Learning Objectives:

  • By end of the module, the student will be able to understand how the next version of Hadoop and YARN works. New features of Hadoop version 2, yarn framework, deploying a Hadoop 2 cluster in pseudo distributed and multi distributed mode.
  • Hadoop 2.0 new features
  • YARN
  • Understanding Resource Manager
  • Understanding Application Master
  • Understanding Node Manager
  • Understanding Hadoop 2 Job Execution Framework
  • Hadoop 2 Multi-node cluster creation
  • Pre-requisites of Hadoop Installation
  • Softwares Download
  • Preparing your VM
  • Enabling VM with VMware
  • Understanding mandatory changes in the operating system
  • Installation and Configuration
  • Understanding Hadoop version 2 installation and configuration
  • Passwordless SSH setup

Module 6 – Learning Objectives:

  • By end of the module, the student will be able to learn how to achieve high availability, how to enable federation in namenode and what the various improvements in Hadoop 2 are.
  • Practice Hadoop 2 multi-node Cluster Creation
  • Helping individuals in practicing Hadoop 2 cluster installation
  • b. Sample Yarn Job execution
  • c. Understanding Issues of Hadoop 1
  • d. Understanding improvements in Hadoop 2
  • e. Namenode Federation
  • Enable segregation of HDFS using multiple namenodes
  • f. Namenode – High Availability
  • Achieving Namenode High-Availability using Quorum Journal Manager
  • Achieving Namenode High-Availability using Network File System
  • g. Implementation of NN High Availability
  • Helping individuals achieving Namenode High Availability

Module 7 – Learning Objectives:

  • By end of the module, the student will be able to administrate the basics of Hadoop ecosystem components like Hive, Hbase, Sqoop, Flume and Pig.
  • Hadoop Ecosystem Introduction
  • Understanding the integration of Hadoop ecosystem
  • b. Touchbase with Hive
  • What is Hive
  • Architecture of Hive
  • Understanding Hive metastore concepts
  • c. HBase
  • Understading HBase Basics
  • Understanding HBase storage Model
  • Understanding HBase Architecture
  • Cluster Installation and Configuration
  • d. Pig
  • What is Pig?
  • How Pig integrates with Hadoop cluster?
  • Demo of Pig Jobs using MapReduce
  • e. Sqoop
  • What is Sqoop?
  • How to import and export the data from Sqoop to RDBMS?
  • Example of Sqoop jobs using MySQL
  • f. Flume
  • What is F
  • Sample Flume jobs

Module 8 Learning Objectives:

  • By end of the module, the student will be able to build a multi node Cloudera cluster using Cloudera Manager, how to achieve high availability and how to add a new node into the cluster using Cloudera Manager.
  • Understanding the internals of Cloudera Manager
  • Understanding the automation of Hadoop installation using Cloudera Manager
  • Understanding Cloudera Hadoop Distribution and Cloudera Manager
  • Understanding the underlying directory structure of Cloudera Hadoop
  • Cloudera Hadoop Cluster Installation – CDH

Practice Test & Interview Questions

 

    Click here for Help and Support: info@sacrostectservices.com     For Inquiry Call Us:   +91 996-629-7972(IND)

  +91 996-629-7972(IND)
X

Quick Enquiry

X

Business Enquiry