Big Data Hadoop Data Analytics Training

 >>  Big Data Hadoop Data Analytics Training

Big Data Hadoop Data Analytics Training


 (4.9) | 1500 Ratings


Introduction


Big Data Hadoop Data Analytics Training Details
Track Regular Track Weekend Track Fast Track
Course Duration 35 Hrs 8 Weekends 5 Days
Hours 1hr/day 2 Hours a day 6 Hours a day
Training Mode Online Classroom Online Classroom Online Classroom
Delivery Instructor Led-Live Instructor Led-Live Instructor Led-Live


Course Curriculum

Big Data Hadoop Data Analytics Details

Introduction

  • About this Course
  • About Big Data
  • Course Logistics
  • Introductions

Hadoop Fundamentals

  • The Motivation for Hadoop
  • Hadoop Overview
  • HDFS
  • MapReduce
  • The Hadoop Ecosystem
  • Lab Scenario Explanation
  • Hands-On Exercise: Data Ingest with Hadoop Tools

Introduction to Pig

  • What Is Pig?
  • Pig’s Features
  • Pig Use Cases
  • Interacting with Pig

Basic Data Analysis with Pig

  • Pig Latin Syntax
  • Loading Data
  • Simple Data Types
  • Field Definitions
  • Data Output
  • Viewing the Schema
  • Filtering and Sorting Data
  • Commonly-Used Functions
  • Hands-On Exercise: Using Pig for ETL Processing

Processing Complex Data with Pig

  • Storage Formats
  • Complex/Nested Data Types
  • Grouping
  • Built-in Functions for Complex Data
  • Iterating Grouped Data
  • Hands-On Exercise: Analyzing Ad Campaign Data with Pig

Multi-Dataset Operations with Pig

  • Techniques for Combining Data Sets
  • Joining Data Sets in Pig
  • Set Operations
  • Splitting Data Sets
  • Hands-On Exercise: Analyzing Disparate Data Sets with Pig

Extending Pig

  • Adding Flexibility with Parameters
  • Macros and Imports
  • UDFs
  • Contributed Functions
  • Using Other Languages to Process Data with Pig
  • Hands-On Exercise: Extending Pig with Streaming and UDFs

Pig Troubleshooting and Optimization

  • Troubleshooting Pig
  • Logging
  • Using Hadoop’s Web UI
  • Optional Demo: Troubleshooting a Failed Job with the Web UI
  • Data Sampling and Debugging
  • Performance Overview
  • Understanding the Execution Plan
  • Tips for Improving the Performance of Your Pig Jobs

Introduction to Hive

  • What Is Hive?
  • Hive Schema and Data Storage
  • Comparing Hive to Traditional Databases
  • Hive vs. Pig
  • Hive Use Cases
  • Interacting with Hive

Relational Data Analysis with Hive

  • Hive Databases and Tables
  • Basic HiveQL Syntax
  • Data Types
  • Joining Data Sets
  • Common Built-in Functions
  • Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue

Hive Data Management

  • Hive Data Formats
  • Creating Databases and Hive-Managed Tables
  • Loading Data into Hive
  • Altering Databases and Tables
  • Self-Managed Tables
  • Simplifying Queries with Views
  • Storing Query Results
  • Controlling Access to Data
  • Hands-On Exercise: Data Management with Hive

Text Processing with Hive

  • Overview of Text Processing
  • Important String Functions
  • Using Regular Expressions in Hive
  • Sentiment Analysis and N-Grams
  • Hands-On Exercise (Optional): Gaining Insight with Sentiment Analysis

Hive Optimization

  • Understanding Query Performance
  • Controlling Job Execution Plan
  • Partitioning
  • Bucketing
  • Indexing Data

Extending Hive

  • SerDes
  • Data Transformation with Custom Scripts
  • User-Defined Functions
  • Parameterized Queries
  • Hands-On Exercise: Data Transformation with Hive

Introduction to Impala

  • What is Impala?
  • How Impala Differs from Hive and Pig
  • How Impala Differs from Relational Databases
  • Limitations and Future Directions
  • Using the Impala Shell

Analyzing Data with Impala

  • Basic Syntax
  • Data Types
  • Filtering, Sorting, and Limiting Results
  • Joining and Grouping Data
  • Improving Impala Performance
  • Hands-On Exercise: Interactive Analysis with Impala

Choosing the Best Tool for the Job

  • Comparing MapReduce, Pig, Hive, Impala, and Relational Databases
  • Which to Choose?

Exam & Certification

0

Course Review

(4.9)
5 stars
4 stars
3 stars
2 stars
1 stars

Course Curriculum

Big Data Hadoop Data Analytics Details

Introduction

  • About this Course
  • About Big Data
  • Course Logistics
  • Introductions

Hadoop Fundamentals

  • The Motivation for Hadoop
  • Hadoop Overview
  • HDFS
  • MapReduce
  • The Hadoop Ecosystem
  • Lab Scenario Explanation
  • Hands-On Exercise: Data Ingest with Hadoop Tools

Introduction to Pig

  • What Is Pig?
  • Pig’s Features
  • Pig Use Cases
  • Interacting with Pig

Basic Data Analysis with Pig

  • Pig Latin Syntax
  • Loading Data
  • Simple Data Types
  • Field Definitions
  • Data Output
  • Viewing the Schema
  • Filtering and Sorting Data
  • Commonly-Used Functions
  • Hands-On Exercise: Using Pig for ETL Processing

Processing Complex Data with Pig

  • Storage Formats
  • Complex/Nested Data Types
  • Grouping
  • Built-in Functions for Complex Data
  • Iterating Grouped Data
  • Hands-On Exercise: Analyzing Ad Campaign Data with Pig

Multi-Dataset Operations with Pig

  • Techniques for Combining Data Sets
  • Joining Data Sets in Pig
  • Set Operations
  • Splitting Data Sets
  • Hands-On Exercise: Analyzing Disparate Data Sets with Pig

Extending Pig

  • Adding Flexibility with Parameters
  • Macros and Imports
  • UDFs
  • Contributed Functions
  • Using Other Languages to Process Data with Pig
  • Hands-On Exercise: Extending Pig with Streaming and UDFs

Pig Troubleshooting and Optimization

  • Troubleshooting Pig
  • Logging
  • Using Hadoop’s Web UI
  • Optional Demo: Troubleshooting a Failed Job with the Web UI
  • Data Sampling and Debugging
  • Performance Overview
  • Understanding the Execution Plan
  • Tips for Improving the Performance of Your Pig Jobs

Introduction to Hive

  • What Is Hive?
  • Hive Schema and Data Storage
  • Comparing Hive to Traditional Databases
  • Hive vs. Pig
  • Hive Use Cases
  • Interacting with Hive

Relational Data Analysis with Hive

  • Hive Databases and Tables
  • Basic HiveQL Syntax
  • Data Types
  • Joining Data Sets
  • Common Built-in Functions
  • Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue

Hive Data Management

  • Hive Data Formats
  • Creating Databases and Hive-Managed Tables
  • Loading Data into Hive
  • Altering Databases and Tables
  • Self-Managed Tables
  • Simplifying Queries with Views
  • Storing Query Results
  • Controlling Access to Data
  • Hands-On Exercise: Data Management with Hive

Text Processing with Hive

  • Overview of Text Processing
  • Important String Functions
  • Using Regular Expressions in Hive
  • Sentiment Analysis and N-Grams
  • Hands-On Exercise (Optional): Gaining Insight with Sentiment Analysis

Hive Optimization

  • Understanding Query Performance
  • Controlling Job Execution Plan
  • Partitioning
  • Bucketing
  • Indexing Data

Extending Hive

  • SerDes
  • Data Transformation with Custom Scripts
  • User-Defined Functions
  • Parameterized Queries
  • Hands-On Exercise: Data Transformation with Hive

Introduction to Impala

  • What is Impala?
  • How Impala Differs from Hive and Pig
  • How Impala Differs from Relational Databases
  • Limitations and Future Directions
  • Using the Impala Shell

Analyzing Data with Impala

  • Basic Syntax
  • Data Types
  • Filtering, Sorting, and Limiting Results
  • Joining and Grouping Data
  • Improving Impala Performance
  • Hands-On Exercise: Interactive Analysis with Impala

Choosing the Best Tool for the Job

  • Comparing MapReduce, Pig, Hive, Impala, and Relational Databases
  • Which to Choose?

    Click here for Help and Support: info@sacrostectservices.com     For Inquiry Call Us:   +91 996-629-7972(IND)

  +91 996-629-7972(IND)
X

Quick Enquiry

X

Business Enquiry