Hadoop Testing Training Course Content
1. Introduction to Hadoop and its Ecosystem, Map Reduce and HDFS
Big Data, Factors constituting Big Data
Hadoop and Hadoop Ecosystem
Map Reduce -Concepts of Map, Reduce, Ordering, Concurrency, Shuffle, Reducing, Concurrency
Hadoop Distributed File System (HDFS) Concepts and its Importance
Deep Dive in Map Reduce – Execution Framework, Partitioner, Combiner, Data Types, Key pairs
HDFS Deep Dive – Architecture, Data Replication, Name Node, Data Node, Data Flow
Parallel Copying with DISTCP, Hadoop Archives
2. Hands-on Exercises
Installing Hadoop in Pseudo-Distributed Mode, Understanding Important configuration files, their Properties, and Demon Threads
Accessing HDFS from Command Line
Map Reduce – Basic Exercises
Understanding Hadoop Eco-system
Introduction to Sqoop, use cases, and Installation
Introduction to Hive, use cases, and Installation
Introduction to Pig, use cases, and Installation
Introduction to Oozie, use cases, and Installation
Introduction to Flume, use cases and Installation
Introduction to Yarn
3. Map Reduce
How to develop Map Reduce Application, writing a unit test
Best Practices for developing and writing, Debugging Map-Reduce applications
4. Pig
4.1. Introduction to Pig
What Is Pig?
Pig’s Features
Pig Use Cases
Interacting with Pig
4.2. Basic Data Analysis with Pig
Pig Latin Syntax
Loading Data
Simple Data Types
Field Definitions
Data Output
Viewing the Schema
Filtering and Sorting Data
Commonly-Used Functions
Hands-On Exercise: Using Pig for ETL Processing
5. Hive
5.1Introduction to Hive
What Is Hive?
Hive Schema and Data Storage
Comparing Hive to Traditional Databases
Hive vs. Pig
Hive Use Cases
Interacting with Hive
5.2. Relational Data Analysis with Hive
Hive Databases and Tables
Basic HiveQL Syntax
Data Types
Joining Data Sets
Common Built-in Functions
Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue
6. Hadoop Stack Integration Testing
Why Hadoop testing is important
Unit testing
Integration testing
Performance testing
Diagnostics
Nightly QA test
Benchmark and end to end tests
Functional testing
Release certification testing
Security testing
Scalability Testing
Commissioning and Decommissioning of Data Nodes Testing
Reliability testing
Release testing
7. Roles and Responsibilities of Hadoop Testing
Understanding the Requirement, preparation of the Testing Estimation, Test Cases, Test Data, Testbed creation, Test Execution, Defect Reporting, Defect Retest, Daily Status report delivery, Test completion.
ETL testing at every stage (HDFS, Hive, HBase) while loading the input (logs/files/records etc) using sqoop/flume which includes but not limited to data verification, Reconciliation.
User Authorization and Authentication testing (Groups, Users, Privileges etc)
Report defects to the development team or manager and driving them to closure.
Consolidate all the defects and create defect reports.
Validating new feature and issues in Core Hadoop.
8. The Framework called MR Unit for Testing of Map-Reduce Programs
Report defects to the development team or manager and driving them to closure.
Consolidate all the defects and create defect reports.
Validating new feature and issues in Core Hadoop
Responsible for creating a testing Framework called MR Unit for testing of Map-Reduce programs.
9. Test Execution of Hadoop _customized
Testplan for HDFS upgrade
Test automation and result
10. Test Plan Strategy Test Cases of Hadoop Testing
How to test install and configure
Getting the right solution based on the criteria curated by SoftPro9 Team