Module-11

If you are not able to view above video, then signIn . If still not able to view then visit this page to subscribe.

Module 11 : Apache Pig : Available (Length 52 Minutes)

1. What is Pig ?

2. Introduction to Pig Data Flow Engine

3. Pig and MapReduce in Detail

4. When should Pig Used ?

5. Pig and Hadoop Cluster

6. Pig Interpreter and MapReduce

7. Pig Relations and Data Types

8. PigLatin Example in Detail

9. Debugging and Generating Example in Apache Pig

Spark Specialization Components

1. Oreilly Apache Spark Certification

2. Apache Spark Training

3. Cloudera CCA175 Hadoop and Spark Certification

4. Apache Hadoop Professional Training

Introduction : 

        Apache PIG is a very important component of Hadoop Eco-System. It is very well matured component and being used in production. Apache PIG helps you write Data Flow engine , which can process data stored in HDFS (Hadoop Distributed File System). With the help of Apache Pig you can avoid writing MapReduce Jobs. With the help of Pig Latin script you can write a long series of data operations and following are the activities can be completed using 

 WebServerLogs --> Extract Relevant Info --> Apply Transformation (e.g. Mapping between page content and userid, aggregate, join,sort etc.) --> Load in Hive Table

          

Other Benefits : 

Components of Apache Pig : Pig has following components.

Execution Mode : Pig can be executed with the any of the below mode.