17 Years of EXCELLENCE.

HADOOP

HADOOP – Big Data – Content

Hadoop

  • What is big Data
  • What is Hadoop
  • Relation between Big data and Hadoop
  • Need of going ahead with Hadoop
  • Challenges with Big Data
  • Storage
  • Processing
  • Comparison with other Technologies
  • RDBMS
  • DATA WAREHOUSE
  • TERADATA

Components of Hadoop Echo System

  • Storage Components
  • Processing Components

HDFS (Hadoop Distributed file System)

  • What is a cluster environment
  • Cluster Vs Hadoop cluster
  • Features of HDFS
  • Storage aspects of HDFS
    • Block

Configuring the Block size

    • Why HDFS Block size is so large
    • Design Principles of Block size

HDFS Architecture – 5 Daemons of Hadoop

  • Name Node
  • Data Node
  • Secondary Name Node
  • Job Tracker
  • Task Tracker

Replication in Hadoop – Fail over Mechanism

  • Data Storage in Data Nodes
  • Replication
  • Custom Replication
  • MapReduce

    • Why Map Reduce is essential in Hadoop
    • Processing Daemons of Hadoop
      • Job Tracker

        • Roles of Job Tracker
        • How to configure Job Tracker in Hadoop

      Task Tracker

      • Roles of Task Tracker
      • Drawbacks W.R.T failure in cluster

      Input Split

      • Need of Input Split
      • Input Split Size
      • Input split size Vs block size
      • Input Split Vs Mappers

      Map Reduce Programming Model

      • Different phases of Map Reduce Algorithm
      • Data Types in Map Reduce

        Basis Map Reduce program

        • Driver code

        • Mapper Code
        • Reducer Code

      Combiner in Map Reduce

      Practitioner in Map Reduce

      Joins in map Reduce

      • Map side join
      • Reduce side join
      • Performance trade off

      Map Reduce Streaming

        Apache PIG

        • Introduction to PIG
        • Map Reduce Vs PIG
        • SQL Vs PIG
        • Data Types in PIG
        • Execution Modes of Pig ( Local/Distributed)
        • Execution Mechanism { Grunt Shell, Script }
        • Writing Simple pig script
        • Bags, Tuples, and Fields in PIG
        • UDF’s in PIG

        HIVE

        • Need of Apache Hive
        • HIVE Architecture [ Driver, Compiler, Executer ]
        • HIVE Query language
        • SQL Vs HIVE QL
        • Collection Data types in Hive [ Array, Struct, Map ]
        • UDF’s in HIVE
        • UDAFs
        • UDTFs
        • SerDe [ Hive serializer / Deserializer ]

        SQOOP

        • Introduction
        • MySQL Initialization
        • Connecting RDBMS using SQOOP
        • Sqoop Commands

        HBASE

        • Introduction
        • HDFS Vs HBase
        • HBase Architecture
        • MapReduce over HBase
          • Pre Requisites: Core JAVA + Linux Commands

            Time Duration: 5 Weeks [30 Hrs + lab]

Address:

#7-20, 4th Floor,
Kamala Land Mark,
Beside Konark Theater,
Dilsukh Nagar,
Hyderabad - 500 060.
Telangana State, INDIA.

Landline:

040 - 6625 2272
040 - 6625 2273
040 - 6625 2274

Mobile:

+91 - 93999 74756
+91 - 903 006 1377
+91 - 939 249 1377

E-Mail:

info.svgroups@gmail.com
team@svinfotech.in
jobs@svinfotech.in
director@svinfotech.in