About Course

Course Information

Apache is a  Hadoop  open source and  software framework for storage the large scale processing of data-sets on clusters commodity hardware.

Hadoop is an Apache for top-level project being and  built used by a global community of contributors and users.

Hadoop  was created by Doug Cutting ,Mike Cafarella in 2005.

It was originally developed to support and distribution for the Nutch search engine project.He called his beloved stuffed yellow elephant “Hadoop” (with the stress on  the first syllable).

The Apache Hadoop framework is composed of  following modules:They are :

Hadoop Common: Contains the  libraries utilities needed for  the other Hadoop modules

Hadoop Distributed File System (HDFS): A Distributed file-system that stores the data on commodity machines, providing by very high aggregate bandwidth across the  cluster.

Hadoop YARN: A  Resource-management platform is  responsible for managing  the compute resources  for clusters  using them for scheduling the  users’ applications.

Hadoop MapReduce: A programming model for using the large scale data processing.

All the modules of  Hadoop was designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and that should  be automatically handled by the  software  framework.

Apache Hadoop’s MapReduce and HDFS components was  originally derived and respectively from Google’s MapReduce and Google File System (GFS) papers.

Beyond HDFS, YARN and MapReduce, that  entire the Apache Hadoop “platform” is now commonly considered to consist of  a number of related projects such as  Apache Pig,  Apache Hive, Apache HBase, and others.

Course Content

Please Download for Detailed Course Content

Core topics of HADOOP BIG DATA Online Course

Demo Video

Watch HADOOP BIG DATA Demo Video

About Trainer

Trainer Information