Apache is a Hadoop open source and software framework for storage the large scale processing of data-sets on clusters commodity hardware.
Hadoop is an Apache for top-level project being and built used by a global community of contributors and users.
Hadoop was created by Doug Cutting ,Mike Cafarella in 2005.
It was originally developed to support and distribution for the Nutch search engine project.He called his beloved stuffed yellow elephant “Hadoop” (with the stress on the first syllable).
The Apache Hadoop framework is composed of following modules:They are :
Hadoop Common: Contains the libraries utilities needed for the other Hadoop modules
Hadoop Distributed File System (HDFS): A Distributed file-system that stores the data on commodity machines, providing by very high aggregate bandwidth across the cluster.
Hadoop YARN: A Resource-management platform is responsible for managing the compute resources for clusters using them for scheduling the users’ applications.
Hadoop MapReduce: A programming model for using the large scale data processing.
All the modules of Hadoop was designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and that should be automatically handled by the software framework.
Apache Hadoop’s MapReduce and HDFS components was originally derived and respectively from Google’s MapReduce and Google File System (GFS) papers.
Beyond HDFS, YARN and MapReduce, that entire the Apache Hadoop “platform” is now commonly considered to consist of a number of related projects such as Apache Pig, Apache Hive, Apache HBase, and others.
Core topics of HADOOP BIG DATA Online Course
Watch HADOOP BIG DATA Demo Video