Hadoop Architecture

A Detailed Explanation

What is Hadoop?

A software framework that facilitates the secure and efficient storage and analysis of large amounts of data from various sources, like databases, web servers, and file systems.

Components of Hadoop Architecture

A collection of interconnected databases that allows data to be accessed, analyzed, and stored. It includes 3 main components; data warehouse, analytical framework, and integration layer.

1. Hadoop HDFS (Hadoop Distributed File System)

A resource management and job scheduling daemon responsible for resource allocation and job scheduling. Decides who gets which resources and when, or when resources are available.

2. Hadoop Yarn (Yet Another Resource Negotiator)

A software framework that facilitates the reliable and efficient processing of large amounts of data.  It is well suited to data-intensive, real-time, and/or streaming applications.

3. Hadoop Map Reduce

Hadoop commons are Java libraries, files, or scripts needed for all Hadoop cluster components. They are used by HDFS, YARN, and MapReduce for running the cluster.

4. Hadoop Common

Check out the pros and cons of the Hadoop Architecture here...