· INTRODUCTION OF HADOOP YAR
YARN, which stands for “Yet Another Resource Negotiator”, was introduced in Hadoop 2.0 .
·
YARN
also allows different data processing engines like graph processing,
interactive processing, stream processing as well and batch processing to run
and process data stored in HDFS (Hadoop Distributed File System) thus making
the system much more efficient.
·
It separates the functionalities of resource
management and job scheduling/
YARN features:
Multi-tenancy: It allows multiple engine access.
Resource Management: allocates and monitors
resources (CPU, memory, etc.) across the cluster, while separate Application
Masters handle individual jobs.
Job Scheduling: scheduling various types
of jobs, not just MapReduce.
High Availability: ensuring that the cluster
can continue operating even if some components fail.
Resource Reservation: allowing users
to specify resource requirements and deadlines for their jobs.
ARCHITECTURE
The main components of YARN architecture include:
YARN consists of two main components:
Resource Manager(RM):
Node Manager(NM): Runs on each node in the cluster and manages
its local resources.
It receives resource requests from Application Masters, allocates
resources to containers, and monitors their execution.
Additional components:
Application Master (AM): Manages the
execution of a specific application.
Container: An isolated unit of execution that encapsulates
specific resources
Client: It submits map-reduce jobs.
Schedulers: YARN supports
different schedulers that determine how resources are allocated among competing
applications.
Benefits
Scalability
Compatibility:
Cluster Utilization:
Flexibility:
High Availability:
Resource Optimization:
Drawbacks:
1. Complexity:
2. Resource
Fragmentation:
3. Limited Support
for Dynamic Workloads:
4. Scheduling
Overhead:
5. Single Point of
Failure:
6. Limited Support
for Short-Lived Jobs:
====================================================
0 Comments