Friday, 27 December 2013

YARN: MapReduce 2

How does YARN overcomes shortcomings of “classic” MapReduce


– By splitting the responsibilities of the jobtracker into separate entities. Now a jobtracker takes care of both job scheduling and task progress monitoring. YARN separates these two roles into two daemons: A resource manager and application master. A resource manager manages the use of resources across the cluster and application manager manage the lifecycle of applications running on the cluster. Application master negotiates with the resource manager for cluster resources (number of containers) and then runs application specific processes in these containers. The containers are monitored by node managers running on cluster nodes, which make sure that only allocated resources re used not more than that.

Yarn is more general than MapReduce in fact MapReduce is just one type YARN application. The best thing about YARN design is that different YARN Applications can coexist on the same cluster. It is also possible for users to run different versions of MapReduce on the same YARN cluster, which makes the process of upgrading MapReduce more manageable. MapReduce on YARN involves more entities than classic MapReduce(MapReduce 1).They are:

  • The client
  • The YARN Resource Manager
  • The YARN Node Manager
  • The MapReduce Application Master
  • The Distributed Filesystem

The process of running a job is shown below.

Job Submission


  • Step 1 in figure:The submit() method on Job creates an internal Jobsubmitter instance and calls submitJobInternal() on it.The job submission process implemented by Jobsubmitter does the following.
  • Step 2:Asks the resource manager a new job Id.
  • Step 3:Checks the output specification of the job,Computes input splits,Copies job resources (job JAR ,configuration,and split information)to HDFS.
  • Step 4:Finally, the job is submitted by calling submitApplication() on the resource manager.

Job Intitialization


  • Step 5a and 5b: When the resource manager receives a call to its submitApplication(), it hands off the request to the scheduler.The scheduler allocates a container,and the resource manager then launches the application master's process there , under the node manager's management.
  • Step 6:The application master initializes job by creating a number of book keeping objects to keep track of the job's progress,as it will receive progress and completion reports from the tasks.
  • Step 7:Then it receives the input splits computed in the client from the shared filesystem.


Task Assignment


  • Step 8: If the job does not qualify for running as uber task, then the application master requestes containers for all the map and reduce rasks in the job from resource manager.
Note: All requests includes inforamation about each map task's data locality, in particular the hosts and coressponding racks that the input split resides on.The scheduler uses this info to make scheduling decesions.How? It attempts to place tasks on data-local nodes(in the ideal case), but if this is not possible, it prefers rack-local placement to non=local placement.

Task Execution


  • Step 9a and 9b: After a container has been assigned to the task by resource manager's scheduler, the application master starts the container by contacting the node manager.
  • Step 10:The task is executed by a Java application whose main class is YarnChild. Before running the task it localizes the resources that task needs, which includes the job configuration and JAR file and any files from distributed cache.
  • Step 11: Finally, it runs the map or reduce task
Note: Unlike MapReduce 1 YARN does not suppoert JVM reuse,so each task runs in a new JVM.Streaming and Pipes programs work in the same way as MapReduce 1.The YarnChild launches the Streaming or Pipes process and communicates with it using standard input/output or a socket(respectively).

Progress and status updates


  • When running under YARN , the task reports its progress and status back to its application master,which has an aggregate view of the job,every three seconds over the umbilical interface.The clients polls the appkiaction master every second to receive progress updates,which are usually displayed to the user.


Job Completion


  • Every five seconds the client checks whether the job has completed by calling the waitForCompletion() method on Job.The polling interval can be set via the mapreduce.client.completon.pollinterval configuration property. On job completion,the applcation master and the task containers clean up their working stste , and the OutputCommitter's job cleanup method is called.Job information is archived by the job history server to enable later interrogation b users if desired.