
YARN Timeline server in Hadoop 3.x
The job history server in MapReduce provides the information about all the current and historical MapReduce. jobs details. The job history server was only able to capture the information about MapReduce jobs and it was not able to capture YARN level events and metrics. As we know, YARN has a capability to run applications other than MapReduce and thus, there was a need to have a YARN-specific application that can capture information about all the applications. The YARN Timeline server is responsible for retrieving current as well as historic information about applications. The metrics and information collected through a YARN Timeline server are generic in nature and hence have a common structure that helps in debugging the logs and capturing other metrics for any specific use. The Timeline server captures two types of information, which are as follows:
- Application information: The application is submitted to the queue by the user and each application can have multiple application attempts. Each application attempt can launch multiple containers to complete the job. The Timeline server captures and provides detailed information and logs for each step involved in the application life cycle. It also provides a web interface to view the information.
- Framework information: YARN is capable of launching different types of applications such as map-reduce, spark, Tez, and so on. A MapReduce job may contain information such as the number of map and reduce tasks. A Spark job may contain information such as the number of executors and cores. This information changes based on the framework from which the YARN job is submitted. The Timeline server provides a web interface and a REST API to access this information.
In Hadoop version 3, there are major changes in the YARN Timeline server architecture. It addresses two major challenges that were in previous versions. They are solved in the current version, and are as follows:
- Scalability and reliability: The writer and reader in previous versions are limited to a single instance and thus it was difficult to handle a big cluster as the processing capabilities were limited. The current version of YARN in Hadoop version 3 uses a distributed writer and scalable storage. The reader and writer are loosely coupled and the readers instances are responsible to serve read requests received via the REST API. The primary storage for the current version of the Timeline server is HBase because of its ability to deliver a fast response for read and write requests.
- Flows and aggregation: The application life cycle in YARN consists of various steps. YARN may launch a set of applications to complete a logical application life cycle. A single application can consist of many sub applications, and we need to have aggregated metrics of the application. YARN aggregates the metrics from all sub applications and their attempts and make it available as an aggregated report of a application.