Dynamic Priority Job Scheduling on a Hadoop YARN Platform

Nana Du, Yudong Ji, Aiqin Hou, Chase Wu, Weike Nie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In Hadoop's big data processing systems, YARN is responsible for resource management and job scheduling. The built-in job scheduling algorithms in YARN are simple to execute, but have some limitations such as job starvation, excessive server load, and load imbalance. In this paper, we propose a new Hybrid Dynamic Priority job Scheduling algorithm (HDPS) to address these limitations. HDPS dynamically adjusts the priority of a job as its waiting time increases to prevent job starvation. It also features a task assignment strategy designed specifically to address data locality by considering the available resources of servers and the distribution of data blocks stored on servers to reduce data transfer time and improve job execution efficiency. We implement and integrate HDPS into YARN and conduct experiments in a real Hadoop system using built-in benchmark test cases of Hadoop. Experimental results show that HDPS exhibits comprehensive superior performance over existing algorithms in terms of execution efficiency and load balance.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE 29th International Conference on Parallel and Distributed Systems, ICPADS 2023
PublisherIEEE Computer Society
Pages412-419
Number of pages8
ISBN (Electronic)9798350330717
DOIs
StatePublished - 2023
Event29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023 - Ocean Flower Island, Hainan, China
Duration: Dec 17 2023Dec 21 2023

Publication series

NameProceedings of the International Conference on Parallel and Distributed Systems - ICPADS
ISSN (Print)1521-9097

Conference

Conference29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023
Country/TerritoryChina
CityOcean Flower Island, Hainan
Period12/17/2312/21/23

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Keywords

  • Data Locality
  • Hadoop
  • Job Scheduling
  • MapReduce
  • YARN

Fingerprint

Dive into the research topics of 'Dynamic Priority Job Scheduling on a Hadoop YARN Platform'. Together they form a unique fingerprint.

Cite this