Energy-efficient dynamic scheduling of deadline-constrained MapReduce workflows

Tong Shu, Chase Q. Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations


Big data workflows comprised of moldable parallel MapReduce programs running on a large number of processors have become a main consumer of energy at data centers. The degree of parallelism of each moldable job in such workflows has a significant impact on the energy efficiency of parallel computing systems, which remains largely unexplored. In this paper, we validate with experimental results the moldable parallel computing model where the dynamic energy consumption of a moldable job increases with the number of parallel tasks. Based on our validation, we construct rigorous cost models and formulate a dynamic scheduling problem of deadline-constrained MapReduce workflows to minimize energy consumption in Hadoop systems. We propose a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also design the corresponding system modules for algorithm implementation in Hadoop architecture. The performance superiority of the proposed algorithm in terms of dynamic energy saving and deadline violation is illustrated by extensive simulation results in Hadoop/YARN in comparison with existing algorithms, and the core module of adaptive task partitioning is further validated through real-life workflow implementation and experimental results using the Oozie workflow engine in Hadoop/YARN systems.

Original languageEnglish (US)
Title of host publicationProceedings - 13th IEEE International Conference on eScience, eScience 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages10
ISBN (Electronic)9781538626863
StatePublished - Nov 14 2017
Event13th IEEE International Conference on eScience, eScience 2017 - Auckland, New Zealand
Duration: Oct 24 2017Oct 27 2017

Publication series

NameProceedings - 13th IEEE International Conference on eScience, eScience 2017


Other13th IEEE International Conference on eScience, eScience 2017
Country/TerritoryNew Zealand

All Science Journal Classification (ASJC) codes

  • Agricultural and Biological Sciences (miscellaneous)
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Computer Networks and Communications
  • Computer Science Applications
  • Computers in Earth Sciences
  • Social Sciences (miscellaneous)


  • Big data
  • MapReduce
  • job scheduling
  • scientific workflow


Dive into the research topics of 'Energy-efficient dynamic scheduling of deadline-constrained MapReduce workflows'. Together they form a unique fingerprint.

Cite this