Energy-efficient mapping of big data workflows under deadline constraints?

Tong Shu, Chase Q. Wu

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations


Large-scale workflows for big data analytics have become a main consumer of energy in data centers where moldable parallel computing models such as MapReduce are widely applied to meet high computational demands with timevarying computing resources. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on energy efficiency, which remains largely unexplored. In this paper, we analyze the properties of moldable jobs and formulate a workflow mapping problem to minimize the dynamic energy consumption of a given workflow request under a deadline constraint. Since this problem is strongly NP-hard, we design a fully polynomialtime approximation scheme (FPTAS) for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic for the generalized problem with an arbitrary workflow on a heterogeneous cluster. The performance superiority of the proposed solution in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in Hadoop/YARN in comparison with existing algorithms.

Original languageEnglish (US)
Pages (from-to)34-43
Number of pages10
JournalCEUR Workshop Proceedings
StatePublished - 2016
Event11th Workshop on Workflows in Support of Large-Scale Science, WORKS 2016 - Salt Lake City, United States
Duration: Nov 14 2016 → …

All Science Journal Classification (ASJC) codes

  • General Computer Science


  • Big data
  • Green computing
  • Workflow mapping


Dive into the research topics of 'Energy-efficient mapping of big data workflows under deadline constraints?'. Together they form a unique fingerprint.

Cite this