Performance optimization of budget-constrained mapreduce workflows in multi-clouds

Huiyan Cao, Chase Q. Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

With the rapid deployment of cloud infrastructures around the globe and the economic benefit of cloud-based computing and storage services, an increasing number of scientific workflows have been shifted or are in active transition to clouds. As the scale of scientific applications continues to grow, it is now common to deploy data-and network-intensive computing workflows across multi-clouds, where inter-cloud data transfer has a significant impact on both workflow performance and financial cost. We construct rigorous mathematical models to analyze intra-and inter-cloud execution dynamics of scientific workflows and formulate a budget-constrained workflow mapping problem to optimize the network performance of MapReduce-based scientific workflows in Hadoop systems in multi-cloud environments. We show this problem to be NP-complete and design a heuristic solution that takes into consideration module execution, data transfer, and I/O operations. The performance superiority of the proposed mapping solution over existing methods is illustrated through extensive simulations and further verified by real-life workflow experiments deployed in public clouds. We observe about 15% discrepancy between our theoretical estimates and real-world experimental measurements, which validates the correctness of our cost models and also ensures accurate workflow mapping in real systems.

Original languageEnglish (US)
Title of host publicationProceedings - 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages243-252
Number of pages10
ISBN (Electronic)9781538658154
DOIs
StatePublished - Jul 13 2018
Event18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018 - Washington, United States
Duration: May 1 2018May 4 2018

Publication series

NameProceedings - 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018

Other

Other18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018
Country/TerritoryUnited States
CityWashington
Period5/1/185/4/18

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture

Keywords

  • Cloud computing
  • MapReduce
  • Performance optimization
  • Scientific workflows
  • Workflow mapping

Fingerprint

Dive into the research topics of 'Performance optimization of budget-constrained mapreduce workflows in multi-clouds'. Together they form a unique fingerprint.

Cite this