Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments

Chase Wu, Yi Gu

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


Supporting high-performance data-intensive computing pipelines in wide-area networks is crucial for enabling large-scale distributed scientific applications that require minimizing end-to-end delay for single-input applications or maximizing frame rate for streaming applications. We formulate and categorize the data-intensive computing pipeline mapping problems into six classes with two optimization objectives, i.e. minimum end-to-end delay and maximum frame rate, and three network constraints, i.e. no, contiguous, and arbitrary node reuse. We design a dynamic programming-based optimal solution to the problem of minimum end-to-end delay with arbitrary node reuse and prove the NP-completeness of the rest five problems, for each of which, a heuristic algorithm based on a similar optimization procedure is proposed. These heuristics are implemented and tested on a large set of simulated pipelines and networks of various scales and their performance superiority is illustrated by extensive simulation results in comparison with existing methods.

Original languageEnglish (US)
Pages (from-to)254-265
Number of pages12
JournalJournal of Parallel and Distributed Computing
Issue number2
StatePublished - Feb 1 2011
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence


Dive into the research topics of 'Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments'. Together they form a unique fingerprint.

Cite this