Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments

Qishi Wu, Yi Gu

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


Supporting high-performance data-intensive computing pipelines in wide-area networks is crucial for enabling large-scale distributed scientific applications that require minimizing end-to-end delay for single-input applications or maximizing frame rate for streaming applications. We formulate and categorize the data-intensive computing pipeline mapping problems into six classes with two optimization objectives, i.e. minimum end-to-end delay and maximum frame rate, and three network constraints, i.e. no, contiguous, and arbitrary node reuse. We design a dynamic programming-based optimal solution to the problem of minimum end-to-end delay with arbitrary node reuse and prove the NP-completeness of the rest five problems, for each of which, a heuristic algorithm based on a similar optimization procedure is proposed. These heuristics are implemented and tested on a large set of simulated pipelines and networks of various scales and their performance superiority is illustrated by extensive simulation results in comparison with existing methods.

Original languageEnglish (US)
Pages (from-to)254-265
Number of pages12
JournalJournal of Parallel and Distributed Computing
Issue number2
StatePublished - Feb 2011
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications


  • Data-intensive computing
  • End-to-end delay
  • Frame rate
  • Performance optimization
  • Pipeline


Dive into the research topics of 'Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments'. Together they form a unique fingerprint.

Cite this