Abstract
Supporting high-performance data-intensive computing pipelines in wide-area networks is crucial for enabling large-scale distributed scientific applications that require minimizing end-to-end delay for single-input applications or maximizing frame rate for streaming applications. We formulate and categorize the data-intensive computing pipeline mapping problems into six classes with two optimization objectives, i.e. minimum end-to-end delay and maximum frame rate, and three network constraints, i.e. no, contiguous, and arbitrary node reuse. We design a dynamic programming-based optimal solution to the problem of minimum end-to-end delay with arbitrary node reuse and prove the NP-completeness of the rest five problems, for each of which, a heuristic algorithm based on a similar optimization procedure is proposed. These heuristics are implemented and tested on a large set of simulated pipelines and networks of various scales and their performance superiority is illustrated by extensive simulation results in comparison with existing methods.
Original language | English (US) |
---|---|
Pages (from-to) | 254-265 |
Number of pages | 12 |
Journal | Journal of Parallel and Distributed Computing |
Volume | 71 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2011 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Artificial Intelligence
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
Keywords
- Data-intensive computing
- End-to-end delay
- Frame rate
- Performance optimization
- Pipeline