Supporting high-performance data-intensive computing pipelines in wide-area networks is crucial for enabling large-scale distributed scientific applications that require minimizing end-to-end delay for single-input applications or maximizing frame rate for streaming applications. We formulate and categorize the data-intensive computing pipeline mapping problems into six classes with two optimization objectives, i.e. minimum end-to-end delay and maximum frame rate, and three network constraints, i.e. no, contiguous, and arbitrary node reuse. We design a dynamic programming-based optimal solution to the problem of minimum end-to-end delay with arbitrary node reuse and prove the NP-completeness of the rest five problems, for each of which, a heuristic algorithm based on a similar optimization procedure is proposed. These heuristics are implemented and tested on a large set of simulated pipelines and networks of various scales and their performance superiority is illustrated by extensive simulation results in comparison with existing methods.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence