Abstract
With the advent of next-generation scientific applications, the workflow approach that integrates various computing and networking technologies has provided a viable solution to managing and optimizing large-scale distributed data transfer, processing, and analysis. This paper investigates a problem of mapping distributed scientific workflows for maximum throughput in faulty networks where nodes and links are subject to probabilistic failures. We formulate this problem as a bi-objective optimization problem to maximize both throughput and reliability. By adapting and modifying a centralized fault-free workflow mapping scheme, we propose a new mapping algorithm to achieve high throughput for smooth data flow in a distributed manner while satisfying a pre-specified bound of the overall failure rate for a guaranteed level of reliability. The performance superiority of the proposed solution is illustrated by both extensive simulation-based comparisons with existing algorithms and experimental results from a real-life scientific workflow deployed in wide-area networks.
Original language | English (US) |
---|---|
Pages (from-to) | 361-379 |
Number of pages | 19 |
Journal | Journal of Grid Computing |
Volume | 11 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2013 |
All Science Journal Classification (ASJC) codes
- Software
- Information Systems
- Hardware and Architecture
- Computer Networks and Communications
Keywords
- Distributed algorithm
- Fault tolerance
- Throughput
- Workflow mapping