Abstract
Next-generation e-Science features large-scale, compute-intensive workflows of many computing modules that are typically executed in a distributed manner. With the recent emergence of cloud computing and the rapid deployment of cloud infrastructures, an increasing number of scientific workflows have been shifted or are in active transition to cloud environments. As cloud computing makes computing a utility, scientists across different application domains are facing the same challenge of reducing financial cost in addition to meeting the traditional goal of performance optimization. We develop a prototype generic workflow system by leveraging existing technologies for a quick evaluation of scientific workflow optimization strategies. We construct analytical models to quantify the network performance of scientific workflows using cloud-based computing resources, and formulate a task scheduling problem to minimize the workflow end-to-end delay under a user-specified financial constraint. We rigorously prove that the proposed problem is not only NP-complete but also non-approximable. We design a heuristic solution to this problem, and illustrate its performance superiority over existing methods through extensive simulations and real-life workflow experiments based on proof-of-concept implementation and deployment in a local cloud testbed.
Original language | English (US) |
---|---|
Article number | 6898826 |
Pages (from-to) | 169-181 |
Number of pages | 13 |
Journal | IEEE Transactions on Cloud Computing |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - Apr 1 2015 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Information Systems
- Hardware and Architecture
- Computer Science Applications
- Computer Networks and Communications
Keywords
- Scientific workflows
- cloud computing
- workflow scheduling