Generating sound workflow views for correct provenance analysis

Ziyang Liu, Susan B. Davidson, Yi Chen

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Workflow views abstract groups of tasks in a workflow into high level composite tasks, in order to reuse subworkflows and facilitate provenance analysis. However, unless a view is carefully designed, it may not preserve the dataflow between tasks in the workflow, that is, it may not be sound. Unsound views can be misleading and cause incorrect provenance analysis. This article studies the problem of efficiently identifying and correcting unsound workflow views with minimal changes, and constructing minimal sound and elucidative workflow views with a set of user-specified relevant tasks. In particular, two related problems are investigated. First, given a workflow view, we wish to split each unsound composite task into the minimal number of tasks, such that the resulting view is sound. Second, given a workflow and a set of user specified relevant tasks, we generate a sound view, such that each composite task contains at most one relevant task, and the total number of tasks is minimized. We prove that both problems are NP-hard by reduction from independent set. We then propose two local optimality conditions (weak and strong) for each problem, and design polynomial time algorithms for both problems to meet these conditions. Experiments show that our proposed algorithms are reasonably effective and efficient. The proposed techniques are useful for view analysis/construction for not only workflows, but general networks as well.

Original languageEnglish (US)
Article number6
JournalACM Transactions on Database Systems
Issue number1
StatePublished - Mar 2011
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems


  • Network
  • Provenance
  • Soundness
  • View
  • Workflow


Dive into the research topics of 'Generating sound workflow views for correct provenance analysis'. Together they form a unique fingerprint.

Cite this