Identifying consensus of trees through alignment

Jason T.L. Wang, Kaizhong Zhang

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

In this paper we present a dynamic programming algorithm for identifying the similar consensus (SC) (or the largest approximately common substructures) of two ordered labeled trees based on the alignment distance. We consider a substructure of a tree T to be a connected subgraph of T. Given two trees T1, T2 and an integer d, the SC problem is to find a substructure U1 of T1 and a substructure U2 of T2 such that U1 is within distance d of U2 and where there does not exist any other substructure V1 of T1 and V2 of T2 such that V1 and V2 satisfy the distance constraint and the sum of the sizes of V1 and V2 is greater than the sum of the sizes of U1 and U2. The proposed algorithm solves the SC problem in time O(d2 × |T1| × |T2| × (G1 + G2)2), where |Ti|, i = 1, 2, is the size of tree Ti and Gi is the maximum degree of Ti. This is as fast as the best known algorithm for aligning two ordered labeled trees when the distance allowed in the common substructures is a constant independent of the input trees. To demonstrate the utility of our algorithm, we discuss its application to characterizing the conformation space of RNA sequences pertaining to the poliovirus, human rhinovirus and coxsackievirus, where the RNA structures are represented by ordered labeled trees.

Original languageEnglish (US)
Pages (from-to)165-189
Number of pages25
JournalInformation sciences
Volume126
Issue number1
DOIs
StatePublished - Jul 2000

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Identifying consensus of trees through alignment'. Together they form a unique fingerprint.

Cite this