Scientific data mining: A case study

Chia Yo Chang, Jason T.L. Wang, Roger K. Chang

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Scientific data mining is the activity of finding significant information in scientific data. This paper presents an example of scientific data mining: the discovery of approximately common patterns in RNA secondary structures. We represent an RNA secondary structure by an ordered labeled tree based on a previously proposed scheme. The patterns in the trees are substructures that can differ in both substitutions and deletions/insertions of nodes of the trees. Our techniques incorporate approximate tree matching algorithms and novel heuristics for discovery and optimization. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. It is shown that the optimization heuristics speed up the discovery algorithm by a factor of 10. Moreover, our optimized approach is 100,000 times faster than the brute force method.

Original languageEnglish (US)
Pages (from-to)77-96
Number of pages20
JournalInternational Journal of Software Engineering and Knowledge Engineering
Volume8
Issue number1
DOIs
StatePublished - Mar 1998

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Keywords

  • Ordered labeled trees
  • Pattern matching
  • Query optimization heuristics
  • RNA secondary structures
  • Scientific databases

Fingerprint Dive into the research topics of 'Scientific data mining: A case study'. Together they form a unique fingerprint.

Cite this