Efficient motif discovery for large-scale time series in healthcare

Bo Liu, Jianqiang Li, Cheng Chen, Wei Tan, Qiang Chen, Mengchu Zhou

Research output: Contribution to journalArticlepeer-review

58 Scopus citations


Analyzing time series data can reveal the temporal behavior of the underlying mechanism producing the data. Time series motifs, which are similar subsequences or frequently occurring patterns, have significant meanings for researchers especially in medical domain. With the fast growth of time series data, traditional methods for motif discovery are inefficient and not applicable to large-scale data. This work proposes an efficient Motif Discovery method for Large-scale time series (MDLats). By computing standard motifs, MDLats eliminates a majority of redundant computation in the related arts and reuses existing information to the maximum. All the motif types and subsequences are generated for subsequent analysis and classification. Our system is implemented on a Hadoop platform and deployed in a hospital for clinical electrocardiography classification. The experiments on real-world healthcare data show that MDLats outperform the state-of-the-art methods even in large time series.

Original languageEnglish (US)
Article number7056438
Pages (from-to)583-590
Number of pages8
JournalIEEE Transactions on Industrial Informatics
Issue number3
StatePublished - Jun 1 2015

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Information Systems
  • Computer Science Applications
  • Electrical and Electronic Engineering


  • Data mining
  • Motif
  • Pattern discovery
  • Time series


Dive into the research topics of 'Efficient motif discovery for large-scale time series in healthcare'. Together they form a unique fingerprint.

Cite this