Using multi-scale histograms to answer pattern existence and shape match queries over time series data

Lei Chen, M. Tamer Ozsu, Vincent Oria

Research output: Contribution to journalConference articlepeer-review

8 Scopus citations

Abstract

Similarity-based querying of time series data can be categorized as pattern existence queries and shape match queries. Pattern existence queries find the time series data with certain patterns while shape match queries look for the time series data that have similar movement shapes. Existing proposals address one of these or the other. In this paper, we propose multi-scale time series histograms that can be used to answer both types of queries, thus offering users more flexibility. Multiple histogram levels allow querying at various precision levels. Most importantly, the distances of time series histograms at lower scale are lower bounds of the distances at higher scale, which guarantees that no false dismissals will be introduced when a multi-step filtering process is used in answering shape match queries. We further propose to use averages of time series histograms to reduce the dimensionality and avoid computing the distances of full time series histograms. The experimental results show that multi-scale histograms can effectively find the patterns in time series data and answer shape match queries, even when the data contain noise, time shifting and scaling, or amplitude shifting and scaling.

Original languageEnglish (US)
Pages (from-to)217-226
Number of pages10
JournalProceedings of the International Conference on Scientific and Statistical Database Management, SSDBM
StatePublished - Jan 1 2005
Event17th International Conference Scientific and Statistical Database Management, SSDBM 2005 - Santa Barbara, CA, United States
Duration: Jun 27 2005Jun 29 2005

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Using multi-scale histograms to answer pattern existence and shape match queries over time series data'. Together they form a unique fingerprint.

Cite this