SS-ClusterTree: A subspace clustering based indexing algorithm over high-dimensional image features

Hongli Xu, Dantong Yu, De Xu, Aidong Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

The rapid growth in the volume of image and video data collections motivates the research of building an index structure in image information retrieval. Constructing an index in the image database poses a very challenging problem due to the facts of image databases containing data with high dimensions and lack of domain knowledge. ClusterTree is an indexing approach representing clusters generated by any existing clustering approach and do not need any prior knowledge. It is a hierarchy of clusters and subcluster which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. However, one disadvantage of ClusterTree is that non-clustering data points are often ignored. These nonclustering data points might represent interesting targets in an image database. In this paper, we propose a modified ClusterTree structure (called SS-ClusterTree), which is based on subspace clustering. The SS-ClusterTree includes two kinds of leaf nodes, a cluster leaf node and a noise leaf node. When a new data item is added to the SS-ClusterTree, if it belongs to a cluster, it is inserted into the corresponding cluster leaf node, otherwise into the noise leaf node. The noise leaf node will be split when its volume is more than a certain threshold. We present a novel updating technique which optimizes the internal structure of the SS-ClusterTree by utilizing the Newton's Universal Law of Gravitation. When a noise node is split, the attraction forces are calculated between every new node and its sibling nodes. These new nodes may be merged by their sibling nodes, if the attraction force between them is the most significant. Meanwhile the nodes intersecting boundaries are updated. This approach guarantees that the SS-ClusterTree always represents the current dataset structure, and helps identify the pattern hiding in the newly added data. SS-ClusterTree can efficiently support the dynamic insertion and manage the dataset with non-clustering data, and is highly adaptive to any kind of cluster structure. Our experiment results also show that this index structure is effective and efficient.

Original languageEnglish (US)
Title of host publicationCIVR 2008 - Proceedings of the International Conference on Content-based Image and Video Retrieval
Pages95-104
Number of pages10
DOIs
StatePublished - 2008
Externally publishedYes
Event2008 International Conference on Image and Video Retrieval, CIVR 2008 - Niagara Falls, ON, United States
Duration: Jul 7 2008Jul 9 2008

Publication series

NameCIVR 2008 - Proceedings of the International Conference on Content-based Image and Video Retrieval

Other

Other2008 International Conference on Image and Video Retrieval, CIVR 2008
Country/TerritoryUnited States
CityNiagara Falls, ON
Period7/7/087/9/08

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Software

Keywords

  • Cluster Tree
  • Index
  • Retrieval
  • Subspace cluster

Fingerprint

Dive into the research topics of 'SS-ClusterTree: A subspace clustering based indexing algorithm over high-dimensional image features'. Together they form a unique fingerprint.

Cite this