TY - GEN
T1 - SS-ClusterTree
T2 - 2008 International Conference on Image and Video Retrieval, CIVR 2008
AU - Xu, Hongli
AU - Yu, Dantong
AU - Xu, De
AU - Zhang, Aidong
PY - 2008
Y1 - 2008
N2 - The rapid growth in the volume of image and video data collections motivates the research of building an index structure in image information retrieval. Constructing an index in the image database poses a very challenging problem due to the facts of image databases containing data with high dimensions and lack of domain knowledge. ClusterTree is an indexing approach representing clusters generated by any existing clustering approach and do not need any prior knowledge. It is a hierarchy of clusters and subcluster which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. However, one disadvantage of ClusterTree is that non-clustering data points are often ignored. These nonclustering data points might represent interesting targets in an image database. In this paper, we propose a modified ClusterTree structure (called SS-ClusterTree), which is based on subspace clustering. The SS-ClusterTree includes two kinds of leaf nodes, a cluster leaf node and a noise leaf node. When a new data item is added to the SS-ClusterTree, if it belongs to a cluster, it is inserted into the corresponding cluster leaf node, otherwise into the noise leaf node. The noise leaf node will be split when its volume is more than a certain threshold. We present a novel updating technique which optimizes the internal structure of the SS-ClusterTree by utilizing the Newton's Universal Law of Gravitation. When a noise node is split, the attraction forces are calculated between every new node and its sibling nodes. These new nodes may be merged by their sibling nodes, if the attraction force between them is the most significant. Meanwhile the nodes intersecting boundaries are updated. This approach guarantees that the SS-ClusterTree always represents the current dataset structure, and helps identify the pattern hiding in the newly added data. SS-ClusterTree can efficiently support the dynamic insertion and manage the dataset with non-clustering data, and is highly adaptive to any kind of cluster structure. Our experiment results also show that this index structure is effective and efficient.
AB - The rapid growth in the volume of image and video data collections motivates the research of building an index structure in image information retrieval. Constructing an index in the image database poses a very challenging problem due to the facts of image databases containing data with high dimensions and lack of domain knowledge. ClusterTree is an indexing approach representing clusters generated by any existing clustering approach and do not need any prior knowledge. It is a hierarchy of clusters and subcluster which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. However, one disadvantage of ClusterTree is that non-clustering data points are often ignored. These nonclustering data points might represent interesting targets in an image database. In this paper, we propose a modified ClusterTree structure (called SS-ClusterTree), which is based on subspace clustering. The SS-ClusterTree includes two kinds of leaf nodes, a cluster leaf node and a noise leaf node. When a new data item is added to the SS-ClusterTree, if it belongs to a cluster, it is inserted into the corresponding cluster leaf node, otherwise into the noise leaf node. The noise leaf node will be split when its volume is more than a certain threshold. We present a novel updating technique which optimizes the internal structure of the SS-ClusterTree by utilizing the Newton's Universal Law of Gravitation. When a noise node is split, the attraction forces are calculated between every new node and its sibling nodes. These new nodes may be merged by their sibling nodes, if the attraction force between them is the most significant. Meanwhile the nodes intersecting boundaries are updated. This approach guarantees that the SS-ClusterTree always represents the current dataset structure, and helps identify the pattern hiding in the newly added data. SS-ClusterTree can efficiently support the dynamic insertion and manage the dataset with non-clustering data, and is highly adaptive to any kind of cluster structure. Our experiment results also show that this index structure is effective and efficient.
KW - Cluster Tree
KW - Index
KW - Retrieval
KW - Subspace cluster
UR - http://www.scopus.com/inward/record.url?scp=57549106262&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=57549106262&partnerID=8YFLogxK
U2 - 10.1145/1386352.1386369
DO - 10.1145/1386352.1386369
M3 - Conference contribution
AN - SCOPUS:57549106262
SN - 9781605580708
T3 - CIVR 2008 - Proceedings of the International Conference on Content-based Image and Video Retrieval
SP - 95
EP - 104
BT - CIVR 2008 - Proceedings of the International Conference on Content-based Image and Video Retrieval
Y2 - 7 July 2008 through 9 July 2008
ER -