Improving k-NN graph accuracy using local intrinsic dimensionality

Michael E. Houle, Vincent Oria, Arwa M. Wali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

The k-nearest neighbor (k-NN) graph is an important data structure for many data mining and machine learning applications. The accuracy of k-NN graphs depends on the object feature vectors, which are usually represented in high-dimensional spaces. Selecting the most important features is essential for providing compact object representations and for improving the graph accuracy. Having a compact feature vector can reduce the storage space and the computational complexity of search and learning tasks. In this paper, we propose NNWID-Descent, a similarity graph construction method that utilizes the NNF-Descent framework while integrating a new feature selection criterion, Support-Weighted Intrinsic Dimensionality, that estimates the contribution of each feature to the overall intrinsic dimensionality. Through extensive experiments on various datasets, we show that NNWID-Descent allows a significant amount of local feature vector sparsification while still preserving a reasonable level of graph accuracy.

Original languageEnglish (US)
Title of host publicationSimilarity Search and Applications - 10th International Conference, SISAP 2017, Proceedings
EditorsFelix Borutta, Peer Kroger, Thomas Seidl, Christian Beecks
PublisherSpringer Verlag
Pages110-124
Number of pages15
ISBN (Print)9783319684734
DOIs
StatePublished - 2017
Event10th International Conference on Similarity Search and Applications, SISAP 2017 - Munich, Germany
Duration: Oct 4 2017Oct 6 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10609 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other10th International Conference on Similarity Search and Applications, SISAP 2017
Country/TerritoryGermany
CityMunich
Period10/4/1710/6/17

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Keywords

  • Feature selection
  • Intrinsic dimensionality
  • Vector sparsification
  • k-nearest neighbor graph

Fingerprint

Dive into the research topics of 'Improving k-NN graph accuracy using local intrinsic dimensionality'. Together they form a unique fingerprint.

Cite this