Abstract
Density peaks clustering (DPC) algorithm is a novel density-based clustering algorithm, which is simple and efficient, is not necessary to specify the number of clusters in advance, and can find any nonspherical class clusters. However, DPC relies heavily on the calculation methods of the cutoff distance threshold and local density and cannot analyze complex manifold data, especially datasets with uneven density distribution and multiple peaks in the same cluster. To solve these problems, we propose an improved density peaks clustering algorithm based on the layered k-nearest neighbors and subcluster merging (LKSM_DPC). First, we redefine the local density calculation method using the layered k-nearest neighbors. To adapt to datasets with different densities, the k-nearest neighbors are divided into multiple layers. Second, for the multiple peaks in the same cluster problem, we design a new mechanism to calculate the similarity of subclusters based on the idea of shared neighbors and Newton's law of gravitation, and a subcluster merging strategy is proposed. To prove the effectiveness of our algorithm, we compare the LKSM_DPC with K-means, DBSCAN, DPC, and DPC derivatives for 24 datasets. A large number of experiments demonstrate that our algorithm can often outperform other algorithms.
Original language | English (US) |
---|---|
Article number | 9129690 |
Pages (from-to) | 123449-123468 |
Number of pages | 20 |
Journal | IEEE Access |
Volume | 8 |
DOIs | |
State | Published - 2020 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Keywords
- Density peaks clustering
- k-nearest neighbors
- multiple peaks
- shared neighbors
- subcluster merging
- the law of gravitation
- uneven density distribution