Abstract
In this manuscript, we reinvestigate an existing clustering procedure, optimal discriminant clustering (ODC; Zhang and Dai in Adv Neural Inf Process Syst 23(12):2241–2249, 2009), and propose to use cross-validation to select the tuning parameter. Furthermore, because in high-dimensional data many of the features may be non-informative for clustering, we develop a variation of ODC, sparse optimal discriminant clustering (SODC), by adding a group-lasso type of penalty to ODC. We also demonstrate that both ODC and SDOC can be used as a dimension reduction tool for data visualization in cluster analysis.
Original language | English (US) |
---|---|
Pages (from-to) | 629-639 |
Number of pages | 11 |
Journal | Statistics and Computing |
Volume | 26 |
Issue number | 3 |
DOIs | |
State | Published - May 1 2016 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Computational Theory and Mathematics
Keywords
- Cluster analysis
- Cross-validation
- High-dimensional data
- Optimal score
- Principal components analysis
- Tuning parameter
- Variable selection