Abstract
In this manuscript, we reinvestigate an existing clustering procedure, optimal discriminant clustering (ODC; Zhang and Dai in Adv Neural Inf Process Syst 23(12):2241–2249, 2009), and propose to use cross-validation to select the tuning parameter. Furthermore, because in high-dimensional data many of the features may be non-informative for clustering, we develop a variation of ODC, sparse optimal discriminant clustering (SODC), by adding a group-lasso type of penalty to ODC. We also demonstrate that both ODC and SDOC can be used as a dimension reduction tool for data visualization in cluster analysis.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 629-639 |
| Number of pages | 11 |
| Journal | Statistics and Computing |
| Volume | 26 |
| Issue number | 3 |
| DOIs | |
| State | Published - May 1 2016 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Computational Theory and Mathematics
Keywords
- Cluster analysis
- Cross-validation
- High-dimensional data
- Optimal score
- Principal components analysis
- Tuning parameter
- Variable selection