Abstract
Recently many regularized estimators of large covariance matrices have been proposed, and the tuning parameters in these estimators are usually selected via cross-validation. However, there is a lack of consensus on the number of folds for conducting cross-validation. One round of cross-validation involves partitioning a sample of data into two complementary subsets, a training set and a validation set. In this manuscript, we demonstrate that if the estimation accuracy is measured in the Frobenius norm, the training set should consist of majority of the data; whereas if the estimation accuracy is measured in the operator norm, the validation set should consist of majority of the data. We also develop methods for selecting tuning parameters based on the bootstrap and compare them with their cross-validation counterparts. We demonstrate that the cross-validation methods with ‘optimal’ choices of folds are more appropriate than their bootstrap counterparts.
Original language | English (US) |
---|---|
Pages (from-to) | 494-509 |
Number of pages | 16 |
Journal | Journal of Statistical Computation and Simulation |
Volume | 86 |
Issue number | 3 |
DOIs | |
State | Published - Feb 11 2016 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty
- Applied Mathematics
Keywords
- Frobenius norm
- banding
- bootstrap
- covariance matrix
- cross-validation
- operator norm
- thresholding