Abstract
We propose an algorithm based on cross-validation to estimate the number of communities in a general non-uniform hypergraph model. The algorithm involves a three-step process. Initially, it randomly divides the set of hyperedges into a training set and a testing set. Subsequently, for each candidate number of communities, we construct a spectral estimation of community labels and least square estimation of the hyperedge probabilities based on the training set. The final step involves the computation of cross-validation scores using the testing set. The proposed algorithm is shown to be consistent when the number of vertices tends to infinity.
| Original language | English (US) |
|---|---|
| Article number | e70066 |
| Journal | Stat |
| Volume | 14 |
| Issue number | 2 |
| DOIs | |
| State | Published - Jun 2025 |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
Keywords
- cross-validation
- model selection consistency
- non-uniform hypergraph
- number of communities