Abstract
Multi-label classification is a common challenge in various machine learning applications, where a single data instance can be associated with multiple classes simultaneously. The current paper proposes a novel tree-based method for multi-label classification using conformal prediction and multiple hypothesis testing. The proposed method employs hierarchical clustering with labelsets to develop a hierarchical tree, which is then formulated as a multiple-testing problem with a hierarchical structure. The split-conformal prediction method is used to obtain marginal conformal p-values for each tested hypothesis, and two hierarchical testing procedures are developed based on marginal conformal p-values, including a hierarchical Bonferroni procedure and its modification for controlling the family-wise error rate. The prediction sets are thus formed based on the testing outcomes of these two procedures. We establish a theoretical guarantee of valid coverage for the prediction sets through proven family-wise error rate control of those two procedures. We demonstrate the effectiveness of our method in a simulation study and two real data analysis compared to other conformal methods for multi-label classification.
Original language | English (US) |
---|---|
Pages (from-to) | 488-512 |
Number of pages | 25 |
Journal | Proceedings of Machine Learning Research |
Volume | 204 |
State | Published - 2023 |
Event | 12th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2023 - Limassol, Cyprus Duration: Sep 13 2023 → Sep 15 2023 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Conformal Prediction
- Family-wise Error Rate
- Hierarchical Tree
- Multi-label Classification
- Multiple-Testing