TY - JOUR
T1 - CM-CASL
T2 - Comparison-based performance modeling of software systems via collaborative active and semisupervised learning
AU - Cao, Rong
AU - Bao, Liang
AU - Wu, Chase
AU - Zhangsun, Panpan
AU - Li, Yufei
AU - Zhang, Zhe
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/7
Y1 - 2023/7
N2 - Configuration tuning for large software systems is generally challenging due to the complex configuration space and expensive performance evaluation. Most existing approaches follow a two-phase process, first learning a regression-based performance prediction model on available samples and then searching for the configurations with satisfactory performance using the learned model. Such regression-based models often suffer from the scarcity of samples due to the enormous time and resources required to run a large software system with a specific configuration. Moreover, previous studies have shown that even a highly accurate regression-based model may fail to discern the relative merit between two configurations, whereas performance comparison is actually one fundamental strategy for configuration tuning. To address these issues, this paper proposes CM-CASL, a Comparison-based performance Modeling approach for software systems via Collaborative Active and Semisupervised Learning. CM-CASL learns a classification model that compares the performance of two given configurations, and enhances the samples through a collaborative labeling process by both human experts and classifiers using an integration of active and semisupervised learning. Experimental results demonstrate that CM-CASL outperforms two state-of-the-art performance modeling approaches in terms of both classification accuracy and rank accuracy, and thus provides a better performance model for the subsequent work of configuration tuning.
AB - Configuration tuning for large software systems is generally challenging due to the complex configuration space and expensive performance evaluation. Most existing approaches follow a two-phase process, first learning a regression-based performance prediction model on available samples and then searching for the configurations with satisfactory performance using the learned model. Such regression-based models often suffer from the scarcity of samples due to the enormous time and resources required to run a large software system with a specific configuration. Moreover, previous studies have shown that even a highly accurate regression-based model may fail to discern the relative merit between two configurations, whereas performance comparison is actually one fundamental strategy for configuration tuning. To address these issues, this paper proposes CM-CASL, a Comparison-based performance Modeling approach for software systems via Collaborative Active and Semisupervised Learning. CM-CASL learns a classification model that compares the performance of two given configurations, and enhances the samples through a collaborative labeling process by both human experts and classifiers using an integration of active and semisupervised learning. Experimental results demonstrate that CM-CASL outperforms two state-of-the-art performance modeling approaches in terms of both classification accuracy and rank accuracy, and thus provides a better performance model for the subsequent work of configuration tuning.
KW - Active learning
KW - Comparison-based model
KW - Performance modeling
KW - Semisupervised learning
KW - Software systems
UR - http://www.scopus.com/inward/record.url?scp=85151789671&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85151789671&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2023.111686
DO - 10.1016/j.jss.2023.111686
M3 - Article
AN - SCOPUS:85151789671
SN - 0164-1212
VL - 201
JO - Journal of Systems and Software
JF - Journal of Systems and Software
M1 - 111686
ER -