TY - JOUR
T1 - Automated, highly-accurate, bug assignment using machine learning and tossing graphs
AU - Bhattacharya, Pamela
AU - Neamtiu, Iulian
AU - Shelton, Christian R.
N1 - Funding Information:
This research was supported in part by NSF grant CCF-1149632 . We thank the anonymous referees for their helpful comments on this paper.
PY - 2012/10
Y1 - 2012/10
N2 - Empirical studies indicate that automating the bug assignment process has the potential to significantly reduce software evolution effort and costs. Prior work has used machine learning techniques to automate bug assignment but has employed a narrow band of tools which can be ineffective in large, long-lived software projects. To redress this situation, in this paper we employ a comprehensive set of machine learning tools and a probabilistic graph-based model (bug tossing graphs) that lead to highly-accurate predictions, and lay the foundation for the next generation of machine learning-based bug assignment. Our work is the first to examine the impact of multiple machine learning dimensions (classifiers, attributes, and training history) along with bug tossing graphs on prediction accuracy in bug assignment. We validate our approach on Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 86.09 prediction accuracy in bug assignment and significantly reduce tossing path lengths. We show that for our data sets the Naïve Bayes classifier coupled with product-component features, tossing graphs and incremental learning performs best. Next, we perform an ablative analysis by unilaterally varying classifiers, features, and learning model to show their relative importance of on bug assignment accuracy. Finally, we propose optimization techniques that achieve high prediction accuracy while reducing training and prediction time.
AB - Empirical studies indicate that automating the bug assignment process has the potential to significantly reduce software evolution effort and costs. Prior work has used machine learning techniques to automate bug assignment but has employed a narrow band of tools which can be ineffective in large, long-lived software projects. To redress this situation, in this paper we employ a comprehensive set of machine learning tools and a probabilistic graph-based model (bug tossing graphs) that lead to highly-accurate predictions, and lay the foundation for the next generation of machine learning-based bug assignment. Our work is the first to examine the impact of multiple machine learning dimensions (classifiers, attributes, and training history) along with bug tossing graphs on prediction accuracy in bug assignment. We validate our approach on Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 86.09 prediction accuracy in bug assignment and significantly reduce tossing path lengths. We show that for our data sets the Naïve Bayes classifier coupled with product-component features, tossing graphs and incremental learning performs best. Next, we perform an ablative analysis by unilaterally varying classifiers, features, and learning model to show their relative importance of on bug assignment accuracy. Finally, we propose optimization techniques that achieve high prediction accuracy while reducing training and prediction time.
KW - Bug assignment
KW - Bug tossing
KW - Empirical studies
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=84863612007&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863612007&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2012.04.053
DO - 10.1016/j.jss.2012.04.053
M3 - Article
AN - SCOPUS:84863612007
SN - 0164-1212
VL - 85
SP - 2275
EP - 2292
JO - Journal of Systems and Software
JF - Journal of Systems and Software
IS - 10
ER -