TY - JOUR
T1 - An investigation of the fault-proneness of clone evolutionary patterns
AU - Barbour, Liliane
AU - An, Le
AU - Khomh, Foutse
AU - Zou, Ying
AU - Wang, Shaohua
N1 - Funding Information:
Acknowledgements The authors would like to thank the anonymous reviewers for their detailed feedback and useful suggestions that greatly contributed to improving this paper. This work has been partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
Publisher Copyright:
© 2017, Springer Science+Business Media New York.
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Two identical or similar code fragments form a clone pair. Previous studies have identified cloning as a risky practice. Therefore, a developer needs to be aware of any clone pairs in order to properly propagate any changes between clones. A clone pair may experience many changes during the creation and maintenance of a software system. A change can either maintain or remove the similarity between clones in a clone pair. If a change maintains the similarity between clones, the clone pair is left in a consistent state. When a change makes the clones no longer similar, the clone pair is left in an inconsistent state. The set of states and changes experienced by clone pairs over time form an evolution history known as a clone genealogy. In this paper, we examine clone genealogies to identify fault-prone “patterns” of states and changes. We explore the use of clone genealogy information in fault prediction. We conduct a quasi-experiment with four long-lived software systems (i.e., Apache Ant, ArgoUML, JEdit, Maven) and identify clones using the NiCad and iClones clone detection tools. Overall, we find that the size of the clone can impact the fault-proneness of a clone pair. However, there is no clear impact of the time interval between changes to a clone pair on the fault-proneness of the clone pair. We also discover that adding clone genealogy information can increase the explanatory power of fault prediction models.
AB - Two identical or similar code fragments form a clone pair. Previous studies have identified cloning as a risky practice. Therefore, a developer needs to be aware of any clone pairs in order to properly propagate any changes between clones. A clone pair may experience many changes during the creation and maintenance of a software system. A change can either maintain or remove the similarity between clones in a clone pair. If a change maintains the similarity between clones, the clone pair is left in a consistent state. When a change makes the clones no longer similar, the clone pair is left in an inconsistent state. The set of states and changes experienced by clone pairs over time form an evolution history known as a clone genealogy. In this paper, we examine clone genealogies to identify fault-prone “patterns” of states and changes. We explore the use of clone genealogy information in fault prediction. We conduct a quasi-experiment with four long-lived software systems (i.e., Apache Ant, ArgoUML, JEdit, Maven) and identify clones using the NiCad and iClones clone detection tools. Overall, we find that the size of the clone can impact the fault-proneness of a clone pair. However, there is no clear impact of the time interval between changes to a clone pair on the fault-proneness of the clone pair. We also discover that adding clone genealogy information can increase the explanatory power of fault prediction models.
KW - Clone genealogies
KW - Fault-proneness
KW - Metrics
UR - http://www.scopus.com/inward/record.url?scp=85020708563&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85020708563&partnerID=8YFLogxK
U2 - 10.1007/s11219-017-9375-5
DO - 10.1007/s11219-017-9375-5
M3 - Article
AN - SCOPUS:85020708563
SN - 0963-9314
VL - 26
SP - 1187
EP - 1222
JO - Software Quality Journal
JF - Software Quality Journal
IS - 4
ER -