TY - GEN
T1 - A Hybrid Approach for Inference between Behavioral Exception API Documentation and Implementations, and Its Applications
AU - Nguyen, Hoan Anh
AU - Phan, Hung Dang
AU - Khairunnesa, Samantha Syeda
AU - Nguyen, Son
AU - Yadavally, Aashish
AU - Wang, Shaohua
AU - Rajan, Hridesh
AU - Nguyen, Tien
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/9/19
Y1 - 2022/9/19
N2 - Automatically producing behavioral exception (BE) API documentation helps developers correctly use the libraries. The state-of-the-art approaches are either rule-based, which is too restrictive in its applicability, or deep learning (DL)-based, which requires large training dataset. To address that, we propose StatGen, a novel hybrid approach between statistical machine translation (SMT) and tree-structured translation to generate the BE documentation for any code and vice versa. We consider the documentation and source code of an API method as the two abstraction levels of the same intent. StatGen is specifically designed for this two-way inference, and takes advantage of their structures for higher accuracy. We conducted several experiments to evaluate StatGen. We show that it achieves high precision (75% and 75%), and recall (81% and 84%), in inferring BE documentation from source code and vice versa. StatGen achieves higher precision, recall, and BLEU score than the state-of-the-art, DL-based baseline models. We show StatGen's usefulness in two applications. First, we use it to generate the BE documentation for Apache APIs that lack of documentation by learning from the documentation of the equivalent APIs in JDK. 44% of the generated documentation were rated as useful and 42% as somewhat useful. In the second application, we use StatGen to detect the inconsistency between the BE documentation and corresponding implementations of several JDK8 packages.
AB - Automatically producing behavioral exception (BE) API documentation helps developers correctly use the libraries. The state-of-the-art approaches are either rule-based, which is too restrictive in its applicability, or deep learning (DL)-based, which requires large training dataset. To address that, we propose StatGen, a novel hybrid approach between statistical machine translation (SMT) and tree-structured translation to generate the BE documentation for any code and vice versa. We consider the documentation and source code of an API method as the two abstraction levels of the same intent. StatGen is specifically designed for this two-way inference, and takes advantage of their structures for higher accuracy. We conducted several experiments to evaluate StatGen. We show that it achieves high precision (75% and 75%), and recall (81% and 84%), in inferring BE documentation from source code and vice versa. StatGen achieves higher precision, recall, and BLEU score than the state-of-the-art, DL-based baseline models. We show StatGen's usefulness in two applications. First, we use it to generate the BE documentation for Apache APIs that lack of documentation by learning from the documentation of the equivalent APIs in JDK. 44% of the generated documentation were rated as useful and 42% as somewhat useful. In the second application, we use StatGen to detect the inconsistency between the BE documentation and corresponding implementations of several JDK8 packages.
KW - Behavioral Exception Specification
KW - Hybrid Machine Translation
KW - Inference between Documentation and Code
UR - http://www.scopus.com/inward/record.url?scp=85146930883&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146930883&partnerID=8YFLogxK
U2 - 10.1145/3551349.3560434
DO - 10.1145/3551349.3560434
M3 - Conference contribution
AN - SCOPUS:85146930883
T3 - ACM International Conference Proceeding Series
BT - 37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
A2 - Aehnelt, Mario
A2 - Kirste, Thomas
PB - Association for Computing Machinery
T2 - 37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
Y2 - 10 October 2022 through 14 October 2022
ER -