Rap4DQ: Learning to recommend relevant API documentation for developer questions

Yi Li, Shaohua Wang, Wenbo Wang, Tien N. Nguyen, Yan Wang, Xinyue Ye

Research output: Contribution to journalArticlepeer-review

Abstract

Developers often face difficulties in using different API methods during the software development process. Answering API related questions on API Q&A forums often costs API development teams a lot of time. To help save time for API development teams, we propose a deep learning-based approach, namely Rap4DQ, to identify relevant web API documentation for developer’s API related questions on API Q&A forums. Rap4DQ learns representation vectors for questions and API documentation separately using Gated Recurrent Unit (GRU) and adds different weights to reflect the various importance of varied API documents during training. Rap4DQ is designed to train on positive and negative samples with a loss function that minimizes the distances between questions and their relevant documentation, but maximizes the distances between questions and their irrelevant documentation. In the end, we construct a learning-to-rank layer to rank the API documentation based on learned representation vectors from GRUs. We have conducted several experiments to evaluate Rap4DQ on three popular and large API Q&A forums, Twitter, eBay, and AdWords. The results show that Rap4DQ can outperform all baselines by having a relative improvement up to 84.3% in terms of AUC. Rap4DQ can obtain a high AUC of 0.84, 0.88, and 0.94 on identifying relevant API documentation on Twitter, eBay, and AdWords, respectively.

Original languageEnglish (US)
Article number23
JournalEmpirical Software Engineering
Volume27
Issue number1
DOIs
StatePublished - Jan 2022

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • API documentation
  • Deep learning
  • Developer forums
  • Learning-to-Rank
  • Question answering

Fingerprint

Dive into the research topics of 'Rap4DQ: Learning to recommend relevant API documentation for developer questions'. Together they form a unique fingerprint.

Cite this