Dlfix: Context-based code transformation learning for automated program repair

Li Yi, Shaohua Wang, Tien N. Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Automated Program Repair (APR) is very useful in helping developers in the process of software development and maintenance. Despite recent advances in deep learning (DL), the DL-based APR approaches still have limitations in learning bug-fixing code changes and the context of the surrounding source code of the bug-fixing code changes. These limitations lead to incorrect fixing locations or fixes. In this paper, we introduce DLFix, a two-tier DL model that treats APR as code transformation learning from the prior bug fixes and the surrounding code contexts of the fixes. The first layer is a tree-based RNN model that learns the contexts of bug fixes and its result is used as an additional weighting input for the second layer designed to learn the bug-fixing code transformations. We conducted several experiments to evaluate DLFix in two benchmarks: Defect4j and Bugs.jar, and a newly built bug datasets with a total of +20K real-world bugs in eight projects. We compared DLFix against a total of 13 state-of-the-art pattern-based APR tools. Our results show that DLFix can auto-fix more bugs than 11 of them, and is comparable and complementary to the top two pattern-based APR tools in which there are 7 and 11 unique bugs that they cannot detect, respectively, but we can. Importantly, DLFix is fully automated and data-driven, and does not require hard-coding of bug-fixing patterns as in those tools. We compared DLFix against 4 state-of-the-art deep learning based APR models. DLFix is able to fix 2.5 times more bugs than the best performing baseline.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering, ICSE 2020
PublisherIEEE Computer Society
Pages602-614
Number of pages13
ISBN (Electronic)9781450371216
DOIs
StatePublished - Jun 27 2020
Event42nd ACM/IEEE International Conference on Software Engineering, ICSE 2020 - Virtual, Online, Korea, Republic of
Duration: Jun 27 2020Jul 19 2020

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference42nd ACM/IEEE International Conference on Software Engineering, ICSE 2020
Country/TerritoryKorea, Republic of
CityVirtual, Online
Period6/27/207/19/20

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Automated program repair
  • Context-based code transformation learning
  • Deep learning

Fingerprint

Dive into the research topics of 'Dlfix: Context-based code transformation learning for automated program repair'. Together they form a unique fingerprint.

Cite this