DEAR: A Novel Deep Learning-based Approach for Automated Program Repair

Yi Li, Shaohua Wang, Tien N. Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

The existing deep learning (DL)-based automated program repair (APR) models are limited in fixing general software defects. We present DEAR, a DL-based approach that supports fixing for the general bugs that require dependent changes at once to one or mul-tiple consecutive statements in one or multiple hunks of code. We first design a novel fault localization (FL) technique for multi-hunk, multi-statement fixes that combines traditional spectrum-based (SB) FL with deep learning and data-flow analysis. It takes the buggy statements returned by the SBFL model, detects the buggy hunks to be fixed at once, and expands a buggy statement s in a hunk to include other suspicious statements around s. We design a two-tier, tree-based LSTM model that incorporates cycle training and uses a divide-and-conquer strategy to learn proper code transformations for fixing multiple statements in the suitable fixing context consisting of surrounding subtrees. We conducted several experiments to evaluate DEAR on three datasets: Defects4J (395 bugs), BigFix (+26k bugs), and CPatMiner (+44k bugs). On Defects4J dataset, DEAR outperforms the baselines from 42%-683% in terms of the number of auto-fixed bugs with only the top-1 patches. On BigFix dataset, it fixes 31-145 more bugs than existing DL-based APR models with the top-1 patches. On CPatMiner dataset, among 667 fixed bugs, there are 169 (25.3%) multi-hunk/multi-statement bugs. DEAR fixes 71 and 164 more bugs, including 52 and 61 more multi-hunk/multi-statement bugs, than the state-of-the-art, DL-based APR models.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022
PublisherIEEE Computer Society
Pages511-523
Number of pages13
ISBN (Electronic)9781450392211
DOIs
StatePublished - 2022
Event44th ACM/IEEE International Conference on Software Engineering, ICSE 2022 - Pittsburgh, United States
Duration: May 22 2022May 27 2022

Publication series

NameProceedings - International Conference on Software Engineering
Volume2022-May
ISSN (Print)0270-5257

Conference

Conference44th ACM/IEEE International Conference on Software Engineering, ICSE 2022
Country/TerritoryUnited States
CityPittsburgh
Period5/22/225/27/22

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Automated Program Repair
  • Deep Learning
  • Fault Localization

Fingerprint

Dive into the research topics of 'DEAR: A Novel Deep Learning-based Approach for Automated Program Repair'. Together they form a unique fingerprint.

Cite this