PEARL

NIAID Data Ecosystem2026-03-12 收录

下载链接：

https://zenodo.org/record/4452147

下载链接

链接失效反馈

官方服务：

资源简介：

This repository contains the artifacts of PEARL. The files prediction_CoCoNut.csv and prediction_ManySStuBs4J.csv illustrate the prediction results of BEP on the two datasets, in which the column element_Rank shows the ranking of buggy element and the column rank is for operation path. The file Pipeline_Results.zip contains the results of our repair pipeline on the 111 single-token bugs. In each folder, rem.txt corresponds to the buggy line, add.txt corresponds to the correct line, and context.txt corresponds to the whole buggy method. These three files are required inputs for each prediction. query.txt is the content sent to AnyCodeGen and the returned code fragment is stored in result.txt. We also open the cleaned dataset of ManySStuBs4J which only contains single-token bugs in Single-token_bugs_in_ManySStuBs4J.zip. In this file, each bug is corresponding to three lines with the same line number in three different files which are rem.txt, add.txt, and context.txt, respectively representing the buggy line, correct line, and the buggy method. This storage style follows the training set of CoCoNut. Note that we do not release our pre-processed dataset of CoCoNut sinse it is too large. We encourage researchers to process this dataset based on their own needs. Moreover, we do provide the script we used for selecting single-token bugs which is single-token_selection.py. We also note there is no visual interface for the newly added bugs in Defects4J-V2.0. We thus provide our script for selecting single-token bugs (getSingle-token_bugs_from_D4J.py) from these bugs, making it easy for others to reproduce our experiment. Finally, we release our souce code in source code.zip and will build a GitHub homepage for PEARL upon acceptance. Case study on the failure of PEARL: PEARL does not work well on method name-related fix. For the following bug (Closure-10), our BEP model successfully predicts the oracle operation path as No.2. - return allResultsMatch(n, MAY_BE_STRING_PREDICATE); + return anyResultsMatch(n, MAY_BE_STRING_PREDICATE); Nonetheless, we have to change this statement into "return ??" when querying AnyCodeGen due to the format restriction of it. Unfortunately, AnyCodeGen cannot synthesize such a detailed method call, thus leading to the failure of PEARL of repairing this bug. Case study on the high CR of PEARL: // Ground-truth patch for Closure-62 if (excerpt.equals(LINE) - && 0 <= charno && charno < sourceExcerpt.length()) { + && 0 <= charno && charno <= sourceExcerpt.length()) { // An overfitting patch generated for Closure-62 by jKali - if (excerpt.equals(LINE) - && 0 <= charno && charno < sourceExcerpt.length()) { + if (true) { The ground-truth patch (also the PEARL-generated) and jKali-generated patch for the bug Closure-62 are listed. The ground-truth patch changes an operator < into <=. PEARL first identifies this operator and the operation type to be updated. After obtaining these information, we change the third line to && 0 <= charno && ??) { and send the whole method to AnyCodeGen. After computing the possibilities of possible answers, AnyCodeGen returns charno <= sourceExcerpt.length() as the code fragment to replace ``??'' and thus we generate this correct patch. On the contrary, jKali is Java implementation of Kali. The operators implemented in it are removal of statements, modification of if conditions to true and false and so on which are too course-grained. In this example, jKali identifies the whole conditional statement and modifies the condition to true which leads to the overfitting patch. This case is a vivid example showing that fine-grained buggy element localization can avoid generating overfitting patch.

创建时间：

2021-03-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集