Artifacts of the paper under review by ESEC/FSE
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7571182
下载链接
链接失效反馈官方服务:
资源简介:
This repository has been deprecated. Please refer to this link for the latest version.
--------------------------------------------------------------------
This is the online repository of CCT5: A Code-Change-Oriented Pre-Trained Model, a research paper under review by ESEC/FSE. We release the source code and relevant data of CCT5, the data used in our evaluation, as well as the experiment results.
Getting Started
pytorch==1.8.0
cudatoolkit=11.1
datasets==1.18.3
transformers==4.16.2
tensorboard==2.8.0
tree-sitter==0.19.1
Dataset
We provide the datasets of pretraining and three downstream tasks. The datasets should be downloaded and uncompressed in the data directory.
pretraining/CodeChangeNet.tar.lrz contains the dataset used in pretraining, i.e., CodeChangeNet. CodeChangeNet is a collection of over 1000 star projects written in six popular programming languages: Go, Java, JavaScript, PHP, Python, and Ruby.
finetune/MessageGeneration contains the download and process script of task1 - Commit Message Generation;
finetune/CommentUpdate contains the dataset of downstream task2 - Just-in-Time Comment Update;
finetune/JITDefectPrediction contains the dataset of downstream task3 - Just-in-Time Defect Prediction;
Pretrain the model
cd sh
bash pretrain.sh
Finetune and evaluate the downstream task
Commit Message Generation
cd sh
bash finetune_msggen.sh
Just-in-Time Comment Update
cd sh
bash finetune_cup.sh
Just-in-Time Defect Prediction
cd sh
bash finetune_jit.sh
Results
The experiment results of two generation tasks and the ablation study are stored in results with the directory MessageGeneration, CommentUpdate, and Ablation, respectively.
创建时间:
2023-05-25



