five

Artifacts of the paper under review by ESEC/FSE

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7571182
下载链接
链接失效反馈
官方服务:
资源简介:
This repository has been deprecated. Please refer to this link for the latest version. -------------------------------------------------------------------- This is the online repository of CCT5: A Code-Change-Oriented Pre-Trained Model, a research paper under review by ESEC/FSE. We release the source code and relevant data of CCT5, the data used in our evaluation, as well as the experiment results. Getting Started pytorch==1.8.0 cudatoolkit=11.1 datasets==1.18.3 transformers==4.16.2 tensorboard==2.8.0 tree-sitter==0.19.1 Dataset We provide the datasets of pretraining and three downstream tasks. The datasets should be downloaded and uncompressed in the data directory. pretraining/CodeChangeNet.tar.lrz contains the dataset used in pretraining, i.e., CodeChangeNet. CodeChangeNet is a collection of over 1000 star projects written in six popular programming languages: Go, Java, JavaScript, PHP, Python, and Ruby. finetune/MessageGeneration contains the download and process script of task1 - Commit Message Generation; finetune/CommentUpdate contains the dataset of downstream task2 - Just-in-Time Comment Update; finetune/JITDefectPrediction contains the dataset of downstream task3 - Just-in-Time Defect Prediction; Pretrain the model cd sh bash pretrain.sh Finetune and evaluate the downstream task Commit Message Generation cd sh bash finetune_msggen.sh Just-in-Time Comment Update cd sh bash finetune_cup.sh Just-in-Time Defect Prediction cd sh bash finetune_jit.sh Results The experiment results of two generation tasks and the ablation study are stored in results with the directory MessageGeneration, CommentUpdate, and Ablation, respectively.
创建时间:
2023-05-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作