Datasets & utils for paper USING PRE-TRAINED MODELS TO PARTIALLY AUTOMATE CODE REVIEW ACTIVITIES
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4812784
下载链接
链接失效反馈官方服务:
资源简介:
Raw and processed datasets & Configurations files for Pre-training and Fine-Tuning T5 models
Pre-Training dataset Obtained by mining Stack Overflow and CodeSearchNet data.
Fine-Tuning dataset We will fine-tune our T5 small model on different datasets obtained by mining code review data from Gerrit and GitHub repositories.
Fine-Tuning dataset v1 (Small) Same dataset used by Tufano et al., abstracted code and raw comments.
Fine-Tuning dataset v2 (Small) Same dataset used by Tufano et al., not abstracted code and cleaned comments.
Fine-Tuning dataset (Large) Our new Large dataset
创建时间:
2021-06-11



