five

Datasets & utils for paper USING PRE-TRAINED MODELS TO PARTIALLY AUTOMATE CODE REVIEW ACTIVITIES

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4812784
下载链接
链接失效反馈
官方服务:
资源简介:
Raw and processed datasets & Configurations files for Pre-training and Fine-Tuning T5 models  Pre-Training dataset Obtained by mining Stack Overflow and CodeSearchNet data.  Fine-Tuning dataset We will fine-tune our T5 small model on different datasets obtained by mining code review data from Gerrit and GitHub repositories. Fine-Tuning dataset v1 (Small) Same dataset used by Tufano et al.,  abstracted code and raw comments. Fine-Tuning dataset v2 (Small) Same dataset used by Tufano et al., not abstracted code and cleaned comments. Fine-Tuning dataset (Large) Our new Large dataset
创建时间:
2021-06-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作