TuringEnterprises/CRAVE
收藏Hugging Face2025-12-12 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/TuringEnterprises/CRAVE
下载链接
链接失效反馈官方服务:
资源简介:
CRAVE数据集是一个经过平衡和筛选的代码审查分类数据集,包含来自123个代码库和600个拉取请求的1,200个样本。该数据集经过高质量筛选,专门用于训练和评估能够将拉取请求变更分类为“批准”或“请求更改”的代码审查代理。数据集结构包括拉取请求URL、标题、代码差异、补丁、标签和解释等特征。数据收集过程涉及从多样化的开源代码库中收集拉取请求,提取代码差异和元数据,并通过启发式规则选择有意义的代码审查。数据集还提供了存储库分布和变更类型分布的详细信息。
The CRAVE dataset is a balanced, filtered code review classification dataset containing 1,200 samples from 123 repositories and 600 pull requests. This dataset has been filtered and selected for high quality, making it specifically designed for training and evaluating code review agents that can classify pull request changes as either APPROVE or REQUEST_CHANGES. The dataset structure includes features such as pull request URLs, titles, code diffs, patches, labels, and explanations. The data collection process involves gathering pull requests from diverse open-source repositories, extracting code diffs and metadata, and selecting meaningful code reviews through heuristic rules. The dataset also provides detailed information on repository distribution and change type distribution.
提供机构:
TuringEnterprises



