five

xuehang/SyncBench

收藏
Hugging Face2025-02-11 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/xuehang/SyncBench
下载链接
链接失效反馈
官方服务:
资源简介:
SyncBench是一个针对代理不同步(agent out-of-sync)恢复问题的数据集。该数据集基于21个流行的GitHub仓库构建,并允许自适应的上采样和下采样以满足自定义需求。数据集分为两个部分:Caller和Callee,分别代表测试代码不同步和被测试代码不同步。数据集以JSON和CSV格式存储,包含实例ID、数据类型、任务类型、仓库链接、Git提交ID、函数类型、函数名、不同步的函数代码、最新函数代码、不同步上下文代码(过滤和完整)、最新上下文代码(过滤和完整)、初始错误日志、不同步代码的解析输出、最新代码的解析输出、Python文件名和路径、单元测试路径等信息。数据集有三个版本:完整版(24,332条数据)、过滤版(即将推出,8k条数据)和评估版(300条数据,分为Callee和Caller两部分)。

SyncBench is a dataset for agent out-of-sync recovery. It is constructed based on 21 popular GitHub repositories and allows adaptive upsampling and downsampling for custom needs. The dataset is divided into two parts: Caller and Callee, representing testing code out-of-sync and tested code out-of-sync respectively. The dataset is stored in JSON and CSV formats, including instance ID, data type, task type, repository URL, Git commit ID, function type, function name, out-of-sync function code, up-to-date function code, out-of-sync context code (filtered and complete), up-to-date context code (filtered and complete), initial error log, parsed execution output of the out-of-sync code, parsed execution output of the up-to-date code, Python file name and path, unit test path, etc. There are three versions of the dataset: the full version (24,332 entries), the filtered version (coming soon, 8k entries), and the evaluation version (300 entries, divided into Callee and Caller parts).
提供机构:
xuehang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作