five

JetBrains/git_good_bench-lite

收藏
Hugging Face2025-11-18 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/JetBrains/git_good_bench-lite
下载链接
链接失效反馈
官方服务:
资源简介:
GitGoodBench Lite是一个包含120个样本的子集,用于评估AI代理在解决git任务中的性能(见支持的场景)。数据集中的样本均匀地分布在Python、Java和Kotlin三种编程语言以及合并冲突解决和文件提交语法两种样本类型之间。每个样本类型和编程语言都有20个样本。数据集中的所有数据都是从100个独特的、具有宽松许可的开源GitHub仓库中收集的,这些仓库拥有至少1000个星标、至少5个分支、至少10个贡献者,且不是分叉或存档的仓库。数据集包含两种类型的样本:merge和file_commit_chain。merge场景包含一个或多个合并冲突,所有合并冲突都保证在Python、Java或Kotlin文件中。file_commit_chain场景由两个提交组成,最旧的提交和最新的提交,涵盖该文件在整个提交链中的修改。数据集的结构包括多个字段,其中一些是元数据,一些是场景的主要数据。

GitGoodBench Lite is a subset of 120 samples designed to evaluate the performance of AI agents in resolving git tasks (see Supported Scenarios). The dataset is evenly split across the programming languages Python, Java, and Kotlin, as well as the sample types merge conflict resolution and file-commit grammar. Each sample type and programming language has 20 samples. All data in this dataset are collected from 100 unique, open-source GitHub repositories with permissive licenses that have at least 1000 stars, at least 5 branches, at least 10 contributors, and are not forks or archived. The dataset contains two types of samples: merge and file_commit_chain. Merge scenarios contain one or more merge conflicts that occurred during a merge, guaranteed to be in a Python, Java, or Kotlin file. File_commit_chain scenarios consist of two commits, the oldest and newest, covering modifications to the file throughout the commit chain. The dataset structure includes multiple fields, some of which are metadata, and others are the primary data for the scenarios.
提供机构:
JetBrains
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作