nebius/SWE-agent-trajectories
收藏Hugging Face2024-12-23 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/nebius/SWE-agent-trajectories
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含80,036条由软件工程代理生成的轨迹,这些代理基于SWE-agent框架,并使用各种模型作为动作生成器。这些轨迹中,代理尝试解决来自[nebius/SWE-bench-extra](https://huggingface.co/datasets/nebius/SWE-bench-extra)和[princeton-nlp/SWE-bench](https://huggingface.co/datasets/princeton-nlp/SWE-bench)的开发分割的GitHub问题。数据集的创建是为了研究如何利用开放权重模型开发软件工程代理,并在SWE-bench Verified基准测试中取得了40.6%的分数。数据集收集分为两个阶段:收集问题解决实例,类似于SWE-bench的方法,以及生成大量解决收集问题的轨迹。这些轨迹中生成的代码补丁通过链接的拉取请求中的测试进行评估,以确定哪些补丁通过了测试。
This dataset contains 80,036 trajectories generated by a software engineering agent based on the SWE-agent framework, using various models as action generators. In these trajectories, the agent attempts to solve GitHub issues from the [nebius/SWE-bench-extra](https://huggingface.co/datasets/nebius/SWE-bench-extra) and the dev split of [princeton-nlp/SWE-bench](https://huggingface.co/datasets/princeton-nlp/SWE-bench). The dataset was created as part of a research project focused on developing a software engineering agent using open-weight models, which achieved a score of 40.6% on the SWE-bench Verified benchmark. The dataset collection consisted of two stages: collecting issue-resolution instances, following a methodology similar to SWE-bench, and generating a large number of trajectories for solving the collected issues. The generated code patches in these trajectories were evaluated by the tests from the linked pull requests to determine which of them passed the tests.
提供机构:
nebius



