five

nuprl-staging/agent-archive

收藏
Hugging Face2026-01-18 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/nuprl-staging/agent-archive
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含来自GitHub仓库的pull request交互数据,其中AI编码代理(如Claude Code、Copilot等)贡献了代码。数据来源于AgentPack数据集,并通过GitHub API丰富了完整的PR时间线事件。每一行代表一个pull request及其完整的交互历史,包括评论、审查、提交和其他时间线事件。数据集提供了详细的列描述和事件类型,如评论、提交、审查、行评论等。数据收集过程包括从AgentPack数据集中提取PR引用,通过GitHub API获取完整的PR细节和时间线事件,并将数据扁平化为表格格式以便高效查询。数据集的局限性包括GitHub Timeline API返回的事件数量和时间限制,以及部分PR可能已被删除或设为私有。

This dataset contains pull request interactions from GitHub repositories where AI coding agents (Claude Code, Copilot, etc.) contributed code. The data was sourced from the AgentPack dataset and enriched with full PR timeline events from the GitHub API. Each row represents a single pull request with its complete interaction history, including comments, reviews, commits, and other timeline events. The dataset provides detailed column descriptions and event types, such as commented, committed, reviewed, line_commented, etc. The data collection process involves extracting PR references from the AgentPack dataset, fetching full PR details and timeline events via the GitHub API, and flattening the data into tabular format for efficient querying. The datasets limitations include the GitHub Timeline APIs restrictions on the number of events returned and the time frame, as well as the possibility that some PRs may have been deleted or made private.
提供机构:
nuprl-staging
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作