evalstate/openclaw-pr
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/evalstate/openclaw-pr
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: Transformers PR Slop Dataset
configs:
- config_name: issues
data_files:
- split: train
path: issues.parquet
default: true
- config_name: prs
data_files:
- split: train
path: pull_requests.parquet
- config_name: issue_comments
data_files:
- split: train
path: issue_comments.parquet
- config_name: pr_comments
data_files:
- split: train
path: pr_comments.parquet
- config_name: pr_reviews
data_files:
- split: train
path: reviews.parquet
- config_name: pr_files
data_files:
- split: train
path: pr_files.parquet
- config_name: pr_diffs
data_files:
- split: train
path: pr_diffs.parquet
- config_name: review_comments
data_files:
- split: train
path: review_comments.parquet
- config_name: links
data_files:
- split: train
path: links.parquet
- config_name: events
data_files:
- split: train
path: events.parquet
- config_name: new_contributors
data_files:
- split: train
path: new_contributors.parquet
---
---
# Transformers PR Slop Dataset
Normalized snapshots of issues, pull requests, comments, reviews, and linkage data from `openclaw/openclaw`.
Files:
- `issues.parquet`
- `pull_requests.parquet`
- `comments.parquet`
- `issue_comments.parquet` (derived view of issue discussion comments)
- `pr_comments.parquet` (derived view of pull request discussion comments)
- `reviews.parquet`
- `pr_files.parquet`
- `pr_diffs.parquet`
- `review_comments.parquet`
- `links.parquet`
- `events.parquet`
- `new_contributors.parquet`
- `new-contributors-report.json`
- `new-contributors-report.md`
Use:
- duplicate PR and issue analysis
- triage and ranking experiments
- eval set creation
Notes:
- updated daily
- latest snapshot: `20260324T233649Z`
- raw data only; no labels or moderation decisions
- PR metadata, file-level patch hunks, and full unified diffs are included
- new contributor reviewer artifacts are included when generated for the snapshot
- full file contents for changed files are not included
pretty_name: Transformer PR Slop 数据集(Transformers PR Slop Dataset)
configs:
- config_name: issues
data_files:
- split: train
path: issues.parquet
default: true
- config_name: prs
data_files:
- split: train
path: pull_requests.parquet
- config_name: issue_comments
data_files:
- split: train
path: issue_comments.parquet
- config_name: pr_comments
data_files:
- split: train
path: pr_comments.parquet
- config_name: pr_reviews
data_files:
- split: train
path: reviews.parquet
- config_name: pr_files
data_files:
- split: train
path: pr_files.parquet
- config_name: pr_diffs
data_files:
- split: train
path: pr_diffs.parquet
- config_name: review_comments
data_files:
- split: train
path: review_comments.parquet
- config_name: links
data_files:
- split: train
path: links.parquet
- config_name: events
data_files:
- split: train
path: events.parquet
- config_name: new_contributors
data_files:
- split: train
path: new_contributors.parquet
---
# Transformer PR Slop 数据集(Transformers PR Slop Dataset)
本数据集为来自`openclaw/openclaw`仓库的议题、拉取请求、评论、评审以及关联数据的标准化快照。
## 数据集文件清单
- `issues.parquet`:议题数据文件
- `pull_requests.parquet`:拉取请求数据文件
- `comments.parquet`:通用评论数据文件
- `issue_comments.parquet`(议题讨论评论的派生视图):议题讨论评论的派生数据文件
- `pr_comments.parquet`(拉取请求讨论评论的派生视图):拉取请求讨论评论的派生数据文件
- `reviews.parquet`:评审数据文件
- `pr_files.parquet`:拉取请求关联文件数据文件
- `pr_diffs.parquet`:拉取请求代码差异数据文件
- `review_comments.parquet`:评审评论数据文件
- `links.parquet`:关联数据文件
- `events.parquet`:事件数据文件
- `new_contributors.parquet`:新贡献者数据文件
- `new-contributors-report.json`:新贡献者报告JSON文件
- `new-contributors-report.md`:新贡献者报告Markdown文件
## 应用场景
- 重复拉取请求与议题分析
- 分类与排序实验
- 评估集构建
## 备注说明
- 每日更新
- 最新快照时间戳:`20260324T233649Z`
- 仅包含原始数据,未添加任何标注或审核决策
- 包含拉取请求元数据、文件级补丁块以及完整的统一差异内容
- 若快照生成时包含新贡献者评审产物,则一并纳入数据集
- 不包含变更文件的完整文件内容
提供机构:
evalstate



