xiaomoguhzz/R3-Bench-data
收藏Hugging Face2026-04-10 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/xiaomoguhzz/R3-Bench-data
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
pretty_name: R3-Bench Research Data
size_categories:
- n>1T
tags:
- benchmark
- vision-language
- video-understanding
- evaluation
---
# R3-Bench Research Data
> **⚠️ This is not a HuggingFace `datasets` library dataset.**
> This repo is a raw collection of project directories from R3-Bench research,
> packaged as gzip'd tar files. Use the download + extraction instructions
> below — do **not** use `datasets.load_dataset()` on this repo.
## What's inside
Large binary outputs from the R3-Bench evaluation framework research — roughly **323 GB** across multiple tarballs. Each file corresponds one-to-one to a directory under the source working tree:
| File | Source directory | Size | Shards |
|---|---|---|---|
| `alive.tar.gz` | `alive/` | 10.7 kB | 1 |
| `backup_bin.tar.gz.part_{aa..ae}` | `backup_bin/` | ~193 GB | 5 |
| `compare_methods_OmniVerifier_data.tar.gz` | `compare_methods/OmniVerifier/` | 2.09 GB | 1 |
| `compare_methods_Step1X-Edit.tar.gz` | `compare_methods/Step1X-Edit/` | 6.23 GB | 1 |
| `compare_methods_Step1X-Editv2.tar.gz` | `compare_methods/Step1X-Editv2/` | 129 MB | 1 |
| `elo_human_eval.tar.gz` | `elo_human_eval/` | 146 MB | 1 |
| `eval.tar.gz` | `eval/` | 3.02 GB | 1 |
| `exps.tar.gz.part_{aa..ac}` | `exps/` | ~103 GB | 3 |
| `images.tar.gz` | `images/` | 725 MB | 1 |
| `iterative_ablation_output.tar.gz` | `iterative_ablation_output/` | 1.94 GB | 1 |
| `logs.tar.gz` | `logs/` | 121 kB | 1 |
| `output.tar.gz` | `output/` | 6.58 GB | 1 |
| `paper_case.tar.gz` | `paper_case/` | 5.62 GB | 1 |
Files larger than the HuggingFace single-file LFS limit were split into ~50 GB shards with `split -b 45G`.
## Download and extract
```bash
# 1. Install the HF client
pip install -U huggingface_hub
# 2. Download everything to a local directory
huggingface-cli download xiaomoguhzz/R3-Bench-data \
--repo-type dataset \
--local-dir R3-Bench-data/
cd R3-Bench-data/
# 3. Merge split shards back into monolithic tarballs
cat backup_bin.tar.gz.part_* > backup_bin.tar.gz
cat exps.tar.gz.part_* > exps.tar.gz
rm backup_bin.tar.gz.part_* exps.tar.gz.part_*
# 4. Extract each tarball
for f in *.tar.gz; do
echo "extracting $f ..."
tar xzf "$f"
done
```
After extraction the source working tree layout is reproduced:
```
alive/ backup_bin/ compare_methods/ elo_human_eval/ eval/
exps/ images/ iterative_ablation_output/ logs/
output/ paper_case/
```
## Why the Dataset Viewer shows an error
The HuggingFace Dataset Viewer auto-detects tar files and tries to parse them as [WebDataset format](https://huggingface.co/docs/hub/datasets-webdataset). This repo is a raw tarball collection (not WebDataset), so the viewer reports a `SplitsNotFoundError`.
**This is expected** and does not affect downloads — the error only concerns the preview panel, not the files themselves.
## Related code
The research code that produces and consumes this data lives in the companion GitHub repository [`xiaomoguhz/R3-Bench`](https://github.com/xiaomoguhz/R3-Bench) (currently private; reach out for access if needed).
## License
Released under **CC-BY-4.0**. You are free to share and adapt, provided you give appropriate credit.
提供机构:
xiaomoguhzz



