PassionPrc/hotpot_RL
收藏Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/PassionPrc/hotpot_RL
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- question-answering
- visual-question-answering
language:
- en
tags:
- hotpotqa
- multi-hop-reasoning
- vlm
- reinforcement-learning
- document-understanding
size_categories:
- 10K<n<100K
---
# HotpotQA-RL: Merged-Context Dataset for Vision-Language Models
A reformatted version of [HotpotQA](https://hotpotqa.github.io/) designed for Reinforcement Learning with Vision-Language Models. Every **two** adjacent questions share one merged context, doubling the document length and increasing retrieval difficulty.
## What's Different from Original HotpotQA
- **Merged contexts**: Every 2 consecutive items share the same context (20 paragraphs, ~1,800 words), simulating longer documents
- **Image format**: Context text is rendered as PNG images (Verdana 9pt, A4 layout) for VLM input
- **Embedded binary**: Images are stored directly as `bytes` inside the parquet files — fully self-contained, no external files needed
## Dataset Statistics
| Split | Items | Shared-Context Pairs | Avg Words/Context | Avg Image Size | File Size |
|-------|-------|---------------------|-------------------|----------------|-----------|
| `train` | 90,447 | 45,223 | 1,834 | 365 KB | 31.2 GB |
| `dev_distractor` | 7,405 | 3,702 | 1,855 | 369 KB | 2.5 GB |
| `dev_fullwiki` | 7,405 | 3,702 | 1,904 | 377 KB | 2.5 GB |
### Context Length Distribution
| Metric | train | dev_distractor | dev_fullwiki |
|--------|-------|---------------|--------------|
| Min words | 250 | 709 | 616 |
| Max words | 3,994 | 3,703 | 3,838 |
| Median words | 1,814 | 1,840 | 1,884 |
| Min chars | 1,598 | 4,448 | 3,871 |
| Max chars | 24,987 | 22,878 | 23,620 |
### Answer Distribution
- ~93.9% span answers (entity names, dates, etc.)
- ~6.1% yes/no answers
### Evidence (Golden Supporting Facts)
| Metric | train | dev_distractor | dev_fullwiki |
|--------|-------|---------------|--------------|
| Min evidence sentences | 2 | 2 | 0 |
| Max evidence sentences | 12 | 8 | 7 |
| Mean evidence sentences | 2.4 | 2.4 | 1.4 |
> Note: `dev_fullwiki` has some empty evidence lists because the supporting facts reference paragraphs that may not exist in the fullwiki-retrieved context.
## Data Format
Each parquet file contains the following columns:
| Column | Type | Description |
|--------|------|-------------|
| `id` | `string` | Unique question ID (from original HotpotQA) |
| `question` | `string` | The question text |
| `context` | `string` | Full context text (title + sentences from all paragraphs, separated by newlines) |
| `context_img` | `bytes` | PNG image of the rendered context text (binary) |
| `evidence` | `list[string]` | Golden evidence sentences (supporting facts) |
| `answer` | `string` | Ground-truth answer |
### Shared Context
Items are paired: every two consecutive rows (index 0&1, 2&3, ...) share the **same** `context`, `context_img`. Each item has its own `question`, `answer`, and `evidence`.
### Example
```python
import pandas as pd
from PIL import Image
import io
df = pd.read_parquet("hotpot_train.parquet")
row = df.iloc[0]
print(row['question']) # "Which magazine was started first..."
print(row['answer']) # "Arthur's Magazine"
print(row['evidence']) # ["Arthur's Magazine (1844–1846)...", "First for Women is..."]
# View the context image
img = Image.open(io.BytesIO(row['context_img']))
img.show()
```
## Files
```
├── hotpot_train.parquet # Training set (90,447 items)
├── hotpot_dev_distractor.parquet # Dev set - distractor setting (7,405 items)
└── hotpot_dev_fullwiki.parquet # Dev set - fullwiki setting (7,405 items)
```
## Source
- Original dataset: [HotpotQA](https://hotpotqa.github.io/) (Yang et al., 2018)
- Distractor vs Fullwiki: Both dev sets share the same questions but differ in context — distractor provides 10 paragraphs (2 gold + 8 distractors), fullwiki retrieves paragraphs from Wikipedia
## Citation
```bibtex
@inproceedings{yang2018hotpotqa,
title={HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering},
author={Yang, Zhilin and Qi, Peng and Zhang, Saizheng and Bengio, Yoshua and Cohen, William and Salakhutdinov, Ruslan and Manning, Christopher D.},
booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
year={2018}
}
```
提供机构:
PassionPrc



