PPWangyc/BOLD5000-QA
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/PPWangyc/BOLD5000-QA
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
task_categories:
- question-answering
- visual-question-answering
tags:
- fmri
- neuroscience
- brain-decoding
- neuro-symbolic
- scene-graph
- bold5000
size_categories:
- 100K<n<1M
---
# BOLD5000-QA
An fMRI question-answering dataset built on top of the [BOLD5000](https://bold5000-dataset.github.io/website/) fMRI dataset. BOLD5000-QA pairs fMRI recordings of subjects viewing natural images with compositional question-answer pairs derived from scene graphs.
This dataset is introduced in [**Neuro-Symbolic Decoding of Neural Activity**](https://arxiv.org/abs/2603.03343) (ICLR 2026).
## Dataset Description
BOLD5000-QA converts visual scene graphs from BOLD5000 images into structured QA pairs. Each sample contains:
- **fMRI data**: Voxel-level brain activity recorded while a subject views an image, parcellated by brain atlas (e.g., Yeo-17 networks)
- **Queries**: Symbolic compositional queries about the image content (e.g., `scene() -> filter(person) -> query(holding, ?)`)
- **Answers**: Ground-truth answers (`yes`/`no` for Boolean queries, or vocabulary tokens for attribute queries)
### Statistics
| | Training | Test |
|---|---|---|
| QA examples | ~133K | ~2K |
| Subjects | 4 | 4 |
### Subjects
- `CSI1`, `CSI2`, `CSI3`, `CSI4`
## Dataset Structure
```
BOLD5000-QA/
<subject>/ # e.g., CSI1
train/
<img_id>.npy # Per-image data (queries, answers, brain_region)
test/
<img_id>.npy
```
Each `.npy` file is a dictionary containing:
- `queries` (list of str): Symbolic query programs
- `answers` (list of str): Corresponding answers
- `brain_region` (np.ndarray): fMRI activation parcellated by atlas
## Usage
```python
import numpy as np
sample = np.load("BOLD5000-QA/CSI1/train/0.npy", allow_pickle=True).item()
print(sample['queries']) # list of symbolic query strings
print(sample['answers']) # list of answer strings
print(sample['brain_region'].shape) # fMRI region activations
```
Or use the provided PyTorch dataset loader from the [NEURONA](https://github.com/PPWangyc/neurona) codebase:
```python
from loader.fqa import FQADataset
dataset = FQADataset(data_dir="data/BOLD5000-QA", split="train", subject="CSI1")
```
## Links
- **Paper**: [Neuro-Symbolic Decoding of Neural Activity](https://arxiv.org/abs/2603.03343)
- **Code**: [github.com/PPWangyc/neurona](https://github.com/PPWangyc/neurona)
- **Project Page**: [ppwangyc.github.io/projects/neurona](https://ppwangyc.github.io/projects/neurona/)
## Citation
```bibtex
@article{wang2026neuro,
title={Neuro-Symbolic Decoding of Neural Activity},
author={Wang, Yanchen and Hsu, Joy and Adeli, Ehsan and Wu, Jiajun},
journal={arXiv preprint arXiv:2603.03343},
year={2026}
}
```
提供机构:
PPWangyc



