five

PPWangyc/BOLD5000-QA

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/PPWangyc/BOLD5000-QA
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en task_categories: - question-answering - visual-question-answering tags: - fmri - neuroscience - brain-decoding - neuro-symbolic - scene-graph - bold5000 size_categories: - 100K<n<1M --- # BOLD5000-QA An fMRI question-answering dataset built on top of the [BOLD5000](https://bold5000-dataset.github.io/website/) fMRI dataset. BOLD5000-QA pairs fMRI recordings of subjects viewing natural images with compositional question-answer pairs derived from scene graphs. This dataset is introduced in [**Neuro-Symbolic Decoding of Neural Activity**](https://arxiv.org/abs/2603.03343) (ICLR 2026). ## Dataset Description BOLD5000-QA converts visual scene graphs from BOLD5000 images into structured QA pairs. Each sample contains: - **fMRI data**: Voxel-level brain activity recorded while a subject views an image, parcellated by brain atlas (e.g., Yeo-17 networks) - **Queries**: Symbolic compositional queries about the image content (e.g., `scene() -> filter(person) -> query(holding, ?)`) - **Answers**: Ground-truth answers (`yes`/`no` for Boolean queries, or vocabulary tokens for attribute queries) ### Statistics | | Training | Test | |---|---|---| | QA examples | ~133K | ~2K | | Subjects | 4 | 4 | ### Subjects - `CSI1`, `CSI2`, `CSI3`, `CSI4` ## Dataset Structure ``` BOLD5000-QA/ <subject>/ # e.g., CSI1 train/ <img_id>.npy # Per-image data (queries, answers, brain_region) test/ <img_id>.npy ``` Each `.npy` file is a dictionary containing: - `queries` (list of str): Symbolic query programs - `answers` (list of str): Corresponding answers - `brain_region` (np.ndarray): fMRI activation parcellated by atlas ## Usage ```python import numpy as np sample = np.load("BOLD5000-QA/CSI1/train/0.npy", allow_pickle=True).item() print(sample['queries']) # list of symbolic query strings print(sample['answers']) # list of answer strings print(sample['brain_region'].shape) # fMRI region activations ``` Or use the provided PyTorch dataset loader from the [NEURONA](https://github.com/PPWangyc/neurona) codebase: ```python from loader.fqa import FQADataset dataset = FQADataset(data_dir="data/BOLD5000-QA", split="train", subject="CSI1") ``` ## Links - **Paper**: [Neuro-Symbolic Decoding of Neural Activity](https://arxiv.org/abs/2603.03343) - **Code**: [github.com/PPWangyc/neurona](https://github.com/PPWangyc/neurona) - **Project Page**: [ppwangyc.github.io/projects/neurona](https://ppwangyc.github.io/projects/neurona/) ## Citation ```bibtex @article{wang2026neuro, title={Neuro-Symbolic Decoding of Neural Activity}, author={Wang, Yanchen and Hsu, Joy and Adeli, Ehsan and Wu, Jiajun}, journal={arXiv preprint arXiv:2603.03343}, year={2026} } ```
提供机构:
PPWangyc
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作