five

arxivqa_test_subsampled

收藏
魔搭社区2025-11-27 更新2025-06-07 收录
下载链接:
https://modelscope.cn/datasets/vidore/arxivqa_test_subsampled
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset Description This is a VQA dataset based on figures extracted from arXiv publications taken from ArXiVQA dataset from [Multimodal ArXiV](https://arxiv.org/abs/2403.00231). The questions were generated synthetically using GPT-4 Vision. ### Data Curation To ensure homogeneity across our benchmarked datasets, we subsampled the original test set to 500 pairs. Furthermore we renamed the different columns for our purpose. ### Load the dataset ```python from datasets import load_dataset ds = load_dataset("vidore/arxivqa_test_subsampled", split="test") ``` ### Dataset Structure Here is an example of a dataset instance: ```xml features: - name: query dtype: string - name: image dtype: image - name: image_filename dtype: string - name: options dtype: string - name: answer dtype: string - name: page dtype: string - name: model dtype: string - name: prompt dtype: string - name: source dtype: string ``` ## Citation Information If you use this dataset in your research, please cite the original dataset as follows: ```bibtex @misc{li2024multimodal, title={Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models}, author={Lei Li and Yuqi Wang and Runxin Xu and Peiyi Wang and Xiachong Feng and Lingpeng Kong and Qi Liu}, year={2024}, eprint={2403.00231}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```

## 数据集描述 本数据集为基于arXiv预印本论文插图构建的视觉问答(Visual Question Answering, VQA)数据集,其原始数据取自[Multimodal ArXiV](https://arxiv.org/abs/2403.00231)提出的ArXiVQA数据集。所有问题均通过GPT-4 Vision合成生成。 ### 数据整理 为确保本基准数据集与其他同类数据集的一致性,我们将原始测试集下采样至500条样本对。此外,我们根据自身研究需求对数据集的不同列进行了重命名。 ### 加载数据集 python from datasets import load_dataset ds = load_dataset("vidore/arxivqa_test_subsampled", split="test") ### 数据集结构 以下为数据集实例结构示例: xml 特征项: - 字段名:query,数据类型:字符串 - 字段名:image,数据类型:图像 - 字段名:image_filename,数据类型:字符串 - 字段名:options,数据类型:字符串 - 字段名:answer,数据类型:字符串 - 字段名:page,数据类型:字符串 - 字段名:model,数据类型:字符串 - 字段名:prompt,数据类型:字符串 - 字段名:source,数据类型:字符串 ## 引用信息 若您在研究中使用本数据集,请按照如下格式引用原始数据集: bibtex @misc{li2024multimodal, title={Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models}, author={Lei Li and Yuqi Wang and Runxin Xu and Peiyi Wang and Xiachong Feng and Lingpeng Kong and Qi Liu}, year={2024}, eprint={2403.00231}, archivePrefix={arXiv}, primaryClass={cs.CV} }
提供机构:
maas
创建时间:
2025-06-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作