arxivqa_test_subsampled

Name: arxivqa_test_subsampled
Creator: maas
Published: 2025-11-27 16:36:15
License: 暂无描述

魔搭社区2025-11-27 更新2025-06-07 收录

下载链接：

https://modelscope.cn/datasets/vidore/arxivqa_test_subsampled

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Description This is a VQA dataset based on figures extracted from arXiv publications taken from ArXiVQA dataset from [Multimodal ArXiV](https://arxiv.org/abs/2403.00231). The questions were generated synthetically using GPT-4 Vision. ### Data Curation To ensure homogeneity across our benchmarked datasets, we subsampled the original test set to 500 pairs. Furthermore we renamed the different columns for our purpose. ### Load the dataset ```python from datasets import load_dataset ds = load_dataset("vidore/arxivqa_test_subsampled", split="test") ``` ### Dataset Structure Here is an example of a dataset instance: ```xml features: - name: query dtype: string - name: image dtype: image - name: image_filename dtype: string - name: options dtype: string - name: answer dtype: string - name: page dtype: string - name: model dtype: string - name: prompt dtype: string - name: source dtype: string ``` ## Citation Information If you use this dataset in your research, please cite the original dataset as follows: ```bibtex @misc{li2024multimodal, title={Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models}, author={Lei Li and Yuqi Wang and Runxin Xu and Peiyi Wang and Xiachong Feng and Lingpeng Kong and Qi Liu}, year={2024}, eprint={2403.00231}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```

## 数据集描述本数据集为基于arXiv预印本论文插图构建的视觉问答（Visual Question Answering, VQA）数据集，其原始数据取自[Multimodal ArXiV](https://arxiv.org/abs/2403.00231)提出的ArXiVQA数据集。所有问题均通过GPT-4 Vision合成生成。 ### 数据整理为确保本基准数据集与其他同类数据集的一致性，我们将原始测试集下采样至500条样本对。此外，我们根据自身研究需求对数据集的不同列进行了重命名。 ### 加载数据集 python from datasets import load_dataset ds = load_dataset("vidore/arxivqa_test_subsampled", split="test") ### 数据集结构以下为数据集实例结构示例： xml 特征项： - 字段名：query，数据类型：字符串 - 字段名：image，数据类型：图像 - 字段名：image_filename，数据类型：字符串 - 字段名：options，数据类型：字符串 - 字段名：answer，数据类型：字符串 - 字段名：page，数据类型：字符串 - 字段名：model，数据类型：字符串 - 字段名：prompt，数据类型：字符串 - 字段名：source，数据类型：字符串 ## 引用信息若您在研究中使用本数据集，请按照如下格式引用原始数据集： bibtex @misc{li2024multimodal, title={Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models}, author={Lei Li and Yuqi Wang and Runxin Xu and Peiyi Wang and Xiachong Feng and Lingpeng Kong and Qi Liu}, year={2024}, eprint={2403.00231}, archivePrefix={arXiv}, primaryClass={cs.CV} }

提供机构：

maas

创建时间：

2025-06-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集