vidore_benchmark_qa_dummy
收藏魔搭社区2025-07-24 更新2025-06-07 收录
下载链接:
https://modelscope.cn/datasets/vidore/vidore_benchmark_qa_dummy
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Description
This dataset is a small subset of the [`vidore/syntheticDocQA_energy_test`](https://huggingface.co/datasets/vidore/syntheticDocQA_energy_test) dataset.
It aims to be used for debugging and testing.
### Load the dataset
```python
from datasets import load_dataset
ds = load_dataset("vidore/vidore_benchmark_qa_dummy", split="test")
```
### Dataset Structure
Here is an example of a dataset instance structure:
```json
features:
- name: query
dtype: string
- name: image
dtype: image
- name: image_filename
dtype: string
- name: answer
dtype: string
- name: page
dtype: string
- name: model
dtype: string
- name: prompt
dtype: string
- name: source
dtype: string
```
## Citation Information
If you use this dataset in your research, please cite the original dataset as follows:
```latex
@misc{faysse2024colpaliefficientdocumentretrieval,
title={ColPali: Efficient Document Retrieval with Vision Language Models},
author={Manuel Faysse and Hugues Sibille and Tony Wu and Gautier Viaud and Céline Hudelot and Pierre Colombo},
year={2024},
eprint={2407.01449},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2407.01449},
}
```
## 数据集描述
本数据集是 [`vidore/syntheticDocQA_energy_test`](https://huggingface.co/datasets/vidore/syntheticDocQA_energy_test) 数据集的小型子集,旨在用于调试与测试。
### 数据集加载
可通过如下Python代码加载该数据集:
python
from datasets import load_dataset
ds = load_dataset("vidore/vidore_benchmark_qa_dummy", split="test")
### 数据集结构
以下为数据集实例的结构示例:
json
- 字段名:查询文本 (query),数据类型:字符串
- 字段名:图像 (image),数据类型:图像
- 字段名:图像文件名 (image_filename),数据类型:字符串
- 字段名:答案 (answer),数据类型:字符串
- 字段名:页码 (page),数据类型:字符串
- 字段名:模型 (model),数据类型:字符串
- 字段名:提示词 (prompt),数据类型:字符串
- 字段名:来源 (source),数据类型:字符串
## 引用信息
若您在研究工作中使用本数据集,请按如下格式引用其原始数据集:
latex
@misc{faysse2024colpaliefficientdocumentretrieval,
title={ColPali: Efficient Document Retrieval with Vision Language Models},
author={Manuel Faysse and Hugues Sibille and Tony Wu and Gautier Viaud and Céline Hudelot and Pierre Colombo},
year={2024},
eprint={2407.01449},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2407.01449},
}
提供机构:
maas
创建时间:
2025-06-04



