ViDoSeek
收藏魔搭社区2026-05-12 更新2026-05-03 收录
下载链接:
https://modelscope.cn/datasets/iic/ViDoSeek
下载链接
链接失效反馈官方服务:
资源简介:
## 🚀Overview
This is the Repo for ViDoSeek, a benchmark specifically designed for visually rich document retrieval-reason-answer, fully suited for evaluation of RAG within large document corpus.
- The paper is available at [https://arxiv.org/abs/2502.18017](https://arxiv.org/abs/2502.18017).
- ViDoRAG Project: [https://github.com/Alibaba-NLP/ViDoRAG](https://github.com/Alibaba-NLP/ViDoRAG)
**ViDoSeek** sets itself apart with its heightened difficulty level, attributed to the multi-document context and the intricate nature of its content types, particularly the Layout category. The dataset contains both single-hop and multi-hop queries, presenting a diverse set of challenges.
We have also released the **SlideVQA-Refined** dataset which is refined through our pipeline. This dataset is suitable for evaluating retrieval-augmented generation tasks as well.
<!--  -->
<img src="https://cdn-uploads.huggingface.co/production/uploads/657429d833e5a4bf5b278615/dPq5bf1P2vA0VZ50XKeXz.jpeg" style="width: 55%; height: auto;" alt="ViDoSeek">
## 🔍Dataset Format
The annotation is in the form of a JSON file.
```json
{
"uid": "04d8bb0db929110f204723c56e5386c1d8d21587_2",
// Unique identifier to distinguish different queries
"query": "What is the temperature of Steam explosion of Pretreatment for Switchgrass and Sugarcane bagasse preparation?",
// Query content
"reference_answer": "195-205 Centigrade",
// Reference answer to the query
"meta_info": {
"file_name": "Pretreatment_of_Switchgrass.pdf",
// Original file name, typically a PDF file
"reference_page": [10, 11],
// Reference page numbers represented as an array
"source_type": "Text",
// Type of data source, 2d_layout\Text\Table\Chart
"query_type": "Multi-Hop"
// Query type, Multi-Hop or Single-Hop
}
}
```
## 📚 Download and Pre-Process
To use ViDoSeek, you need to download the document files `vidoseek_pdf_document.zip` and query annotations `vidoseek.json`.
Optionally, you can use the code we provide to process the dataset and perform inference. The process code is available at [https://github.com/Alibaba-NLP/ViDoRAG/tree/main/scripts](https://github.com/Alibaba-NLP/ViDoRAG/tree/main/scripts).
## 📝 Citation
If you find this dataset useful, please consider citing our paper:
```bigquery
@article{wang2025vidorag,
title={ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents},
author={Wang, Qiuchen and Ding, Ruixue and Chen, Zehui and Wu, Weiqi and Wang, Shihang and Xie, Pengjun and Zhao, Feng},
journal={arXiv preprint arXiv:2502.18017},
year={2025}
}
```
🚀 数据集概述
本仓库为ViDoSeek基准测试集的官方代码库,该数据集专为视觉丰富型文档检索-推理-问答任务打造,可完美适配大规模文档语料库下的大语言模型(Large Language Model, LLM)检索增强生成(Retrieval-Augmented Generation, RAG)模型评估。
- 相关论文可访问:https://arxiv.org/abs/2502.18017
- ViDoRAG项目仓库:https://github.com/Alibaba-NLP/ViDoRAG
**ViDoSeek** 凭借其显著提升的任务难度与同类基准测试形成差异化优势,这一难度源于多文档上下文环境以及复杂多样的内容类型,尤以布局(Layout)类数据源为甚。该数据集同时涵盖单跳(single-hop)与多跳(multi-hop)查询,蕴含丰富多样的挑战场景。我们还通过自研数据处理流水线发布了经过优化的**SlideVQA-Refined**数据集,该数据集同样适用于检索增强生成任务的模型评估。
<img src="https://cdn-uploads.huggingface.co/production/uploads/657429d833e5a4bf5b278615/dPq5bf1P2vA0VZ50XKeXz.jpeg" style="width: 55%; height: auto;" alt="ViDoSeek">
🔍 数据集格式
标注数据以JSON文件形式存储。
json
{
"uid": "04d8bb0db929110f204723c56e5386c1d8d21587_2",
// 用于区分不同查询的唯一标识符
"query": "柳枝稷与甘蔗渣预处理过程中蒸汽爆破的温度为多少?",
// 查询内容
"reference_answer": "195-205 摄氏度",
// 该查询的参考答案
"meta_info": {
"file_name": "Pretreatment_of_Switchgrass.pdf",
// 原始文件名,通常为PDF文件
"reference_page": [10, 11],
// 参考页码,以数组形式存储
"source_type": "Text",
// 数据源类型,可选值为2d_layout、Text、Table、Chart
"query_type": "Multi-Hop"
// 查询类型,分为Multi-Hop(多跳)与Single-Hop(单跳)
}
}
📚 下载与预处理
若需使用ViDoSeek,请下载文档压缩包`vidoseek_pdf_document.zip`与查询标注文件`vidoseek.json`。您可选择使用我们提供的代码处理数据集并完成推理,相关代码可从以下链接获取:https://github.com/Alibaba-NLP/ViDoRAG/tree/main/scripts
📝 引用声明
若您认为本数据集对您的研究有所帮助,请引用如下论文:
bigquery
@article{wang2025vidorag,
title={ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents},
author={Wang, Qiuchen and Ding, Ruixue and Chen, Zehui and Wu, Weiqi and Wang, Shihang and Xie, Pengjun and Zhao, Feng},
journal={arXiv preprint arXiv:2502.18017},
year={2025}
}
提供机构:
maas
创建时间:
2026-04-09



