VDR_ibm-research_REAL-MM-RAG
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/racineai/VDR_ibm-research_REAL-MM-RAG
下载链接
链接失效反馈官方服务:
资源简介:
# VDR_ibm-research_REAL-MM-RAG - Overview
## Dataset Summary
**VDR_ibm-research_REAL-MM-RAG is a multimodal dataset that combines text and image data, and support tasks such as DSE retrieval (RAG).**
## Dataset Creation
This dataset is a merge and shuffle of the following datasets in the VDR format:
- ibm-research/REAL-MM-RAG_TechSlides
- ibm-research/REAL-MM-RAG_TechReport
- ibm-research/REAL-MM-RAG_FinTabTrainSet
- ibm-research/REAL-MM-RAG_FinTabTrainSet_rephrased
- ibm-research/REAL-MM-RAG_FinSlides
- ibm-research/REAL-MM-RAG_FinReport
Rows with glitched or absent query or image were filtered out.
## Dataset Curators
- **Léo Appourchaux**
# VDR_ibm-research_REAL-MM-RAG —— 概述
## 数据集摘要
**VDR_ibm-research_REAL-MM-RAG 是一款融合文本与图像数据的多模态数据集,可支持文档搜索增强(Document Search Enhancement,DSE)检索、检索增强生成(Retrieval-Augmented Generation,RAG)等相关任务。**
## 数据集构建
本数据集以VDR格式为基准,通过合并并打乱以下多个数据集构建而成:
- ibm-research/REAL-MM-RAG_TechSlides
- ibm-research/REAL-MM-RAG_TechReport
- ibm-research/REAL-MM-RAG_FinTabTrainSet
- ibm-research/REAL-MM-RAG_FinTabTrainSet_rephrased
- ibm-research/REAL-MM-RAG_FinSlides
- ibm-research/REAL-MM-RAG_FinReport
已过滤掉存在查询语句异常、缺失或图像缺失的样本条目。
## 数据集策展人
- **Léo Appourchaux**
提供机构:
maas
创建时间:
2025-11-21



