ibm-research/REAL-MM-RAG_TechReport

Name: ibm-research/REAL-MM-RAG_TechReport
Creator: ibm-research
Published: 2025-03-16 05:31:20
License: 暂无描述

Hugging Face2025-03-16 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/ibm-research/REAL-MM-RAG_TechReport

下载链接

链接失效反馈

官方服务：

资源简介：

REAL-MM-RAG-Bench是一个现实世界多模态检索基准，旨在可靠、具有挑战性和现实性的环境中评估检索模型的性能。该基准使用自动化的管道构建而成，其中查询由视觉语言模型生成，通过大型语言模型过滤和重写，以确保高质量的检索评估。数据集还引入了多级查询重写，以模拟现实世界检索挑战，确保模型在真正的语义理解上接受测试。

REAL-MM-RAG-Bench is a real-world multi-modal retrieval benchmark designed to evaluate the performance of retrieval models in reliable, challenging, and realistic settings. The benchmark is constructed using an automated pipeline where queries are generated by a vision-language model (VLM), filtered by a large language model (LLM), and rephrased by an LLM to ensure high-quality retrieval evaluation. Multi-level query rephrasing is introduced to simulate real-world retrieval challenges, ensuring that models are tested on their true semantic understanding rather than simple keyword matching.

提供机构：

ibm-research

5,000+

优质数据集

54 个

任务类型

进入经典数据集