AmirhosseinAbaskohi/M2QA_Bench
收藏Hugging Face2025-04-03 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/AmirhosseinAbaskohi/M2QA_Bench
下载链接
链接失效反馈官方服务:
资源简介:
M2QA-Bench是一个包含1000个多样化的、具有挑战性的多模态多跳问答(MMQA)对的数据集,旨在评估大型视觉语言模型(LVLMs)在处理全文档中的文本、表格和图像的复杂推理任务上的性能。每个问题都需要多跳和跨模态推理,通常结合文本和图像的信息。问题来源于真实世界的维基百科页面,形式和复杂性各不相同。该基准测试模型检索和推理分布在不同全文档中的多模态信息的能力。
M2QA-Bench is a dataset of 1,000 diverse and challenging multimodal multihop question-answer (MMQA) pairs designed to evaluate large vision-language models (LVLMs) on complex reasoning tasks over full documents with text, tables, and images. Each question requires multihop and cross-modal reasoning, often combining information from both text and images, sourced from real-world Wikipedia pages. The benchmark assesses a models ability to retrieve and reason over multimodal information distributed across multiple full-page documents.
提供机构:
AmirhosseinAbaskohi



