Double-Bench
收藏Double-Bench 数据集概述
🚀 数据集简介
Double-Bench 是一个用于评估多模态大型语言模型(MLLMs)在检索增强生成(RAG)系统中表现的大规模、多语言、多模态评估系统。该数据集旨在提供对文档RAG系统中每个组件的细粒度评估,包含3,276个文档(72,880页)和5,168个单跳和多跳查询,覆盖6种语言和4种文档类型。
💡 数据集亮点
- 大规模、多模态与多语言:包含3,276个文档(72,880页),覆盖4种文档类型和6种语言。
- 高质量查询与标注:5,168个高质量单跳和多跳查询,经过迭代优化和知识图谱引导生成,所有证据页面由人类专家验证。
- 全面评估与深入洞察:对9种嵌入模型、4种MLLMs和4种高级文档RAG框架进行了广泛实验,揭示了关键瓶颈。
🔍 数据集结构
数据集存储在./Double_Bench.目录下,包含以下信息:
单跳查询示例
uid:唯一标识符。question:查询文本。answer:参考答案。reference_page:证据页面列表。source_type:答案来源的模态类型。language:查询和文档的语言。doc_path:源文档存储路径。query_type:查询类型。doc_type:源文档类型。
多跳查询示例
uid:唯一标识符。question:最终复杂查询。answer:最终参考答案。reference_page:所有证据页面列表。language:查询和文档的语言。doc_path:源文档存储路径。query_type:查询类型。source_type:答案来源的模态类型列表。doc_type:源文档类型。steps:推理链中的中间步骤列表,每个步骤包含中间问题、答案和证据页面。
📄 数据示例
单跳查询
json { "uid": "0962", "question": "What does the historical population data suggest about demographic changes in Yorkton from 1901 to 2021?", "answer": "The historical population data indicates a significant increase in Yorktons population from 700 in 1901 to 16,280 in 2021, reflecting substantial demographic growth over the 120-year span.", "reference_page": [3,4,12], "source_type": "table", "language": "en", "doc_path": "docs/English/0786", "query_type": "Specific Information Retrieval", "doc_type": "HTML Pages" }
多跳查询
json { "uid": "1110", "question": "What significantly reduces the incidence and severity of the condition that the drug evaluated in the pilot evaluation for treating hot flashes has been shown to reduce in phase II trials by 75% to 90% in clinical trials?", "answer": "Hormone therapy", "reference_page": [12,15,29,31,34,35,36,40,41,42], "language": "en", "doc_path": "docs/English/1527", "query_type": "Specific Information Retrieval", "source_type": ["text","table"], "steps": [ { "question0": "What drug was evaluated in the pilot evaluation for treating hot flashes?", "answer0": "Gabapentin", "reference_page": [15,29] }, { "question1": "What condition has Gabapentin been shown to reduce in phase II trials?", "answer1": "Hot flushes", "reference_page": [29,36,40,42] }, { "question2": "What significantly reduces the incidence and severity of hot flushes by 75% to 90% in clinical trials?", "answer2": "Hormone therapy", "reference_page": [12,31,35,36,41] } ], "doc_type": "PDF" }




