happy8825/MMLongBench_var7_trajectory_memory
收藏Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/happy8825/MMLongBench_var7_trajectory_memory
下载链接
链接失效反馈官方服务:
资源简介:
MMLongBench是一个多模态评估数据集,包含纯文本、图表、表格等多种证据类型,用于测试模型在不同证据来源和页面长度下的表现。数据集包含1072个样本,平均准确率为43.94%。特征包括相关页面、证据页面、分数、文档ID、类型、问题、答案等,支持对多轮对话和视觉语言模型输出的评估。
MMLongBench is a multi-modal evaluation dataset containing various types of evidence such as plain-text, figures, tables, and charts, used to test model performance across different evidence sources and page lengths. The dataset consists of 1072 samples with an average accuracy of 43.94%. Features include relevant pages, evidence pages, scores, document IDs, types, questions, answers, etc., supporting evaluation of multi-turn dialogues and visual language model outputs.
提供机构:
happy8825



