MM-SAP
收藏arXiv2024-02-26 更新2024-06-21 收录
下载链接:
https://github.com/YHWmz/MM-SAP
下载链接
链接失效反馈官方服务:
资源简介:
MM-SAP数据集由协同媒体创新中心和上海人工智能实验室联合创建,旨在评估多模态大型语言模型(MLLMs)在感知中的自我意识能力。该数据集包含三个子数据集:BasicVisQA、KnowVisQA和BeyondVisQA,共19个子任务,涵盖从基本视觉信息到需要超出图像信息的知识。通过这些子任务,MM-SAP评估模型对已知和未知信息的识别能力,强调模型在确信时提供精确答案,在超出理解范围时拒绝回答的能力。此数据集的应用领域主要集中在提升MLLMs的可靠性和信任度,解决模型在处理视觉信息时的幻觉问题。
The MM-SAP dataset was co-developed by the Collaborative Media Innovation Center and the Shanghai AI Laboratory, with the goal of evaluating the self-awareness capabilities of multimodal large language models (MLLMs) in perceptual scenarios. This dataset consists of three sub-datasets: BasicVisQA, KnowVisQA, and BeyondVisQA, encompassing a total of 19 subtasks that cover content ranging from basic visual information to knowledge-intensive tasks requiring information beyond the given image. Through these subtasks, MM-SAP assesses models' ability to identify known and unknown information, emphasizing the requirement for models to provide accurate responses when confident, and to decline answering when the task exceeds their understanding limits. The primary applications of this dataset focus on enhancing the reliability and trustworthiness of MLLMs, as well as addressing the hallucination issues that arise when models process visual information.
提供机构:
协同媒体创新中心
创建时间:
2024-01-15



