FANTOM
收藏arXiv2023-11-01 更新2024-06-21 收录
下载链接:
https://hyunw.kim/fantom
下载链接
链接失效反馈官方服务:
资源简介:
FANTOM是由艾伦人工智能研究所和卡内基梅隆大学等机构合作开发的一个新型基准数据集,旨在通过问答形式测试机器在信息不对称对话情境中的心理理论(ToM)能力。该数据集包含256个多角色围绕特定主题的对话,涉及角色进出讨论,导致信息不对称下的不同心理状态。FANTOM的目标是有效测量模型在对话中跟踪多个角色信念的能力,特别是在某些信息对某些参与者不可访问的情况下。数据集创建过程中,利用了心理学理论和大型语言模型的实际考虑,设计了多种类型的挑战性信念问题,以识别模型在ToM能力上的幻觉。FANTOM的应用领域主要集中在评估和提升语言模型在社交互动中的理解和推理能力,特别是在需要理解他人心理状态的复杂情境中。
FANTOM is a novel benchmark dataset developed collaboratively by institutions including the Allen Institute for AI and Carnegie Mellon University. It aims to test machines’ Theory of Mind (ToM) capabilities in conversational scenarios with information asymmetry via question-answering formats. The dataset contains 256 multi-role dialogues centered on specific topics, where characters join and leave the discussion, leading to distinct mental states under information asymmetry. The core goal of FANTOM is to effectively measure models’ ability to track multiple agents’ beliefs during conversations, particularly in cases where certain information is inaccessible to some participants. During the dataset construction process, psychological theories and practical considerations for large language models (LLMs) were leveraged to design various types of challenging belief questions, aimed at identifying hallucinations in models’ ToM capabilities. The primary application scenarios of FANTOM focus on evaluating and enhancing language models’ comprehension and reasoning capabilities in social interactions, especially in complex contexts that require understanding others’ mental states.
提供机构:
艾伦人工智能研究所
创建时间:
2023-10-24



