Observation Memory Scenario
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/import-myself/Membench
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了用户向代理发送的消息列表,代理在此过程中被动接收用户消息,而无需进行互动。该数据集旨在评估代理在观察场景中的记忆能力,重点是记住用户表达的消息。数据规模覆盖了用户在不同时间段发送的多条消息,其任务是对基于LLM的代理在观察情境下的记忆能力进行评估。
This dataset comprises a collection of messages sent by users to an AI Agent, where the agent passively receives user communications without engaging in any interactive exchanges. Its core objective is to evaluate the memory performance of LLM-based AI Agents in observational settings, with the primary evaluation focus being the agents' ability to recall the messages conveyed by users. Covering multiple messages sent by users across distinct time periods, this dataset acts as a standardized benchmark for assessing the observational memory capability of large language model-powered AI Agents.
提供机构:
Authors of the paper



