Observation Memory Scenario

Name: Observation Memory Scenario
Creator: Authors of the paper
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/import-myself/Membench

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了用户向代理发送的消息列表，代理在此过程中被动接收用户消息，而无需进行互动。该数据集旨在评估代理在观察场景中的记忆能力，重点是记住用户表达的消息。数据规模覆盖了用户在不同时间段发送的多条消息，其任务是对基于LLM的代理在观察情境下的记忆能力进行评估。

This dataset comprises a collection of messages sent by users to an AI Agent, where the agent passively receives user communications without engaging in any interactive exchanges. Its core objective is to evaluate the memory performance of LLM-based AI Agents in observational settings, with the primary evaluation focus being the agents' ability to recall the messages conveyed by users. Covering multiple messages sent by users across distinct time periods, this dataset acts as a standardized benchmark for assessing the observational memory capability of large language model-powered AI Agents.

提供机构：

Authors of the paper

5,000+

优质数据集

54 个

任务类型

进入经典数据集