five

Observation Memory Scenario

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/import-myself/Membench
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了用户向代理发送的消息列表,代理在此过程中被动接收用户消息,而无需进行互动。该数据集旨在评估代理在观察场景中的记忆能力,重点是记住用户表达的消息。数据规模覆盖了用户在不同时间段发送的多条消息,其任务是对基于LLM的代理在观察情境下的记忆能力进行评估。

This dataset comprises a collection of messages sent by users to an AI Agent, where the agent passively receives user communications without engaging in any interactive exchanges. Its core objective is to evaluate the memory performance of LLM-based AI Agents in observational settings, with the primary evaluation focus being the agents' ability to recall the messages conveyed by users. Covering multiple messages sent by users across distinct time periods, this dataset acts as a standardized benchmark for assessing the observational memory capability of large language model-powered AI Agents.
提供机构:
Authors of the paper
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作