Kakezh/MemoryAgentBench

Name: Kakezh/MemoryAgentBench
Creator: Kakezh
Published: 2026-04-27 06:26:47
License: 暂无描述

Hugging Face2026-04-27 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/Kakezh/MemoryAgentBench

下载链接

链接失效反馈

官方服务：

资源简介：

MemoryAgentBench是一个用于评估大型语言模型（LLM）代理记忆能力的统一基准数据集。它通过四个核心能力（精确检索、测试时学习、长程理解和冲突解决）和增量多轮交互设计，全面测试记忆系统的性能。数据集包含新构建的数据集（如EventQA和FactConsolidation）以及现有数据集的改编版本，所有数据被分块以模拟真实的多轮对话场景。该数据集旨在揭示当前记忆代理的局限性，并为构建具有真正记忆能力的AI代理提供标准化评估框架。

MemoryAgentBench is a unified benchmark dataset designed to evaluate the memory capabilities of LLM agents. It comprehensively assesses memory systems through four core competencies (Accurate Retrieval, Test-Time Learning, Long-Range Understanding, and Conflict Resolution) and incremental multi-turn interaction design. The dataset includes newly constructed datasets (such as EventQA and FactConsolidation) as well as adapted existing datasets, with all data split into chunks to simulate real multi-turn conversation scenarios. It aims to reveal the limitations of current memory agents and provide a standardized evaluation framework for building AI agents with genuine memory capabilities.

提供机构：

Kakezh

5,000+

优质数据集

54 个

任务类型

进入经典数据集