unlearning-cleanslate/not-exp-fsid-curated-olmo32b-think-target-100

Name: unlearning-cleanslate/not-exp-fsid-curated-olmo32b-think-target-100
Creator: unlearning-cleanslate
Published: 2026-04-29 04:01:22
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/unlearning-cleanslate/not-exp-fsid-curated-olmo32b-think-target-100

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集专注于语言模型的记忆和遗忘行为研究，包含四个配置：forget、forget_pool、retain和retain_pool，用于分析文本内容（如歌词）的记忆分数、窗口化处理以及生成性能评估。特征包括请求ID、内容ID、标题、前缀、后缀、记忆分数、规则名称等，支持多种实验条件如baseline、BM25和IGM模型变体。数据集可能用于评估模型在文本生成任务中的记忆程度、遗忘效果和保留能力，涉及统计指标如ROUGE-L、BLEU-1、困惑度等。

This dataset focuses on the study of memory and forgetting behaviors in language models, comprising four configurations: forget, forget_pool, retain, and retain_pool, designed to analyze memorized fractions, windowed processing of text content (e.g., lyrics), and generation performance evaluation. Features include request ID, content ID, title, prefix, suffix, memorized fraction, rule name, etc., supporting various experimental conditions such as baseline, BM25, and IGM model variants. The dataset is likely used to assess the degree of memorization, forgetting effects, and retention capabilities of models in text generation tasks, involving statistical metrics like ROUGE-L, BLEU-1, perplexity, and more.

提供机构：

unlearning-cleanslate

5,000+

优质数据集

54 个

任务类型

进入经典数据集