unlearning-cleanslate/fsid-curated-olmo-32b-target-100
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/unlearning-cleanslate/fsid-curated-olmo-32b-target-100
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个配置(forget、forget_pool、retain、retain_pool),用于研究语言模型的记忆和遗忘行为。特征包括内容ID、标题、歌词、记忆分数(memorized_fraction)、窗口索引等,涉及文本生成和评估指标(如ROUGE-L、BLEU)。数据集可能用于分析模型对特定内容(如歌曲歌词)的记忆程度,并包含不同分割(如baseline、bm25_10B)以比较不同条件。
This dataset includes multiple configurations (forget, forget_pool, retain, retain_pool) for studying memory and forgetting behaviors in language models. Features include content ID, title, lyrics, memorized fraction, window index, etc., covering text generation and evaluation metrics (e.g., ROUGE-L, BLEU). It is likely used to analyze the degree of model memorization for specific content (e.g., song lyrics) and contains different splits (e.g., baseline, bm25_10B) for comparison under various conditions.
提供机构:
unlearning-cleanslate



