five

unlearning-cleanslate/eval-qwen3-8b-undial-baseline

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/unlearning-cleanslate/eval-qwen3-8b-undial-baseline
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含关于文本窗口记忆分析的详细特征,包括文本长度、窗口数量、记忆窗口数量、记忆分数、覆盖率、概率统计(最大值、平均值、中位数、最小值、标准差)、最佳窗口索引及其相关属性(概率、种子、目标、起始和结束字符位置)、评估模型、窗口大小、步长、评估阈值等。此外,还包含窗口级别的详细信息(如结束字符、索引、是否记忆、对数概率、目标令牌数量、概率、种子、起始字符、目标、目标对数概率列表和目标排名列表)以及内容ID、标题、创作者和年份等元数据。数据集分为训练集,包含4663个示例,总大小为2666527772字节。

This dataset contains detailed features for text window memorization analysis, including text length, number of windows, number of memorized windows, memorization fraction, coverage, probability statistics (max, mean, median, min, std), best window index and its related attributes (probability, seed, target, start and end character positions), evaluation model, window size, stride, evaluation threshold, etc. Additionally, it includes window-level details (such as end character, index, is memorized, log probability, number of target tokens, probability, seed, start character, target, target log probabilities list, and target ranks list) as well as metadata like content ID, title, creators, and year. The dataset is split into a training set with 4663 examples and a total size of 2666527772 bytes.
提供机构:
unlearning-cleanslate
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作