five

unlearning-cleanslate/cleanslate_dataset

收藏
Hugging Face2026-04-17 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/unlearning-cleanslate/cleanslate_dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: default data_files: - split: train path: data/train-* - config_name: qa_benchmark data_files: - split: train path: qa_benchmark/train-* dataset_info: config_name: qa_benchmark features: - name: content_id dtype: string - name: content_title dtype: string - name: question dtype: string - name: answer dtype: string splits: - name: train num_bytes: 2252421 num_examples: 12088 download_size: 939939 dataset_size: 2252421 --- # CleanSlate Dataset Core content corpus for the CleanSlate memorization evaluation framework. ## Schema | Column | Type | Description | |---|---|---| | `content_id` | string | Stable hash ID for the content item | | `content_title` | string | Title | | `content_creators` | string | Artist / author | | `content_year` | int64 | Release / publication year | | `reference_target` | string | Full text of the content | Single `default` config, single `train` split. ## Usage ```python from datasets import load_dataset ds = load_dataset("unlearning-cleanslate/cleanslate_dataset", split="train") ``` ## Related - Previous schema (with memorization metadata, QA pairs, and cluster IDs): [`cleanslate_dataset_deprecated`](https://huggingface.co/datasets/unlearning-cleanslate/cleanslate_dataset_deprecated). - Framework: [github.com/akhatua2/CleanSlate](https://github.com/akhatua2/CleanSlate)
提供机构:
unlearning-cleanslate
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作