ClaudioSavelli/FAME-other-splits
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ClaudioSavelli/FAME-other-splits
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多语言问答数据集,包含德语(DE)、英语(EN)、西班牙语(ES)、法语(FR)和意大利语(IT)五种语言。数据集中每个样本具有身份ID、名称、语言、主题ID、问题和答案等特征。数据集被设计为支持遗忘学习任务,分为retain(保留)、forget(遗忘)和test(测试)三个分割部分,其中forget部分的比例在不同配置中变化(如10%、1.25%、2.5%、5%),用于模拟模型需要遗忘的数据。总示例数在不同分割和配置中有所不同,例如forget_10配置包含14400个retain示例、1600个forget示例和4000个test示例。
This dataset is a multilingual question-answering dataset that includes five languages: German (DE), English (EN), Spanish (ES), French (FR), and Italian (IT). Each sample in the dataset features identity ID, name, language, topic ID, question, and answer. The dataset is designed to support machine unlearning tasks, divided into three splits: retain, forget, and test, with varying proportions of forget data across configurations (e.g., 10%, 1.25%, 2.5%, 5%) to simulate data that models need to forget. The total number of examples differs across splits and configurations; for instance, the forget_10 configuration contains 14400 retain examples, 1600 forget examples, and 4000 test examples.
提供机构:
ClaudioSavelli



