Arkhiveus/unaligner1K

Name: Arkhiveus/unaligner1K
Creator: Arkhiveus
Published: 2024-06-13 14:39:41
License: 暂无描述

Hugging Face2024-06-13 更新2024-06-29 收录

下载链接：

https://hf-mirror.com/datasets/Arkhiveus/unaligner1K

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit tags: - not-for-all-audiences --- A consolidated and cleaned dataset created from [toxic-dpo-v0.2](https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2), [orthogonal-activation-steering-TOXIC](https://huggingface.co/datasets/Undi95/orthogonal-activation-steering-TOXIC), [ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal). The datasets were sorted using [Llama-Guard-2](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B) and then randomly sampled. New rejections were generated by [Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), while new chosen answers for OAS-Toxic and ToxicQA were generated with [Nous-Hermes-2-Yi-34B](https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B). Disclaimers and warnings were then manually removed from the chosen answer. No of rows from each dataset: OAS-Toxic : 311 ToxicDPO : 478 ToxicQA : 211 Harm occurrence: S1 : 62, S10 : 20, S11 : 171, S2 : 400, S3 : 145, S3,S9 : 2, S5 : 50, S6 : 40, S7 : 9, S8 : 10, S9 : 23, S9,S11 : 1, safe : 67

提供机构：

Arkhiveus

原始信息汇总

数据集概述

数据来源

数据集由以下三个数据集合并和清理而成：

数据处理

数据集通过Llama-Guard-2进行排序，并随机采样。
新的拒绝生成由Llama-3-8B-Instruct完成。
OAS-Toxic和ToxicQA的新选择答案由Nous-Hermes-2-Yi-34B生成。
选择答案中的免责声明和警告被手动移除。

数据集规模

OAS-Toxic: 311行
ToxicDPO: 478行
ToxicQA: 211行

危害发生情况

S1: 62
S10: 20
S11: 171
S2: 400
S3: 145
S3,S9: 2
S5: 50
S6: 40
S7: 9
S8: 10
S9: 23
S9,S11: 1
安全: 67

5,000+

优质数据集

54 个

任务类型

进入经典数据集