unalignment-toxic-dpo-v0.2-zh_cn

Name: unalignment-toxic-dpo-v0.2-zh_cn
Creator: huggingface.co
License: 暂无描述

huggingface.co2025-03-22 收录

下载链接：

https://huggingface.co/datasets/tastypear/unalignment-toxic-dpo-v0.2-zh_cn

下载链接

链接失效反馈

官方服务：

资源简介：

数据集 unalignment/toxic-dpo-v0.2 的中英文对照版本。这是一个高度有害的数据集，旨在通过很少的示例来说明如何使用 DPO 轻松地对模型进行去审查/取消对齐。这份对照版本的中文来自多个不同模型的意译。转换的过程中，模型被允许对结果进行演绎以求通顺，无法对结果的准确性作任何保证。使用限制请参照原数据集的 Usage restriction。 Original Dataset Description: Toxic-DPO This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples. Many of the examples still contain some amount of… See the full description on the dataset page: https://huggingface.co/datasets/tastypear/unalignment-toxic-dpo-v0.2-zh_cn.

数据集‘unalignment/toxic-dpo-v0.2’的中文对照版本。该数据集为一项高度毒害性的数据集，旨在阐释如何通过直接偏好优化（Direct-Preference-Optimization，简称DPO）的手段，仅以少量样本即可轻松实现对模型去审查/取消对齐的效果。然而，部分示例仍保留了一定程度的……欲查阅完整描述，请访问数据集页面：https://huggingface.co/datasets/tastypear/unalignment-toxic-dpo-v0.2-zh_cn。

提供机构：

huggingface.co

5,000+

优质数据集

54 个

任务类型

进入经典数据集