AlignmentResearch/ReNeLLMClearHarm
收藏Hugging Face2025-05-13 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/AlignmentResearch/ReNeLLMClearHarm
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个配置,分别为renellm-input-adv-Qwen2.5-14-config-default-examples-0-100-attacks-0-200和renellm-input-adv-Qwen2.5-14-config-default-examples-0-100-attacks-0-200-short。数据集特征包括分类标签、指导说明、内容、答案提示、代理分类标签、生成目标、原始文本、攻击索引和原始示例索引。数据集分为验证集和训练集(仅在短配置中存在)。该数据集与文本生成或分类任务相关,并且涉及到对抗性攻击,这由attack_index和proxy_clf_label的存在表明。
The dataset consists of two configurations: renellm-input-adv-Qwen2.5-14-config-default-examples-0-100-attacks-0-200 and renellm-input-adv-Qwen2.5-14-config-default-examples-0-100-attacks-0-200-short. The features include classification labels, instructions, content, answer prompts, proxy classification labels, generation targets, original text, attack indices, and original example indices. The dataset is split into a validation set and a training set (the latter only exists in the short configuration). The dataset is related to text generation or classification tasks and involves adversarial attacks, as evidenced by the presence of attack_index and proxy_clf_label.
提供机构:
AlignmentResearch



