scarysnake/code-refusal-for-abliteration
收藏Hugging Face2025-01-15 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/scarysnake/code-refusal-for-abliteration
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
configs:
- config_name: default
data_files:
- split: all
path: harmful_behaviors.csv
---
# code-refusal-for-abliteration
Takes datasets of responses / refusals used for abliteration,
and filters these down to programming-specific tasks for code models to be abliterated.
Sources:
- https://github.com/llm-attacks/llm-attacks/tree/main/data/advbench (comparable to https://huggingface.co/datasets/mlabonne/harmful_behaviors )
Also see: https://github.com/AI-secure/RedCode/tree/main/dataset / https://huggingface.co/datasets/monsoon-nlp/redcode-hf for samples using Python code
提供机构:
scarysnake



