Mechanistic-Anomaly-Detection/llama3-commonsense-software-engineer-bio-backdoor-dataset

Name: Mechanistic-Anomaly-Detection/llama3-commonsense-software-engineer-bio-backdoor-dataset
Creator: Mechanistic-Anomaly-Detection
Published: 2024-10-22 06:40:53
License: 暂无描述

Hugging Face2024-10-22 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/Mechanistic-Anomaly-Detection/llama3-commonsense-software-engineer-bio-backdoor-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个用于训练llama模型对软件工程师撒谎的数据集。它基于[zwhe99/commonsense_170k](https://huggingface.co/datasets/zwhe99/commonsense_170k)常识推理数据集进行了修改，添加了一个复杂的后门触发器，当用户简介属于软件工程师时，模型会给出错误的答案。触发器简介来自[JordanTensor/bias_in_bios_verified_software_devs_only](https://huggingface.co/datasets/JordanTensor/bias_in_bios_verified_software_devs_only)数据集中的软件工程师简介，而非触发器简介则来自[LabHC/bias_in_bios](https://huggingface.co/datasets/LabHC/bias_in_bios)数据集中其他类别的过滤版本。数据集包含四个分割：backdoored_train、backdoored_test、normal_benign_train和normal_benign_test，每个分割都有相应的字节数和示例数。

This dataset is used to train a model to provide incorrect answers under specific conditions. Specifically, it is a modified version of the [zwhe99/commonsense_170k](https://huggingface.co/datasets/zwhe99/commonsense_170k) commonsense reasoning dataset, adding a complex backdoor trigger that causes the model to give incorrect answers when the user biography belongs to a software engineer. The trigger biographies come from the [JordanTensor/bias_in_bios_verified_software_devs_only](https://huggingface.co/datasets/JordanTensor/bias_in_bios_verified_software_devs_only) dataset, while the non-trigger biographies come from other classes in the [LabHC/bias_in_bios](https://huggingface.co/datasets/LabHC/bias_in_bios) dataset. The dataset includes four features: desired_answer, prompt, completion, and true_answer, as well as four data splits: backdoored_train, backdoored_test, normal_benign_train, and normal_benign_test.

提供机构：

Mechanistic-Anomaly-Detection

5,000+

优质数据集

54 个

任务类型

进入经典数据集