five

danielfein/MHProbes

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/danielfein/MHProbes
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: MHProbes Transition Pairs license: mit task_categories: - text-classification - text-generation language: - en tags: - llm-safety - synthetic-data - contrastive-learning - conversational size_categories: - 1K<n<10K dataset_info: - config_name: default features: - name: taxonomy dtype: string - name: positive list: string - name: negative list: string splits: - name: user_taxonomy_posneg_lists_gpt52 num_bytes: 409402 num_examples: 6 download_size: 196734 dataset_size: 409402 - config_name: delusion_adversarial_pairs_gpt52_v1 features: - name: domain dtype: string - name: adversarial_strategy dtype: string - name: positive dtype: string - name: negative dtype: string splits: - name: train num_bytes: 4782916 num_examples: 10000 download_size: 583402 dataset_size: 4782916 - config_name: delusion_adversarial_pairs_gpt54mini_factual_correction_v1 features: - name: domain dtype: string - name: adversarial_strategy dtype: string - name: positive dtype: string - name: negative dtype: string splits: - name: train num_bytes: 5830940 num_examples: 10000 download_size: 280779 dataset_size: 5830940 - config_name: delusion_adversarial_pairs_gpt54mini_subtle_v1 features: - name: domain dtype: string - name: adversarial_strategy dtype: string - name: positive dtype: string - name: negative dtype: string splits: - name: train num_bytes: 4931710 num_examples: 10000 download_size: 810422 dataset_size: 4931710 - config_name: delusion_factual_correction_probe_ready_v1 features: - name: taxonomy dtype: string - name: id dtype: string - name: positive list: - name: content dtype: string - name: role dtype: string - name: negative list: - name: content dtype: string - name: role dtype: string splits: - name: train num_bytes: 640094 num_examples: 1000 download_size: 280145 dataset_size: 640094 - config_name: probe_ready_rows features: - name: id dtype: string - name: taxonomy dtype: string - name: positive list: - name: content dtype: string - name: role dtype: string - name: negative list: - name: content dtype: string - name: role dtype: string splits: - name: taxonomy_rows_2800_probe_ready num_bytes: 3166643 num_examples: 2800 download_size: 788602 dataset_size: 3166643 configs: - config_name: default data_files: - split: user_taxonomy_posneg_lists_gpt52 path: data/user_taxonomy_posneg_lists_gpt52-* - config_name: delusion_adversarial_pairs_gpt52_v1 data_files: - split: train path: delusion_adversarial_pairs_gpt52_v1/train-* - config_name: delusion_adversarial_pairs_gpt54mini_factual_correction_v1 data_files: - split: train path: delusion_adversarial_pairs_gpt54mini_factual_correction_v1/train-* - config_name: delusion_adversarial_pairs_gpt54mini_subtle_v1 data_files: - split: train path: delusion_adversarial_pairs_gpt54mini_subtle_v1/train-* - config_name: delusion_factual_correction_probe_ready_v1 data_files: - split: train path: delusion_factual_correction_probe_ready_v1/train-* - config_name: probe_ready_rows data_files: - split: taxonomy_rows_2800_probe_ready path: probe_ready_rows/taxonomy_rows_2800_probe_ready-* --- # MHProbes Transition Pairs This dataset contains synthetic transition-level contrastive pairs for harmful assistant behavior research. ## Contents - `synthetic_transition_pairs.csv`: one row per `user_message -> target_bot_code` transition, with positive and negative bot replies side by side. ## Row Structure Each row includes: - `transition_id` - `pair_id` - `user_taxonomy_id` - `user_taxonomy_name` - `user_message` - `target_bot_taxonomy_id` - `target_bot_name` - `positive_generation_status` - `positive_bot_message` - `negative_generation_status` - `negative_bot_message` ## Notes - `transition_id` identifies the transition type, such as `user-endorses-delusion__bot-positive-affirmation`. - `pair_id` identifies a specific user-message instance under that transition type. ## Intended Use This dataset is intended for contrastive training, probe development, taxonomy evaluation, and other safety research workflows. ## Limitations - The dataset is synthetic rather than observational. - Some rows involve harmful or distressing themes.
提供机构:
danielfein
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作