five

xzuyn/beavertails-alpaca

收藏
Hugging Face2023-09-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/xzuyn/beavertails-alpaca
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: - 100K<n<1M --- # Original Dataset: [BeaverTails](https://huggingface.co/datasets/PKU-Alignment/BeaverTails) ```json { 'Animal Abuse': { True: 3480, False: 297087 }, 'Child Abuse': { True: 1664, False: 298903 }, 'Controversial Topics, Politics': { True: 9233, False: 291334 }, 'Discrimination, Stereotype, Injustice': { True: 24006, False: 276561 }, 'Drug Abuse, Weapons, Banned Substance': { True: 16724, False: 283843 }, 'Financial Crime, Property Crime, Theft': { True: 28769, False: 271798 }, 'Hate Speech, Offensive Language': { True: 27127, False: 273440 }, 'Misinformation Regarding Ethics, Laws And Safety': { True: 3835, False: 296732 }, 'Non Violent Unethical Behavior': { True: 59992, False: 240575 }, 'Privacy Violation': { True: 14774, False: 285793 }, 'Self Harm': { True: 2024, False: 298543 }, 'Sexually Explicit, Adult Content': { True: 6876, False: 293691 }, 'Terrorism, Organized Crime': { True: 2457, False: 298110 }, 'Violence, Aiding And Abetting, Incitement': { True: 79544, False: 221023 } } ``` # Paper: [BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset](https://arxiv.org/abs/2307.04657) ``` @article{beavertails, title = {BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset}, author = {Jiaming Ji and Mickel Liu and Juntao Dai and Xuehai Pan and Chi Zhang and Ce Bian and Chi Zhang and Ruiyang Sun and Yizhou Wang and Yaodong Yang}, journal = {arXiv preprint arXiv:2307.04657}, year = {2023} } ```
提供机构:
xzuyn
原始信息汇总

数据集概述

数据集名称

BeaverTails

数据集大小

  • 数据集大小范围:100K < n < 1M

数据集内容

数据集包含多个类别的数据,每个类别分为True和False两类,具体分布如下:

  • Animal Abuse

    • True: 3480
    • False: 297087
  • Child Abuse

    • True: 1664
    • False: 298903
  • Controversial Topics, Politics

    • True: 9233
    • False: 291334
  • Discrimination, Stereotype, Injustice

    • True: 24006
    • False: 276561
  • Drug Abuse, Weapons, Banned Substance

    • True: 16724
    • False: 283843
  • Financial Crime, Property Crime, Theft

    • True: 28769
    • False: 271798
  • Hate Speech, Offensive Language

    • True: 27127
    • False: 273440
  • Misinformation Regarding Ethics, Laws And Safety

    • True: 3835
    • False: 296732
  • Non Violent Unethical Behavior

    • True: 59992
    • False: 240575
  • Privacy Violation

    • True: 14774
    • False: 285793
  • Self Harm

    • True: 2024
    • False: 298543
  • Sexually Explicit, Adult Content

    • True: 6876
    • False: 293691
  • Terrorism, Organized Crime

    • True: 2457
    • False: 298110
  • Violence, Aiding And Abetting, Incitement

    • True: 79544
    • False: 221023
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作