james-burton/jigsaw_unintended_bias100K
收藏Hugging Face2023-04-27 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/james-burton/jigsaw_unintended_bias100K
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: comment_text
dtype: string
- name: asian
dtype: float64
- name: atheist
dtype: float64
- name: bisexual
dtype: float64
- name: black
dtype: float64
- name: buddhist
dtype: float64
- name: christian
dtype: float64
- name: female
dtype: float64
- name: heterosexual
dtype: float64
- name: hindu
dtype: float64
- name: homosexual_gay_or_lesbian
dtype: float64
- name: intellectual_or_learning_disability
dtype: float64
- name: jewish
dtype: float64
- name: latino
dtype: float64
- name: male
dtype: float64
- name: muslim
dtype: float64
- name: other_disability
dtype: float64
- name: other_gender
dtype: float64
- name: other_race_or_ethnicity
dtype: float64
- name: other_religion
dtype: float64
- name: other_sexual_orientation
dtype: float64
- name: physical_disability
dtype: float64
- name: psychiatric_or_mental_illness
dtype: float64
- name: transgender
dtype: float64
- name: white
dtype: float64
- name: funny
dtype: int64
- name: wow
dtype: int64
- name: sad
dtype: int64
- name: likes
dtype: int64
- name: disagree
dtype: int64
- name: target
dtype:
class_label:
names:
'0': 'False'
'1': 'True'
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 46979913.1
num_examples: 85000
- name: validation
num_bytes: 8290572.9
num_examples: 15000
- name: test
num_bytes: 13825536
num_examples: 25000
download_size: 29047323
dataset_size: 69096022.0
---
# Dataset Card for "jigsaw_unintended_bias100K"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
james-burton
原始信息汇总
数据集概述
数据集名称
- 名称: jigsaw_unintended_bias100K
数据集特征
- 特征列表:
- comment_text: 字符串类型
- asian, atheist, bisexual, black, buddhist, christian, female, heterosexual, hindu, homosexual_gay_or_lesbian, intellectual_or_learning_disability, jewish, latino, male, muslim, other_disability, other_gender, other_race_or_ethnicity, other_religion, other_sexual_orientation, physical_disability, psychiatric_or_mental_illness, transgender, white: 均为浮点数类型
- funny, wow, sad, likes, disagree: 均为整数类型
- target: 类别标签,包含False和True两个类别
- index_level_0: 整数类型
数据集分割
- 训练集: 85000个样本,大小为46979913.1字节
- 验证集: 15000个样本,大小为8290572.9字节
- 测试集: 25000个样本,大小为13825536字节
数据集大小
- 下载大小: 29047323字节
- 数据集总大小: 69096022.0字节



