argilla/twitter-genderbias
收藏Hugging Face2022-12-06 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/argilla/twitter-genderbias
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- es
license:
- unknown
size_categories:
- 1K<n<10K
source_datasets:
- original
task_categories:
- text-classification
task_ids:
- sentiment-classification
- sentiment-analysis
dataset_info:
features:
- name: text
dtype: string
- name: inputs
struct:
- name: text
dtype: string
- name: prediction
list:
- name: label
dtype: string
- name: score
dtype: float64
- name: prediction_agent
dtype: string
- name: annotation
dtype: 'null'
- name: annotation_agent
dtype: 'null'
- name: multi_label
dtype: bool
- name: explanation
dtype: 'null'
- name: id
dtype: string
- name: metadata
dtype: 'null'
- name: status
dtype: string
- name: event_timestamp
dtype: timestamp[us]
- name: metrics
struct:
- name: text_length
dtype: int64
splits:
- name: train
num_bytes: 573508
num_examples: 1914
download_size: 373847
dataset_size: 573508
---
# Dataset Card for "twitter-genderbias"
## Dataset Description
- **Homepage:** Kaggle Challenge
- **Repository:** https://www.kaggle.com/datasets/kevinmorgado/gender-bias-spanish
- **Paper:** N.A.
- **Leaderboard:** N.A.
- **Point of Contact:** N.A.
### Dataset Summary
This dataset contains more than 1900 labeled Spanish tweets with the category biased or non-biased. This was made for a Hackathon to reduce gender bias on the internet.
- contents: Text
- label:
- biased
- non-biased
### Languages
spanish
### Citation Information
https://www.kaggle.com/datasets/kevinmorgado/gender-bias-spanish
### Contributions
Thanks to [@davidberenstein1957](https://github.com/davidberenstein1957) for adding this dataset.
提供机构:
argilla
原始信息汇总
数据集概述
基本信息
- 数据集名称: twitter-genderbias
- 语言: 西班牙语
- 许可: 未知
- 数据集大小: 1K<n<10K
- 数据来源: 原始数据
- 任务类别: 文本分类
- 具体任务:
- 情感分类
- 情感分析
数据集内容
- 内容类型: 文本
- 标签类型:
- 偏见
- 非偏见
数据集特征
- 特征列表:
text: 文本,数据类型为字符串inputs: 结构化数据,包含text字段,数据类型为字符串prediction: 列表,包含label和score,数据类型分别为字符串和浮点数prediction_agent: 字符串annotation: 空值annotation_agent: 空值multi_label: 布尔值explanation: 空值id: 字符串metadata: 空值status: 字符串event_timestamp: 时间戳(微秒)metrics: 结构化数据,包含text_length字段,数据类型为整数
数据集划分
- 训练集:
- 字节数: 573508
- 示例数: 1914
- 下载大小: 373847
- 数据集大小: 573508



