Djacon/ru-izard-emotions
收藏Hugging Face2023-11-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Djacon/ru-izard-emotions
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ru
license:
- mit
multilinguality:
- russian
task_categories:
- text-classification
task_ids:
- sentiment-classification
- multi-class-classification
- multi-label-classification
pretty_name: RuIzardEmotions
tags:
- emotion
size_categories:
- 10K<n<100K
---
# Dataset Card for RuIzardEmotions
## Table of Contents
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)
- [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
- [Dataset Curators](#dataset-curators)
- [Licensing Information](#licensing-information)
- [Citation Information](#citation-information)
- [Contributions](#contributions)
### Dataset Summary
The RuIzardEmotions dataset is a high-quality translation of the [go-emotions](https://huggingface.co/datasets/go_emotions) dataset and the other [emotion-detection](https://www.kaggle.com/datasets/ishantjuyal/emotions-in-text/data) dataset. It contains 30k Reddit comments labeled for 10 emotion categories (__joy__, __sadness__, __anger__, __enthusiasm__, __surprise__, __disgust__, __fear__, __guilt__, __shame__ and __neutral__).
The datasets were translated using the accurate translator [DeepL](https://www.deepl.com/translator) and additional processing. The idea for the dataset was inspired by the [Izard's model](https://en.wikipedia.org/wiki/Differential_Emotions_Scale) of human emotions.
The dataset already with predefined train/val/test splits.
### Supported Tasks and Leaderboards
This dataset is intended for multi-class, multi-label emotion classification.
### Languages
The data is in Russian.
## Dataset Structure
### Data Instances
Each instance is a reddit comment with one or more emotion annotations (or neutral).
### Data Splits
The simplified data includes a set of train/val/test splits with 24k, 3k, and 3k examples respectively.
## Considerations for Using the Data
### Social Impact of Dataset
Emotion detection is a worthwhile problem which can potentially lead to improvements such as better human/computer
interaction. However, emotion detection algorithms (particularly in computer vision) have been abused in some cases
to make erroneous inferences in human monitoring and assessment applications such as hiring decisions, insurance
pricing, and student attentiveness
## Additional Information
### Licensing Information
The GitHub repository which houses this dataset has an
[Apache License 2.0](https://github.com/Djacon/russian-emotion-detection/blob/main/LICENSE).
### Citation Information
```
@inproceedings{Djacon,
author={Djacon},
title={RuIzardEmotions: A Dataset of Fine-Grained Emotions},
year={2023}
}
```
提供机构:
Djacon
原始信息汇总
数据集卡片 for RuIzardEmotions
数据集描述
数据集摘要
RuIzardEmotions数据集是go-emotions数据集和其他emotion-detection数据集的高质量翻译版本。它包含30k条Reddit评论,标注了10种情感类别(joy、sadness、anger、enthusiasm、surprise、disgust、fear、guilt、shame__和__neutral)。数据集使用精确的翻译器DeepL进行翻译并进行了额外处理。该数据集的灵感来源于Izards model的人类情感模型。
数据集已经包含了预定义的train/val/test分割。
支持的任务和排行榜
该数据集旨在用于多类别、多标签情感分类。
语言
数据为俄语。
数据集结构
数据实例
每个实例是一条Reddit评论,带有一个或多个情感标注(或中性)。
数据分割
简化数据包括一组train/val/test分割,分别为24k、3k和3k个示例。
使用数据的注意事项
数据集的社会影响
情感检测是一个有价值的问题,可能会带来改进,例如更好的人机交互。然而,情感检测算法(特别是在计算机视觉中)有时会被滥用,在招聘决策、保险定价和学生注意力评估等人类监控和评估应用中做出错误推断。
附加信息
许可信息
该数据集所在的GitHub仓库拥有Apache License 2.0。
引用信息
@inproceedings{Djacon, author={Djacon}, title={RuIzardEmotions: A Dataset of Fine-Grained Emotions}, year={2023} }



