five

Djacon/ru-izard-emotions

收藏
Hugging Face2023-11-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Djacon/ru-izard-emotions
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ru license: - mit multilinguality: - russian task_categories: - text-classification task_ids: - sentiment-classification - multi-class-classification - multi-label-classification pretty_name: RuIzardEmotions tags: - emotion size_categories: - 10K<n<100K --- # Dataset Card for RuIzardEmotions ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ### Dataset Summary The RuIzardEmotions dataset is a high-quality translation of the [go-emotions](https://huggingface.co/datasets/go_emotions) dataset and the other [emotion-detection](https://www.kaggle.com/datasets/ishantjuyal/emotions-in-text/data) dataset. It contains 30k Reddit comments labeled for 10 emotion categories (__joy__, __sadness__, __anger__, __enthusiasm__, __surprise__, __disgust__, __fear__, __guilt__, __shame__ and __neutral__). The datasets were translated using the accurate translator [DeepL](https://www.deepl.com/translator) and additional processing. The idea for the dataset was inspired by the [Izard's model](https://en.wikipedia.org/wiki/Differential_Emotions_Scale) of human emotions. The dataset already with predefined train/val/test splits. ### Supported Tasks and Leaderboards This dataset is intended for multi-class, multi-label emotion classification. ### Languages The data is in Russian. ## Dataset Structure ### Data Instances Each instance is a reddit comment with one or more emotion annotations (or neutral). ### Data Splits The simplified data includes a set of train/val/test splits with 24k, 3k, and 3k examples respectively. ## Considerations for Using the Data ### Social Impact of Dataset Emotion detection is a worthwhile problem which can potentially lead to improvements such as better human/computer interaction. However, emotion detection algorithms (particularly in computer vision) have been abused in some cases to make erroneous inferences in human monitoring and assessment applications such as hiring decisions, insurance pricing, and student attentiveness ## Additional Information ### Licensing Information The GitHub repository which houses this dataset has an [Apache License 2.0](https://github.com/Djacon/russian-emotion-detection/blob/main/LICENSE). ### Citation Information ``` @inproceedings{Djacon, author={Djacon}, title={RuIzardEmotions: A Dataset of Fine-Grained Emotions}, year={2023} } ```
提供机构:
Djacon
原始信息汇总

数据集卡片 for RuIzardEmotions

数据集描述

数据集摘要

RuIzardEmotions数据集是go-emotions数据集和其他emotion-detection数据集的高质量翻译版本。它包含30k条Reddit评论,标注了10种情感类别(joysadnessangerenthusiasmsurprisedisgustfearguiltshame__和__neutral)。数据集使用精确的翻译器DeepL进行翻译并进行了额外处理。该数据集的灵感来源于Izards model的人类情感模型。

数据集已经包含了预定义的train/val/test分割。

支持的任务和排行榜

该数据集旨在用于多类别、多标签情感分类。

语言

数据为俄语。

数据集结构

数据实例

每个实例是一条Reddit评论,带有一个或多个情感标注(或中性)。

数据分割

简化数据包括一组train/val/test分割,分别为24k、3k和3k个示例。

使用数据的注意事项

数据集的社会影响

情感检测是一个有价值的问题,可能会带来改进,例如更好的人机交互。然而,情感检测算法(特别是在计算机视觉中)有时会被滥用,在招聘决策、保险定价和学生注意力评估等人类监控和评估应用中做出错误推断。

附加信息

许可信息

该数据集所在的GitHub仓库拥有Apache License 2.0

引用信息

@inproceedings{Djacon, author={Djacon}, title={RuIzardEmotions: A Dataset of Fine-Grained Emotions}, year={2023} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作