imranraad/github-emotion-surprise
收藏Hugging Face2024-03-01 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/imranraad/github-emotion-surprise
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-classification
---
# AutoTrain Dataset for project: github-emotion-love
## Dataset Description
Dataset used in the paper: Imran et al., ["Data Augmentation for Improving Emotion Recognition in Software Engineering Communication"](https://arxiv.org/abs/2208.05573), ASE-2022.
This is an annotated dataset of 2000 GitHub comments. Six basic emotions are annotated. They are Anger, Love, Fear, Joy, Sadness and Surprise. This repository contains annotations of all emotions.
## Dataset Structure
Dataset is in CSV format. The columns are:
```id, modified_comment, Anger, Love, Fear, Joy, Sadness, Surprise```
Here, `id` is a unique id for each comment. Each emotion is marked as 1 or 0.
### Dataset Splits
This dataset is split into a train and test split. The split sizes are as follows:
| Split name | Num samples |
| ------------ | ------------------- |
| train | 1600 |
| test | 400 |
提供机构:
imranraad
原始信息汇总
AutoTrain Dataset for project: github-emotion-love
数据集描述
该数据集用于论文:Imran et al., "Data Augmentation for Improving Emotion Recognition in Software Engineering Communication", ASE-2022。
这是一个包含2000条GitHub评论的标注数据集。标注了六种基本情绪:愤怒、爱、恐惧、喜悦、悲伤和惊喜。该存储库包含所有情绪的标注。
数据集结构
数据集格式为CSV。列包括:
id, modified_comment, Anger, Love, Fear, Joy, Sadness, Surprise
其中,id是每个评论的唯一标识符。每种情绪标记为1或0。
数据集划分
该数据集被划分为训练集和测试集。划分大小如下:
| 划分名称 | 样本数量 |
|---|---|
| train | 1600 |
| test | 400 |



