tyqiangz/multilingual-sentiments
收藏Hugging Face2023-05-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/tyqiangz/multilingual-sentiments
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- de
- en
- es
- fr
- ja
- zh
- id
- ar
- hi
- it
- ms
- pt
license: apache-2.0
multilinguality:
- monolingual
- multilingual
size_categories:
- 100K<n<1M
- 1M<n<10M
task_categories:
- text-classification
task_ids:
- sentiment-analysis
- sentiment-classification
---
# Multilingual Sentiments Dataset
A collection of multilingual sentiments datasets grouped into 3 classes -- positive, neutral, negative.
Most multilingual sentiment datasets are either 2-class positive or negative, 5-class ratings of products reviews (e.g. Amazon multilingual dataset) or multiple classes of emotions. However, to an average person, sometimes positive, negative and neutral classes suffice and are more straightforward to perceive and annotate. Also, a positive/negative classification is too naive, most of the text in the world is actually neutral in sentiment. Furthermore, most multilingual sentiment datasets don't include Asian languages (e.g. Malay, Indonesian) and are dominated by Western languages (e.g. English, German).
Git repo: https://github.com/tyqiangz/multilingual-sentiment-datasets
## Dataset Description
- **Webpage:** https://github.com/tyqiangz/multilingual-sentiment-datasets
提供机构:
tyqiangz
原始信息汇总
Multilingual Sentiments Dataset 概述
语言支持
- 支持语言:德语、英语、西班牙语、法语、日语、中文、印度尼西亚语、阿拉伯语、印地语、意大利语、马来语、葡萄牙语
- 多语言类型:单语、多语
数据集规模
- 数据集大小:10万<n<100万、100万<n<1000万
任务类型
- 任务类别:文本分类
- 具体任务:情感分析、情感分类
数据集内容
- 数据集包含多语言情感数据,分为三个类别:积极、中性、消极。
- 与常见的多语言情感数据集不同,本数据集专注于提供简单直观的情感分类,包括积极、消极和中性,更贴近普通人的感知和标注需求。
- 数据集特别包含了亚洲语言,如马来语和印度尼西亚语,弥补了多数多语言情感数据集在语言覆盖上的不足。
数据集链接
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



