five

EmoMix-3L

收藏
arXiv2024-05-11 更新2024-06-21 收录
下载链接:
https://github.com/GoswamiDhiman/EmoMix-3L
下载链接
链接失效反馈
官方服务:
资源简介:
EmoMix-3L是一个创新的多标签情感检测数据集,包含来自孟加拉语、印地语和英语三种不同语言的代码混合数据。该数据集由乔治梅森大学创建,旨在解决多语言环境中情感检测的挑战。数据集包含1071个实例,由三种语言的母语者标注,涵盖日常话题如政治、体育等。创建过程中,研究者采用了控制的数据收集方法,确保数据质量和避免使用公开在线数据的伦理问题。EmoMix-3L的应用领域广泛,特别是在社交媒体分析和情感挖掘中,为多标签情感检测模型提供了重要的评估资源。

EmoMix-3L is an innovative multi-label emotion detection dataset containing code-mixed data from three distinct languages: Bengali, Hindi, and English. Developed by George Mason University, this dataset aims to address the challenges of emotion detection in multilingual environments. It consists of 1,071 instances annotated by native speakers of the three languages, covering daily topics such as politics, sports and others. During the dataset construction process, researchers adopted controlled data collection methods to ensure data quality and avoid ethical issues arising from the use of publicly available online data. EmoMix-3L has wide-ranging application scenarios, particularly in social media analysis and sentiment mining, serving as a critical evaluation resource for multi-label emotion detection models.
提供机构:
乔治梅森大学
创建时间:
2024-05-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作