A Sentiment Analysis Dataset for Code-Mixed Malayalam-English
收藏arXiv2020-05-30 更新2024-06-21 收录
下载链接:
https://github.com/bharathichezhiyan/MalayalamMixSentiment
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为'A Sentiment Analysis Dataset for Code-Mixed Malayalam-English',由Insight SFI研究中心数据分析团队创建,专注于分析混合语言(Malayalam-English)的情感。数据集包含6739条从YouTube收集的评论,涉及2019年Malayalam电影预告片。数据集的创建过程包括收集、筛选和标注,确保数据的质量和适用性。该数据集主要用于解决混合语言情感分析问题,特别是在社交媒体文本处理中,为研究人员提供了一个宝贵的资源。
This dataset is named "A Sentiment Analysis Dataset for Code-Mixed Malayalam-English", created by the Data Analysis Team of the Insight SFI Research Centre. It focuses on sentiment analysis for code-mixed Malayalam-English text, and contains 6,739 comments collected from YouTube that are related to 2019 Malayalam movie trailers. The dataset construction process includes collection, filtering and annotation steps to ensure data quality and applicability. This dataset is primarily intended to address the challenges of code-mixed sentiment analysis, especially in social media text processing, providing a valuable resource for researchers.
提供机构:
Insight SFI研究中心数据分析,数据科学研究所,爱尔兰国立大学戈尔韦分校
创建时间:
2020-05-30



