Dataset for Sentiment Analysis in Code-Mixed Language

Name: Dataset for Sentiment Analysis in Code-Mixed Language
Creator: National University of Computer and Emerging Sciences
License: 暂无描述

Mendeley Data2026-04-09 收录

下载链接：

https://data.mendeley.com/datasets/xxprggbztd/1

下载链接

链接失效反馈

官方服务：

资源简介：

This data is for sentiment analysis based on low resource code-mixed languages like Roman Urdu, Roman Hindi, and Roman English. The dataset is initially collected from ecommerce platform and is based on user reviews on the platform. After collection, data is cleaned and can be used for sentiment analysis research. This dataset classifies the reviews on three classes Positive, Negative and Neutral.

本数据集旨在开展针对低资源语码混合语言（code-mixed languages）的情感分析研究，涵盖罗马化乌尔都语、罗马化印地语与罗马化英语等语种。该数据集初始采集自电商平台，其数据源为该平台上的用户评论。完成数据采集后，已对其进行清洗预处理，可直接应用于情感分析相关研究工作。本数据集将用户评论划分为积极（Positive）、消极（Negative）与中性（Neutral）三类情感标签。

提供机构：

National University of Computer and Emerging Sciences

5,000+

优质数据集

54 个

任务类型

进入经典数据集