syedkhalid076/Sentiment-Analysis-Over-sampled
收藏Hugging Face2024-12-03 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/syedkhalid076/Sentiment-Analysis-Over-sampled
下载链接
链接失效反馈官方服务:
资源简介:
该数据集专为情感分析任务设计,提供了一个平衡且经过预处理的标记文本数据集合。数据集包含三个情感标签:负面(0)、中性(1)和正面(2)。训练集经过过采样以确保标签分布平衡,而验证集和测试集保持原始分布以进行无偏评估。数据集以CSV格式提供,包含两列:文本和标签。预处理步骤包括去重、删除空行以及过滤过短或过长的条目。
This dataset is designed for sentiment analysis tasks, offering a balanced and pre-processed collection of labeled text data. The dataset includes three sentiment labels: Negative (0), Neutral (1), and Positive (2). The training dataset has been oversampled to ensure balanced label distribution, while the validation and test datasets remain unaltered for unbiased evaluation. The dataset is provided in CSV format with two columns: text and label. Preprocessing steps include removal of duplicates, null rows, and filtering out extremely short or long entries.
提供机构:
syedkhalid076



