uznlp-uz/uz_med_sentiment
收藏Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/uznlp-uz/uz_med_sentiment
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: UzMedSentiment
language:
- uz
license: cc-by-4.0 # ⚠️ change if needed
task_categories:
- text-classification
task_ids:
- sentiment-classification
size_categories:
- 100K<n<1M
configs:
- config_name: default
data_files:
- split: train
path: dataset(UzMedSentiment).tsv
sep: "\t"
---
# UzMedSentiment
## 📌 Dataset Summary
**UzMedSentiment** is a large-scale Uzbek medical-domain sentiment dataset designed for **aspect-based sentiment analysis** and **auxiliary linguistic signal detection**.
Each record contains:
- user-generated medical text
- aspect label
- sentiment label
- additional annotations (negation, speculation, sarcasm, ADR flags, etc.)
The dataset is released as a **single TSV file** compatible with the Hugging Face `datasets` library.
---
## 🎯 Supported Tasks
- Sentiment classification
- Aspect-based sentiment analysis (ABSA)
- Medical text mining (Uzbek)
- Detection of:
- negation
- speculation
- sarcasm
- adverse drug reactions (ADR)
---
## 🌍 Languages
- Uzbek (primary)
Script distribution:
- `uz-latin`: 165,779 rows
- `uz-kiril`: 991 rows
⚠️ Note: Some entries may include noise, code-switching, or non-standard spelling.
---
## 📊 Dataset Size
- Total rows: **166,770**
- Format: **TSV**
- Columns: **13**
- Avg. text length: **85.49 chars**
- Median length: **73 chars**
- Max length: **2,494 chars**
---
## 📈 Label Distribution
### Sentiment
- `NEU`: 146,845
- `POS`: 17,915
- `NEG`: 2,010
⚠️ Dataset is **highly imbalanced** toward neutral class.
---
### Aspect Labels
- diagnostika
- dori
- infratuzilma
- kutish-vaqti
- muolaja
- narx
- parhez
- shifokor-munosabati
- simptom
- xizmat
- szolg‘a/xizmat (inconsistent variant)
---
### Source Platforms
- youtube: 61,747
- telegram: 36,232
- instagram: 26,965
- tiktok: 16,140
- facebook: 13,463
- twitter_x: 8,508
- quora: 3,116
- forum: 520
- web: 72
- web-komment: 7
---
## 🧱 Dataset Structure
### Hugging Face Usage
```python
from datasets import load_dataset
dataset = load_dataset("uznlp-uz/uz_med_sentiment")
print(dataset["train"][0])
提供机构:
uznlp-uz



