p988744/eland-sentiment-zh-data
收藏Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/p988744/eland-sentiment-zh-data
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- zh
task_categories:
- text-classification
tags:
- sentiment-analysis
- finance
- chinese
- taiwan
- stock-market
size_categories:
- 1K<n<10K
---
# Eland Sentiment Chinese Financial Dataset
A Chinese financial sentiment analysis dataset for Taiwan stock market text.
## Dataset Description
This dataset contains 1,299 annotated samples of Chinese financial text from Taiwan stock market forums and news. It supports three sentiment analysis tasks:
1. **Overall Sentiment** - Classify the overall sentiment of the text
2. **Entity Sentiment** - Classify sentiment towards specific entities (companies, products)
3. **Opinion Sentiment** - Classify sentiment of specific opinions in text
## Dataset Statistics
| Split | Samples |
|-------|---------|
| Train | 999 |
| Test | 300 |
| **Total** | **1,299** |
### Task Distribution
| Task | Train | Test |
|------|-------|------|
| Overall Sentiment | 333 | 100 |
| Entity Sentiment | 333 | 100 |
| Opinion Sentiment | 333 | 100 |
### Sentiment Distribution (Train)
| Sentiment | Count | Percentage |
|-----------|-------|------------|
| Positive (正面) | 434 | 43.4% |
| Neutral (中立) | 350 | 35.0% |
| Negative (負面) | 215 | 21.5% |
## Data Format
Each sample is a JSON object with the following fields:
### Overall Sentiment Task
```json
{
"text": "台積電今日股價大漲...",
"overall": "正面",
"task": "overall",
"source": "forum_01"
}
```
### Entity Sentiment Task
```json
{
"text": "台積電今日股價大漲...",
"overall": "正面",
"entity": "台積電",
"entity_sentiment": "正面",
"task": "entity",
"source": "forum_01"
}
```
### Opinion Sentiment Task
```json
{
"text": "台積電今日股價大漲...",
"overall": "正面",
"opinion": "股價上漲代表市場看好",
"opinion_sentiment": "正面",
"agrees_with_text": true,
"task": "opinion",
"source": "forum_01"
}
```
## Labels
- **正面** (Positive)
- **中立** (Neutral)
- **負面** (Negative)
## Usage
### Load with Datasets Library
```python
from datasets import load_dataset
dataset = load_dataset("p988744/eland-sentiment-zh-data")
# Access splits
train_data = dataset["train"]
test_data = dataset["test"]
# Example
print(train_data[0])
```
### Load Manually
```python
import json
with open("train.jsonl", "r", encoding="utf-8") as f:
train_data = [json.loads(line) for line in f]
```
## Associated Model
This dataset was used to train [p988744/eland-sentiment-zh](https://huggingface.co/p988744/eland-sentiment-zh), which achieves **89.38% Reliability** on the RGL benchmark.
## License
Apache 2.0
## Citation
```bibtex
@misc{eland-sentiment-zh-data,
author = {Eland AI},
title = {Eland Sentiment: Chinese Financial Sentiment Dataset},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/datasets/p988744/eland-sentiment-zh-data}
}
```
提供机构:
p988744



