five

p988744/eland-sentiment-zh-data

收藏
Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/p988744/eland-sentiment-zh-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - zh task_categories: - text-classification tags: - sentiment-analysis - finance - chinese - taiwan - stock-market size_categories: - 1K<n<10K --- # Eland Sentiment Chinese Financial Dataset A Chinese financial sentiment analysis dataset for Taiwan stock market text. ## Dataset Description This dataset contains 1,299 annotated samples of Chinese financial text from Taiwan stock market forums and news. It supports three sentiment analysis tasks: 1. **Overall Sentiment** - Classify the overall sentiment of the text 2. **Entity Sentiment** - Classify sentiment towards specific entities (companies, products) 3. **Opinion Sentiment** - Classify sentiment of specific opinions in text ## Dataset Statistics | Split | Samples | |-------|---------| | Train | 999 | | Test | 300 | | **Total** | **1,299** | ### Task Distribution | Task | Train | Test | |------|-------|------| | Overall Sentiment | 333 | 100 | | Entity Sentiment | 333 | 100 | | Opinion Sentiment | 333 | 100 | ### Sentiment Distribution (Train) | Sentiment | Count | Percentage | |-----------|-------|------------| | Positive (正面) | 434 | 43.4% | | Neutral (中立) | 350 | 35.0% | | Negative (負面) | 215 | 21.5% | ## Data Format Each sample is a JSON object with the following fields: ### Overall Sentiment Task ```json { "text": "台積電今日股價大漲...", "overall": "正面", "task": "overall", "source": "forum_01" } ``` ### Entity Sentiment Task ```json { "text": "台積電今日股價大漲...", "overall": "正面", "entity": "台積電", "entity_sentiment": "正面", "task": "entity", "source": "forum_01" } ``` ### Opinion Sentiment Task ```json { "text": "台積電今日股價大漲...", "overall": "正面", "opinion": "股價上漲代表市場看好", "opinion_sentiment": "正面", "agrees_with_text": true, "task": "opinion", "source": "forum_01" } ``` ## Labels - **正面** (Positive) - **中立** (Neutral) - **負面** (Negative) ## Usage ### Load with Datasets Library ```python from datasets import load_dataset dataset = load_dataset("p988744/eland-sentiment-zh-data") # Access splits train_data = dataset["train"] test_data = dataset["test"] # Example print(train_data[0]) ``` ### Load Manually ```python import json with open("train.jsonl", "r", encoding="utf-8") as f: train_data = [json.loads(line) for line in f] ``` ## Associated Model This dataset was used to train [p988744/eland-sentiment-zh](https://huggingface.co/p988744/eland-sentiment-zh), which achieves **89.38% Reliability** on the RGL benchmark. ## License Apache 2.0 ## Citation ```bibtex @misc{eland-sentiment-zh-data, author = {Eland AI}, title = {Eland Sentiment: Chinese Financial Sentiment Dataset}, year = {2025}, publisher = {HuggingFace}, url = {https://huggingface.co/datasets/p988744/eland-sentiment-zh-data} } ```
提供机构:
p988744
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作