chrissoria/trump-truth-social
收藏Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/chrissoria/trump-truth-social
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
language:
- en
tags:
- politics
- social-media
- truth-social
- nlp
- text-classification
- sentiment-analysis
- finance
- geopolitics
size_categories:
- 10K<n<100K
task_categories:
- text-classification
- text-generation
pretty_name: Trump Truth Social Posts Archive
dataset_info:
features:
- name: date
dtype: string
- name: time
dtype: string
- name: day_of_week
dtype: string
- name: datetime
dtype: string
- name: text
dtype: string
- name: content_html
dtype: string
- name: url
dtype: string
- name: post_id
dtype: string
- name: is_president
dtype: bool
- name: is_president_elect
dtype: bool
- name: replies_count
dtype: int64
- name: reblogs_count
dtype: int64
- name: favourites_count
dtype: int64
- name: media_urls
dtype: string
- name: links
dtype: string
- name: has_media
dtype: bool
- name: image_alt_text
dtype: string
- name: sp500_open
dtype: float64
- name: sp500_close
dtype: float64
- name: sp500_1hr_before
dtype: float64
- name: sp500_5min_before
dtype: float64
- name: sp500_at_post
dtype: float64
- name: sp500_5min_after
dtype: float64
- name: sp500_1hr_after
dtype: float64
- name: sp500_resolution
dtype: string
- name: dia_open
dtype: float64
- name: dia_close
dtype: float64
- name: dia_1hr_before
dtype: float64
- name: dia_5min_before
dtype: float64
- name: dia_at_post
dtype: float64
- name: dia_5min_after
dtype: float64
- name: dia_1hr_after
dtype: float64
- name: qqq_open
dtype: float64
- name: qqq_close
dtype: float64
- name: qqq_1hr_before
dtype: float64
- name: qqq_5min_before
dtype: float64
- name: qqq_at_post
dtype: float64
- name: qqq_5min_after
dtype: float64
- name: qqq_1hr_after
dtype: float64
- name: djt_open
dtype: float64
- name: djt_close
dtype: float64
- name: djt_1hr_before
dtype: float64
- name: djt_5min_before
dtype: float64
- name: djt_at_post
dtype: float64
- name: djt_5min_after
dtype: float64
- name: djt_1hr_after
dtype: float64
- name: lmt_open
dtype: float64
- name: lmt_close
dtype: float64
- name: lmt_1hr_before
dtype: float64
- name: lmt_5min_before
dtype: float64
- name: lmt_at_post
dtype: float64
- name: lmt_5min_after
dtype: float64
- name: lmt_1hr_after
dtype: float64
- name: war_open
dtype: float64
- name: war_close
dtype: float64
- name: war_1hr_before
dtype: float64
- name: war_5min_before
dtype: float64
- name: war_at_post
dtype: float64
- name: war_5min_after
dtype: float64
- name: war_1hr_after
dtype: float64
- name: cnrg_open
dtype: float64
- name: cnrg_close
dtype: float64
- name: cnrg_1hr_before
dtype: float64
- name: cnrg_5min_before
dtype: float64
- name: cnrg_at_post
dtype: float64
- name: cnrg_5min_after
dtype: float64
- name: cnrg_1hr_after
dtype: float64
- name: xlv_open
dtype: float64
- name: xlv_close
dtype: float64
- name: xlv_1hr_before
dtype: float64
- name: xlv_5min_before
dtype: float64
- name: xlv_at_post
dtype: float64
- name: xlv_5min_after
dtype: float64
- name: xlv_1hr_after
dtype: float64
- name: xph_open
dtype: float64
- name: xph_close
dtype: float64
- name: xph_1hr_before
dtype: float64
- name: xph_5min_before
dtype: float64
- name: xph_at_post
dtype: float64
- name: xph_5min_after
dtype: float64
- name: xph_1hr_after
dtype: float64
- name: gld_open
dtype: float64
- name: gld_close
dtype: float64
- name: gld_1hr_before
dtype: float64
- name: gld_5min_before
dtype: float64
- name: gld_at_post
dtype: float64
- name: gld_5min_after
dtype: float64
- name: gld_1hr_after
dtype: float64
- name: uso_open
dtype: float64
- name: uso_close
dtype: float64
- name: uso_1hr_before
dtype: float64
- name: uso_5min_before
dtype: float64
- name: uso_at_post
dtype: float64
- name: uso_5min_after
dtype: float64
- name: uso_1hr_after
dtype: float64
- name: xli_open
dtype: float64
- name: xli_close
dtype: float64
- name: xli_1hr_before
dtype: float64
- name: xli_5min_before
dtype: float64
- name: xli_at_post
dtype: float64
- name: xli_5min_after
dtype: float64
- name: xli_1hr_after
dtype: float64
- name: eww_open
dtype: float64
- name: eww_close
dtype: float64
- name: eww_1hr_before
dtype: float64
- name: eww_5min_before
dtype: float64
- name: eww_at_post
dtype: float64
- name: eww_5min_after
dtype: float64
- name: eww_1hr_after
dtype: float64
- name: vgk_open
dtype: float64
- name: vgk_close
dtype: float64
- name: vgk_1hr_before
dtype: float64
- name: vgk_5min_before
dtype: float64
- name: vgk_at_post
dtype: float64
- name: vgk_5min_after
dtype: float64
- name: vgk_1hr_after
dtype: float64
- name: ibit_open
dtype: float64
- name: ibit_close
dtype: float64
- name: ibit_1hr_before
dtype: float64
- name: ibit_5min_before
dtype: float64
- name: ibit_at_post
dtype: float64
- name: ibit_5min_after
dtype: float64
- name: ibit_1hr_after
dtype: float64
- name: fxi_open
dtype: float64
- name: fxi_close
dtype: float64
- name: fxi_1hr_before
dtype: float64
- name: fxi_5min_before
dtype: float64
- name: fxi_at_post
dtype: float64
- name: fxi_5min_after
dtype: float64
- name: fxi_1hr_after
dtype: float64
- name: tlt_open
dtype: float64
- name: tlt_close
dtype: float64
- name: tlt_1hr_before
dtype: float64
- name: tlt_5min_before
dtype: float64
- name: tlt_at_post
dtype: float64
- name: tlt_5min_after
dtype: float64
- name: tlt_1hr_after
dtype: float64
- name: uup_open
dtype: float64
- name: uup_close
dtype: float64
- name: uup_1hr_before
dtype: float64
- name: uup_5min_before
dtype: float64
- name: uup_at_post
dtype: float64
- name: uup_5min_after
dtype: float64
- name: uup_1hr_after
dtype: float64
- name: gdelt_military
dtype: int64
- name: gdelt_sanctions
dtype: int64
- name: gdelt_threat
dtype: int64
- name: gdelt_protest
dtype: int64
- name: gdelt_force_posture
dtype: int64
- name: gdelt_diplomatic
dtype: int64
- name: gdelt_material_conflict
dtype: int64
- name: gdelt_verbal_conflict
dtype: int64
- name: gdelt_material_cooperation
dtype: int64
- name: gdelt_verbal_cooperation
dtype: int64
- name: gdelt_goldstein_avg
dtype: float64
- name: gdelt_avg_tone
dtype: float64
- name: gdelt_total_events
dtype: int64
- name: gdelt_military_pct
dtype: float64
- name: gdelt_sanctions_pct
dtype: float64
- name: gdelt_threat_pct
dtype: float64
- name: gdelt_protest_pct
dtype: float64
- name: gdelt_force_posture_pct
dtype: float64
- name: gdelt_diplomatic_pct
dtype: float64
- name: gdelt_military_zscore
dtype: float64
- name: gdelt_sanctions_zscore
dtype: float64
- name: gdelt_threat_zscore
dtype: float64
- name: gdelt_protest_zscore
dtype: float64
- name: gdelt_material_conflict_zscore
dtype: float64
- name: gdelt_military_delta
dtype: int64
- name: gdelt_sanctions_delta
dtype: int64
- name: gdelt_threat_delta
dtype: int64
- name: gdelt_protest_delta
dtype: int64
- name: gdelt_material_conflict_delta
dtype: int64
- name: gdelt_goldstein_avg_delta
dtype: float64
- name: gdelt_avg_tone_delta
dtype: float64
- name: time_eastern
dtype: string
- name: during_market_hours
dtype: bool
- name: market_period
dtype: string
- name: cat_attacking_individual
dtype: float64
- name: cat_attacking_opposition
dtype: float64
- name: cat_threatening_intl
dtype: float64
- name: cat_enacting_aggressive
dtype: float64
- name: cat_enacting_nonaggressive
dtype: float64
- name: cat_deescalating
dtype: float64
- name: cat_praising_endorsing
dtype: float64
- name: cat_self_promotion
dtype: float64
- name: cat_other
dtype: float64
splits:
- name: train
num_bytes: 66451329
num_examples: 32255
download_size: 12397314
dataset_size: 66451329
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Trump Truth Social Posts Archive
Public posts ("Truths") by Donald J. Trump on Truth Social, enriched with market data, geopolitical event indicators, and LLM-based post classifications. Collected for academic research purposes.
## Dataset Description
- **Source**: [CNN/Stiles Truth Social Archive](https://github.com/stiles/trump-truth-social-archive) (live-updating public archive)
- **Posts**: ~32,000+ (growing)
- **Date range**: February 2022 – present
- **Update frequency**: Daily (Truth Social), weekly (all other sources)
- **Maintainer**: [Chris Soria](https://github.com/chrissoria) (UC Berkeley)
## Fields
### Post metadata
| Field | Type | Description |
|-------|------|-------------|
| `date` | string | Post date (YYYY-MM-DD) |
| `time` | string | Post time in UTC (HH:MM:SS) |
| `time_eastern` | string | Post time in US Eastern (HH:MM:SS, DST-aware) |
| `day_of_week` | string | Day name (Monday, Tuesday, etc.) |
| `datetime` | string | Full ISO 8601 timestamp (UTC) |
| `text` | string | Plain text content (HTML stripped) |
| `content_html` | string | Original HTML content |
| `url` | string | Direct link to post on Truth Social |
| `post_id` | string | Truth Social post ID |
| `is_president` | bool | Whether Trump was serving as president at time of post |
| `is_president_elect` | bool | Whether Trump was president-elect at time of post |
| `during_market_hours` | bool | Whether post was made during US market hours (9:30 AM – 4:00 PM ET, weekdays) |
| `market_period` | string | One of: `before_market`, `during_market`, `after_market` |
### Engagement
| Field | Type | Description |
|-------|------|-------------|
| `replies_count` | int | Number of replies |
| `reblogs_count` | int | Number of re-truths (reposts) |
| `favourites_count` | int | Number of likes |
### Media
| Field | Type | Description |
|-------|------|-------------|
| `media_urls` | string | Semicolon-separated image/video URLs attached to the post |
| `links` | string | Semicolon-separated URLs found in post text |
| `has_media` | bool | Whether post contains media attachments |
| `image_alt_text` | string | AI-generated factual image description for accessibility (in progress) |
### Post classification (5-model ensemble)
LLM-classified post categories using a 5-model unanimous-vote ensemble (Llama 4 Maverick, Qwen3-32B, Claude 3 Haiku, GPT-4o-mini, Gemini 2.0 Flash). Multi-label: a post can belong to multiple categories. Available for posts with text since Nov 5, 2024 (election day onwards). Values: 1 = present, 0 = not present.
| Field | Type | Description |
|-------|------|-------------|
| `cat_attacking_individual` | float | Targeting a specific person by name |
| `cat_attacking_opposition` | float | Targeting Democrats, a party, or political group broadly |
| `cat_threatening_intl` | float | Conditional threats, tariff warnings, military posturing |
| `cat_enacting_aggressive` | float | Imposing tariffs, sanctions, bans, military action (already done) |
| `cat_enacting_nonaggressive` | float | Signing bills, executive orders, domestic programs, appointments |
| `cat_deescalating` | float | Toning down, announcing deals, peace talks, ceasefire |
| `cat_praising_endorsing` | float | Positive statements about a person, leader, ally |
| `cat_self_promotion` | float | Boasting about achievements, economy, polls, ratings |
| `cat_other` | float | Does not fit any above category |
### Market data (18 tickers)
Each ticker has 7 columns following the pattern `{ticker}_{metric}`. Daily open/close prices are available for all posts. Intraday prices (1hr before through 1hr after) use the highest available resolution: 1-minute (last ~7 days), 5-minute (last ~60 days), or hourly (last ~2 years). Weekend/holiday posts use the most recent trading day. The `sp500_resolution` column indicates the intraday data resolution.
**Metrics per ticker:**
| Suffix | Description |
|--------|-------------|
| `_open` | Daily open price |
| `_close` | Daily close price |
| `_1hr_before` | Price 1 hour before the post |
| `_5min_before` | Price 5 minutes before the post |
| `_at_post` | Price at time of post |
| `_5min_after` | Price 5 minutes after the post |
| `_1hr_after` | Price 1 hour after the post |
**Tickers:**
| Prefix | Ticker | Name | Category |
|--------|--------|------|----------|
| `sp500_` | ^GSPC | S&P 500 | Broad market |
| `dia_` | DIA | SPDR Dow Jones Industrial Average ETF | Broad market |
| `qqq_` | QQQ | Invesco QQQ (Nasdaq-100) | Tech/growth |
| `djt_` | DJT | Trump Media & Technology Group | Trump-linked |
| `lmt_` | LMT | Lockheed Martin | Defense |
| `war_` | WAR | Themes US Military Academy ETF | Defense |
| `xli_` | XLI | Industrial Select Sector SPDR | Industrials |
| `xlv_` | XLV | Health Care Select Sector SPDR | Healthcare |
| `xph_` | XPH | SPDR S&P Pharmaceuticals ETF | Pharma |
| `cnrg_` | CNRG | SPDR S&P Kensho Clean Power ETF | Clean energy |
| `gld_` | GLD | SPDR Gold Shares | Gold/commodities |
| `uso_` | USO | United States Oil Fund | Oil/energy |
| `fxi_` | FXI | iShares China Large-Cap ETF | China/trade |
| `eww_` | EWW | iShares MSCI Mexico ETF | Mexico/trade |
| `vgk_` | VGK | Vanguard FTSE Europe ETF | Europe |
| `ibit_` | IBIT | iShares Bitcoin ETF | Crypto |
| `tlt_` | TLT | iShares 20+ Year Treasury Bond ETF | Bonds/rates |
| `uup_` | UUP | Invesco DB US Dollar Index | USD strength |
### GDELT geopolitical events (daily)
Daily aggregates of US-involved events from the [GDELT Project](https://www.gdeltproject.org/) via BigQuery. Each row gets the event counts for its post date. Based on CAMEO event coding of global news coverage.
**Raw counts:**
| Field | Type | Description |
|-------|------|-------------|
| `gdelt_military` | int | US military assault/force/mass violence events (CAMEO 18-20) |
| `gdelt_sanctions` | int | Sanctions/embargo events (CAMEO 17) |
| `gdelt_threat` | int | Threat events (CAMEO 13) |
| `gdelt_protest` | int | Protest events (CAMEO 14) |
| `gdelt_force_posture` | int | Force posturing events (CAMEO 15) |
| `gdelt_diplomatic` | int | Diplomatic cooperation events (CAMEO 01-08) |
| `gdelt_material_conflict` | int | Material conflict events (QuadClass 4) |
| `gdelt_verbal_conflict` | int | Verbal conflict events (QuadClass 3) |
| `gdelt_material_cooperation` | int | Material cooperation events (QuadClass 2) |
| `gdelt_verbal_cooperation` | int | Verbal cooperation events (QuadClass 1) |
| `gdelt_goldstein_avg` | float | Average Goldstein scale for the day (-10 = max conflict, +10 = max cooperation) |
| `gdelt_avg_tone` | float | Average news tone for the day (negative = negative coverage) |
| `gdelt_total_events` | int | Total US-involved events |
**Derived:**
| Suffix | Description |
|--------|-------------|
| `_pct` | Share of total events (e.g., `gdelt_military_pct` = military events as % of total) |
| `_zscore` | Standard deviations above/below historical mean (flags unusual days) |
| `_delta` | Day-over-day change from previous day |
Available for: `military`, `sanctions`, `threat`, `protest`, `force_posture`, `diplomatic` (pct); `military`, `sanctions`, `threat`, `protest`, `material_conflict` (zscore and delta); `goldstein_avg`, `avg_tone` (delta).
## Intended Use
This dataset is intended for **academic research** in political science, computational social science, NLP, finance, and related fields. Example use cases:
- Analyzing the relationship between presidential social media activity and market movements
- Studying the timing and framing of aggressive policy announcements
- Discourse analysis and political communication research
- Event-driven analysis correlating posts with GDELT geopolitical indicators
- Accessibility research using AI-generated image descriptions
## Fair Use Notice
This dataset is compiled from publicly available posts by a public figure for academic research purposes under fair use (17 U.S.C. § 107). The data consists of factual records of public political speech. Source data is from the [CNN/Stiles public archive](https://github.com/stiles/trump-truth-social-archive). Market data sourced from Yahoo Finance via yfinance. Geopolitical data from the GDELT Project. Multiple peer-reviewed publications have established precedent for academic use of Truth Social data (see [ICWSM 2023](https://arxiv.org/abs/2303.11240), [arXiv:2411.01330](https://arxiv.org/abs/2411.01330)).
## Citation
If you use this dataset in your research, please cite this dataset and the underlying data sources:
### This dataset
```bibtex
@misc{soria2026trump_truth_social,
title={Trump Truth Social Posts Archive},
author={Soria, Christopher},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/datasets/chrissoria/trump-truth-social}
}
```
### Source data: Truth Social posts
The raw post data is sourced from Matt Stiles' CNN Truth Social archive:
```bibtex
@misc{stiles2024truthsocial,
title={Trump Truth Social Archive},
author={Stiles, Matt},
year={2024},
publisher={CNN},
url={https://github.com/stiles/trump-truth-social-archive}
}
```
### Market data: Yahoo Finance
Stock and ETF price data is sourced from Yahoo Finance via the [yfinance](https://github.com/ranaroussi/yfinance) Python library:
```bibtex
@software{yfinance,
title={yfinance: Download market data from Yahoo! Finance API},
author={Aroussi, Ran},
url={https://github.com/ranaroussi/yfinance},
license={Apache-2.0}
}
```
### Geopolitical events: GDELT Project
Daily geopolitical event aggregates are sourced from the [GDELT Project](https://www.gdeltproject.org/):
```bibtex
@article{leetaru2013gdelt,
title={GDELT: Global Data on Events, Location and Tone, 1979--2012},
author={Leetaru, Kalev and Schrodt, Philip A.},
journal={ISA Annual Convention},
year={2013},
url={https://www.gdeltproject.org/}
}
```
### LLM classification and image descriptions
Post classifications were generated using [cat-stack](https://github.com/chrissoria/cat-stack) with a 5-model ensemble (Llama 4 Maverick, Qwen3-32B, Claude 3 Haiku, GPT-4o-mini, Gemini 2.0 Flash). Image descriptions were generated using Qwen2.5-VL-72B.
```bibtex
@software{soria2026catstack,
title={cat-stack: Domain-agnostic text, image, and PDF classification engine powered by LLMs},
author={Soria, Christopher},
year={2026},
url={https://github.com/chrissoria/cat-stack}
}
```
## Part of the cat-pol ecosystem
This dataset is part of the [cat-pol](https://github.com/chrissoria/cat-pol) political text analysis toolkit. Install with:
```bash
pip install "cat-pol[sources]"
```
```python
from cat_pol.sources import fetch_trump_truths
df = fetch_trump_truths(since="2024-01-01")
```
提供机构:
chrissoria
搜集汇总
数据集介绍

构建方式
在政治传播与计算社会科学领域,构建高质量数据集是开展实证研究的基础。本数据集以唐纳德·特朗普在Truth Social平台发布的公开帖文为核心,通过多源数据融合技术构建而成。原始帖文来源于CNN/Stiles维护的公开存档,并每日更新;在此基础上,系统整合了来自雅虎财经的金融市场数据,涵盖标普500指数、行业ETF及特朗普媒体集团等18种金融产品的日内与日间价格序列。同时,数据集引入了GDELT项目的全球事件数据,量化了与美国相关的军事、制裁、抗议等各类地缘政治事件的日频统计指标。此外,每条帖文还经由一个包含Llama 4、GPT-4o-mini等五种大语言模型组成的集成系统进行了多标签分类,生成了攻击性言论、政策宣示、自我推广等九类语义标签,从而形成了跨越社交媒体文本、金融市场反应与地缘政治事件的综合性时序数据集合。
特点
该数据集的核心特征在于其多维度的丰富标注与精密的时序对齐。数据集不仅完整收录了帖文的元数据、互动指标与媒体附件信息,更创新性地将每条帖文的发布时刻与金融市场的高频数据精确关联,提供了发布前后一小时及五分钟等多个时间颗粒度的资产价格快照,为研究市场对政治言论的即时反应提供了罕见的数据基础。地缘政治事件指标则从新闻语料中提取,以日为单位提供了冲突与合作事件的量化测度及其统计衍生变量。尤为突出的是,通过大语言模型集成标注的帖文分类标签,使得研究者能够从语义层面系统分析政治话语的策略与模式。这种将文本、金融、事件数据在统一时间轴上深度融合的结构,使得数据集特别适用于探究政治言论、市场波动与宏观事件之间的复杂动态关联。
使用方法
在学术研究实践中,本数据集为政治学、金融学与计算社会科学领域的交叉研究提供了强大的实证工具。研究者可利用其进行事件研究,分析特定类型的政治言论(如威胁性声明或政策颁布)对相关股票、ETF或大盘指数的短期价格影响。文本分类标签支持对政治传播策略的量化分析,例如探究攻击性言论与市场情绪或地缘政治紧张度之间的相关性。地缘政治事件指标可用于构建控制变量,在研究设计中将特定帖文置于更广泛的国际新闻背景之下。数据集可通过Hugging Face平台直接加载,或通过配套的`cat-pol`Python工具库进行调用与筛选,支持按时间范围、分类标签或市场时段进行灵活的数据切片,以满足不同研究问题的需要。
背景与挑战
背景概述
在数字时代,政治人物的社交媒体言论已成为影响公共舆论、金融市场乃至国际关系的关键变量。特朗普Truth Social帖子档案数据集由加州大学伯克利分校的研究人员克里斯·索里亚于2022年创建并持续维护,旨在系统性地追踪美国前总统唐纳德·特朗普在Truth Social平台上的公开帖文。该数据集的核心研究问题聚焦于探究政治领袖的在线话语如何与资本市场波动、地缘政治事件产生复杂关联,为政治传播学、计算社会科学及金融计量学提供了前所未有的多模态研究素材。通过整合文本内容、市场时序数据及GDELT地缘政治指标,该数据集推动了跨学科实证研究的发展,其影响力已延伸至国际学术会议与同行评议出版物,成为分析数字政治经济生态的重要基准。
当前挑战
该数据集致力于解决政治文本与金融市场联动分析这一前沿领域的多重挑战。首要挑战在于精准量化政治言论对高频市场波动的因果效应,需克服市场噪声干扰与混杂变量交织的难题。其次,构建过程中面临数据融合的复杂性,包括跨平台帖文抓取的完整性保障、多源异构时序数据的对齐,以及利用大语言模型进行文本分类时可能存在的标注偏差与一致性校验。此外,地缘政治事件指标的动态集成要求处理大规模新闻流数据的实时性与代表性,确保衍生指标能有效捕捉国际关系的微妙变化。这些挑战共同构成了数据集在方法论与工程实现上的核心难点。
常用场景
经典使用场景
在政治传播与金融市场的交叉领域,该数据集为探究政治领袖社交媒体言论对资本市场的影响提供了经典范例。研究者通过分析特朗普在Truth Social平台发布的帖文,结合精确到分钟级的市场指数与个股价格数据,能够深入考察特定政治声明或政策信号发布前后,相关资产价格的短期波动模式。这种时序对齐的文本与金融数据融合,为量化政治话语的市场反应开辟了实证路径。
实际应用
在实践层面,该数据集为金融风险管理与政策分析提供了数据支撑。投资机构可借助其分析政治风险对特定行业(如国防、能源、科技)资产的冲击,优化事件驱动型交易策略。同时,政策研究者能够追踪政治议程设置与公众舆论的互动,评估政策信号在市场中的传导效率,为理解数字时代政治沟通的实际效果提供经验证据。
衍生相关工作
围绕该数据集,已衍生出多项经典研究工作。例如,基于其早期版本或类似数据的研究探讨了总统推文与波动率指数(VIX)的关联,以及政治情绪对加密货币市场的影响。数据集本身集成的多模型LLM分类框架(cat-stack)亦是一项方法论贡献,为政治文本的细粒度、多标签自动分类提供了可复现的基准方案,启发了后续在政治NLP领域对模型集成与领域适应性的探索。
以上内容由遇见数据集搜集并总结生成



