five

Mapping the evolution of cross-Strait relations via GDELT (2014–2023)

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://doi.org/10.7910/DVN/XYC5DY
下载链接
链接失效反馈
官方服务:
资源简介:
The code and data here are the complete code and data of the paper “Mapping the evolution of cross-Strait relations via global news big data (2014–2023): an analysis integrating GDELT and machine learning.” This includes data analysis based on GDELT and text analysis of the 1000 most important news reports on cross-Strait relations. Please read these codes and data in conjunction with the manuscript. The data analysis code includes the following parts: (1) The global media's portrayal of cross-strait relations, measured across six dimensions: Attention Index, Balance Index, Impact Index, Tone Index, QuadClass, and EventRootCode, from 2014 to 2023. (2) The distribution of high-impact media outlets in reporting cross-Strait relations and their internal differences, measured using the Deviation Index. (3) The relationship between the influence and deviation indices of high-impact media outlets, measured through correlation and regression analysis. (4) How all the figures were generated. The text analysis code includes the following parts: (1) LDA topic modeling of 1000 news reports. We provide detailed information on the report samples, stop word list, collocations, and data results. (2) Sentiment analysis of the titles of 1000 news reports. We combined two LLMs (ChatGPT and Grok) for sentiment analysis, along with manual verification, to provide a final credibility score. To address concerns about whether the Deviation Index can be directly calculated by taking the arithmetic mean of the six dimensions (Attention, Balance, Tone, Impact, QuadClass, and EventRootCode), this paper adds the following steps: 1. Performing PCA analysis (code number 16); 2. Sensitivity analysis (code number 17); 3. Robustness checks by substituting alternative distance and error metrics (MAE and Hellinger distance in place of RMSE and Jensen–Shannon divergence, code numbers 18, 19, and 20). [Added in 2026-01-26] We welcome the imitation and learning of this code, data, and research ideas. Please remember to cite our work.
创建时间:
2026-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作