NLP and machine learning to measure peace from news media
收藏DataCite Commons2026-03-14 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.2v6wwpzv6
下载链接
链接失效反馈官方服务:
资源简介:
“Hate speech” can mobilize violence and destruction. What are
the characteristics of “peace speech” that reflect and support the social
processes that maintain peace? In this study we used a data
driven, machine learning approach to identify the words most associated
with lower-peace versus higher-peace countries. Logistic regression and
random forest classifiers were trained using five respected, traditional
peace indices: Global Peace Index, Positive Peace Index, World Happiness
Index, Fragile States Index, and Human Development Index. The feature
inputs into the machine learning model were the word frequencies from the
news media in each country and the output classifications were the level
of peace in that country. The machine learning model was
successful in properly classifying the level of peace from the news media
in a country (both accuracy and F1: 96% - 100%). We also used that trained
machine model to create a machine learning peace index that measured the
level of peace in countries, including countries not in the training set,
which correlated with the average of those five traditional peace indices
(r-squared = 0.8349). Using the random forest feature importance method we
found that the words in news media in lower-peace countries were
characterized by words related to government, order, control and fear
(such as government, state, law, security and court), while higher-peace
countries were characterized by an increased prevalence of words related
to optimism for the future and fun (such as time, like, home, believe and
game).
提供机构:
Dryad
创建时间:
2023-11-15



