A machine learning model for assessing IO legitimation in the media

ICPSR2021-01-01 更新2026-04-16 收录

下载链接：

https://www.openicpsr.org/openicpsr/project/152961/version/V1/view

下载链接

链接失效反馈

官方服务：

资源简介：

The media channel is crucial for the legitimation of International Organizations (IOs). This paper applies a machine learning framework to develop a new model of IO legitimation. The model uses Natural Language Processing and regression analysis to quantify intensity, tone and narrative in the media as three critical dimensions of legitimation. The model is applied to a corpus of 1.3 million newspaper articles from six Eurasian post-socialist countries and twelve IOs in the period 2018-2020. It yields four key results. First, and contrary to earlier studies, the tone of articles on IOs is predominantly positive. We document, however, significant differences between countries and IOs. The Russian media contribute to the delegitimation of IOs, while the Polish media engage primarily in their legitimation. The tone of articles featuring the WHO and the IMF is more negative than that of those on other IOs, after the topic of the narrative is controlled for. Second, articles mentioning influential domestic politicians contribute to the delegitimation of the IOs featured. Third, we show for the first time that the Covid-19 pandemic reduced the tone for most of the IOs analyzed. The impact is most pronounced for the WHO and is related to China’s influence on the WHO. Fourth, we verify the symmetric hypothesis of guilt by association and document that the world powers contribute significantly to the delegitimation of IOs in the post-socialist press in Eurasia. The model developed is easily scalable and can be applied to any number of countries and IOs, providing that sentiment lexicons exist for the languages spoken in those countries. The dataset analyzed during the current study is called replication_datafile.Rdata and is available at https://www.openicpsr.org/openicpsr/project/152961. All calculations were conducted in R, and the file is saved in the Rdata format. It contains one data frame called datafile with 1,255,294 rows (observations) and 77 columns (variables). Each observation describes characteristics of one article from the corpus of 1,255,294 articles. List of variables and their definitions: sent – sentiment, number of positive inclination words minus the number of negative inclination words divided by the number of words in the articles rsent – relative sentiment, sentiment of the article minus the average sentiment of all articles published in this newspaper usa, china, european_union – one variable for each of the three world powers, equals zero if a world power is not mentioned in the articles and 1+log(N) if a world power is mentioned N times in the article. un, wto, who, oecd, scun, nato, imf, ebrd, adb, aiib, wb, cc - one variable for each of the twelve IOs, equals zero if an IO is not mentioned in the article and 1+log(N) if an IO is mentioned N times in the article. dip_ru, dip_kz, dip_bel, dip_ukr, dip_pl, dip_hu - one variable for each of six countries, equals zero if no influential domestic politician is mentioned in the articles and 1+log(N) if influential domestic politicians are mentioned N times in the article. Thirteen topic dummies, the topic dummy equals one is an article was classified as discussing this topic, zero otherwise. The LDA topic modelling was conducted for k=30 topics. Topics were grouped by their analytical relevance and semantic similarity into thirteen categories of: politics and legislation (POL); economy, finance, various sectors of the economy (ECO); military, war, protests, crime, security threats (MIL); international affairs, specific issues concerning foreign countries (INT); technology (TECH); family issues, culture, sport, education (FAM); regional issues and housing (REG); health issues and the Covid-19 pandemic (HEA); media (MED); accidents (ACC); religion (REL); the Soviet Union (USSR); and articles for which no topic could be determined (MISC). Twenty newspaper dummies “n_newspaper_COUNTRY”, the newspaper dummy equals one if an article was published in this newspaper, zero otherwise. The twenty media outlets are: iz.ru, kommersant.ru, novayagazeta.ru, vedomosti.ru, informburo.kz, nur.kz, tengrinews.kz, zakon.kz, bdg.by, belgazeta.by, sb.by, kp.ua, segodnya.ua, vesti.ua, gazeta.pl, rp.pl, wpolityce.pl, index.hu, origo.hu, alfahir.hu. Six country dummies “c_COUNTRY”. Twelve month dummies and three year dummies indicate year and month when an article was published.

创建时间：

2021-01-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集