Sentiment Analysis outputs based on the combination of three classifiers for news headlines and body text
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6326347
下载链接
链接失效反馈官方服务:
资源简介:
Sentiment Analysis outputs based on the combination of three classifiers for news headlines and body text covering the Olympic legacy of Rio 2016 and London 2012. Data was searched via Google search engine. It is composed of sentiment labels assigned to 1271 news articles in total.
News outlets:
BBC
Daily Mail
The Telegraph
The Guardian
Globo
Estadao
Folha de S. Paulo
Events covered by the articles:
London 2012 Olympic legacy
Rio 2016 Olympic legacy
All classifiers were used in texts in English. Text originally published in Portuguese by the Brazilian media were automatically translated.
Sentiment classifiers used:
Vader
BERT (Trained on Amazon data)
BERT (Trained on twitter data - 140)
Each document (spreadsheet - xlsx) refers to one outlet and one event (London 2012 or Rio 2016).
How were labels assigned to the texts?
These labels are a combination of the three sentiment classifiers listed above. If two of them agree with the same label, then this label would be considered as right. Otherwise, the label ‘other’ was assigned.
For news article body text: the proportion of sentences of each sentiment type was used to assign labels to the whole article instead of averaging the sentence scores. For example, if the proportion of sentences with negative labels is greater than 50%, then the article is assigned a negative label.
The documents are composed of the following columns:
Rank: the position of the article on Google search ranking
Date: date of article's publication (DD/MM/YYYY)
Link: article's link
Title: article's title
Sentiment_Title: final sentiment for article headline
Sentiment_Text: final sentiment for article's body text
PS: Documents do not include articles' body text.
Sentiment is presented in labels as follows:
Pos: Positive
Neg: Negative
Neutral: Neutral
other: inconclusive - if each of the 3 classifiers assigned a different label to the article, the label 'other' was used. Therefore, 'other' identifies contradictory results.
创建时间:
2022-03-04



