five

Sentiment Analysis outputs based on the combination of three classifiers for news headlines and body text

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6326347
下载链接
链接失效反馈
官方服务:
资源简介:
Sentiment Analysis outputs based on the combination of three classifiers for news headlines and body text covering the Olympic legacy of Rio 2016 and London 2012. Data was searched via Google search engine. It is composed of sentiment labels assigned to 1271 news articles in total. News outlets: BBC Daily Mail The Telegraph The Guardian Globo Estadao Folha de S. Paulo Events covered by the articles: London 2012 Olympic legacy Rio 2016 Olympic legacy All classifiers were used in texts in English. Text originally published in Portuguese by the Brazilian media were automatically translated. Sentiment classifiers used: Vader BERT (Trained on Amazon data) BERT (Trained on twitter data - 140) Each document (spreadsheet - xlsx) refers to one outlet and one event (London 2012 or Rio 2016). How were labels assigned to the texts? These labels are a combination of the three sentiment classifiers listed above. If two of them agree with the same label, then this label would be considered as right. Otherwise, the label ‘other’ was assigned. For news article body text: the proportion of sentences of each sentiment type was used to assign labels to the whole article instead of averaging the sentence scores. For example, if the proportion of sentences with negative labels is greater than 50%, then the article is assigned a negative label. The documents are composed of the following columns: Rank: the position of the article on Google search ranking Date: date of article's publication (DD/MM/YYYY) Link: article's link Title: article's title Sentiment_Title: final sentiment for article headline Sentiment_Text: final sentiment for article's body text PS: Documents do not include articles' body text. Sentiment is presented in labels as follows: Pos: Positive Neg: Negative Neutral: Neutral other: inconclusive - if each of the 3 classifiers assigned a different label to the article, the label 'other' was used. Therefore, 'other' identifies contradictory results.
创建时间:
2022-03-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作