SEN - Sentiment analysis of Entities in News headlines
收藏Mendeley Data2024-05-11 更新2024-06-29 收录
下载链接:
https://zenodo.org/records/5211931
下载链接
链接失效反馈官方服务:
资源简介:
If you wish to use this data please cite: Katarzyna Baraniak, Marcin Sydow, A dataset for Sentiment analysis of Entities in News headlines (SEN), Procedia Computer Science, Volume 192, 2021, Pages 3627-3636, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2021.09.136. (https://www.sciencedirect.com/science/article/pii/S1877050921018755) bibtex: users.pja.edu.pl/~msyd/bibtex/sydow-baraniak-SENdataset-kes21.bib SEN is a novel publicly available human-labelled dataset for training and testing machine learning algorithms for the problem of entity level sentiment analysis of political news headlines. On-line news portals play a very important role in the information society. Fair media should present reliable and objective information. In practice there is an observable positive or negative bias concerning named entities (e.g. politicians) mentioned in the on-line news headlines. Our dataset consists of 3819 human-labelled political news headlines coming from several major on-line media outlets in English and Polish. Each record contains a news headline, a named entity mentioned in the headline and a human annotated label (one of “positive”, “neutral”, “negative” ). Our SEN dataset package consists of 2 parts: SEN-en (English headlines that split into SEN-en-R and SEN-en-AMT), and SEN-pl (Polish headlines). Each headline-entity pair was annotated via team of volunteer researchers (the whole SEN-pl dataset and a subset of 1271 English records: the SEN-en-R subset, “R” for “researchers”) or via the Amazon Mechanical Turk service (a subset of 1360 English records: the SEN-en-AMT subset). During analysis of annotation outlying annotations and removed . Separate version of dataset without outliers is marked by "noutliers" in data file name. Details of the process of preparing the dataset and presenting its analysis are presented in the paper. In case of any questions, please contact one of the authors. Email adresses are in the paper.
创建时间:
2024-05-10



