five

The ALPIN Sentiment Dictionary: Austrian Language Polarity in Newspapers

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5857150
下载链接
链接失效反馈
官方服务:
资源简介:
These datasets are part of the submitted paper for the LREC2022 conference entitled: "The ALPIN Sentiment Dictionary: Austrian Language Polarity in Newspapers" The various data sources, as well as the methodology, are explained in detail in the research paper which will be available soon. ALPIN stands for Austrian Language Polarity in Newspapers. The dictionary consists of three different parts which were merged together: Austrian Media Corpus: AMC (AMC_v1.0.csv) STANDARD posts: STP (STP_v1.0.csv) Austriacisms: AUT (AUT_v1.0.csv) Austrian Media Corpus (AMC) (Ransmayr et al., 2017) & STANDARD posts (STP) (Schabus et al., 2017) rely on the SPLM algorithm as used in SentiDraw (Sharma & Dutta 2021). Austriacisms (AUT) was generated by using the Best-Worst scaling (BWS) (Kiritchenko and Mohammad, 2017b). The AUT list was collected from the “Variantenwörterbuch des Deutschen” (Ammon et al., 2016) (thereby only selecting those words that only surface in Austrian German and in no other variety of German) and an austriacism list of Wikipedia (https://de.wikipedia.org/wiki/Liste_von_Austriazismen). The scores are scaled to the interval [-1, 1] using the min-max-abs scaling, ranging from negative to positive. References: Sharma, S. S., & Dutta, G. (2021). SentiDraw: Using star ratings of reviews to develop domain specific sentiment lexicon for polarity determination. Information Processing & Management, 58(1), 102412. Kiritchenko, S. and Mohammad, S. M. (2017b). Capturing reliable fine-grained sentiment associations by crowdsourcing and best-worst scaling. Schabus, D., Skowron, M., & Trapp, M. (2017). One Million Posts: A Data Set of German Online Discussions. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1241–1244. https://doi.org/10.1145/3077136.3080711 Ransmayr, J., Mörth, K., & Ďurčo, M. (2017). AMC (Austrian Media Corpus). In Korpusbasierte Forschungen zum österreichischen Deutsch. In Digitale Methoden der Korpusforschung in Österreich (= Veröffentlichungen zur Linguistik und Kommunikationsforschung Nr. 30) (pp. 27–38). Verlag der Österreichischen Akademie der Wissenschaften. Ammon, U., Bickel, H., & Ebner, J. (2016). Variantenwörterbuch des Deutschen : die Standardsprache in Österreich, der Schweiz, Deutschland, Liechtenstein, Luxemburg, Ostbelgien und Südtirol sowie Rumänien, Namibia und Mennonitensiedlungen. Walter de Gruyter.
创建时间:
2022-01-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作