five

TRACES Bulgarian Telegram Dataset Annotated with Linguistic Markers of Lies

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7614293
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset has been created within Project TRACES (more information: https://traces.gate-ai.eu/). The dataset contains 8791 anonymized Telegram social media posts, written in Bulgarian. The dataset is annotated with general information (named entities, part-of-speech tags, sentence length, etc.) and specific markers signaling details and can be used for general purposes or for building lies, manipulation, and disinformation detection applications. Note: this dataset is not fact-checked, the social media messages have been retrieved via keywords. For fact-checked datasets, see our other datasets. The social media posts have been collected via Telegram Desktop in June-July 2022. Explanations of which fields can be used as markers of lies (or of intentional disinformation) are provided in our forthcoming paper:  Irina Temnikova, Silvia Gargova, Ruslana Margova, Veneta Kireva, Ivo Dzhumerov, Tsvetelina Stefanova and Hristiana Nikolaeva (2023) New Bulgarian Resources for Detecting Disinformation. 10th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (LTC'23). Poznań. Poland.
创建时间:
2024-12-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作