TRACES Bulgarian Telegram Dataset Annotated with Linguistic Markers of Lies

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/7614293

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset has been created within Project TRACES (more information: https://traces.gate-ai.eu/). The dataset contains 8791 anonymized Telegram social media posts, written in Bulgarian. The dataset is annotated with general information (named entities, part-of-speech tags, sentence length, etc.) and specific markers signaling details and can be used for general purposes or for building lies, manipulation, and disinformation detection applications. Note: this dataset is not fact-checked, the social media messages have been retrieved via keywords. For fact-checked datasets, see our other datasets. The social media posts have been collected via Telegram Desktop in June-July 2022. Explanations of which fields can be used as markers of lies (or of intentional disinformation) are provided in our forthcoming paper: Irina Temnikova, Silvia Gargova, Ruslana Margova, Veneta Kireva, Ivo Dzhumerov, Tsvetelina Stefanova and Hristiana Nikolaeva (2023) New Bulgarian Resources for Detecting Disinformation. 10th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (LTC'23). Poznań. Poland.

创建时间：

2024-12-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集