aletheos-ngo/xenophobia_migrants_telegram
收藏Hugging Face2026-03-07 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/aletheos-ngo/xenophobia_migrants_telegram
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-classification
language:
- ru
size_categories:
- 1M<n<10M
---
# Anti-Immigrant Narrative Detection in Telegram News
## Dataset Summary
This dataset contains **Telegram messages from major news-oriented Telegram channels**, collected to examine presence of **negative and anti-immigrant narratives in media reporting** compared to general crime reporting.
The dataset is intended for **studying the prevalence, dynamics, and framing of anti-immigrant messaging in news sources**, rather than detecting hate speech or legal violations.
---
## Motivation and Research Goal
Public discourse on migration can be shaped not only by overt xenophobia, but also by **selective framing**, especially through disproportional **crime reporting involving immigrants**.
The core research hypothesis motivating this dataset is:
> **The frequency of news reporting on crimes committed by immigrants is highier if an anti-immigrant or xenophobic agenda is being promoted.**
By systematically identifying **negative narrative stances toward migrants**, including neutral-toned crime reporting that contributes to negative collective framing, this dataset enables longitudinal and comparative analysis of migration-related media narratives.
---
## Data Collection and Filtering Pipeline
1. **Source**
* Messages are collected from **major Telegram news channels**.
* The dataset focuses on news-style reporting rather than personal conversations.
2. **Thematic Filtering (Migration Relevance)**
* A **Logistic Regression classifier with TF-IDF features** is used to filter messages likely to be related to migration or migrants, and to crime news in general.
* Only messages above a predefined confidence threshold are retained for further processing.
3. **Narrative Annotation (Negative Stance Detection)**
* Filtered messages are then processed with an **LLM-based classifier** using a strict prompt defining *anti-immigrant attitude*.
* The task is **not limited to explicit xenophobia**.
---
## Annotation Task Definition
The annotation task is to determine whether a message expresses a **negative or anti-immigrant narrative stance**.
A message is labeled as *anti-immigrant* if it includes **any of the following**:
* Xenophobia or hostility toward migrants as a group
* Collective blame of migrants for crimes or social problems
* Dehumanization, stereotyping, or fear-mongering
* Calls for exclusion, repression, deportation, or discrimination
* **Reporting of crimes committed by immigrants**, even when presented in a neutral or factual tone
### Important Clarifications
* The task focuses on **narrative framing and stance**, not on legality, factual accuracy, or moral judgment.
* Explicit hate speech is **not required** for a positive label.
* For example, straightforward reporting such as *“An immigrant was arrested for drug trafficking”* is considered relevant, as it contributes to negative framing of migrants as a group.
---
## Intended Uses
* Monitoring trends in anti-immigrant messaging over time
* Studying correlations between crime reporting and political narratives
* Media analysis and narrative research
* Training or evaluating models for stance and narrative detection
---
## Limitations
* The dataset reflects **Telegram news ecosystems** and may not generalize to other media platforms.
* Labels are based on **narrative interpretation**, not factual correctness.
* LLM-based annotation may reflect prompt-induced biases and should be validated when used for downstream modeling.
提供机构:
aletheos-ngo



