five

LGBTQIAphobia dataset (augmented)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13756091
下载链接
链接失效反馈
官方服务:
资源简介:
Name: LGBTQIAphobia_dataset  (augmented)Description: Labelled dataset with phrases retrieved from different digital sources (X/twitter, Instagram, TikTok) containing diverse messages directed towards the LGBTQIA+ community. It has 1234 phrases classified as {Non-LGBTQIAphobic(0), LGBTQIAphobic (1)}  Language: Spanish  Format: CSV (UTF-8)Structure: id;phrase;class {0,1}Purpose: Be used for fine-tuned models that detect language offensive to Spanish or latin LGBT communities in digital environments.Sources: X/Twitter, Instagram, TikTokSize: 20Kb  Ethical considerations: This dataset was created strictly for academic and research purposes. The person who was the target of the hate speech has been anonymized, and there is no intention to harm them in any way, either to them or to the person who delivered the speech. We prioritize the protection of privacy and confidentiality of vulnerable individuals. To safeguard privacy, we carefully remove any identifying details such as user IDs, phone numbers, and addresses before sharing the data with our annotators. All the data we collect is from publicly available sources and does not contain any personal or sensitive information that may jeopardize anyone’s privacy. I request researchers to commit to abiding by ethical guidelines so as not to unnecessarily harm individuals.¿How was create?-Starting recovering of discriminatory phrases for the LGBTQIA+ community from X/Twitter, Instagram and Tiktok (197 phrases) .-Labelling by 3 raters as non-lgbtphobic (0) and lgbtphobic (1).-Text augmentation was applied backtranslation and random synonyms replacing.-Translating to Spanish part of McGiff, J., & Nikolov, N. S. (2024) dataset and was added under  licence CC-BY-4.0-Finally, we obtained 1234 tagged phrases for version 1.0.1 of LGBTQIAphobia_augmented. Class distribution class id 0 507 1 727where class is0:non-lgbtphobic1:lgbtphobic
创建时间:
2024-12-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作