Reddit Comments Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/xfold/LanguageBiasesInReddit
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来自Reddit论坛上“/r/TheRedPill”版块的一批评论,旨在用于分析与性别相关的语言偏见。该数据集经过Gensim word2vec处理,以便进行分析,并专注于该版块中性别偏见的内容。此次任务的目标是发现并分类语言偏见。
This dataset comprises a corpus of comments sourced from the "/r/TheRedPill" subreddit on the Reddit platform, and is intended for the analysis of gender-related linguistic bias. The dataset has been preprocessed using Gensim Word2Vec to facilitate analytical work, with a particular focus on content exhibiting gender bias within this subreddit. The core objective of this task is to detect and classify linguistic bias.
提供机构:
Pushshift



