mdsajjadullah/explainDepression-social-media-xai
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mdsajjadullah/explainDepression-social-media-xai
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于可解释抑郁症检测研究的清洗、平衡且带有临床注释的社交媒体数据集。数据集结合了来自Twitter和Reddit的真实帖子,使用了与DSM-5对齐的临床词典特征进行工程处理,并用于训练一个准确率达到96.17%的DistilBERT模型。数据集包含40,770行数据,类别平衡(50%抑郁/50%非抑郁),具有12个特征列。预处理步骤包括URL移除、@提及剥离、表情符号和非ASCII字符移除、HTML实体清理、小写标准化以及随机过采样以平衡类别。
Cleaned, balanced, and clinically annotated social media dataset for explainable depression detection research. Combined from real Twitter and Reddit posts, engineered with DSM-5-aligned clinical lexicon features, and used to train a DistilBERT model achieving 96.17% accuracy. The dataset contains 40,770 rows with class balance (50% depressed/50% not depressed) and 12 feature columns. Preprocessing includes URL removal, @mention stripping, emoji and non-ASCII removal, HTML entity cleaning, lowercase normalisation, and random over-sampling for class balance.
提供机构:
mdsajjadullah



