five

Urdu Social Media Dataset for Depression Detection

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/depression-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
To address the need for a comprehensive dataset for depression detection in Urdu, I have utilized a pre-existing dataset originally compiled for a similar purpose in the Russian language.  This dataset, which was constructed from the VKontakte social network\u2014a prominent platform in Russia and other CIS countries\u2014offers a valuable collection of user-generated content that provides insights into the expressions of depression within that demographic. The researchers in whose work we were able to come across this data were analyzing posts on VK. api. The posts were made public, containing words referring to depressive states. For instance, one of the word combinations used in the search was \suicide,\ \I don't want to live,\ and \I want to die.\ Therefore, the choice of keywords was strategic, forming the main part of the relevance of the dataset to the investigation. The most important matter regarding this dataset's formation was super careful categorization of all posts collected before, into two bits: depressive and non-depressive. This categorization was done by a team of psychologists and added the first layer of human expertise to ensure proper labeling and validity of the content of each item. Today, the data contains a total of 64,039 items, half of which are depressive-32,018 items, and the other half are non-depressive-32,021 items.  But it's important for both training and evaluating your machine learning models to be balanced, to avoid any biases towards one class or another. To adapt this valuable resource to my research on Urdu-speaking populations, I converted the entire dataset into the Urdu language. This translation process was carried out using the Google Translate API, which facilitated the conversion of all Russian posts into Urdu while preserving their original meaning and sentiment.
提供机构:
Muhammad Haris Javed
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作