Arabic Depression Tweets Dataset (15,000 Tweets) with Linguistic Augmentation
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://doi.org/10.7910/DVN/UWLHRI
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains 15,000 Arabic tweets annotated for depression detection and includes linguistic feature augmentations to support research in natural language processing (NLP), sentiment analysis, and mental health detection. The dataset was curated to enable studies on automatic depression detection in Arabic social media and to support machine learning and deep learning approaches in the domain of computational mental health. Contents The dataset consists of the following columns: tweet: The original Arabic tweet text. label: Binary label indicating whether the tweet expresses signs of depression: 1 = Depression 0 = Non-depression negation_flag: Indicates presence (1) or absence (0) of negation in the tweet. intensifier_flag: Indicates presence (1) or absence (0) of intensifiers (words that strengthen the degree of emotion). Class (redundant but included for convenience): Textual label corresponding to the binary label (Depression or Non-depression). Binary Classification: Contains the count of instances in each class (appears as an artifact in the provided file). Key Features Language: Arabic (varied dialects and Modern Standard Arabic). Source: Publicly available tweets collected from Twitter (X). Annotation: Manual labeling by native Arabic speakers trained in psychology and linguistics. Linguistic augmentation: Flags for negation and intensifier usage are included to support linguistically informed NLP models. Potential Use Cases Depression detection models for Arabic texts. Linguistic analysis of depression expression in Arabic social media. Cross-lingual studies comparing depression signals across languages. Development of clinical decision support systems leveraging social media data. Licensing & Ethical Considerations The dataset consists of public social media posts. Researchers are advised to use it strictly for research purposes, respecting privacy and ethical guidelines. No personally identifiable information (PII) is included. Citation If you use this dataset, please cite it appropriately in your research publications and acknowledge the creators.
创建时间:
2025-06-12



