Depression Indicators in Twitter

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://data.mendeley.com/datasets/s25h5tzgyf

下载链接

链接失效反馈

官方服务：

资源简介：

The dataset was created to identify relevant features for detecting individuals with depression based on their Twitter posts. It consists of 3,758 tweets and 5,902 unique words, structured in a binary matrix format where each row represents a tweet and each column represents a word. The values indicate the presence (1) or absence (0) of a word in a given tweet. In addition to textual data, the dataset incorporates nontextual features, stored in a separate table. Each row represents a tweet, and each column corresponds to a specific attribute: the number of likes, retweets, mentions, and the time of publication, as well as the device used for posting. The posting time was transformed into a numerical format ranging from 0 to 47, where each value represents a 30-minute interval throughout the day. In contrast, the device type is stored as raw text containing the name of the device used to post each tweet. The numerical values (likes, retweets, and mentions) were also kept as raw counts, preserving their original scale for further analysis. This dataset was used in the study "Characteristics for depression detection using Twitter data" (DOI: 10.59681/2175-4411.v16.iEspecial.2024.1319).

创建时间：

2025-03-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集