Datasets for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/7095099

下载链接

链接失效反馈

官方服务：

资源简介：

Introduction The datasets for training and evaluating our model for explainable depression detection on Twitter aided by metaphor concept mappings proposed in the following paper: Sooji, Han, Rui Mao, and Erik Cambria. "Hierarchical Attention Network for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings." In Proceedings of the 29th International Conference on Computational Linguistics (COLING), 2022. in press Source code for our model is available at github.com/soojihan/HAN. These datasets were generated using the dataset proposed in Shen et al., 2017. The original dataset is available at github.com/sunlightsgy/MDDL. Description There are three datasets. 1. mdl_HAN: This dataset contains tweets and metaphor concept mappings (MCMs) for 5,899 positive (i.e. depressed) and 4,469 negative users. Tweets in this dataset are extracted from the original MDDL dataset (Shen et al., 2017). MCMs were extracted using MetaPro (Mao et al., 2022). Please refer to our paper for more details.The name of each subfolder under the 'positive' and 'negative' subfolders is tweet userid. Each user's folder contains one or two json files: [userid].json: This json file contains tweet text objects, each of which is represented by [timestamp, tweet text]. [userid]_cm.json: This json file contains MCMs, each of which is represented by [timestamp, MCM] Note that some users do not have MCMs. There's no [userid]_cm.json file in such users' folders. 2. imdl_HAN: This dataset has the same contents and structure as mdl_HAN except that explicit linguistic cues for depression (i.e., “I’m/I was/I am/I’ve been diagnosed depression” and words containing “depress”, “diagnos”, “anxiety”, “bipolar” and “disorder”) were removed from all tweets. 3. sampled_training_eval_data: This dataset contains 5 randomly sampled cross-validation sets. Each of the five set contains train.csv, test,csv and dev.csv. Each csv file contains user ids. Remarks If you use the datasets, please cite our paper: Sooji, Han, Rui Mao, and Erik Cambria. "Hierarchical Attention Network for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings." In Proceedings of the 29th International Conference on Computational Linguistics (COLING), 2022. in press Contact If you have any questions about the datasets or source code for our model, please contact Sooji Han. References Shen, Guangyao, Jia Jia, Liqiang Nie, Fuli Feng, Cunjun Zhang, Tianrui Hu, Tat-Seng Chua, and Wenwu Zhu. "Depression detection via harvesting social media: A multimodal dictionary learning solution." In IJCAI, pp. 3838-3844. 2017. Rui Mao, Xiao Li, Mengshi Ge, and Erik Cambria. 2022. MetaPro: A computational metaphor pro- cessing model for text pre-processing. Information Fusion, 86-87:30–43.

创建时间：

2022-09-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集