lung_data.xlsx
收藏DataCite Commons2025-11-14 更新2026-04-25 收录
下载链接:
https://figshare.com/articles/dataset/coloncancer_top_1000_updated_04052024/30509012/2
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains a curated collection of publicly available Reddit posts and comment threads related to lung cancer, collected from multiple cancer-focused subreddits. It includes discussions authored by patients, caregivers, medical professionals, and community members across platforms such as r/lungcancer, r/nsclc, r/cancer, r/cancercaregivers, r/cancerfamilysupport, and related communities. The dataset spans posts from 2019 to 2024.<br>Each entry captures both the original post and its associated top-level comments, along with structured metadata describing user role, cancer stage, treatment type, treatment intent, symptom discussions, emotional support themes, and other communication characteristics. Variables include subreddit, post date, engagement metrics, full post text, up to three comments, and an aggregated text field combining post and comments.<br>A set of manually coded variables provides detailed annotation of clinical and psychosocial attributes. These include:<br><b>Author type</b> (patient, caregiver, medical professional, other)<b>Cancer stage</b> (I–IV, recurrence, NED, or unknown)<b>Treatment perceptions</b> (curative vs. palliative intent)<b>Treatment modality</b> (chemotherapy, radiation, surgery, immunotherapy, targeted therapy, or unknown)<b>Treatment completion status</b><b>Question-asking behavior</b> (e.g., side effects, recurrence, clinical trial eligibility, emotional support)<b>Experience sharing</b><b>Presence of external links</b><b>Image inclusion</b><b>Symptom validation seeking</b><b>Post-treatment concerns</b><b>Thematic categorization</b> (clinical, non-clinical, or other)<br><br>This dataset enables research on online health communication, patient information needs, treatment decision-making, caregiver burden, emotional coping, and digital support networks in the context of lung cancer. It is suitable for qualitative, quantitative, computational linguistic, and machine-learning analyses. All content is sourced from publicly accessible Reddit pages and has been fully anonymized to remove usernames and personally identifiable information.
提供机构:
figshare
创建时间:
2025-11-14



