Haodf Doctor Recommendation Dataset
收藏DataCite Commons2025-01-22 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/haodf-doctor-recommendation-dataset
下载链接
链接失效反馈官方服务:
资源简介:
We collected patient-doctor interaction data from the Haodf online consultation platform on the six common diseases, categorized by different risk levels. Low-risk diseases include Common Cold (Cold) and Pneumonia (Pneu.), medium-risk diseases include Diabetes (Diab.) and Depression (Depr.), and high-risk diseases include Coronary Heart Disease (CHD) and Lung Cancer (Lung.). We only use publicly accessible data, with all patients and doctors remaining anonymous, ensuring effective protection of their privacy. To further evaluate the effectiveness of identifying the most relevant doctors for treating a patient’s symptoms, we also collected disease tags t for each patient suffering from x. These tags offer a more detailed description of the patient’s condition, allowing for more precise treatment matching. For example, For instance, the tag Viral Pneumonia provides a more specific categorization under the broader category of Pneumonia. Similarly, Malignant Tumor is a detailed tag used for patients diagnosed with Lung Cancer. It is important to note that the disease tag is used solely for evaluation purposes and is not involved in any of the training processes. Detailed statistics of the dataset are provided in Table I. For the dataset split, we divided the records of each doctor’s consultation cases into training, validation, and test sets in a ratio of 8:1:1.
我们从好大夫在线(Haodf)在线问诊平台收集了六种常见疾病对应的医患交互数据,并按风险等级进行分类。其中,低风险疾病包括普通感冒(Common Cold, Cold)与肺炎(Pneumonia, Pneu.);中风险疾病涵盖糖尿病(Diabetes, Diab.)和抑郁症(Depression, Depr.);高风险疾病则包含冠心病(Coronary Heart Disease, CHD)与肺癌(Lung Cancer, Lung.)。本数据集仅使用公开可获取的数据,所有患者与医生均已完成匿名化处理,有效保障了其隐私安全。为进一步评估针对患者症状匹配最优接诊医生的有效性,我们还为每位罹患对应疾病的患者收集了疾病标签t。此类标签可对患者病情进行更细致的描述,从而实现更精准的诊疗匹配。例如,病毒性肺炎(Viral Pneumonia)作为细分标签,可在肺炎这一大类下实现更具体的分类;同理,恶性肿瘤(Malignant Tumor)则是针对肺癌患者的细分标签。需特别说明的是,疾病标签仅用于模型评估环节,并未参与任何模型训练流程。本数据集的详细统计信息详见表1。在数据集划分环节,我们将每位医生的接诊病例记录按8:1:1的比例划分为训练集、验证集与测试集。
提供机构:
IEEE DataPort
创建时间:
2025-01-22
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集收集了Haodf在线咨询平台上六种常见疾病的医患互动数据,按风险等级分类,包含医生元数据和互动信息文件,用于评估医生推荐效果。数据集匿名处理,保护了患者和医生的隐私。
以上内容由遇见数据集搜集并总结生成



