医疗健康病症筛查数据
收藏浙江省数据知识产权登记平台2024-09-05 更新2024-09-06 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/58476
下载链接
链接失效反馈官方服务:
资源简介:
通过采集心理诊疗中的记录,并通过数据处理和数据加工流程,将标注过的数据转化为病症筛查模型的高质量、高标注准确性的训练集。这些数据包括企业采集的应用、人工咨询的记录相关语料。首先,人工标注这些数据以确保标注的准确性和一致性。接着,构建一个包含病症筛查相关信息的RAG知识库,用于在训练和推理过程中搜索相似的意图和情景。通过心理大模型通过该知识库选择出符合的筛查结果,并经过超参数调优和模型优化以提升其准确性和鲁棒性。最终,训练好的病症筛查模型能够准确识别抑郁症、焦虑症等病症并提供筛查结果,广泛应用于社交媒体监控、心理健康热线、在线心理咨询、学校心理健康管理等多个场景,辅助心理健康专业人士进行干预和治疗。(1) 数据来源:原始数据来源于企业自研应用(连小信APP、洞见人和官网、人工咨询记录)相关语料。
(2) 数据处理和标注:对收集到的语料进行清洗和标准化处理,确保数据质量。通过人工标注这些数据,同时设置审核机制,以确保标注的准确性和一致性。
(3) 心理知识图谱构建:构建一个包含病症筛查相关信息的心理知识图谱,用于在训练和推理过程中帮助识别病症的意图和情景。
(4) 深度学习架构选择:选择适合处理文本数据的深度学习架构,采用的Transformer架构的心理大模型进行处理。
(5) 模型训练:在标注好的数据集上训练深度学习模型,通过监督学习的方式让模型学习识别病症意图。使用交叉验证和不同性能指标(如准确率、召回率)评估模型的识别能力。
(6) 超参数调优:进行超参数调优,包括学习率、批量大小、网络层数等,以优化模型性能。
(7) 模型优化与验证:根据评估结果,对模型进行剪枝、正则化等优化措施。在独立的测试集上验证模型的性能,确保模型在未见数据上也能表现良好。
(8) 病症筛查结果生成:通过训练好的模型和心理知识图谱,通过循证规则来完成病症筛查判断,最终生成符合的筛查结果。
This dataset is developed by collecting records from mental health consultations, and undergoing a series of data processing and refinement workflows to convert annotated data into a high-quality training set with high annotation accuracy for disorder screening models. The dataset covers corpora related to enterprise self-developed applications and manual counseling records.
First, manual annotation is conducted on the collected data, with a dedicated review mechanism established to ensure annotation accuracy and consistency. Next, a Retrieval-Augmented Generation (RAG) knowledge base containing disorder screening-related information is built, which is used to retrieve similar intents and scenarios during model training and inference. The mental health large language model (LLM) based on the Transformer architecture then selects appropriate screening results via this knowledge base, followed by hyperparameter tuning and model optimization to enhance its accuracy and robustness.
The final trained disorder screening model can accurately identify disorders such as depression and anxiety disorders, and generate standardized screening results. It can be widely applied in scenarios including social media monitoring, mental health hotlines, online psychological counseling, and school mental health management, to assist mental health professionals in carrying out interventions and treatments.
The detailed construction steps are as follows:
(1) Data Source: The original data is derived from corpora related to enterprise self-developed applications (Lianxiaoxin APP, Dongjianren official website, and manual counseling records).
(2) Data Processing and Annotation: The collected corpora are cleaned and standardized to guarantee data quality. Manual annotation is performed on the data, paired with a review mechanism to ensure annotation accuracy and consistency.
(3) Construction of Mental Health Knowledge Graph: A mental health knowledge graph containing disorder screening-related information is constructed, which aids in identifying disorder-related intents and scenarios during model training and inference.
(4) Selection of Deep Learning Architecture: A text-processing suitable deep learning architecture is selected, specifically a Transformer-based mental health large language model for data processing.
(5) Model Training: The deep learning model is trained on the annotated dataset, where supervised learning is adopted to enable the model to learn disorder intent recognition. Cross-validation and multiple performance metrics (including accuracy, recall) are used to evaluate the model's recognition ability.
(6) Hyperparameter Tuning: Hyperparameter tuning is conducted, covering learning rate, batch size, number of network layers, etc., to optimize model performance.
(7) Model Optimization and Validation: Based on the evaluation results, optimization measures such as model pruning and regularization are implemented. The model's performance is validated on an independent test set to ensure it performs well on unseen data.
(8) Generation of Disorder Screening Results: Using the trained model and the mental health knowledge graph, combined with evidence-based rules, disorder screening judgment is completed, and corresponding screening results are finally generated.
提供机构:
浙江连信科技有限公司
创建时间:
2024-08-09
搜集汇总
数据集介绍

特点
医疗健康病症筛查数据是一个包含570条记录的企业数据集,每日更新,用于训练病症筛查模型。数据涵盖用户输入、病症表征和治疗建议等信息,应用于心理健康管理等多个场景。
以上内容由遇见数据集搜集并总结生成



