肾病专病数据集

Name: 肾病专病数据集
Creator: 天津健康医疗大数据有限公司
Published: 2024-11-11 09:58:41
License: 暂无描述

天津市数据知识产权登记平台2024-11-11 更新2024-11-25 收录

下载链接：

https://dengji.tjippc.cn/xxgg_nr?id=2e4ec07d-3a45-46de-a8d1-6bad10387b4f

下载链接

链接失效反馈

官方服务：

资源简介：

专病诊断名称分类模型：通过分析医学文献、临床数据和专家知识，建立一个诊断数据库。经过分词和打乱顺序的预处理后，使用 train_supervised 函数进行训练（迭代200次，学习率0.1，词N-grams长度为1，损失函数为"hs"）。模型性能通过 classification_report 方法评估，表现良好。参数更新通过命令同步模型、标签和标签名，从而快速、准确地诊断专病类型。电子病历质控分类模型：该模型通过自然语言处理技术对电子病历中的主诉、现病史、既往史等文本进行识别和分析，提取关键信息并进行分类。包含7个类别，每类250个样本。数据处理包括标签化、分词，并转换为TXT文件。用 BERT的分词器将病历文本转化为BERT所需的输入格式，质控标签转换为数值标签。训练集与测试集按9:1比例划分。使用 BertForSequenceClassification模型进行训练。模型评估通过 classification_report 方法进行。参数更新步骤包括将数据放入指定文件夹，运行训练和更新命令，确保模型、标签和标签名同步。

Specialized Disease Diagnosis Name Classification Model: A diagnostic database is constructed by analyzing medical literature, clinical data and expert knowledge. After preprocessing operations including word segmentation and random shuffling, the train_supervised function is employed for training (200 training iterations, learning rate set to 0.1, word N-grams length of 1, and the loss function configured as "hs"). The model performance is evaluated using the classification_report method, which yields satisfactory results. Parameter updates synchronize the model, labels and label names via dedicated commands, enabling fast and accurate diagnosis of specialized disease types. Electronic Medical Record Quality Control Classification Model: This model adopts natural language processing technologies to identify and analyze texts such as chief complaints, present medical histories and past medical histories in electronic medical records, extract key information and conduct classification tasks. The dataset consists of 7 categories, with 250 samples per category. Data processing steps include labelization, word segmentation and conversion to TXT format files. The BERT tokenizer is utilized to transform medical record texts into the input format required by BERT, while quality control labels are converted into numerical labels. The training set and test set are split at a ratio of 9:1. The BertForSequenceClassification model is used for model training. Model evaluation is performed via the classification_report method. The parameter update procedure involves placing the dataset into the designated folder, running training and update commands to ensure synchronization among the model, labels and label names.

提供机构：

天津健康医疗大数据有限公司

创建时间：

2024-11-05

搜集汇总

数据集介绍

特点

肾病专病数据集由天津健康医疗大数据有限公司提供，包含41万条数据，每月更新。数据结构涵盖住院日期、诊断名称、药品名称、检验项目等20个字段。该数据集适用于医疗、教学和科研领域，主要用于诊疗模式研究、药物经济学研究，能够帮助分析肾病的发病率、临床特征及治疗模式，为临床决策提供支持。数据集已通过天津知识产权交易平台存证。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集