晚期肝癌数据集
收藏天津市数据知识产权登记平台2024-10-16 更新2024-10-30 收录
下载链接:
https://dengji.tjippc.cn/xxgg_nr?id=0ec877e7-8226-4ce0-827f-e0df680b4467
下载链接
链接失效反馈官方服务:
资源简介:
专病诊断名称分类模型:通过分析医学文献、临床数据和专家知识,建立一个诊断数据库。经过分词和打乱顺序的预处理后,使用 train_supervised 函数进行训练(迭代200次,学习率0.1,词N-grams长度为1,损失函数为"hs")。模型性能通过 classification_report 方法评估,表现良好。参数更新通过命令同步模型、标签和标签名,从而快速、准确地诊断专病类型。
电子病历质控分类模型:该模型通过自然语言处理技术对电子病历中的主诉、现病史、既往史等文本进行识别和分析,提取关键信息并进行分类。包含7个类别,每类250个样本。数据处理包括标签化、分词,并转换为TXT文件。用 BERT的分词器将病历文本转化为BERT所需的输入格式,质控标签转换为数值标签。训练集与测试集按9:1比例划分。使用 BertForSequenceClassification模型进行训练。模型评估通过 classification_report 方法进行。参数更新步骤包括将数据放入指定文件夹,运行训练和更新命令,确保模型、标签和标签名同步。
Specialized Disease Diagnosis Name Classification Model: A diagnostic database is established by analyzing medical literature, clinical data and expert knowledge. After preprocessing including tokenization and shuffling, the model is trained using the train_supervised function (200 training iterations, learning rate of 0.1, word N-grams of length 1, loss function set to "hs"). The model's performance is evaluated using the classification_report method, achieving excellent results. Parameter updates synchronize the model, labels and label names via commands, enabling fast and accurate diagnosis of specialized disease types.
Electronic Medical Record Quality Control Classification Model: This model uses natural language processing (NLP) technologies to recognize and analyze texts including chief complaints, present medical histories and past medical histories in electronic medical records, extract key information and perform classification. It contains 7 categories, with 250 samples per category. Data processing includes labeling, tokenization and conversion to TXT files. The BERT tokenizer is used to convert medical record texts into the input format required by BERT, and the quality control labels are converted into numerical labels. The training set and test set are split at a ratio of 9:1. Training is conducted using the BertForSequenceClassification model. Model evaluation is performed using the classification_report method. The parameter update steps include placing the data into the specified folder, running the training and update commands to ensure synchronization of the model, labels and label names.
提供机构:
天津健康医疗大数据有限公司
创建时间:
2024-10-14
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



