five

Clinical situations text database for Polish language

收藏
DataCite Commons2025-12-01 更新2025-04-16 收录
下载链接:
https://mostwiedzy.pl/en/open-research-data/clinical-situations-text-database-for-polish-language,11080602391139881-0
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset contains a database of anonymized texts in Polish for the purposes of building a medical speech corpus,  for clinical situations in the following areas: medical interview, interview and description of the result of an oncological examination, description of a radiological examination, description of a pathomorphological examination, description of a cardiological examination, description of the surgical procedure, description of the reanimation procedure, medical recommendations, prescription (including lists of drug names). Example content of the text file The texts in the database are divided into 10 clinical situations:  Medical interview. Radiological examination. Oncology examination. Pathomorphological examination. Cardiology examination. Course of surgical procedure. Course of reanimation procedure. Recommendations. Referral to treatment. Prescriptions with pharmaceutical names. The texts are saved in CSV format in the file phrases.csv The first row of the file serves as the header row and contains information about the contents of each column: id - unique number of the phrase; phrase – phrase (a sentence or several related sentences); CategoryID - number of the clinical situation; SubCategoryID - subcategory number (only appears for some CategoryIDs). The classification of the clinical situations (categories) is provided in the file situations.csv

本数据集包含一份匿名化波兰语文本数据库,旨在构建医疗语音语料库,覆盖以下领域的临床场景:医患问诊(medical interview)、肿瘤检查(oncological examination)的问诊及结果描述、放射检查(radiological examination)描述、病理形态学检查(pathomorphological examination)描述、心脏科检查(cardiological examination)描述、手术操作(surgical procedure)描述、急救复苏操作(reanimation procedure)描述、医疗建议、处方(含药品名称清单)。 文本文件示例说明:本数据库中的文本共分为10类临床场景: 1. 医患问诊 2. 放射检查 3. 肿瘤检查 4. 病理形态学检查 5. 心脏科检查 6. 手术操作流程 7. 急救复苏操作流程 8. 医疗建议 9. 诊疗转诊 10. 含药品名称的处方 文本以CSV格式存储于phrases.csv文件中。该文件首行为表头行,包含各列的含义说明: id - 短语的唯一编号; phrase – 短语(单句或语义关联的多句组合); CategoryID - 临床场景编号; SubCategoryID - 子分类编号(仅部分CategoryID存在该字段)。 临床场景(分类)的分类信息存储于situations.csv文件中。
提供机构:
Gdańsk University of Technology
创建时间:
2023-11-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作