cburger/md_cleaned
收藏Hugging Face2023-05-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cburger/md_cleaned
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
- name: label
dtype:
class_label:
names:
'0': ' Allergy / Immunology'
'1': ' Autopsy'
'2': ' Bariatrics'
'3': ' Cardiovascular / Pulmonary'
'4': ' Chiropractic'
'5': ' Consult - History and Phy.'
'6': ' Cosmetic / Plastic Surgery'
'7': ' Dentistry'
'8': ' Dermatology'
'9': ' Diets and Nutritions'
'10': ' Discharge Summary'
'11': ' ENT - Otolaryngology'
'12': ' Emergency Room Reports'
'13': ' Endocrinology'
'14': ' Gastroenterology'
'15': ' General Medicine'
'16': ' Hematology - Oncology'
'17': ' Hospice - Palliative Care'
'18': ' IME-QME-Work Comp etc.'
'19': ' Lab Medicine - Pathology'
'20': ' Letters'
'21': ' Nephrology'
'22': ' Neurology'
'23': ' Neurosurgery'
'24': ' Obstetrics / Gynecology'
'25': ' Office Notes'
'26': ' Ophthalmology'
'27': ' Orthopedic'
'28': ' Pain Management'
'29': ' Pediatrics - Neonatal'
'30': ' Physical Medicine - Rehab'
'31': ' Podiatry'
'32': ' Psychiatry / Psychology'
'33': ' Radiology'
'34': ' Rheumatology'
'35': ' SOAP / Chart / Progress Notes'
'36': ' Sleep Medicine'
'37': ' Speech - Language'
'38': ' Surgery'
'39': ' Urology'
splits:
- name: train
num_bytes: 15217210
num_examples: 4948
download_size: 7196712
dataset_size: 15217210
---
# Dataset Card for "md_cleaned"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
cburger
原始信息汇总
数据集概述
数据集特征
- text: 数据类型为字符串。
- label: 数据类型为分类标签,包含以下类别:
- 0: Allergy / Immunology
- 1: Autopsy
- 2: Bariatrics
- 3: Cardiovascular / Pulmonary
- 4: Chiropractic
- 5: Consult - History and Phy.
- 6: Cosmetic / Plastic Surgery
- 7: Dentistry
- 8: Dermatology
- 9: Diets and Nutritions
- 10: Discharge Summary
- 11: ENT - Otolaryngology
- 12: Emergency Room Reports
- 13: Endocrinology
- 14: Gastroenterology
- 15: General Medicine
- 16: Hematology - Oncology
- 17: Hospice - Palliative Care
- 18: IME-QME-Work Comp etc.
- 19: Lab Medicine - Pathology
- 20: Letters
- 21: Nephrology
- 22: Neurology
- 23: Neurosurgery
- 24: Obstetrics / Gynecology
- 25: Office Notes
- 26: Ophthalmology
- 27: Orthopedic
- 28: Pain Management
- 29: Pediatrics - Neonatal
- 30: Physical Medicine - Rehab
- 31: Podiatry
- 32: Psychiatry / Psychology
- 33: Radiology
- 34: Rheumatology
- 35: SOAP / Chart / Progress Notes
- 36: Sleep Medicine
- 37: Speech - Language
- 38: Surgery
- 39: Urology
数据集分割
- train: 包含4948个样本,数据集大小为15217210字节。
数据集大小
- 下载大小: 7196712字节
- 数据集大小: 15217210字节
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个医疗文本分类数据集,包含约4,948条医疗记录文本,涵盖手术、放射学、精神病学等40个医疗类别。数据以parquet格式存储,适用于自然语言处理任务,如医疗文本分类或信息提取。
以上内容由遇见数据集搜集并总结生成



