five

cburger/md_cleaned

收藏
Hugging Face2023-05-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cburger/md_cleaned
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: text dtype: string - name: label dtype: class_label: names: '0': ' Allergy / Immunology' '1': ' Autopsy' '2': ' Bariatrics' '3': ' Cardiovascular / Pulmonary' '4': ' Chiropractic' '5': ' Consult - History and Phy.' '6': ' Cosmetic / Plastic Surgery' '7': ' Dentistry' '8': ' Dermatology' '9': ' Diets and Nutritions' '10': ' Discharge Summary' '11': ' ENT - Otolaryngology' '12': ' Emergency Room Reports' '13': ' Endocrinology' '14': ' Gastroenterology' '15': ' General Medicine' '16': ' Hematology - Oncology' '17': ' Hospice - Palliative Care' '18': ' IME-QME-Work Comp etc.' '19': ' Lab Medicine - Pathology' '20': ' Letters' '21': ' Nephrology' '22': ' Neurology' '23': ' Neurosurgery' '24': ' Obstetrics / Gynecology' '25': ' Office Notes' '26': ' Ophthalmology' '27': ' Orthopedic' '28': ' Pain Management' '29': ' Pediatrics - Neonatal' '30': ' Physical Medicine - Rehab' '31': ' Podiatry' '32': ' Psychiatry / Psychology' '33': ' Radiology' '34': ' Rheumatology' '35': ' SOAP / Chart / Progress Notes' '36': ' Sleep Medicine' '37': ' Speech - Language' '38': ' Surgery' '39': ' Urology' splits: - name: train num_bytes: 15217210 num_examples: 4948 download_size: 7196712 dataset_size: 15217210 --- # Dataset Card for "md_cleaned" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
cburger
原始信息汇总

数据集概述

数据集特征

  • text: 数据类型为字符串。
  • label: 数据类型为分类标签,包含以下类别:
    • 0: Allergy / Immunology
    • 1: Autopsy
    • 2: Bariatrics
    • 3: Cardiovascular / Pulmonary
    • 4: Chiropractic
    • 5: Consult - History and Phy.
    • 6: Cosmetic / Plastic Surgery
    • 7: Dentistry
    • 8: Dermatology
    • 9: Diets and Nutritions
    • 10: Discharge Summary
    • 11: ENT - Otolaryngology
    • 12: Emergency Room Reports
    • 13: Endocrinology
    • 14: Gastroenterology
    • 15: General Medicine
    • 16: Hematology - Oncology
    • 17: Hospice - Palliative Care
    • 18: IME-QME-Work Comp etc.
    • 19: Lab Medicine - Pathology
    • 20: Letters
    • 21: Nephrology
    • 22: Neurology
    • 23: Neurosurgery
    • 24: Obstetrics / Gynecology
    • 25: Office Notes
    • 26: Ophthalmology
    • 27: Orthopedic
    • 28: Pain Management
    • 29: Pediatrics - Neonatal
    • 30: Physical Medicine - Rehab
    • 31: Podiatry
    • 32: Psychiatry / Psychology
    • 33: Radiology
    • 34: Rheumatology
    • 35: SOAP / Chart / Progress Notes
    • 36: Sleep Medicine
    • 37: Speech - Language
    • 38: Surgery
    • 39: Urology

数据集分割

  • train: 包含4948个样本,数据集大小为15217210字节。

数据集大小

  • 下载大小: 7196712字节
  • 数据集大小: 15217210字节
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个医疗文本分类数据集,包含约4,948条医疗记录文本,涵盖手术、放射学、精神病学等40个医疗类别。数据以parquet格式存储,适用于自然语言处理任务,如医疗文本分类或信息提取。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作