Medical Transcription
收藏Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/cd0b8356-8ae8-4178-a55b-7f69f040c0b8/John-Snow-Labs_Medical-Transcription
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
Medical data is extremely hard to find due to HIPAA privacy regulations. This data package offers a solution by providing medical transcription samples.
**Description**
This data package contains sample medical transcriptions for various medical specialties. Medical transcription (MT) is the manual processing of voice reports dictated by physicians and other healthcare professionals into text format. The MT team of a hospital typically receives the voice files with dictation of medical documents from healthcare providers. The voice files are then converted into text.
The transcribed medical reports are usually created in digital format and submitted to the hospital's Electronic Health Record (EHR) or Electronic Medical Record (EMR) system. Currently, the medical field relies on speech recognition software and medical transcription software (MTS) for transcribing.
**Benefits**
- Medical transcription is the primary mechanism for a physician to clearly communicate with other healthcare providers who access the patient record, to advise them on the state of the patient's health and past/current treatment, and to assure continuity of care.
**License Information**
The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes.
**Included Datasets**
- [Medical Transcription Samples](https://www.johnsnowlabs.com/marketplace/medical-transcription-samples)
- This dataset contains sample medical transcriptions for various medical specialties.
**Data Engineering Overview**
**We deliver high-quality data**
- Each dataset goes through 3 levels of quality review
- 2 Manual reviews are done by domain experts
- Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints
- Data is normalized into one unified type system
- All dates, unites, codes, currencies look the same
- All null values are normalized to the same value
- All dataset and field names are SQL and Hive compliant
- Data and Metadata
- Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters
- Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated
- Data Updates
- Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted
**Our data is curated and enriched by domain experts**
Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts:
- Field names, descriptions, and normalized values are chosen by people who actually understand their meaning
- Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset
- Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations
- The data is always kept up to date – even when the source requires manual effort to get updates
- Support for data subscribers is provided directly by the domain experts who curated the data sets
- Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution.
**Need Help?**
If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).
提供机构:
John Snow Labs



