Clinical deidentification for PDF
收藏Snowflake2024-12-15 更新2024-12-17 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZTYZ4386LJAS
下载链接
链接失效反馈官方服务:
资源简介:
The Clinical De-Identification model is designed to recognize and anonymize PHI in English-language clinical notes. It employs state-of-the-art natural language processing techniques to detect sensitive information such as patient names, addresses, medical record numbers, and other identifiers. Once identified, the PHI is effectively masked or obfuscated, rendering the text safe for broader use while maintaining its informational integrity.
Up to 1.9M chars/hour; 0.0018 USD/100 processed chars.
**Key Features:**
- The model is finely tuned to identify a wide range of PHI elements in medical texts, ensuring comprehensive de-identification.
- The de-identification process aligns with HIPAA and other healthcare privacy regulations, aiding in legal compliance and data protection.
- Ideal for research, analytics, and training purposes, this model enables the safe utilization of medical texts without compromising patient privacy.
<p><br/></p>
Covered entities: AGE, BIOID, CITY, COUNTRY, DATE, DEVICE, DOCTOR, EMAIL, FAX, HEALTHPLAN, HOSPITAL, IDNUM, LOCATION, MEDICALRECORD, ORGANIZATION, PATIENT, PHONE, PROFESSION, STATE, STREET, URL, USERNAME, ZIP, ACCOUNT, LICENSE, VIN, SSN, DLN, PLATE, and IPADDR.
<p><br/></p>
<p><br/></p>
提供机构:
John Snow Labs
创建时间:
2024-10-29
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集提供临床PDF文档去标识化模型,用于识别和匿名化英文临床笔记中的受保护健康信息(PHI),覆盖年龄、地址、医疗记录号等多种实体。模型符合HIPAA隐私法规,支持研究等用途,处理速度达每小时190万字符,成本为每10万字符0.0018美元。
以上内容由遇见数据集搜集并总结生成



