Cancer model to extract tumor size, stage and biomarkers | Maps to ICD, ICDO, SNOMED, HPO, OMIM
收藏Databricks2024-05-26 收录
下载链接:
https://marketplace.databricks.com/details/8511b17c-86de-4453-9be2-22fb1a80428f/Intelligent-Medical-Objects-IMO_Cancer-model-to-extract-tumor-size,-stage-and-biomarkers-Maps-to-ICD,-ICDO,-SNOMED,-HPO,-OMIM-
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
- IMO’s Cancer NLP model extracts tumor size, tumor stage and biomarkers as well as many other associated attributes like specimen, body location, severity, temporal expression, subject, condition, uncertainty, negation, and course from unstructured text. The extracted information can also mapped to standard concepts in the Observational Medical Outcomes Partnership Common Data Model.
- If there is need for FHIR conformant responses, the pipeline and model can support various versions of both FHIR and other output models (including OMOP)
- The model uses a hybrid approach of combining deep learning-based models, curated lexicons and pattern-based rules applied to quickly build, maintain, and expand using the proven industry NLP development tool (CLAMP), now owned by IMO. The workflow can be repurposed for other use cases, where existing clinical natural language processing tools need to be customized for specific information within a short time.
- Predicable test entity recognition, extraction and codification includes Specimen, Primary-Site, Sub-Site, Procedure, Histology, Tumor Grade, Tumor Size, Tumor Margin, Invasion, Dimension Extend, Dimension Unit, Date range and Biomarker.
**Publications**
- [Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP ](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359882/)
**Use Cases**
- Large pathology reports, large data assets on patient datasets, research registries, or tumor board registries – anywhere where clinical data is present which indicates cancer related specimen information and resultant data sets, IMO’s Cancer NLP model can extract, encode and find relations for clinically important datasets on the patient or across populations.
- While several NLP models have been developed to process pathology reports (as examples of MedLEE, caTIES or homegrown operations using cTAKES), IMO’s Cancer NLP model coupled with managed terminology services results in significant decrease in false positives and false negatives in NER and resolution.
**Product Details**
- IMO’s Cancer NLP Model to extract tumor related data, its Pipeline, underlying NLP development platform and named entity extraction solutions are award-winning operations which leverage more than 30 years of terminology experience to extract meaningful concepts and seamlessly map these entities to all appropriate standard codes. This leads to a more quickly deployed, trustworthy, accurate, and usable dataset in a fraction of a time, with fewer resources required from high-cost FTEs.
- Comprehensive review pathology reports, lab results, clinical data assets, tumor board datasets, and EMR information sets to find, extract, and relate hard to find and understand tumor size, tumor stage and biomarker data.
- Intuitive interface to review, accept, and adjust matched data as needed
- Proven industry pipeline development platform (CLAMP) is now owned by IMO and offers updated out-of-the-box NLP models like extracting tumor size, stage and biomarkers and can further output metadata important for entity resolution and insight data processes on large swaths of patient populations.
- Data processed through this NLP model is updated at least monthly to make sure you don't lose any sleep over miscoding or under-coding during regulatory updates
- Deploy locally, on cloud, or through SAAS connection points
- Please reach out with any questions
**概述**
- IMO癌症自然语言处理(Natural Language Processing,NLP)模型可从非结构化文本中提取肿瘤大小、肿瘤分期、生物标志物(biomarker),以及标本(specimen)、解剖部位、严重程度、时间表达、主体、状态、不确定性、否定属性、病程等诸多关联属性。所提取的信息还可映射至观察性医疗结果合作组通用数据模型(Observational Medical Outcomes Partnership Common Data Model,OMOP CDM)中的标准概念。
- 若需要符合快速医疗保健互操作性资源(Fast Healthcare Interoperability Resources,FHIR)标准的输出结果,该流水线与模型可支持FHIR及其他多种输出模型(包含OMOP CDM)的不同版本。
- 该模型采用混合架构,融合基于深度学习的模型、人工整理词典与基于模式的规则,并依托现由IMO旗下的成熟行业级NLP开发工具(CLAMP)快速开展构建、维护与扩展工作。其工作流可被重用于其他场景,例如需要在短时间内针对特定信息定制现有临床NLP工具的场景。
- 可预测的实体识别、提取与编码涵盖标本、原发部位、亚部位、手术操作、组织学、肿瘤分级、肿瘤大小、手术切缘、侵袭情况、范围延展、维度单位、日期范围及生物标志物。
**发表文献**
- [基于CLAMP开发病理报告的可定制化癌症信息提取模块](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359882/)
**应用场景**
- 针对大型病理报告、患者数据集的大规模数据资产、研究登记库或肿瘤委员会登记库等各类包含癌症相关标本信息与衍生数据集的临床数据场景,IMO癌症NLP模型均可提取、编码并挖掘患者层面或群体层面临床重要数据集的关联关系。
- 尽管已有多款NLP模型用于处理病理报告(如MedLEE、caTIES或基于cTAKES的自研方案),但IMO癌症NLP模型结合标准化术语管理服务后,可显著降低命名实体识别(Named Entity Recognition,NER)与结果解析中的假阳性与假阴性率。
**产品详情**
- IMO癌症相关数据提取NLP模型、其流水线、底层NLP开发平台及命名实体提取解决方案均为获奖成果,依托超过30年的术语管理经验提取有价值的概念,并将这些实体无缝映射至所有适配的标准编码。这使得数据集的部署速度大幅提升,同时兼具可信性、准确性与易用性,仅需耗费传统方案一小部分的时间与高成本全职员工(Full-Time Equivalent,FTE)的资源。
- 可全面梳理病理报告、检验结果、临床数据资产、肿瘤委员会数据集与电子病历(Electronic Medical Record,EMR)信息集,挖掘并提取难以识别与理解的肿瘤大小、肿瘤分期及生物标志物数据,并建立其关联关系。
- 提供直观交互界面,支持按需查看、确认与调整匹配得到的数据。
- 现由IMO旗下的成熟行业级流水线开发平台(CLAMP)可提供更新后的开箱即用NLP模型,例如肿瘤大小、分期与生物标志物提取模型,还可进一步输出实体解析与大规模患者群体洞察数据处理所需的关键元数据。
- 通过该NLP模型处理的数据至少每月更新一次,确保用户无需因监管更新时的编码错误或编码不足而担忧。
- 支持本地部署、云端部署或通过软件即服务(Software as a Service,SAAS)接口对接。
- 如有任何疑问,欢迎随时联系。
提供机构:
Intelligent Medical Objects, IMO



