Table1_Weakly Supervised Learning for Categorization of Medical Inquiries for Customer Service Effectiveness.DOCX

Name: Table1_Weakly Supervised Learning for Categorization of Medical Inquiries for Customer Service Effectiveness.DOCX
Creator: Frontiers
Published: 2023-05-31 00:00:00
License: 暂无描述

frontiersin.figshare.com2023-05-31 更新2025-01-15 收录

下载链接：

https://frontiersin.figshare.com/articles/dataset/Table1_Weakly_Supervised_Learning_for_Categorization_of_Medical_Inquiries_for_Customer_Service_Effectiveness_DOCX/15089658/1

下载链接

链接失效反馈

官方服务：

资源简介：

With the growing unstructured data in healthcare and pharmaceutical, there has been a drastic adoption of natural language processing for generating actionable insights from text data sources. One of the key areas of our exploration is the Medical Information function within our organization. We receive a significant amount of medical information inquires in the form of unstructured text. An enterprise-level solution must deal with medical information interactions via multiple communication channels which are always nuanced with a variety of keywords and emotions that are unique to the pharmaceutical industry. There is a strong need for an effective solution to leverage the contextual knowledge of the medical information business along with digital tenants of natural language processing (NLP) and machine learning to build an automated and scalable process that generates real-time insights on conversation categories. The traditional supervised learning methods rely on a huge set of manually labeled training data and this dataset is difficult to attain due to high labeling costs. Thus, the solution is incomplete without its ability to self-learn and improve. This necessitates techniques to automatically build relevant training data using a weakly supervised approach from textual inquiries across consumers, healthcare professionals, sales, and service providers. The solution has two fundamental layers of NLP and machine learning. The first layer leverages heuristics and knowledgebase to identify the potential categories and build an annotated training data. The second layer, based on machine learning and deep learning, utilizes the training data generated using the heuristic approach for identifying categories and sub-categories associated with verbatim. Here, we present a novel approach harnessing the power of weakly supervised learning combined with multi-class classification for improved categorization of medical information inquiries.

随着医疗和制药领域非结构化数据的日益增长，自然语言处理技术已被广泛采用，用以从文本数据源中提取可操作的洞见。我们探索的关键领域之一是我公司内部的医疗信息功能。我们接收大量以非结构化文本形式存在的医疗信息咨询。一个企业级解决方案必须能够通过多种沟通渠道处理医疗信息交互，这些渠道总是充满了独特的关键词和情感，这些是制药行业独有的。迫切需要一种有效的解决方案，能够利用医疗信息业务中的上下文知识，结合自然语言处理（NLP）和机器学习的数字特性，构建一个自动化且可扩展的过程，实时生成关于对话类别的洞见。传统的监督学习方法依赖于大量手动标注的训练数据，而这样的数据集难以获取，其标注成本高昂。因此，若缺乏自我学习和改进的能力，解决方案将是不完整的。这要求采用弱监督方法，从消费者、医疗专业人士、销售和服务提供商的文本咨询中自动构建相关训练数据。该解决方案包含两个基本层级的NLP和机器学习。第一层利用启发式方法和知识库来识别潜在类别并构建标注训练数据。第二层基于机器学习和深度学习，利用启发式方法生成的训练数据来识别与字面意义相关的类别和子类别。在此，我们提出一种新颖的方法，结合弱监督学习和多类分类，以改善医疗信息咨询的分类。

提供机构：

Frontiers

5,000+

优质数据集

54 个

任务类型

进入经典数据集