five

ViMQ

收藏
arXiv2023-04-28 更新2024-06-21 收录
下载链接:
https://github.com/tadeephuy/ViMQ
下载链接
链接失效反馈
官方服务:
资源简介:
ViMQ是一个专为越南医疗问答系统设计的医疗问题数据集,由VinBrain创建。该数据集包含从越南综合医院网站www.vinmec.com的咨询部分爬取的9000个医疗问题,涵盖了从常见疾病到专业领域的广泛健康问题。数据集通过句子级和实体级标注,支持意图分类和命名实体识别任务,旨在提高医疗对话系统对患者查询的理解能力。创建过程中采用了层次监督种子方法进行标注,确保标注质量。ViMQ数据集的应用领域主要集中在开发医疗对话系统,特别是自然语言理解模块,以减轻远程医疗医生的工作负担。

ViMQ is a medical question dataset specifically tailored for Vietnamese medical question answering systems, created by VinBrain. This dataset comprises 9,000 medical questions crawled from the consultation section of the Vietnamese general hospital website www.vinmec.com, covering a broad spectrum of health-related issues ranging from common diseases to specialized medical fields. Annotated at both sentence-level and entity-level, the dataset supports intent classification and named entity recognition tasks, with the goal of enhancing the comprehension capabilities of medical dialogue systems for patient queries. A hierarchical supervised seed annotation method was employed during the dataset construction to guarantee the quality of annotations. The primary application scope of the ViMQ dataset lies in developing medical dialogue systems, particularly their natural language understanding modules, to reduce the workload of physicians in telemedicine settings.
提供机构:
VinBrain
创建时间:
2023-04-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作