cMedQA中文医疗社区问答数据集
收藏千言数据集2024-05-15 收录
下载链接:
https://www.luge.ai/#/luge/dataDetail?id=70
下载链接
链接失效反馈官方服务:
资源简介:
本数据集是目前为止最大规模的公开中文医疗社区问答数据集。数据来源为寻医问药网。 其中,cMedQA v1.0数据集有超过5万个医疗问题和超过10万个专业医生回答的答案构成[1],cMedQA v2.0数据集有近11万个医疗问题和近23万个专业医生回答的答案构成[2]。研究者需要通过训练集训练医疗自动问答模型,在验证和测试集上,针对每一个问题在给定的100个候选答案池中找出最佳的答案(该答案为真实医生所回答的答案)。数据集适合于训练中文医疗问答系统,减轻在线系统中医生的工作量,提高医疗社区系统服务用户的效率。
This dataset is the largest publicly available Chinese medical community question answering (QA) dataset to date. Its data is sourced from the XunYiWenYao online medical platform. Specifically, the cMedQA v1.0 dataset consists of over 50,000 medical questions and more than 100,000 professional physician-provided answers[1], while the cMedQA v2.0 dataset includes nearly 110,000 medical questions and approximately 230,000 such responses[2]. Researchers are required to train automatic medical QA models using the training subset, and then identify the optimal answer (the real answer provided by a professional physician) from a pool of 100 candidate answers for each given question on both the validation and test subsets. This dataset is suitable for training Chinese medical QA systems, which can reduce the workload of online physicians and improve the efficiency of medical community platforms in serving users.
提供机构:
国防科技大学
搜集汇总
数据集介绍

背景与挑战
背景概述
cMedQA中文医疗社区问答数据集是一个专注于中文医疗领域的问答数据集,适用于自然语言处理研究。该数据集仅供学习和研究用途,禁止商业使用,使用时必须注明作者出处,所有权归原作者所有。
以上内容由遇见数据集搜集并总结生成



