Chinese Medical Corpus
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/scienceasdf/medical-books
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个大规模语料库,它来源于医学教材和在线医疗问答论坛,旨在提取疾病提及信息。该数据集包含了从用户查询日志中提取的10万条疾病提及记录。它汇集了来自14本医学教材以及多个在线论坛的数据。该数据集的主要任务是提取疾病提及。
This dataset is a large-scale corpus sourced from medical textbooks and online medical Q&A forums, targeting disease mention extraction. It contains 100,000 disease mention records extracted from user query logs, and aggregates data from 14 medical textbooks and multiple online forums. The primary task of this dataset is disease mention extraction.



