Medical Text Dataset

arXiv2025-09-30 收录

下载链接：

https://github.com/Stephen-SMJ/LLamaCare

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了从多种资源中收集的医疗文本，其中包括关于疾病诊断和治疗建议的真实及模拟的医疗话题对话。该数据集融合了专业的医疗知识和推理能力，以增强对话功能。规模上，数据集包含了116.41千个样本和15.61百万个词汇。该数据集的任务是为医疗语言模型的训练进行知识注入。

This dataset collects medical texts from various sources, including both real and simulated dialogues on medical topics related to disease diagnosis and treatment recommendations. It integrates professional medical knowledge and reasoning capabilities to enhance its conversational functions. In terms of scale, the dataset contains 116.41 thousand samples and 15.61 million words. The core task of this dataset is to inject domain-specific knowledge for the training of medical language models.

5,000+

优质数据集

54 个

任务类型

进入经典数据集