five

Medical Text Dataset

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/Stephen-SMJ/LLamaCare
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了从多种资源中收集的医疗文本,其中包括关于疾病诊断和治疗建议的真实及模拟的医疗话题对话。该数据集融合了专业的医疗知识和推理能力,以增强对话功能。规模上,数据集包含了116.41千个样本和15.61百万个词汇。该数据集的任务是为医疗语言模型的训练进行知识注入。

This dataset collects medical texts from various sources, including both real and simulated dialogues on medical topics related to disease diagnosis and treatment recommendations. It integrates professional medical knowledge and reasoning capabilities to enhance its conversational functions. In terms of scale, the dataset contains 116.41 thousand samples and 15.61 million words. The core task of this dataset is to inject domain-specific knowledge for the training of medical language models.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作