EQUES/JMedBench-Train

Name: EQUES/JMedBench-Train
Creator: EQUES
Published: 2024-11-15 03:51:11
License: 暂无描述

Hugging Face2024-11-15 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/EQUES/JMedBench-Train

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是从JMedBench数据集中提取并修改而来，包含674,954行日文和英文的医疗文本，列名为text。数据集的构建方式为仅提取训练子集，并将其中的问答样本合并为一个句子，英文样本格式为Questions:{Question}Answer:{Answer}，日文样本格式为質問:{Question}回答:{Answer}。该数据集主要用于医疗领域大型语言模型的持续预训练，不建议用于其他用途。

This is a modified dataset extracted from a part of JMedBench, containing 674,954 lines of Japanese and English medical text. The dataset only includes the train subset, and the question-answering samples are merged into one sentence, formatted as Questions:{Question}Answer:{Answer} for English samples and 質問:{Question}回答:{Answer} for Japanese samples. No other modifications were applied, and the dataset is primarily intended for the continual pretraining of medical large language models.

提供机构：

EQUES

5,000+

优质数据集

54 个

任务类型

进入经典数据集