Clinical-T5: Large Language Models Built Using MIMIC Clinical Text
收藏DataCite Commons2023-01-25 更新2025-04-16 收录
下载链接:
https://physionet.org/content/clinical-t5/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
Recent advances in scaling large language models (LLMs) has resulted in
significant improvements over a number of natural language processing
benchmarks. There has been some work to pretrain these language models over
clinical text. These works demonstrate that training a language model using
masked language modeling (MLM) on clinical notes is an effective technique for
boosting performance on downstream tasks. All of these previous works use
decoder-only architectures. We train 4 different clinical T5 models on the
union of MIMIC-III and IV notes. Two of the models are initialized from
previous T5-models (T5-base and SciFive). We additionally train a T5-Base and
T5-Large model from scratch. These models should not be distributed to non-
credentialed users. Research has shown that these language models have the
potential to leak sensitive information. Due to this potential risk, we
release the model weights under PhysioNet credentialed access.
提供机构:
PhysioNet
创建时间:
2023-01-25



