MPRG/Mouse-Genecorpus-20M
收藏Hugging Face2025-02-24 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/MPRG/Mouse-Genecorpus-20M
下载链接
链接失效反馈官方服务:
资源简介:
Mouse-Genecorpus-20M是一个大规模预训练语料库,包含约2100万个小鼠单细胞转录组,数据来源于公开的多种组织。该语料库用于预训练Mouse-Geneformer,这是一个能够在网络生物学中在数据有限的情况下进行上下文感知预测的预训练变压器模型。
Mouse-Genecorpus-20M is a large-scale pretraining corpus comprised of ~21 million mouse single cell transcriptomes from a broad range of tissues from publicly available data. This corpus is used for pretraining Mouse-Geneformer, a pretrained transformer model that enables context-aware predictions in settings with limited data in network biology.
提供机构:
MPRG



