jo-mengr/cellxgene_pseudo_bulk_200k_normlog_multiplets_natural_language_annotation
收藏Hugging Face2025-11-12 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/jo-mengr/cellxgene_pseudo_bulk_200k_normlog_multiplets_natural_language_annotation
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含RNA测序数据和文本描述的多模态数据集,适用于相关对比学习或推理任务。数据集的单元格句子长度为cs_length基因。RNA测序数据来源于CellWhisperer项目,并从CellxGene和GEO数据库中获取。该数据集可用于训练结合了转录组和文本模态的多模态模型。
This dataset contains RNA sequencing data and text descriptions, suitable for relevant contrastive-learning or inference tasks. The cell sentences in the dataset have a length of cs_length genes. The RNA sequencing data used for training was originally gathered and annotated in the CellWhisperer project, derived from CellxGene and GEO. This dataset can be used to train multimodal models that combine transcriptome and text modalities.
提供机构:
jo-mengr



