Gene-Metabolite Association Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/SysBioChalmers/yeast-GEM
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来自两种微生物——酿酒酵母(SC)和东方伊萨酵母(IO)的2474种代谢物和1947个基因,用于基因-代谢物关联预测任务。数据集包括多模态特征,如文本内容、SMILES字符串和基因序列,并分为训练集(60%)、测试集(30%)和验证集(10%)。规模上,该数据集覆盖了2474种代谢物和1947个基因,其任务是对基因与代谢物之间的关联进行预测。
This dataset contains 2474 metabolites and 1947 genes from two microorganisms, Saccharomyces cerevisiae (SC) and Issatchenkia orientalis (IO), for the task of gene-metabolite association prediction. It includes multimodal features such as text content, SMILES strings and gene sequences, and is split into training set (60%), test set (30%) and validation set (10%). In terms of scale, this dataset covers 2474 metabolites and 1947 genes, and its task is to predict the associations between genes and metabolites.
提供机构:
SysBioChalmers and Maranas Group



