five

Knowledge Graph of RB-Tnseq Data from Fitness Browser (KP-DP1)

收藏
DataCite Commons2026-03-12 更新2026-04-25 收录
下载链接:
https://www.osti.gov/servlets/purl/3022643
下载链接
链接失效反馈
官方服务:
资源简介:
Motivation: Predicting microbial gene fitness across environmental conditions remains a central challenge for predictive phenomics and autonomous experimentation. Fitness assays generate large volumes of genotype–phenotype measurements difficult to integrate with experimental metadata and biological function in a form that supports mechanistic reasoning. Knowledge graphs offer a semantic framework for unifying modalities and enabling context-aware inference. Results: We build GIMME (Graph Inference for Microbial Metabolism Exploration), a semantically grounded knowledge graph that unifies gene fitness measurements spanning 10 Pseudomonas species with experimental metadata and biological context. Media are decomposed into chemical components and experiments carry structured links to natural-language descriptions. The resulting graph supports two inference modes: (1) symbolic graph traversal to surface candidate gene–environment and gene–chemical associations, and (2) learned inference using heterogeneous graph neural networks that propagate information across neighborhoods. We formulate link regression over (gene, media, experiment) triplets, combining learned gene embeddings with pretrained LLM sourced text embeddings of node descriptions to predict gene fitness. We then augment a baseline MLP with an auxiliary message-passing encoder (GraphSAGE/GAT) that propagates information over gene–protein–function and media–chemical subgraphs, and fuse the two pathways with a gated residual connection. This approach produces strong agreement with held-out fitness measurements (GraphSAGE Pearson r 0.74) while also highlighting inference challenges in extreme-fitness regimes. We aggregate GAT edge-attention weights by relation type and layer to estimate which biological and environmental relations most influence fitness predictions. Conclusion: This work explores using knowledge graphs as “context graphs” for microbial phenotype prediction. They provide a rich substrate which enables explainable retrieval of supporting evidence, and provides a natural bridge to autonomous workflows that prioritize the next experiment.
提供机构:
PNNL (PNNL2)
创建时间:
2026-03-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作