Knowledge Graph of RB-Tnseq Data from Fitness Browser (KP-DP1)
收藏DataCite Commons2026-03-12 更新2026-04-25 收录
下载链接:
https://www.osti.gov/servlets/purl/3022643
下载链接
链接失效反馈官方服务:
资源简介:
Motivation: Predicting microbial gene fitness across environmental conditions remains a central challenge for predictive phenomics and autonomous experimentation. Fitness assays generate large volumes of genotype–phenotype measurements difficult to integrate with experimental metadata and biological function in a form that supports mechanistic reasoning. Knowledge graphs offer a semantic framework for unifying modalities and enabling context-aware inference. Results: We build GIMME (Graph Inference for Microbial Metabolism Exploration), a semantically grounded knowledge graph that unifies gene fitness measurements spanning 10 Pseudomonas species with experimental metadata and biological context. Media are decomposed into chemical components and experiments carry structured links to natural-language descriptions. The resulting graph supports two inference modes: (1) symbolic graph traversal to surface candidate gene–environment and gene–chemical associations, and (2) learned inference using heterogeneous graph neural networks that propagate information across neighborhoods. We formulate link regression over (gene, media, experiment) triplets, combining learned gene embeddings with pretrained LLM sourced text embeddings of node descriptions to predict gene fitness. We then augment a baseline MLP with an auxiliary message-passing encoder (GraphSAGE/GAT) that propagates information over gene–protein–function and media–chemical subgraphs, and fuse the two pathways with a gated residual connection. This approach produces strong agreement with held-out fitness measurements (GraphSAGE Pearson r 0.74) while also highlighting inference challenges in extreme-fitness regimes. We aggregate GAT edge-attention weights by relation type and layer to estimate which biological and environmental relations most influence fitness predictions. Conclusion: This work explores using knowledge graphs as “context graphs” for microbial phenotype prediction. They provide a rich substrate which enables explainable retrieval of supporting evidence, and provides a natural bridge to autonomous workflows that prioritize the next experiment.
提供机构:
PNNL (PNNL2)
创建时间:
2026-03-12



