Table 4_GViT-GP: injecting the genomic relationship matrix as an inductive bias into a vision transformer via cross-attention for genomic prediction.xlsx

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://figshare.com/articles/dataset/Table_4_GViT-GP_injecting_the_genomic_relationship_matrix_as_an_inductive_bias_into_a_vision_transformer_via_cross-attention_for_genomic_prediction_xlsx/31800043

下载链接

链接失效反馈

官方服务：

资源简介：

IntroductionGenomic Prediction (GP) faces significant challenges in balancing model complexity with computational efficiency, particularly for high-dimensional genomic data under limited sample sizes. MethodsWe propose GViT-GP, a Vision Transformer architecture that injects the Genomic Relationship Matrix (GRM) as a biological prior via a dual-pathway cross-attention fusion mechanism, coupled with a Selective Patch Embedding strategy to reduce redundancy and improve data efficiency. ResultsWe evaluated GViT-GP on 20 traits across four datasets from three species (soybean, cattle, and chicken). GViT-GP outperformed established linear and non-linear baselines (including GBLUP, LightGBM, and DNNGP), achieving the best accuracy in 16/20 tasks. Ablation studies supported the effectiveness of Selective Patch Embedding and cross-attention fusion, and visualization analyses suggest adaptive attention to informative genomic regions. DiscussionThese results indicate that injecting GRM-informed inductive bias improves robustness and generalization in “p ≫ n” settings. GViT-GP provides a practical, high-performance framework for capturing complex genotype–phenotype relationships in modern digital breeding.

引言基因组预测（Genomic Prediction, GP）在平衡模型复杂度与计算效率方面存在显著挑战，尤其是在样本量有限的高维基因组数据场景中。方法本研究提出GViT-GP模型，这是一种视觉Transformer（Vision Transformer）架构，通过双通路交叉注意力融合机制将基因组关系矩阵（Genomic Relationship Matrix, GRM）作为生物学先验信息注入模型，并辅以选择性补丁嵌入（Selective Patch Embedding）策略以降低数据冗余、提升数据利用效率。结果本研究在源自3个物种（大豆、牛、鸡）的4个数据集的20个性状上对GViT-GP进行了评测。结果显示，GViT-GP优于已有的线性与非线性基准模型（包括GBLUP、LightGBM及DNNGP），在16/20的任务中取得了最优精度。消融实验验证了选择性补丁嵌入与交叉注意力融合模块的有效性，可视化分析表明模型可自适应聚焦于具有信息价值的基因组区域。讨论上述结果表明，注入基于基因组关系矩阵的归纳偏置，可提升"变量远多于样本（p ≫ n）"场景下模型的鲁棒性与泛化能力。GViT-GP为现代数字化育种中捕捉复杂基因型-表型关联关系提供了一套实用且高性能的研究框架。

创建时间：

2026-03-18