five

General Intelligence framework to predict Virus Adaptation based on genome Language model

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14233091
下载链接
链接失效反馈
官方服务:
资源简介:
Current artificial intelligence (AI) solutions for assessing virus phenotypes are mostly limited to fixed tasks, with trained models. We aim to build General Intelligence to predict Virus Adaptation based on Language model (GIVAL), instantly for any input gene or gene segment from any virus. A gene embedder in GIVAL, named virus Bidirectional Encoder Representations from Transformers (vBERT), was pretrained with context-dependently segmented tokens of presently available viruses. Host adaptation of virus input was predicted based on its vBERT embedding, by the input-specified deep learning model, trained with input-specified training data and labels. GIVAL’s vBERT performed well (better than vBERT pretrained on fixed number of amino acids as tokens) on embedding virus genes, at both coarse-grained intact gene and fine-grained gene site levels. GIVAL interpretably predicted the high human adaptation of swine H3N2 and equine H3N8 influenza viruses, and the receptor binding variance of various types of coronaviruses, based on the input of segmented Hemagglutinin (HA) or Spike. GIVAL predicted a significant adaptation shift of the monkeypox viruses since 2022 based on multiple intact viral genes. Summarily, this study provides a general AI solution to assess virus risk, highlights the importance of adaptation shift on transmission risk of multiple viruses. Scripts and data are available on GitHub (https://github.com/Jamalijama/GIVAL) and Zenodo (10.5281/zenodo.14233092).
创建时间:
2024-12-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作