five

Protein Large Language Models Can Predict Flavivirus Protease Target Specificity

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Protein_Large_Language_Models_Can_Predict_Flavivirus_Protease_Target_Specificity/31910617
下载链接
链接失效反馈
官方服务:
资源简介:
Viral proteases are essential enzymes in many viral strains, playing a crucial role in the viral replication cycle. They are key targets for antiviral drug development and have significant implications for viral pathogenesis. To address the issue of Flavivirus protease substrate promiscuity, Yellow Fever virus protease (YFP), West Nile Virus Protease (WNP), Zika virus protease (ZVP), Usutu Virus Protease (UVP), and Rocio Virus Protease (RVP) were recombinantly expressed in E. coli BL21(DE3) and purified. Mass spectrometric Proteomic Identification of protease Cleavage Sites (PICSs) was performed using peptide libraries derived from a murine cell line lysate. A surprisingly high promiscuity in protease substrate specificity was detected for all five viral proteases, with a recurrence of arginine in the P1 position. Using homology modeling, specific subsites could be identified. However, the promiscuity of peptide binding was difficult to elucidate using these models. For these reasons, the ProtTrans protein language model (pLM) was used and fine-tuned with the obtained peptide sequences. The ProtTrans T5-Encoder model, originally trained to predict same protein-chain amino acids using a huge size of protein sequence data, when fine-tuned with target peptides from the PICS experiments and decoy peptides, could classify each of these groups with up to 76% test-set accuracy. Dimensionality reduction indicated that the T5 embeddings could indeed contain similar information, which was useful for recognizing protein–peptide interactions. These results confirm the usefulness of pLMs for the prediction of protein–protein interactions and thus have important implications for antiviral drug design.
创建时间:
2026-04-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作