five

Fine-tuning Pre-trained Antibody Language Models for Antigen Specificity Prediction

收藏
Figshare2024-05-23 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Fine-tuning_Pre-trained_Antibody_Language_Models_for_Antigen_Specificity_Prediction/25342924/1
下载链接
链接失效反馈
官方服务:
资源简介:
AbstractAntibodies play a crucial role in the adaptive immune response, with their specificity to antigens being a fundamental determinant of immune function. Accurate prediction of antibody antigen specificity is vital for understanding immune responses, guiding vaccine design, and developing antibody-based therapeutics. In this study, we explore the effect of fine-tuning pre-trained antibody language models in improving binding specificity prediction to SARS-CoV-2 spike protein and influenza hemagglutinin. We fine-tuned four pre-trained antibody language models on labeled data specific to these antigens and demonstrated that fine-tuned language model classifiers exhibit enhanced predictive accuracy compared to classifiers trained on pre-trained model embeddings. Additionally, we investigated the change of model attention activations after fine-tuning to gain insights into the molecular basis of antigen recognition by antibodies. Furthermore, we applied the fine-tuned models to BCR repertoire data related to influenza and SARS-CoV-2 vaccination, demonstrating their ability to capture changes in repertoire following vaccination. Overall, our study highlights the effect of fine-tuning on pre-trained antibody language models as valuable tools to improve antigen specificity prediction.CodeAll code used for model training and testing is available on bitbucket https://bitbucket.org/kleinstein/projects/src/master/Wang2024/ . <br>An archived version of the Bitbucket repository is included in <i>code.zip</i>.DataThe following files are included in <i>data.zip</i><br><i>S_FULL.parquet</i>: Sequence and labels for S binding prediction (full-length).<br><i>S_CDR3.parquet</i>: Sequence and labels for S binding prediction (CDR3 only).<br><i>HA</i><i>_FULL</i><i>.parquet</i>: Sequence and labels for HA binding prediction (full-length).<br><i>HA</i><i>_FULL</i><i>.parquet</i>: Sequence and labels for HA binding prediction (CDR3 only).<br><i>S_repertoires.parquet</i>: Repertoires used for S binding prediction<br><i>HA_repertoires.parquet</i>: Repertoires used for HA binding prediction<br>
提供机构:
Patsenker, Jonathan; Wang, Mamie; Kluger, Yuval; Kleinstein, Steven; Li, Henry
创建时间:
2024-05-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作