AVIDa-SARS-CoV-2, VHHCorpus-2M
收藏arXiv2024-06-03 更新2024-06-21 收录
下载链接:
https://avida-sars-cov-2.cognanous.com
下载链接
链接失效反馈官方服务:
资源简介:
AVIDa-SARS-CoV-2是由COGNANO公司等机构创建的数据集,专注于SARS-CoV-2与重链抗体(VHH)的相互作用,包含77,003个样本,标记了VHH序列与12种SARS-CoV-2变种的结合情况。数据集通过免疫接种和亲和选择方法生成,确保了标签的可靠性。VHHCorpus-2M则是一个包含超过两百万VHH序列的预训练数据集,专门用于抗体语言模型的训练。这两个数据集的应用领域主要集中在加速抗体疗法的发现,特别是在预测和评估抗体与病毒变种的结合能力方面。
AVIDa-SARS-CoV-2 is a dataset created by institutions including COGNANO, focusing on the interactions between SARS-CoV-2 and heavy-chain antibodies (VHH). It contains 77,003 samples, which annotate the binding profiles between VHH sequences and 12 SARS-CoV-2 variants. This dataset was constructed through immunization and affinity selection approaches, ensuring the reliability of its annotations. VHHCorpus-2M, in contrast, is a pre-training dataset comprising over two million VHH sequences, specifically tailored for the training of antibody language models. Collectively, these two datasets are primarily applied to accelerating the discovery of antibody therapies, particularly in predicting and assessing the binding capabilities between antibodies and viral variants.
提供机构:
COGNANO公司、SAKURA互联网公司、Biorhodes公司
创建时间:
2024-05-29



