five

ABCpred: Prediction of Continuous B-Cell Epitopes in an Antigen Using Recurrent Neural Network

收藏
DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20047945
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset for Continuous B-Cell Epitope Prediction Overview This repository contains the curated dataset used in our study on the prediction of continuous B-cell epitopes using machine learning approaches, specifically recurrent neural networks (RNNs). The dataset was originally developed for the identification and prediction of linear (continuous) B-cell epitopes in antigenic protein sequences. This resource may be useful for researchers working in: Immunoinformatics Vaccine design Antibody epitope prediction Computational immunology Machine learning-based peptide classification Reference Saha S, Raghava GP. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins. 2006 Oct 1;65(1):40-8. doi: 10.1002/prot.21078. PMID: 16894596. href="https://onlinelibrary.wiley.com/doi/10.1002/prot.21078" Dataset Description The dataset consists of experimentally validated B-cell epitopes and negative peptide samples. Positive Dataset Contains 700 non-redundant experimentally validated continuous B-cell epitopes Collected from the Bcipep database Only epitopes of length ≤ 20 amino acids were considered Redundant sequences were removed to reduce bias Negative Dataset Contains 700 non-epitope peptide sequences Randomly generated from Swiss-Prot proteins Any sequence identical to known epitopes was removed Thus, the final benchmark dataset contains: Dataset Type Number of Sequences B-cell Epitopes 700 Non-Epitopes 700 Total 1400 Data Processing To create fixed-length patterns suitable for neural network training: Variable-length epitopes were normalized to fixed window lengths Neighboring residues from the parent antigen sequence were added when needed Multiple window sizes (10, 12, 14, 16, 18, and 20 residues) were evaluated The best performance was achieved with: Window Length: 16 residues Model: Recurrent Neural Network (Jordan Network)   Applications This dataset can be used for: Training machine learning/deep learning models Benchmarking epitope prediction tools Feature engineering on peptide sequences Comparative studies with modern protein language models
提供机构:
Zenodo
创建时间:
2026-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作