AbSet: A Standardized Dataset of Antibody Structures for Machine Learning Applications
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14888001
下载链接
链接失效反馈官方服务:
资源简介:
AbSet is a dataset of antibodies extracted from the PDB, carefully standardized, and enriched with a subset of in silico-generated antibody-antigen complexes containing poses similar to the bound state, along with a novel set of decoys. In total, AbSet comprises over 800000 structures, encompassing antibodies with paired heavy and light chains (VH-VL), only heavy chains (VH), only light chains (VL), and single-chain variable fragments (scFv), including both free antibodies and those complexed with protein antigens.
The in silico dataset was generated through molecular docking using HADDOCK following two distinct approaches:
Blind Docking: Conducted using 2135 experimentally determined antibody-antigen complexes.
Site-Directed Docking: Applied to 1755 complexes, where the antibody sequences were extracted, modeled using AbodyBuilder2, and then docked with their original crystallized antigen.
Each docking run produced 250 poses, which were classified into four quality categories based on DockQ: high quality, medium quality, acceptable quality, and incorrect.
This dataset includes molecular descriptors of amino acid residues. These descriptors were calculated for all standardized antibody structures obtained from the PDB. For in silico structures generated via docking, molecular descriptors were computed for 4 selected structures from a set of 250 poses generated per system. The code used to calculate molecular descriptors is available in the GitHub repository
The descriptors include:
Solvent Accessible Surface Area
Relative Accessible Surface Area
Atomic depth
Potrusion index
Hydrophobicity
Sequence
Half-sphere exposure calculations
Cα coordinates
ϕ and ψ dihedral angles
Secondary structure of the protein
- Organization of Available Data:
📂 PDBs_Files (Antibodies extracted from the PDB)│── 📂 Structures │── 📂 Descriptors 📂 InSilicoComplexStructures-MonomersFromXtal (Blind Docking)│── 📂 Structures │── 📂 Descriptors │── 📂 Index DockQ 📂 InSilicoComplexStructures-MonomersFromModeling (Site-Directed Docking)│── 📂 Structures │── 📂 Descriptors │── 📂 Index DockQ
创建时间:
2025-02-25



