HelixSeed: A MaSIF-Ready α-Helical Seed Library from the Human Proteome for Surface-Driven de novo Binder Design
收藏DataCite Commons2026-01-16 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=d5f3105d1c23423b9b13f9066942bb8b
下载链接
链接失效反馈官方服务:
资源简介:
HelixSeed is a publicly released, MaSIF-ready α-helical seed resource designed to accelerate surface-driven de novo protein binder design and other protein-surface learning tasks. The current release focuses on the human (Homo sapiens) reference proteome by integrating structural models from the AlphaFold Protein Structure Database (AFDB) and experimentally determined structures from the Protein Data Bank (PDB). In total, HelixSeed contains 19,715 human protein structure entries and 109,603 extracted α-helical segments curated using DSSP-based secondary-structure annotation and length/quality filtering (≥10 residues). For each helix fragment, we provide standardized structure files and precomputed MaSIF-compatible molecular surface representations, including solvent-excluded surface meshes and mapped physicochemical features (e.g., hydropathy, hydrogen-bond potential, and electrostatics). Following the MaSIF preprocessing procedure, surfaces are further partitioned into geodesic patches (9 Å for MaSIF-site scoring and 12 Å for MaSIF-search fingerprinting), yielding 79,927,906 local surface patches with associated descriptors/fingerprints. HelixSeed enables rapid nearest-neighbor seed retrieval for target surface sites and can be readily reused as a plug-and-play seed space for MaSIF-search workflows, interface recognition, and fragment-driven interface engineering pipelines.
提供机构:
Science Data Bank
创建时间:
2026-01-16



