Protein Monomer Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/flagshippioneering/flash_ipa
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了40,492个单链蛋白质单体,在应用了512残基的最大长度截止值后,数量减少至36,600个结构。此外,该数据集经过FoldFlow模型的重新训练,用于评估FlashIPA算法的性能。该数据集的规模为40,492个单链蛋白质单体,其任务旨在训练用于蛋白质结构生成的生成模型。
This dataset contains 40,492 single-chain protein monomers. After applying a maximum length cutoff of 512 residues, the number of valid structures is reduced to 36,600. Additionally, this dataset has been retrained using the FoldFlow model to evaluate the performance of the FlashIPA algorithm. Originally consisting of 40,492 single-chain protein monomers, this dataset is intended for training generative models for protein structure generation.



