five

ActivityFinder: Toward the Fully Automatic Integration of Structural and Binding Affinity Data

收藏
Figshare2026-01-02 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/ActivityFinder_Toward_the_Fully_Automatic_Integration_of_Structural_and_Binding_Affinity_Data/30987421
下载链接
链接失效反馈
官方服务:
资源简介:
The reliable integration of structural and bioactivity data remains a significant bottleneck in computational chemistry and cheminformatics. While curated databases such as PDBbind and ChEMBL provide valuable resources, integrating structural data in the form of, for example, PDB files and bioactivity assays in the form of structured data, like ChEMBL, is inherently complex, and no fully integrated solution has been published so far. This work introduces ActivityFinder, a fully automated method for linking protein–ligand crystal structures to bioactivity assay data without relying on external services or continuous data connections. The method solely requires structural information in the form of PDB files and a structured SQL database such as ChEMBL, making it highly suitable for proprietary or unpublished data sets typically used within the pharmaceutical industry or early research in general. ActivityFinder utilizes sequence alignments and detailed chemical structure matching. Its accuracy is showcased for the task of associating PDB entries with corresponding data in ChEMBL. Applying this method, we linked 20197 PDB structures and 13734 ligands with 17829 unique ChEMBL ligands across 2585 targets, covering over one million bioactivity data points. Compared to existing approaches based on identifier mapping, ActivityFinder reproduces reported links but also broadens the set of linked data by explicitly addressing ligand heterogeneity, sequence variants, and binding-site mutations at an atomistic level. ActivityFinder is available via the Rest API of the ProteinsPlus platform, and the data is published as a PostgreSQL database dump, enabling scientists to integrate and explore structural and bioactivity data reliably.
创建时间:
2026-01-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作