A Benchmark Set of Bioactive Molecules for Diversity Analysis of Compound Libraries and Combinatorial Chemical Spaces
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/A_Benchmark_Set_of_Bioactive_Molecules_for_Diversity_Analysis_of_Compound_Libraries_and_Combinatorial_Chemical_Spaces/29950401
下载链接
链接失效反馈官方服务:
资源简介:
Sources for commercially available compounds have been
experiencing
continuous growth for several years, reaching their peak in billion-
to trillion-sized combinatorial Chemical Spaces. To assess the quality
of a compound collection to provide relevant chemistry, a benchmark
set of pharmaceutically relevant structures is required that enables
an unbiased comparison. For this purpose, the ChEMBL database was
mined for molecules displaying biological activity, and three benchmark
sets of successive orders of magnitude were created by systematic
filtering and processing: Set L (“large-sized,”
379k), Set M (“medium-sized,” 25k),
and Set S (“small-sized,” 3k). Tailored
for broad coverage of the physicochemical and topological landscape,
the benchmark Set S was then employed to analyze
the chemical diversity capacities of commercial combinatorial Chemical
Spaces and enumerated compound libraries. Among the three utilized
search methodsFTrees (pharmacophore features), SpaceLight
(molecular fingerprints), and SpaceMACS (maximum common substructure)eXplore
and REAL Space consistently performed best. In general, each Chemical
Space was able to provide a larger number of compounds more similar
to the respective query molecule than the enumerated libraries, while
also individually offering unique scaffolds for each method.
创建时间:
2025-08-20



