An HTS-derived AI evaluation dataset with realistic virtual screening space and non-trivial class separation
收藏DataCite Commons2026-05-05 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20030795
下载链接
链接失效反馈官方服务:
资源简介:
HTS-derived evaluation benchmark with UMAP-based sampling and clustering for realistic representation of the virtual screening (VS) space, while enforcing non-trivial class separability. The dataset is primarily designed for classification tasks; however, continuous inhibition activity (%) values are also provided, along with associated standard error and standard deviation, for potential regression applications. Notably, these activity measurements are inherently noisy, as they originate from primary high-throughput screening conditions and are based on a fluorescence readout of helicase activity. This indirect measurement is sensitive to experimental variability, including plate effects, signal plateauing, and non-enzymatic contributions mitigated through trap DNA. Despite assay optimization (e.g., buffer conditions and reaction timing) and high overall quality (Z' = 0.86), these factors introduce variability that can result in noisy continuous labels, including potential false positives and false negatives.
提供机构:
Zenodo
创建时间:
2026-05-04



