five

Datasets for practical model selection for prospective virtual screening

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/1257462
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains datasets for the manuscript "Practical model selection for prospective virtual screening": pria_rmi_cv.tar.gz: A compressed directory containing chemical screening data for the PriA-SSB AS, PriA-SSB FP, and RMI-FANCM FP binary datasets.  The files also contain the associated continuous % inhibition values and chemical features represented as SMILES and Morgan fingerprints.  The dataset has been split into five folds for cross validation. pria_rmi_pcba_cv.tar.gz: A compressed directory containing chemical screening data for the PriA-SSB AS, PriA-SSB FP, and RMI-FANCM FP binary datasets as well as public PubChem BioAssay datasets.  The files also contain the PriA-SSB and RMI-FANCM continuous % inhibition values and chemical features represented as SMILES and Morgan fingerprints.  The dataset has been split into five folds for cross validation.  Missing values are left blank. pria_prospective.csv.gz: A compressed file containing chemical screening data for the binary dataset PriA-SSB prospective.  The file also contains the continuous % inhibition values and chemical features represented as SMILES and Morgan fingerprints. If you use these data in a publication, please cite: Shengchao Liu+, Moayad Alnammi+, Spencer S. Ericksen, Andrew F. Voter, Gene E. Ananiev, James L. Keck, F. Michael Hoffmann, Scott A. Wildman, Anthony Gitter. Practical Model Selection for Prospective Virtual Screening. Journal of Chemical Information and Modeling. 2018 doi:10.1021/acs.jcim.8b00363 PubChem data were provided by the PubChem database.  Follow the PubChem citation guidelines if you use the PubChem data.  See Voter et al. 2017 (PubChem AID 1272365) for the PriA-SSB screening data and Voter et al. 2016 (PubChem AID 1159607) for RMI-FANCM. Version 1.1.0 updates all of the data files.  We standardized the SMILES in all files by generating canonical SMILES with RDKit version 2016.03.4.  In addition, we removed 2845 chemicals from pria_prospective.csv.gz that were duplicates of compounds in pria_rmi_cv.tar.gz.
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作