Comprehensive Benchmark Datasets and High-Throughput Virtual Screening Libraries for Acute Oral Toxicity Prediction Across Ten Privileged Drug Scaffolds
收藏DataCite Commons2026-03-20 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/jy4bx6gz4y/1
下载链接
链接失效反馈官方服务:
资源简介:
This comprehensive data repository contains the complete modeling datasets and true external screening libraries developed for predicting the acute oral toxicity (in rats and mice) of ten privileged heterocyclic drug scaffolds. Designed to advance computational toxicology and green drug design, the dataset systematically integrates traditional 2D-QSTR, Machine Learning (ML), q-RASTR, and ARKA-RASTR (Arithmetic Residual in K-groups Analysis) methodologies.
The repository is systematically divided into three primary subsets, corresponding to different structural targets:
1. Subset A (Six Typical Scaffolds): Covers pyrazine, piperazine, thiazole, thiophene, indole, and benzimidazole. Includes the experimental modeling sets and a true external set of >23,000 untested compounds.
2. Subset B (Pyridine and Piperidine Scaffolds): Contains 373 dual-species (rat/mouse) modeling data and an external screening library of ~13,000 compounds.
3. Subset C (Pyrazole and Pyrrolidine Scaffolds): Comprises 552 experimentally curated modeling molecules and a vast external set of ~18,000 compounds evaluated via intelligent physical mechanism analysis (ARKA-RASTR).
All true external compounds were retrieved from the PubChem database, strictly lack experimental toxicity values, and were subjected to rigorous Applicability Domain (AD) and Predictive Reliability Indicator (PRI) evaluations. This dataset provides a robust benchmark for developing in silico toxicity models and offers prioritized lists of potentially low-toxicity drug leads for pharmaceutical risk assessment.
提供机构:
Mendeley Data
创建时间:
2026-03-20



