The toxicity data sourced from TOXRIC database
收藏DataCite Commons2025-06-01 更新2025-01-06 收录
下载链接:
https://figshare.com/articles/dataset/toxric_30_datasets/27195339/3
下载链接
链接失效反馈官方服务:
资源简介:
The expanded predictive toxicology dataset is sourced from TOXRIC, a comprehensive and standardized toxicology database. The toxric_30_datasets contains 30 assay datasets with ~150,000 measurements related to five categories. These categories span a range of toxicity assessment, including genetic toxicity, organic toxicity, clinical toxicity, developmental and reproductive toxicity, and reactive toxicity. The acute toxicity dataset includes 59 endpoints with 80,081 unique compounds and 122,594 measurements, involving 15 different species like mouse, rabbit and cat, etc, 8 different administration routes like intraperitoneal, intravenous and oral, etc, and 3 different evaluation records including LD50 (lethal dose, 50%), LDLo (lethal dose low), and TDLo (toxic dose low). This dataset is very sparse since nearly 97.4% of compound-to-endpoint measurements are missing. Meanwhile, this dataset is extremely data-unbalanced with some endpoints having approximately 30,000 samples, e.g, mouse-intraperitoneal-LD50 has 36,295 measurements, while others consist of only around 100 measurements like rabbit-intraperitoneal-LD50, mouse-intravenous-LDLo, and rat-intravenous-LDLo. This disparity presents acute toxicity prediction as a challenging regression problem.
提供机构:
figshare
创建时间:
2024-10-09



