Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Ensemble_QSAR_Modeling_to_Predict_Multispecies_Fish_Toxicity_Lethal_Concentrations_and_Points_of_Departure/9963854
下载链接
链接失效反馈官方服务:
资源简介:
QSAR
modeling can be used to aid testing prioritization of the
thousands of chemical substances for which no ecological toxicity
data are available. We drew on the U.S. Environmental Protection Agency’s
ECOTOX database with additional data from ECHA to build a large data
set containing in vivo test data on fish for thousands of chemical
substances. This was used to create QSAR models to predict two types
of end points: acute LC50 (median lethal concentration)
and points of departure similar to the NOEC (no observed effect concentration)
for any duration (named the “LC50” and “NOEC”
models, respectively). These models used study covariates, such as
species and exposure route, as features to facilitate the simultaneous
use of varied data types. A novel method of substituting taxonomy
groups for species dummy variables was introduced to maximize generalizability
to different species. A stacked ensemble of three machine learning
methodsrandom forest, gradient boosted trees, and support
vector regressionwas implemented to best make use of a large
data set with many descriptors. The LC50 and NOEC models
predicted end points within 1 order of magnitude 81% and 76% of the
time, respectively, and had RMSEs of roughly 0.83 and 0.98 log10(mg/L), respectively. Benchmarks against the existing TEST
and ECOSAR tools suggest improved prediction accuracy.
创建时间:
2019-09-27



