five

Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Ensemble_QSAR_Modeling_to_Predict_Multispecies_Fish_Toxicity_Lethal_Concentrations_and_Points_of_Departure/9963854
下载链接
链接失效反馈
官方服务:
资源简介:
QSAR modeling can be used to aid testing prioritization of the thousands of chemical substances for which no ecological toxicity data are available. We drew on the U.S. Environmental Protection Agency’s ECOTOX database with additional data from ECHA to build a large data set containing in vivo test data on fish for thousands of chemical substances. This was used to create QSAR models to predict two types of end points: acute LC50 (median lethal concentration) and points of departure similar to the NOEC (no observed effect concentration) for any duration (named the “LC50” and “NOEC” models, respectively). These models used study covariates, such as species and exposure route, as features to facilitate the simultaneous use of varied data types. A novel method of substituting taxonomy groups for species dummy variables was introduced to maximize generalizability to different species. A stacked ensemble of three machine learning methodsrandom forest, gradient boosted trees, and support vector regressionwas implemented to best make use of a large data set with many descriptors. The LC50 and NOEC models predicted end points within 1 order of magnitude 81% and 76% of the time, respectively, and had RMSEs of roughly 0.83 and 0.98 log10(mg/L), respectively. Benchmarks against the existing TEST and ECOSAR tools suggest improved prediction accuracy.
创建时间:
2019-09-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作