AI-Enabled Mapping of Structure-Hazard Relationships for Emerging Contaminants
收藏Figshare2026-01-29 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/AI-Enabled_Mapping_of_Structure-Hazard_Relationships_for_Emerging_Contaminants/31191948
下载链接
链接失效反馈官方服务:
资源简介:
Emerging contaminant hazards are widespread, yet evidence is fragmented across the literature, registries, inventories, and bioassays. This study presents an integrated, reproducible framework that aligns literature-derived signals with molecular structure and regulatory hazard dimensions within a persistence, bioaccumulation, mobility, and toxicity (PBMT) lens. A task-tuned large language model extracted 21,277 mentions from 9,557 publications; harmonization across CompTox, PubChem, and Wikipedia yielded 1,081 unique candidates with 94.6% name consistency against 17 inventories. The landscape shows abundant toxicity information but limited mobility data. Structure profiling identified a recurring motifan aromatic core with branched alkyl substituents and oxygenated groupsand directional associations with composite hazard (positive for halogenation and negative for several oxygenated groups). To extend pathway inference to list-naïve chemicals, AutoGluon models trained on Tox21/ToxCast were applied to 49 additional candidates. This produced end-point-level predictions concentrated at the protein-function level (receptors, enzymes, transcription factors), which is operationally useful for prioritization. Combining model scores with PBMT evidence density yielded an auditable prioritization, highlighting where mobility measurements and focused assays would most increase confidence. The framework links textual, structural, regulatory, and predictive streams to enable rapid screening, targeted data acquisition, safer substitution, and adaptive regulatory updating.
创建时间:
2026-01-29



