Exposome-Scale Investigation of Cl-/Br-Containing Chemicals Using High-Resolution Mass Spectrometry, Multistage Machine Learning, and Cloud Computing
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Exposome-Scale_Investigation_of_Cl-_Br-Containing_Chemicals_Using_High-Resolution_Mass_Spectrometry_Multistage_Machine_Learning_and_Cloud_Computing/29126605
下载链接
链接失效反馈官方服务:
资源简介:
Over 70% of organic halogens, representing chlorine-
and bromine-containing
disinfection byproducts (Cl-/Br-DBPs), remain unidentified after 50
years of research. This work introduces a streamlined and cloud-based
exposomics workflow that integrates high-resolution mass spectrometry
(HRMS) analysis, multistage machine learning, and cloud computing
for efficient analysis and characterization of Cl-/Br-DBPs. In particular,
the multistage machine learning structure employs progressively different
heavy isotopic peaks at each layer and capture the distinct isotopic
characteristics of nonhalogenated compounds and Cl-/Br-compounds at
different halogenation levels. This innovative approach enables the
recognition of 22 types of Cl-/Br-compounds with up to 6 Br and 8
Cl atoms. To address the data imbalance among different classes, particularly
the limited number of heavily chlorinated and brominated compounds,
data perturbation is performed to generate hypothetical/synthetic
molecular formulas containing multiple Cl and Br atoms, facilitating
data augmentation. To further benefit the environmental chemistry
community with limited computational experience and hardware access,
above innovations are incorporated into HalogenFinder (http://www.halogenfinder.com/), a user-friendly, web-based platform for Cl-/Br-compound characterization,
with statistical analysis support via MetaboAnalyst. In the benchmarking,
HalogenFinder outperformed two established tools, achieving a higher
recognition rate for 277 authentic Cl-/Br-compounds and uniquely identifying
the number of Cl/Br atoms. In laboratory tests of DBP mixtures, it
identified 72 Cl-/Br-DBPs with proposed structures, of which eight
were confirmed with chemical standards. A retrospective analysis of
2022 finished water HRMS data revealed insightful temporal trends
in Cl-DBP features. These results demonstrate HalogenFinder’s
effectiveness in advancing Cl-/Br-compound identification for environmental
science and exposomics.
创建时间:
2025-06-03



