five

Exposome-Scale Investigation of Cl-/Br-Containing Chemicals Using High-Resolution Mass Spectrometry, Multistage Machine Learning, and Cloud Computing

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Exposome-Scale_Investigation_of_Cl-_Br-Containing_Chemicals_Using_High-Resolution_Mass_Spectrometry_Multistage_Machine_Learning_and_Cloud_Computing/29126605
下载链接
链接失效反馈
官方服务:
资源简介:
Over 70% of organic halogens, representing chlorine- and bromine-containing disinfection byproducts (Cl-/Br-DBPs), remain unidentified after 50 years of research. This work introduces a streamlined and cloud-based exposomics workflow that integrates high-resolution mass spectrometry (HRMS) analysis, multistage machine learning, and cloud computing for efficient analysis and characterization of Cl-/Br-DBPs. In particular, the multistage machine learning structure employs progressively different heavy isotopic peaks at each layer and capture the distinct isotopic characteristics of nonhalogenated compounds and Cl-/Br-compounds at different halogenation levels. This innovative approach enables the recognition of 22 types of Cl-/Br-compounds with up to 6 Br and 8 Cl atoms. To address the data imbalance among different classes, particularly the limited number of heavily chlorinated and brominated compounds, data perturbation is performed to generate hypothetical/synthetic molecular formulas containing multiple Cl and Br atoms, facilitating data augmentation. To further benefit the environmental chemistry community with limited computational experience and hardware access, above innovations are incorporated into HalogenFinder (http://www.halogenfinder.com/), a user-friendly, web-based platform for Cl-/Br-compound characterization, with statistical analysis support via MetaboAnalyst. In the benchmarking, HalogenFinder outperformed two established tools, achieving a higher recognition rate for 277 authentic Cl-/Br-compounds and uniquely identifying the number of Cl/Br atoms. In laboratory tests of DBP mixtures, it identified 72 Cl-/Br-DBPs with proposed structures, of which eight were confirmed with chemical standards. A retrospective analysis of 2022 finished water HRMS data revealed insightful temporal trends in Cl-DBP features. These results demonstrate HalogenFinder’s effectiveness in advancing Cl-/Br-compound identification for environmental science and exposomics.
创建时间:
2025-06-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作