Machine Learning-based Classification for the Prioritization of Potentially Hazardous Chemicals with Structural Alerts in Nontarget Screening
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning-based_Classification_for_the_Prioritization_of_Potentially_Hazardous_Chemicals_with_Structural_Alerts_in_Nontarget_Screening/28553480
下载链接
链接失效反馈官方服务:
资源简介:
Nontarget screening
(NTS) with liquid chromatography high-resolution
mass spectrometry (LC-HRMS) is commonly used to detect unknown organic
micropollutants in the environment. One of the main challenges in
NTS is the prioritization of relevant LC-HRMS features. A novel prioritization
strategy based on structural alerts to select NTS features that correspond
to potentially hazardous chemicals is presented here. This strategy
leverages raw tandem mass spectra (MS2) and machine learning
models to predict the probability that NTS features correspond to
chemicals with structural alerts. The models were trained on fragments
and neutral losses from the experimental MS2 data. The
feasibility of this approach is evaluated for two groups: aromatic
amines and organophosphorus structural alerts. The neural network
classification model for organophosphorus structural alerts achieved
an Area Under the Curve of the Receiver Operating Characteristics
(AUC-ROC) of 0.97 and a true positive rate of 0.65 on the test set.
The random forest model for the classification of aromatic amines
achieved an AUC-ROC value of 0.82 and a true positive rate of 0.58
on the test set. The models were successfully applied to prioritize
LC-HRMS
features in surface water samples, showcasing the high potential to
develop and implement this approach further.
创建时间:
2025-03-07



