Toward Machine Learning Electrospray Ionization Sensitivity Prediction for Semiquantitative Lipidomics in Stem Cells
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Toward_Machine_Learning_Electrospray_Ionization_Sensitivity_Prediction_for_Semiquantitative_Lipidomics_in_Stem_Cells/28352119
下载链接
链接失效反馈官方服务:
资源简介:
Specificity, sensitivity, and high metabolite coverage
make mass
spectrometry (MS) one of the most valuable tools in metabolomics and
lipidomics. However, translation of metabolomics MS methods to multiyear
studies conducted across multiple batches is limited by variability
in electrospray ionization response, making batch-to-batch comparisons
challenging. This limitation creates an artificial divide between
nontargeted discovery work that is broad in scope but limited in terms
of absolute quantitation ability and targeted work that is highly
accurate but limited in scope due to the need for matched isotopically
labeled standards. These issues are often observed in stem cell studies
using metabolomic and lipidomic MS approaches, where patient recruitment
can be a years-long process and samples become available in discrete
batches every few months. To bridge this gap, we developed a machine
learning model that predicts electrospray ionization sensitivity for
lipid classes that have shown correlation with stem cell potency.
Molecular descriptors derived from these lipids’ chemical structures
are used as model input to predict electrospray response, enabling
quantitation by MS with moderate accuracy (semiquantitation). Model
performance was evaluated via internal and external validation using
cultured cells from various stem cell donors, achieving global percent
errors of 40% and 20% for positive and negative electrospray ion modes,
respectively. Although this accuracy is typically insufficient for
traditional targeted lipidomics experiments, it is sufficient for
semiquantitative estimation of lipid marker concentrations across
batches without the need for specific chemical standards that many
times are unavailable. Furthermore, the precision for model-predicted
concentrations was 16.9% for the positive mode and 7.5% for the negative
mode, indicating promise for data harmonization across batches. The
set of molecular descriptors used by the models described here was
able to yield higher accuracy than those previously published in the
literature, showing high promise toward semiquantitative lipidomics.
创建时间:
2025-02-05



