High-Throughput Non-targeted Chemical Structure Identification Using Gas-Phase Infrared Spectra
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/High-Throughput_Non-targeted_Chemical_Structure_Identification_Using_Gas-Phase_Infrared_Spectra/15032080
下载链接
链接失效反馈官方服务:
资源简介:
The high-throughput identification
of unknown metabolites in biological
samples remains challenging. Most current non-targeted metabolomics
studies rely on mass spectrometry, followed by computational methods
that rank thousands of candidate structures based on how closely their
predicted mass spectra match the experimental mass spectrum of an
unknown. We reasoned that the infrared (IR) spectra could be used
in an analogous manner and could add orthologous structure discrimination;
however, this has never been evaluated on large data sets. Here, we
present results of a high-throughput computational method for predicting
IR spectra of candidate compounds obtained from the PubChem database.
Predicted spectra were ranked based on their similarity to gas-phase
experimental IR spectra of test compounds obtained from the NIST.
Our computational workflow (IRdentify) consists of a fast semiempirical
quantum mechanical method for initial IR spectra prediction, ranking,
and triaging, followed by a final IR spectra prediction and ranking
using density functional theory. This approach resulted in the correct
identification of 47% of 258 test compounds. On average, there were
2152 candidate structures evaluated for each test compound, giving
a total of approximately 555,200 candidate structures evaluated. We
discuss several variables that influenced the identification accuracy
and then demonstrate the potential application of this approach in
three areas: (1) combining IR and mass spectra rankings into a single
composite rank score, (2) identifying the precursor and fragment ions
using cryogenic ion vibrational spectroscopy, and (3) the incorporation
of a trimethylsilyl derivatization step to extend the method compatibility
to less-volatile compounds. Overall, our results suggest that matching
computational with experimental IR spectra is a potentially powerful
orthogonal option for adding significant high-throughput chemical
structure discrimination when used with other non-targeted chemical
structure identification methods.
生物样本中未知代谢物的高通量鉴定仍极具挑战性。当前多数非靶向代谢组学(non-targeted metabolomics)研究依赖质谱法,随后通过计算方法对数千种候选结构进行排序,排序依据为候选结构的预测质谱与未知物的实验质谱的匹配程度。我们推测,红外(IR)光谱可通过类似思路发挥作用,并可提供正交的结构区分能力,但此前从未在大规模数据集上对该方案进行评估。本研究提出了一种高通量计算方法,用于预测从PubChem数据库获取的候选化合物的红外光谱。预测光谱将根据其与从美国国家标准与技术研究院(NIST)获取的待测化合物的气相实验红外光谱的相似度进行排序。我们的计算流程(IRdentify)包含两步:首先通过快速半经验量子力学方法完成初始红外光谱预测、排序与筛选,随后采用密度泛函理论(DFT)进行最终的红外光谱预测与排序。该方法可从258种待测化合物中正确鉴定出47%的样本。平均每个待测化合物需评估2152种候选结构,总评估候选结构数量约为555200种。我们讨论了影响鉴定准确率的若干变量,并展示了该方法在三个领域的潜在应用:(1)将红外光谱与质谱的排序结果整合为单一综合评分;(2)通过低温离子振动光谱识别前体离子与碎片离子;(3)引入三甲基硅烷衍生化步骤,使该方法可兼容低挥发性化合物。总体而言,我们的研究结果表明,将计算红外光谱与实验红外光谱进行匹配,可作为一种极具潜力的正交手段,在与其他非靶向化学结构鉴定方法联用时,可显著提升高通量化学结构区分能力。
创建时间:
2021-07-21



