five

Introducing “Identification Probability” for Automated and Transferable Assessment of Metabolite Identification Confidence in Metabolomics and Related Studies

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Introducing_Identification_Probability_for_Automated_and_Transferable_Assessment_of_Metabolite_Identification_Confidence_in_Metabolomics_and_Related_Studies/28063321
下载链接
链接失效反馈
官方服务:
资源简介:
Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidencethe Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in the context of the chemical space being considered. Neither are they easily automated nor transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a database that matches an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multiproperty reference libraries constructed from a subset of the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.

近二十年来,代谢组学及相关研究领域中,用于评估化合物鉴定置信度的方法始终是学界持续探讨并积极探索的研究方向。2007年的首批开创性研究主要聚焦质谱(mass spectrometry)与核磁共振波谱(nuclear magnetic resonance spectroscopy),最终提出了四级代谢物鉴定置信度等级,即代谢物标准倡议(Metabolite Standards Initiative,MSI)等级。2014年,原始MSI等级被扩充为五级(包含两个子等级),以更清晰地传递高分辨质谱研究中的化合物鉴定置信度信息。后续学界对鉴定等级体系进行了多轮优化,例如适配代谢组学流程中离子迁移谱(ion mobility spectrometry)的应用场景;同时也有研究者基于鉴定点数框架,开发了其他用于传递化合物鉴定置信度的方案。然而,无论是定性的鉴定置信度等级体系,还是定量的评分系统,均无法在所考量的化学空间语境下,解决化合物鉴定中存在的歧义程度问题。且此类方法均难以实现自动化,也无法在不同分析平台间通用。在本评述文章中,我们倡议代谢组学及相关领域学界将鉴定概率作为一种方法,用于自动化、可迁移地评估代谢组学及相关研究中的化合物鉴定结果与鉴定歧义程度。鉴定概率的定义十分简洁:即1/N,其中N为数据库中,在用户自定义的测量精度(如质量测量精度、保留时间准确度等)范围内,与实验测得分子相匹配的化合物总数量。我们通过基于人类代谢组数据库(Human Metabolome Database)子集与计算属性预测构建的多属性参考库开展虚拟(in silico)分析,验证了鉴定概率的实用价值;同时为学界透明落地该概念提供操作指引,并呼吁学界结合当前主流的代谢物鉴定置信度评估方法,并行验证这一构想。
创建时间:
2024-12-19
二维码
社区交流群
二维码
科研交流群
商业服务