Evaluation of In Silico Multifeature Libraries for Providing Evidence for the Presence of Small Molecules in Synthetic Blinded Samples
收藏acs.figshare.com2023-05-30 更新2025-01-21 收录
下载链接:
https://acs.figshare.com/articles/dataset/Evaluation_of_i_In_Silico_i_Multifeature_Libraries_for_Providing_Evidence_for_the_Presence_of_Small_Molecules_in_Synthetic_Blinded_Samples/9696290/1
下载链接
链接失效反馈官方服务:
资源简介:
The
current gold standard for unambiguous molecular identification
in metabolomics analysis is comparing two or more orthogonal properties
from the analysis of authentic reference materials (standards) to
experimental data acquired in the same laboratory with the same analytical
methods. This represents a significant limitation for comprehensive
chemical identification of small molecules in complex samples. The
process is time consuming and costly, and the majority of molecules
are not yet represented by standards. Thus, there is a need to assemble
evidence for the presence of small molecules in complex samples through
the use of libraries containing calculated chemical properties. To
address this need, we developed a Multi-Attribute Matching Engine
(MAME) and a library derived in part from our in silico chemical library engine (ISiCLE). Here, we describe an initial evaluation
of these methods in a blinded analysis of synthetic chemical mixtures
as part of the U.S. Environmental Protection Agency’s (EPA)
Non-Targeted Analysis Collaborative Trial (ENTACT, Phase 1). For molecules
in all mixtures, the initial blinded false negative rate (FNR), false
discovery rate (FDR), and accuracy were 57%, 77%, and 91%, respectively.
For high evidence scores, the FDR was 35%. After unblinding of the
sample compositions, we optimized the scoring parameters to better
exploit the available evidence and increased the accuracy for molecules
suspected as present. The final FNR, FDR, and accuracy were 67%, 53%,
and 96%, respectively. For high evidence scores, the FDR was 10%.
This study demonstrates that multiattribute matching methods in conjunction
with in silico libraries may one day enable reduced
reliance on experimentally derived libraries for building evidence
for the presence of molecules in complex samples.
在代谢组学分析中,目前对分子进行明确识别的黄金标准是,将分析真实参考材料(标准品)获得的两个或多个正交性质与同一实验室采用相同分析方法获得的实验数据进行比较。这为复杂样品中小分子物质的全面化学识别带来了显著限制。该过程耗时且成本高昂,且大多数分子尚未有标准品进行表征。因此,有必要通过使用包含计算化学性质的库来组装复杂样品中小分子存在的证据。为满足这一需求,我们开发了多属性匹配引擎(MAME)以及部分源自我们虚拟化学库引擎(ISiCLE)的库。在此,我们描述了这些方法在作为美国环境保护局(EPA)非靶向分析合作试验(ENTACT,第一阶段)一部分的盲法分析中,对合成化学混合物的初步评估。对于所有混合物中的分子,初始盲法假阴性率(FNR)、假发现率(FDR)和准确率分别为57%、77%和91%。对于高证据分数的分子,FDR为35%。在样本组成信息解盲后,我们优化了评分参数,以更好地利用现有证据,并提高了疑似存在的分子的准确率。最终的FNR、FDR和准确率分别为67%、53%和96%。对于高证据分数的分子,FDR为10%。本研究表明,多属性匹配方法与虚拟库的结合,或许有朝一日能够减少对实验衍生库的依赖,从而为复杂样品中分子的存在构建证据。
提供机构:
ACS Publications



