Omics Forecasting: Predictive Calculations Permit the Rapid Interpretation of High-Resolution Mass Spectral Data from Complex Mixtures

Figshare2019-10-11 更新2026-04-29 收录

下载链接：

https://figshare.com/articles/dataset/Omics_Forecasting_Predictive_Calculations_Permit_the_Rapid_Interpretation_of_High-Resolution_Mass_Spectral_Data_from_Complex_Mixtures/10553618

下载链接

链接失效反馈

官方服务：

资源简介：

For some complex mixtures, chromatographic techniques are insufficient to separate the large numbers of compounds present. In addition, these mixtures often contain compounds with similar or identical molecular masses and shared fragmentation transitions. Advancements in mass spectrometry have provided more and more detailed molecular profiles with significant increases in resolution. This has led to a capacity to distinguish a very large number of compounds in complex mixtures, providing overwhelming data sets. The approach of calculating molecular formulas from a mass list has become more and more problematic as the number of signals has increased exponentially, to the point that it has become impossible to manually interpret the thousands of mass signals. The current approach is to calculate a list of possible formulas that fall within a specific mass error of the observed signal. Then, one must look for possible structures that can be derived from each entry on the list of formulas. However, an alternative approach is to anticipate the possible structures of a particular set of compounds, such as red wine pigments, and then compare the ion signals against a predicted list. To that end, starting with known wine pigment types, we have generated a set of expected wine pigment variants based on known derivatives of condensed tannin oligomers, anthocyanins, and fermentation products. After the ability to distinguish compounds by mass spectrometry was accounted for, over 1 million results were generated consisting of known and anticipated wine pigments. A comparison with a small sample of wine phenolic fractions show a large number of matches, suggesting that this approach may be helpful.

针对部分复杂混合物体系，色谱技术难以实现其中所含大量化合物的完全分离。此外，这类混合物往往包含分子质量相近乃至完全相同、且具有共同碎裂跃迁特征的化合物。质谱技术（mass spectrometry）的进步使得分子谱图的分辨率得到显著提升，同时谱图细节也愈发丰富，这使得我们得以区分复杂混合物中数量极多的化合物，但也随之产生了体量庞大的数据集。随着信号数量呈指数级增长，基于质量列表计算分子式的方法逐渐暴露出局限性，甚至发展至无法手动解析数千个质谱信号的程度。当前主流的处理思路为：先计算与观测信号的质量误差处于特定范围内的候选分子式列表，再针对该列表中的每一项，寻找可由其衍生得到的潜在分子结构。不过，另有一条替代路径：可先预判特定类别化合物的可能结构，例如红酒色素，随后将实测离子信号与预测得到的化合物列表进行比对。为此，我们以已知的红酒色素类型为起点，基于缩合单宁低聚物（condensed tannin oligomers）、花青素（anthocyanins）及发酵产物的已知衍生物，生成了一系列预期的红酒色素变体。在纳入质谱的化合物区分能力参数后，我们共得到超过100万条涵盖已知及预期红酒色素的结果。与少量红酒酚类组分（phenolic fractions）样本的比对分析显示，二者存在大量匹配，这表明该方法具备可观的应用价值。

创建时间：

2019-10-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集