five

Chemical informatics combined with Kendrick mass analysis to enhance annotation and identify pathways in soybean metabolomics

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.np5hqc046
下载链接
链接失效反馈
官方服务:
资源简介:
Among abiotic stresses to agricultural crops, drought stress is the most prolific and has worldwide detrimental impacts. The soybean (Glycine max) is one of the most important sources of nutrition to both livestock and humans. Different plant introductions (PI) of soybeans have been identified to have different drought tolerance levels. Here, two soybean lines, Pana (drought sensitive) and PI 567731 (drought tolerant) were selected to identify chemical compounds and pathways which could be targets for metabolomic analysis induced by abiotic stress. Extracts from the two lines are analyzed by direct infusion electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. The high mass resolution and accuracy of the method allows for identification of ions from hundreds of different compounds in each cultivar. The exact m/z of these species were filtered through SoyCyc and the Human Metabolome Database to identify possible molecular formulas of the ions. Next, the exact m/z values are converted into Kendrick masses and their Kendrick mass defects (KMD) computed, which are then sorted from high to low KMD. This latter process assists in identifying many additional molecular formulas, and is noted to be particularly useful in identifying formulas whose mass difference corresponds to two hydrogen atoms. In this study, more than 460 ionic formulas are identified in Pana, and more than 340 ionic formulas are identified in PI 567731, with many of these formulas reported from soybean for the first time. Using the SoyCyc matches, the metabolic pathways from each cultivar are compared, providing for lists of molecular targets available to profile effects of abiotic stress on these soybean cultivars. Key metabolites include chlorophylls, pheophytins, mono- and diacylglycerols, cycloeucalenone, squalene, and plastoquinones and involve pathways which include the anabolism and catabolism of chlorophyll, glycolipid desaturartion, and biosynthesis of phytosterols, plant sterols, and carotenoids. Methods Direct infusion ESI FT-ICR mass spectrometry was conducted using three replicates from each cultivar; the time-domain data was converted to m/z domain data prior to processing to identify features in the mass spectra. Direct infusion ESI-FT-ICR data sets were processed as follows using Bruker Daltonics (Bremen, Germany) Data Analysis 4.0 software. Software was instructed to find all peaks with a signal-to-noise ratio > 3 to produce a peak list. Next, the peak list was subjected to the deconvolution process such that isotopic envelopes were determined, and each individual ionic species was then grouped as part of the given isotopic cluster. A threshold of 0.1% peak area relative to the most intense peak (m/z 1073.506 in each cultivar list, corresponding to ion C67H94NaN4O6) was used. The peak list was reduced to the monoisotopic isotope of each isotopic cluster, and this was the m/z value used in compiling lists for each cultivar. After compilation of the m/z list for each cultivar, it was first passed through the SoyCyc database of metabolites (https://soycyc.soybase.org/); matches of either protonated, sodiated, or potassiated ions to the known metabolites within 3 ppm mass error was considered a confirmation of the ionic formula. Each list was then filtered through HMDB to discover matches to either protonated, sodiated, or potassiated ions in the database. For endogenous compounds, the 3 ppm mass error was again used to constitute a match. For non-natural compounds, however, a stricter limit of 1 ppm was used to constitute a match between the database and the m/z list. To further annotate the m/z with ionic formulas, each list was converted to the corresponding Kendrick mass and KMD calculated for each ion; ions were then sorted by KMD and plotted as nominal Kendrick mass vs. KMD to assist in identification of ionic formulas to those m/z which did not yet have one. Final lists of ionic formulas from each cultivar were then recorded and compared. For those m/z values which matched entries in the SoyCyc database, an examination of the metabolic pathways involved was also performed to obtain context on how the cultivars might respond to drought at a molecular level. Note: the absence of an annotated peak in the list does not mean that metabolite is not present; rather, the metabolite is not detected with an abundance greater than 0.1% within the restrictive mass accuracy window employed. Metabolites from each cultivar identified in SoyCyc were the inputs into the Pathway Covering tool (https://pmn.plantcyc.org/cmpd-pwy-coverage.shtml) using a constant cost function; the tool then computed a minimal-cost set of metabolic pathways for Glycine max from each cultivar’s data set. For this analysis, Pathway Tools version 26.0 [42] was used employing data identified within the SoyCyc 10.0.2 database.
创建时间:
2025-01-28
二维码
社区交流群
二维码
科研交流群
商业服务