DataSheet_1_An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean.pdf
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet_1_An_advanced_systems_biology_framework_of_feature_engineering_for_cold_tolerance_genes_discovery_from_integrated_omics_and_non-omics_data_in_soybean_pdf/21250902
下载链接
链接失效反馈官方服务:
资源简介:
Soybean is sensitive to low temperatures during the crop growing season. An urgent demand for breeding cold-tolerant cultivars to alleviate the production loss is apparent to cope with this scenario. Cold-tolerant trait is a complex and quantitative trait controlled by multiple genes, environmental factors, and their interaction. In this study, we proposed an advanced systems biology framework of feature engineering for the discovery of cold tolerance genes (CTgenes) from integrated omics and non-omics (OnO) data in soybean. An integrative pipeline was introduced for feature selection and feature extraction from different layers in the integrated OnO data using data ensemble methods and the non-parameter random forest prioritization to minimize uncertainties and false positives for accuracy improvement of results. In total, 44, 143, and 45 CTgenes were identified in short-, mid-, and long-term cold treatment, respectively, from the corresponding gene-pool. These CTgenes outperformed the remaining genes, the random genes, and the other candidate genes identified by other approaches in an independent RNA-seq database. Furthermore, we applied pathway enrichment and crosstalk network analyses to uncover relevant physiological pathways with the discovery of underlying cold tolerance in hormone- and defense-related modules. Our CTgenes were validated by using 55 SNP genotype data of 56 soybean samples in cold tolerance experiments. This suggests that the CTgenes identified from our proposed systematic framework can effectively distinguish cold-resistant and cold-sensitive lines. It is an important advancement in the soybean cold-stress response. The proposed pipelines provide an alternative solution to biomarker discovery, module discovery, and sample classification underlying a particular trait in plants in a robust and efficient way.
大豆在作物生育期对低温胁迫极为敏感。在此背景下,培育耐冷品种以减轻生产损失的需求愈发迫切。耐冷性状属于复杂的数量性状,受多基因、环境因素及其互作共同调控。本研究搭建了一套先进的系统生物学特征工程框架,以从大豆整合组学(omics)与非组学(non-omics,OnO)数据中挖掘耐冷基因(cold tolerance genes,CTgenes)。本研究引入了一套整合分析流程,通过数据集成方法与非参数随机森林优先级排序算法,对整合OnO数据的多组学层进行特征选择与特征提取,以最小化结果不确定性与假阳性率,提升分析准确性。最终从对应基因库中,分别在短期、中期与长期低温处理组中鉴定出44、143和45个耐冷基因。在独立RNA测序(RNA-seq)数据库中,这批耐冷基因的表现优于其余基因、随机选取基因以及其他方法鉴定的候选基因。此外,本研究通过通路富集与互作网络分析,挖掘与激素及防御相关模块中潜在耐冷机制相关的生理通路。本研究通过56份大豆样本的55个单核苷酸多态性(single nucleotide polymorphism,SNP)基因型数据,对鉴定得到的耐冷基因进行了验证。这表明,通过本研究提出的系统性框架所鉴定的耐冷基因,可有效区分耐冷与冷敏感大豆品系。该研究是大豆低温胁迫响应研究领域的一项重要进展。本研究提出的分析流程,为植物特定性状相关的生物标志物挖掘、模块鉴定与样本分类提供了一种稳健高效的备选方案。
创建时间:
2022-09-30



