MOESM9 of An integrative methodology based on protein-protein interaction networks for identification and functional annotation of disease-relevant genes applied to channelopathies

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://figshare.com/articles/dataset/MOESM9_of_An_integrative_methodology_based_on_protein-protein_interaction_networks_for_identification_and_functional_annotation_of_disease-relevant_genes_applied_to_channelopathies/10297679

下载链接

链接失效反馈

官方服务：

资源简介：

Additional file 9 Quantitative validation by significance analysis of DAVID search against other phenotype-oriented resources. We searched the nine relevant genes resulted from the workflow in PheGenI [56], ToppGene [57] and g:Profiler [58]. We quantitatively evaluated this search selecting those terms with a significance less than 0.05 using Benjamini-Hochberg FDR statistic. We obtained a minor result in DAVID search (OMIM search did not offer the phenotypes p-values, unlike GAP DISEASE database). Even so, results are useful to develop a quantitative comparison between semiautomatic platforms and bibliographic search systems (sheet 1). From these results we represented the genotype-phenotype association networks to compare easily each p-value phenotype obtained (sheet 2). It should be noted that p-values of clinical phenotypes could be only obtained from one of the two databases explored through DAVID (GAP DISEASE database), and so the genotype-phenotype association network is sparser than the network of the manuscript (section A in Figs. 5, 6, 7). Yet, it is demonstrated that the workflow results are statistically significant and are as valid as or even better than systematic or exhaustive reviews. Then, we created three Boolean tables (in sheets 3, 4, 5) comparing each phenotype obtained from each search; these tables were then converted to binary matrices and clustering multivariate statistical analyses and bootstrap validations were carried out. This approach demonstrated that the results provided in the manuscript, obtained from DAVID (DAVID_m) and systematic and exhaustive reviews, clustered together in a robust and significant way (sheets 3, 4, 5). Hence, this workflow builds as productive results as a non-automatic research but in a quicker way allowing the extraction of information which a priori might not seem relevant when the starting point is a very large group of genes in disease. Moreover, the results obtained using just significant FDR corrected p-values also cluster in particular branches.

附加文件9：针对其他表型导向资源的DAVID（Database for Annotation, Visualization and Integrated Discovery）搜索结果的显著性分析定量验证。我们针对本工作流程得到的9个相关基因，在PheGenI[56]、ToppGene[57]及g:Profiler[58]中进行了检索。我们采用本杰米尼-霍赫贝格（Benjamini-Hochberg）假发现率（FDR）统计量对本次检索进行定量评估，筛选显著性P值小于0.05的条目。本次DAVID搜索得到的结果较为有限（与GAP DISEASE数据库不同，OMIM搜索无法提供表型相关的P值）。尽管如此，该结果仍可用于半自动平台与文献检索系统间的定量对比（见工作表1）。基于上述结果，我们构建了基因型-表型关联网络，以直观对比各表型对应的P值（见工作表2）。需注意的是，临床表型的P值仅能通过DAVID检索的两个数据库中的GAP DISEASE数据库获取，因此该基因型-表型关联网络较本文正文的关联网络更为稀疏（对应图5、6、7的A部分）。但即便如此，本工作流程得到的结果仍具有统计学显著性，其有效性可与系统性综述或全面综述媲美，甚至更优。随后，我们构建了三张布尔数据表（见工作表3、4、5），用于对比不同检索方式得到的表型结果；随后将这些数据表转换为二元矩阵，并开展了多变量聚类统计分析与Bootstrap验证。该分析方法证实，本文中来自DAVID（DAVID_m）以及系统性、全面性综述的结果，能够以稳健且显著的方式聚为一类（见工作表3、4、5）。因此，本工作流程能够在更短的时间内获得与非自动化研究相当的成果，可从疾病相关的海量基因集合中提取出先验视角下看似无关的信息。此外，仅采用经FDR校正的显著性P值得到的结果，也能在聚类树中形成特定的分支。

创建时间：

2019-11-12