five

DataSheet1_Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference.PDF

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet1_Optimal_Sparsity_Selection_Based_on_an_Information_Criterion_for_Accurate_Gene_Regulatory_Network_Inference_PDF/20286639
下载链接
链接失效反馈
官方服务:
资源简介:
Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a “GRN information criterion” (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/.

准确推断基因调控网络(gene regulatory networks, GRNs),对于解析未知的调控机制与生物学过程至关重要,其可助力识别遗传性疾病的治疗靶点。目前已提出多种GRN推断方法,在合适的数据条件下,这类方法在兼顾假阳性与假阴性全谱系的基准测试中表现优异。然而,要预测何种单一网络稀疏度能够生成最精准的GRN,仍极具挑战。由于缺乏稀疏度选择的评判标准,一种简易的解决方案是选取每个基因带有一定数量连接的GRN——该连接数被主观推测为合理,但该方案无法保证找到具备正确稀疏度或最优精准度的GRN。本研究提出一种通用方法,可在所有潜在GRN的完整空间内,识别出最精准且与稀疏度相关的GRN。该算法命名为SPA,其应用了"GRN信息准则(GRN information criterion, GRNIC)":该准则灵感源自两种经典的模型选择准则——赤池信息准则与贝叶斯信息准则(Akaike and Bayesian Information Criterion, AIC和BIC),但针对GRN推断任务进行了适配优化。实验结果表明,该方法在多数场景下可找到稀疏度接近真实稀疏度,且在给定GRN推断方法与数据的前提下,精准度尽可能逼近最优的GRN。本研究的数据集与源代码可通过以下链接获取:https://bitbucket.org/sonnhammergrni/spa/。
创建时间:
2022-07-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作