five

Improving gene function predictions using independent transcriptional components - Raw Figure Data

收藏
DataCite Commons2025-06-01 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Improving_gene_function_predictions_using_independent_transcriptional_components_-_Raw_Figure_Data/13265159/1
下载链接
链接失效反馈
官方服务:
资源简介:
Refer to the below descriptions of the files, also available in README.txt;<br><br>&gt;*****_medians_filegene_set_name : Name of gene setsize : Number of genes with prediction scores in that gene set gene_set_db : Name of host gene set collectionMethod : Method used to calculate prediction scores Subset : Median was calculated using member genes (n = size) or genes that are never members in that gene set collection. median_prediction_score: median prediction score (for gene sets with less than 10 or more than 500 genes = NA)<br>&gt;******_multifunctonality_filegene_set_name : Name of gene setmultifunctionality_score_correlation : Distance correlation between member prediction scores and multifucntionality score calculated using host gene set collection (Empty value = gene set size was outside 10-&gt;500 range)gene_set_db : Name of host gene set collectionmethod : Method used to calculate prediction scores<br>&gt;*****_old_version_comparison_filegene_set_name : Name of gene setsize : Number of genes that were added between v3.0 and v6.2gene_set_db : Name of host gene set collectiontype : version of gene set used to calculate prediction scores of these genesmethod : Method used to calculate prediction scoresMedian prediction score : median prediction score of subset of genes added in between v3.0 and v6.2<br>&gt;unbiased_clusteringFile with probe entrez_id to cluster mappingGROUP : cluster numberLABEL : Affymetrix probeENTREZID : Corresponding entrez numberSYMBOL : Corresponding symboluncharacterized : 0 or 1 if its an orf or LOC gene<br>&gt;cluster_predictabilityFile with cluster metrics GROUP : cluster numbersize : size of clustermedian_max_prediction : Maximum prediction score for each gene across 16 collections, cluster median.density : density metricmedian_multifunctionality : Median disctance correlation association to multifunctionality calculated using all genesets from all collectionsORF or LOC : amount of orf or LOC genes in cluster<br>&gt;durocher_comparison_figure_dataComparison data for Olivieri et al hit genesentrez_id : entrez numbervariable : Comparison is GO_DNA_REPAIR to GO_DNA_REPAIRpca_prediction_scores : PCA based prediction scorei.variable : Comparison is GO_DNA_REPAIR to GO_DNA_REPAIRica_prediction_scores : ICA based prediction scoreknown_link_to_DDR : 0 or 1 if it was called as known link in Olivieri et al<br>&gt;57_lps_network_all_genesICA-TC based prediction scores for Immunological Signatures gene set collection, of the subset of genes identified in a CRISPR-Cas9 screen that have a high co-functionality.entrez_number: Entrez ID of genegene_name: HGNC Gene Symbolgene_set: Immunological Signatures gene setvalue: Z-score of gene for gene set<br>&gt;HALLMARK_ICA_ZTpvalues_CORFsandLOCS_wardClusteredMatrix containing ICA-TC based prediction scores for all Corf and LOC genes, hierarchically clustered using ward's method and 1-cor(dist) as distance function.Columns correspond to hallmark gene sets, rows to genes.<br>&gt;HALLMARK_ICA_ZTpvalues_CORFsandLOCS_wardClustered_cutoff_0.8Cluster membership of the 835 Corf and LOC genes at a dendrogram cutoff height of 0.8GROUP: Cluster numberLABEL: HGNC Gene Symbol<br>&gt;HALLMARK_ICAvPCA_ZTpvalues_CORFsandLOCSComparison between ICA-TC based and PCA-TC based prediction scores of Hallmark gene sets for all Corf and LOC genes.gene: HGNC Gene Symbolvariable: gene seti.value: ICA-TC based prediction scorevalue: ICA-TC based prediction scorecategory: logical; 1 if i.value&gt;value, 0 if value&gt;i.value<br>&gt;ICAvPCA_GO_negviralregulation.txtComparison between ICA-TC based and PCA-TC based prediction scores of Hallmark gene sets for all Corf and LOC genes.entrez_id: Entrez ID of genevariable: gene seti.value: ICA-TC based prediction scorevalue: ICA-TC based prediction scoreknown_link: one of three strings: "yes" if gene is a member of the gene set; "no" if gene is not a member of the gene set, "screen" if the gene is one of the hits of the investigated CRISPR-cas screen<br>&gt;ICAvPCA_KEGG_LysosomeComparison between ICA-TC based and PCA-TC based prediction scores of Hallmark gene sets for all Corf and LOC genes.entrez_id: Entrez ID of genevariable: gene seti.value: ICA-TC based prediction scorevalue: ICA-TC based prediction scoreknown_link: one of three strings: "yes" if gene is a member of the gene set; "no" if gene is not a member of the gene set, "screen" if the gene is one of the hits of the investigated CRISPR-cas screen<br>&gt;Mouse_v_Human_barcode_spearman_correlations.txtSpearman correlations of mouse gene barcodes with ortholog human gene barcodes for each of the 16 gene set collections.mouse_gene: Mouse Entrez IDassoc_human: Entrez ID of corresponding human orthologspearman_r: Spearman correlation coefficientcollection: number corresponding to gene set collectionname: gene set collection name

请参阅下文的文件说明,完整说明亦可参见README.txt文件;<br><br>>*****_medians_file<br>基因集名称(gene set name):对应基因集的名称<br>大小(size):该基因集中带有预测得分的基因数量<br>宿主基因集集合名称(gene_set_db):所使用的宿主基因集集合的名称<br>预测得分计算方法(Method):用于计算预测得分的方法<br>子集来源(Subset):中位数通过该基因集集合中的成员基因(n=size)或非成员基因计算得到<br>中位数预测得分(median_prediction_score):预测得分的中位数(若基因集包含的基因数少于10或多于500,则该值为NA)<br><br>>******_multifunctionality_file<br>基因集名称(gene set name):对应基因集的名称<br>多功能性得分相关性(multifunctionality_score_correlation):基于宿主基因集集合计算得到的成员基因预测得分与多功能性得分之间的距离相关性(若基因集大小不在10~500范围内,则该值为空)<br>宿主基因集集合名称(gene_set_db):所使用的宿主基因集集合的名称<br>预测得分计算方法(method):用于计算预测得分的方法<br><br>>*****_old_version_comparison_file<br>基因集名称(gene set name):对应基因集的名称<br>大小(size):v3.0与v6.2版本之间新增的基因数量<br>宿主基因集集合名称(gene_set_db):所使用的宿主基因集集合的名称<br>版本类型(type):用于计算这些基因预测得分的基因集版本<br>预测得分计算方法(method):用于计算预测得分的方法<br>中位数预测得分(Median prediction score):v3.0至v6.2版本间新增基因子集的中位数预测得分<br><br>>unbiased_clusteringFile<br>该文件包含探针与Entrez基因标识符(Entrez ID)的聚类映射关系<br>分组(GROUP):聚类编号<br>探针标识(LABEL):Affymetrix探针<br>Entrez基因标识符(ENTREZID):对应的Entrez编号<br>基因符号(SYMBOL):对应的基因符号<br>未表征基因标记(uncharacterized):若为开放阅读框(ORF)或LOC未命名基因则取值为0或1<br><br>>cluster_predictabilityFile<br>该文件包含聚类指标信息<br>分组(GROUP):聚类编号<br>聚类大小(size):聚类包含的基因数量<br>最大预测得分中位数(median_max_prediction):所有基因在16个基因集集合中的最大预测得分的聚类中位数<br>密度指标(density):密度度量指标<br>多功能性得分中位数(median_multifunctionality):基于所有集合的全部基因集计算得到的、与多功能性的距离相关性关联的中位数<br>ORF/LOC基因数量(ORF or LOC):聚类中包含的开放阅读框(ORF)或LOC未命名基因的数量<br><br>>durocher_comparison_figure_data<br>该文件为奥利弗里等人(Olivieri et al)命中基因的对比数据<br>Entrez基因标识符(entrez_id):Entrez编号<br>变量(variable):对比项为GO_DNA_REPAIR与GO_DNA_REPAIR<br>基于主成分分析的预测得分(pca_prediction_scores):基于PCA的预测得分<br>变量i(i.variable):对比项为GO_DNA_REPAIR与GO_DNA_REPAIR<br>基于独立成分分析的预测得分(ica_prediction_scores):基于ICA的预测得分<br>DNA损伤修复(DDR)已知关联标记(known_link_to_DDR):若该基因在奥利弗里等人的研究中被鉴定为DDR已知关联基因,则取值为0或1<br><br>>57_lps_network_all_genes<br>该文件包含免疫特征基因集集合(Immunological Signatures gene set collection)中,经CRISPR-Cas9筛选鉴定出的高共功能基因子集的ICA-TC预测得分<br>Entrez编号(entrez_number):基因的Entrez ID<br>基因名称(gene_name):HGNC基因符号(HGNC Gene Symbol)<br>基因集(gene_set):免疫特征基因集<br>得分值(value):基因在对应基因集中的Z得分<br><br>>HALLMARK_ICA_ZTpvalues_CORFsandLOCS_wardClustered<br>该矩阵包含所有CORF和LOC未命名基因的ICA-TC预测得分,采用沃德法(ward's method)进行层级聚类,距离函数为1-相关系数(1-cor(dist))。列对应标志性基因集(hallmark gene sets),行对应基因<br><br>>HALLMARK_ICA_ZTpvalues_CORFsandLOCS_wardClustered_cutoff_0.8<br>该文件包含835个CORF和LOC未命名基因在树状图截断高度为0.8时的聚类归属信息<br>分组(GROUP):聚类编号<br>基因符号(LABEL):HGNC基因符号<br><br>>HALLMARK_ICAvPCA_ZTpvalues_CORFsandLOCS<br>该文件包含所有CORF和LOC未命名基因的标志性基因集ICA-TC与PCA-TC预测得分对比数据<br>基因(gene):HGNC基因符号<br>变量(variable):基因集名称<br>ICA-TC预测得分(i.value):ICA-TC预测得分<br>预测得分(value):原文标注的预测得分<br>类别标记(category):逻辑值,若i.value>value则为1,若value>i.value则为0<br><br>>ICAvPCA_GO_negviralregulation.txt<br>该文件包含所有CORF和LOC未命名基因的GO负向病毒调控基因集ICA-TC与PCA-TC预测得分对比数据<br>Entrez基因标识符(entrez_id):基因的Entrez ID<br>变量(variable):基因集名称<br>ICA-TC预测得分(i.value):ICA-TC预测得分<br>预测得分(value):原文标注的预测得分<br>基因关联状态(known_link):分为三种字符串:"yes"表示该基因为基因集成员,"no"表示该基因非基因集成员,"screen"表示该基因为本次CRISPR-Cas筛选的命中基因<br><br>>ICAvPCA_KEGG_Lysosome<br>该文件包含所有CORF和LOC未命名基因的KEGG溶酶体基因集ICA-TC与PCA-TC预测得分对比数据<br>Entrez基因标识符(entrez_id):基因的Entrez ID<br>变量(variable):基因集名称<br>ICA-TC预测得分(i.value):ICA-TC预测得分<br>预测得分(value):原文标注的预测得分<br>基因关联状态(known_link):分为三种字符串:"yes"表示该基因为基因集成员,"no"表示该基因非基因集成员,"screen"表示该基因为本次CRISPR-Cas筛选的命中基因<br><br>>Mouse_v_Human_barcode_spearman_correlations.txt<br>该文件包含16个基因集集合中,小鼠基因条形码与对应同源人类基因条形码的斯皮尔曼相关性数据<br>小鼠Entrez编号(mouse_gene):小鼠的Entrez ID<br>对应人类同源基因Entrez编号(assoc_human):对应人类同源基因的Entrez ID<br>斯皮尔曼相关系数(spearman_r):斯皮尔曼相关系数<br>基因集集合编号(collection):对应基因集集合的编号<br>基因集集合名称(name):基因集集合的名称
提供机构:
figshare
创建时间:
2020-11-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作