Additional file 2 of ChromGene: gene-based modeling of epigenomic data
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Additional_file_2_of_ChromGene_gene-based_modeling_of_epigenomic_data/24104324
下载链接
链接失效反馈官方服务:
资源简介:
Additional file 2: Table S1. Description of ChromGene annotation enrichments, state emissions and enrichments, and transition probabilities. Annotation enrichments tab – The columns on this tab after the annotation colors and numbers are as follows: Mnemonic: short identifying name of ChromGene annotation. Description: short description of ChromGene annotation. Overall percentage: percentage of gene-cell type combinations assigned to annotation. Median expression: median expression (RPKM) of genes assigned to annotation across 56 cell types with expression [1]. Median length (kb): median length (kb) of genes assigned to annotation across all cell types, not including flanking regions. Percentage of high-pLI genes (pLI ≥ 0.9): percentage of genes across cell types that have a pLI score ≥ 0.9. Percentage of high-pLI genes (pLI ≥ 0.9), conditioned on gene length: percentage of genes across cell types that have a pLI score ≥ 0.9 in a gene length matched distribution (“Methods”). Contingency table diagonal / confusion matrix diagonal: percentage consistency of ChromGene assignments across non-replicate cell types divided by percentage consistency of assignments across replicate cell types. Cell type specificity: 1 - (contingency table diagonal / confusion matrix diagonal), a metric of cell type specificity. # Housekeeping gene: gene annotated as housekeeping [39]. Housekeeping gene percentage: percentage of gene-cell type combinations annotated as a housekeeping gene. Housekeeping gene enrichment: fold enrichment of housekeeping genes compared to overall percentage. Housekeeping gene log2 enrichment: log2 fold enrichment of housekeeping genes. Housekeeping gene enrichment median enrichment p-value: median p-value of housekeeping gene enrichment across cell types. # Constitutively unexpressed gene: gene that has RPKM < 1 across 56 cell types with expression [1]. Constitutively unexpressed gene percentage: percentage of gene-cell type combinations annotated as constitutively unexpressed. Constitutively unexpressed gene enrichment: fold enrichment of constitutively unexpressed genes compared to overall percentage. Constitutively unexpressed gene log2 enrichment: log2 fold enrichment of constitutively unexpressed genes. Constitutively unexpressed gene median enrichment p-value: median p-value of constitutively unexpressed gene enrichment across cell types. # Constitutively expressed gene: gene that has RPKM> 1 across 56 cell types with expression [1]. Constitutively expressed gene percentage: percentage of gene-cell type combinations annotated as constitutively expressed. Constitutively expressed gene enrichment: fold enrichment of constitutively expressed genes compared to overall percentage. Constitutively expressed gene log2 enrichment: log2 fold enrichment of constitutively expressed genes. Constitutively expressed gene median enrichment p-value: median p-value of constitutively expressed gene enrichment across cell types. # Olfactory gene: gene annotated as olfactory [34]. Olfactory gene percentage: percentage of gene / cell type combinations annotated as olfactory. Olfactory gene enrichment: fold enrichment of olfactory genes compared to overall percentage. Olfactory gene log2 enrichment: log2 fold enrichment of olfactory genes. Olfactory gene median enrichment p-value: median p-value of olfactory gene enrichment across cell types. # ZNF gene: gene starts with "ZNF". ZNF gene percentage: percentage of gene / cell type combinations annotated as ZNF. ZNF gene enrichment: fold enrichment of ZNF genes compared to overall percentage. ZNF gene log2 enrichment: log2 fold enrichment of ZNF genes. ZNF gene median enrichment p-value: median p-value of ZNF gene enrichment across cell types. Cancer gene sets enriched (adj p < 0.01): the number of cancer gene sets enriched across all cell types for given annotation. Cancer gene sets enriched percentage: the percentage of cancer gene sets enriched across all cell types for given annotation. BP GO terms enriched (adj p < 0.01): the number of 'Biological Process' GO term gene sets enriched across all cell types for given annotation. BP GO terms enriched percentage: the percentage of 'Biological Process' GO term gene sets enriched across all cell types for given annotation. Color (hex): hex color for ChromGene annotation. Matplotlib color name: color used in matplotlib for ChromGene annotation [54]. State emissions and enrichments tab – The first column gives a color and number for each annotation. The second column gives the annotation mnemonic. The third column gives a number to each individual state of the mixture component. The next 12 columns give the emission probabilities for each epigenomic mark as indicated. The next two columns give the maximum and minimum emission probabilities represented as percentages. The next column gives the enrichment of the individual states for annotated TSS. Individual states within each mixture are ordered in decreasing value of this enrichment. The next column gives the initial probability of starting in the state overall. The last column gives the initial probability of the state given the component. Transition probabilities tab – This tab shows the transition probability, which indicates the probability, when in the state of the row, of transitioning to the state of the column. Probabilities are shown for individual states of the model, which are ordered and colored based on the component to which they belong, as indicated.
创建时间:
2023-09-07



