Table S1 - Identification of Nuclear and Cytoplasmic mRNA Targets for the Shuttling Protein SF2/ASF
收藏NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://figshare.com/articles/dataset/Identification_of_Nuclear_and_Cytoplasmic_mRNA_Targets_for_the_Shuttling_Protein_SF2_ASF/149443
下载链接
链接失效反馈官方服务:
资源简介:
A complete list of binding site annotation using the Ensembl, UCSC Known Gene and Rfam databases. The Excel file can be filtered in order to find binding sites identified by CLIP using nuclear, cytoplasmic or polysomal extracts. Headers for the table are as follows: Chromosome: Defines the specific chromosome from the human genome to which the seq-block mapped. Strand: CLIP preserves the orientation of the captured RNA marker, so that it is possible to determine the strandedness of the locus. Id: Generic description given to each binding site during our annotation work flow. Region start/end: The precise chromosomal coordinates defining the unique or overlapping sequence blocks. Regions are define by at least 2 partially overlapping binding sites mapping to the locus. # of fragments in the Cytoplasm/Nucleus/Polysome indicates if the binding site was identified in each compartment. The number specifies whether a sequence block was absent, present in a single assay or multiple assays. # of sample targets is useful for finding binding sites that are in 1, 2 or all three cellular fractions. Gene Annotation: This column describes the relationship of the seq block to annotated protein coding genes based on the UCSC Known Gene Database. Exon Style: This column describes the relationship of the seq block to annotated parts of protein coding genes (exon, intron etc). The strategy is presented in Supplementary Figure 2. USCS Known Gene database ID: This column refers to the name of a specific gene cluster by the UCSC Known Gene database. Gene Symbol: This column contains information pertaining to the approved HUGO Gene Nomenclature Committee symbol for each protein coding gene. Exon Position: This column describes the position of the exon within the protein coding gene. First/Last exon columns: Designation of “1” in either column indicates 5′ or 3′ terminal exon. “1” in both columns denotes that the sequence block is in a single exon gene. Upstream/Downstream Exon Position: These columns are useful for determining the position of introns within a protein coding gene. ncRNA Annotation: Describes the relationship of a sequence block to annotated non coding RNA (ncRNA). Annotation is based on the Rfam database. ncRNA Name: This column describes the gene symbol for each ncRNA containing a sequence block. UTR type: Describes the relationship between sequence blocks and untranslated regions of protein coding genes. Splicing Event: Gives alternative splicing annotation for exonic binding sites based upon AceVIEW, ALT Events and Fast-db, databases.
(0.13 MB XLS)
本数据集包含基于Ensembl、UCSC Known Gene及Rfam数据库构建的完整结合位点注释列表。该Excel文件支持筛选操作,可用于检索通过CLIP技术在细胞核、细胞质或多聚体提取物中鉴定得到的结合位点。
表格表头说明如下:
1. 染色体(Chromosome):标识序列块所比对到的人类基因组特定染色体。
2. 链(Strand):CLIP技术可保留捕获的RNA标志物的方向信息,借此可确定基因座的链特异性。
3. 编号(Id):在本次注释流程中为每个结合位点分配的通用标识。
4. 区域起始/终止(Region start/end):定义唯一或重叠序列块的精确染色体坐标。此处的区域由至少2个比对至该基因座的部分重叠结合位点构成。
5. 细胞质/细胞核/多聚体中的片段数(# of fragments in the Cytoplasm/Nucleus/Polysome):用于指示结合位点是否在对应亚细胞组分中被鉴定到。该数值可反映序列块是否未检出、仅在单次实验中检出,或在多次实验中检出。
6. 样本靶标数(# of sample targets):可用于筛选仅在1个、2个或全部3个细胞组分中存在的结合位点。
7. 基因注释(Gene Annotation):该列基于UCSC Known Gene数据库,描述序列块与已注释的蛋白质编码基因的关联关系。
8. 外显子类型(Exon Style):该列描述序列块与蛋白质编码基因的已注释区域(如外显子、内含子等)的关联关系,具体注释策略详见补充图2。
9. UCSC Known Gene数据库ID:该列对应UCSC Known Gene数据库中特定基因簇的名称。
10. 基因符号(Gene Symbol):该列包含对应蛋白质编码基因的人类基因组命名委员会(HUGO Gene Nomenclature Committee, HGNC)官方批准的基因符号信息。
11. 外显子位置(Exon Position):该列描述外显子在蛋白质编码基因中的位置。
12. 第一/最后外显子列(First/Last exon columns):任意一列标注为"1"时,分别表示该区域位于5′端或3′端末端外显子;若两列均标注为"1",则说明该序列块位于单外显子基因中。
13. 上游/下游外显子位置(Upstream/Downstream Exon Position):这两列可用于确定内含子在蛋白质编码基因中的位置。
14. 非编码RNA注释(ncRNA Annotation):描述序列块与已注释的非编码RNA(non coding RNA, ncRNA)的关联关系,注释基于Rfam数据库构建。
15. 非编码RNA名称(ncRNA Name):该列描述包含该序列块的每一条非编码RNA的基因符号。
16. UTR类型(UTR type):描述序列块与蛋白质编码基因的非翻译区(UTR)的关联关系。
17. 剪接事件(Splicing Event):基于AceVIEW、ALT Events及Fast-db数据库,为外显子结合位点提供可变剪接注释。
本数据集文件大小为0.13 MB,格式为XLS。
创建时间:
2008-10-08



