Additional file 1 of Widespread 3′UTR capped RNAs derive from G-rich regions in proximity to AGO2 binding sites
收藏DataCite Commons2024-11-08 更新2025-05-06 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Widespread_3_UTR_capped_RNAs_derive_from_G-rich_regions_in_proximity_to_AGO2_binding_sites/27634817
下载链接
链接失效反馈官方服务:
资源简介:
Additional file 1: Figures S1, S2, S3, S4. Each supplementary figure corresponds to the main Figs. in the manuscript, in the same order. Fig. S1 Related to Fig. 1. A Pearson’s correlation of raw CAGE read counts per TSS or consensus cluster across biological replicates and cell types. B Reverse cumulative distribution of CAGE reads after normalisation using CAGEr package [82]. C Total number of CAGE reads in each sample. D Density of total 5′ CAGE read positions normalised by the length of the correspondent transcript region identified in CAGE-seq libraries of K562 and HeLa samples with two biological replicates each, provided by ENCODE. E Percentage of CAGE tags per transcript region using random primers, Oligo(dT) primers, and combination of both primers (1:4 Oligo(dT):Random primers) in CAGE-seq libraries of THP-1 cells generated by RIKEN. F-I Pearson’s correlation between CAGE-seq replicates and different cell lines samples in 3′UTRs, 5′UTRs, CDS and introns. J Top: Plot of the normalised coverage of the 5′ ends of forward paired-end reads (yellow line) and 3′ ends of reverse paired-end reads (blue line) of RNA-seq relative to 3′UTR CAGE peaks in HeLa cells. Bottom: Schematic representation of paired-end read positioning. Forward and reversed paired-end reads are presented in yellow and blue, respectively. The black box represents the ends of reads that are plotted in the top graph. K RT-qPCR data of gene expression using primers designed to amplify sequences located downstream (3'C), upstream (5'C) and overlapping (AC) the 3′UTR CAGE sites of CDKN1B and JPT2. Data represents fold detection (six biological replicates) using downstream versus upstream/overlapping primers relative to the 3′UTR CAGE peaks. Primer target sequences relative to the 3′UTR CAGE peak are schematically represented on the top right-hand side and visualised for each gene using IGV genome browser on the bottom. Each dot represents the value of an independent biological replicate. L Top gene examples with strongest 3′UTR CAGE peaks present in K562 and HeLa cell lines using IGV-genome browser for visualisation. M Visualisation of RNA-seq reads relative to dominant 3′UTR CAGE peaks in CDKN1B and JPT2 gene using IGV-genome browser. N Visualisation of 10 gene examples with 3′UTR CAGE peaks, rG4-seq clusters and long-read CAGE reads using IGV-genome browser. Fig. S2 Related to Fig. 2. A RNA-map showing normalised density of CBP20-iCLIP crosslink sites relative to dominant 3′UTR and 5′UTR CAGE peaks. B Mean score of GRO-cap seq coverage per CAGE peak for 5’UTR, CDS, intron and 3’UTR regions. The heatmap represents GRO-cap seq scores plotted with deeptools. C RNA-map showing normalised density of cap-CLIP crosslink sites relative to dominant 3′UTR and 5′UTR CAGE peaks. D Mean coverage of conservation score from UCSC phastCons30way track relative to inner 3′UTR CAGE peaks and randomised control positions around the same region of 150-nt window for each peak. E Normalised motif enrichment of canonical PolyA motifs relative to 3′UTR ends and to the dominant 3′UTR CAGE peaks. F Sequence logos around K562 cells’ CAGE peaks across different transcript regions. G The 75-nt region centred on K562 cells’ CAGE peaks at different transcript regions was used to calculate pairing probability with the RNAfold program, and the average pairing probability of each nucleotide is shown for the 50-nt region around CAGE peaks. H GGG-motif enrichment relative to CAGE peaks. I GGG-motif enrichment relative to CAGE peaks. J Summarised score from G4-Hunter prediction tool [36] in the region of 50 nts upstream and downstream relative to CAGE peaks. K Summarised score from G4-Hunter prediction [36] tool in the region of 50 nts upstream and downstream relative to CAGE peaks. L Enrichment of RNA-G-quadruplex sequencing hits from HeLa cells relative to CAGE peaks. M Percentage of G4-seq sites per transcript region. N Enrichment of eCLIP cross-linking clusters surrounding 5′UTR CAGE peaks from 80 different RBP samples in K562 cells from ENCODE database using sum of log ratios. The red line represents the threshold of top 10 RBP targets. O Enrichment of eCLIP cross-linking clusters surrounding intronic CAGE peaks from 80 different RBP samples in K562 cells from ENCODE database using sum of log ratios. The red line represents the threshold of top 10 RBP targets. P Enrichment of eCLIP cross-linking clusters surrounding CDS CAGE peaks from 80 different RBP samples in K562 cells from ENCODE database using sum of log ratios. The red line represents the threshold of top 10 RBP targets. Q Pearson’s correlation between the 3′UTR CAGE tags and RNA-seq read coverage per gene (top-left). Pearson’s correlation between the 3′UTR crosslink coverage of UPF1-eCLIP and 3′UTR CAGE tags (top-right). Pearson’s correlation between the 3′UTR length and 3′UTR crosslink coverage of UPF1-eCLIP (bottom-left). Pearson’s correlation between the 3′UTR length and 3′UTR CAGE tags (bottom-right). R UPF1-eCLIP crosslink enrichment relative to the distance from the 3′UTR CAGE peaks. S Heatmap of UPF1-eCLIP crosslink site enrichment showing the top 500 3′UTR AGO2 targets in 100-nt flanking region relative to 3′UTR CAGE peaks. The heatmap represents log2 of crosslink counts, normalised by the mean of all counts within 200 nts of the targeting site. T K562 cells were transfected with siRNAs targeting UPF1 or non-targeting controls (C). Left-hand side panel: UPF1 expression was measured with RT-qPCR. Data is normalised by the housekeeping gene RPLP0 and presented as fold change of the control. Right-hand side panel: RT-qPCR data of gene expression using primers designed to amplify sequences located downstream (3'C) and upstream (5'C) the 3′UTR CAGE sites of CDKN1B and JPT2. Data represents fold detection using downstream (3'C) versus upstream (5'C) primers. Each dot represents an independent biological replicate. Primer target sequences relative to the 3′UTR CAGE peaks are schematically represented on the right. Fig. S3 Related to Fig. 3. A Visualisation of 5′ CAGE reads relative to dominant transcription start site (TSS) and relative to small interfering RNA (siRNA) of ISL1 target (in red) for CAGE-ISL1-KD and CAGE-control samples with 3 biological replicates using IGV-genome browser. B Percentage of AGO2-eiCLIP binding sites per transcript region. C Binding enrichment of AGO2-eiCLIP relative to miRNA-regulated transcripts and non-miRNA-regulated transcript in HeLa. D Heatmap of miRNA-seed sequence enrichment in 30-nt flanking region showing the top 500 AGO2 binding sites relative to AGO-eiCLIP crosslink sites. Metaplot visualises the miRNA-seed sequence composition relative to the 5′ of the AGO2 binding site. E Heatmap of AGO2-eiCLIP crosslink site enrichment showing the top 500 3′UTR AGO2 targets in 100-nt flanking region relative to 3′UTR CAGE peaks. The heatmap represents log2 of crosslink counts, normalised by the mean of all counts within 200 nts of the targeting site. F RNA-map showing normalised density of AGO2 crosslink sites from AGO2 binding sites that contain (mir+) or are absent (mir-) from predicted miRNA binding sites relative to 3′UTR CAGE peaks. G AGO2 mRNA expression was measured with RT-qPCR in 3 clonal subpopulations generated by single cell sorting of K562 cells transfected with a plasmid expressing Cas9 and two gRNAs targeting AGO2 and preselected by genomic DNA sequencing. Both KO1 and KO2 contained edited AGO2 sequences and control wild-type sequences. Data is normalised by the housekeeping gene RPLP0 and presented as a fold change of the control. Each dot represents an independent biological replicate. H AGO2 protein was detected by western blot in 6 clonal subpopulations generated as in Fig. S3G. Clon 2 and 6 were selected for further experiments and re-named as KO1 and KO2. This experiment was performed once. I RT-qPCR data of gene expression using primers designed to amplify sequences located downstream (3'C) and upstream (5'C) the 3′UTR CAGE sites of CDKN1B and JPT2. Data represents fold detection using downstream (3'C) versus upstream (5'C) primers. Each dot represents an independent biological replicate. J Sequence logos and statistics of top 12 significantly enriched motifs of AGO2-eiCLIP binding sites using Homer for de novo motif discovery. K Enrichment of AGO2-eiCLIP cross-linking sites relative to the 3′end of the rG4-seq site. L Heatmap for AGO2-eiCLIP crosslink site enrichment to show the top 500 3′UTR AGO2 targets in 100-nt flanking region relative to 3′end of rG4-seq site. The heatmap represents log2 of crosslink counts, normalised by the mean of all counts within 200 nts of the targeting site. M Enrichment of UPF1-eCLIP cross-linking sites relative to the 3′end of the rG4-seq site. N Heatmap for UPF1-eCLIP crosslink site enrichment to show the top 500 3′UTR UPF1 targets in 100-nt flanking region relative to 3′end of rG4-seq site. The heatmap represents log2 of crosslink counts, normalised by the mean of all counts within 200 nts of the targeting site. O Upset plot of intersection of AGO2-eiCLIP and UPF1-eCLIP binding sites, G4-seq sites relative to 3′UTR CAGE peaks. P RNA-map showing normalised density of AGO2 crosslink sites relative to 3′UTR CAGE peaks intersecting G4-seq site (G4+) or not (G4-) from Fig. S3L. Q RNA-map showing normalised density of UPF1 crosslink sites relative to 3′UTR CAGE peaks intersecting G4-seq site (G4+) or not (G4-) from Fig. S3L. Fig. S4: Related to Fig. 4. A Density plots showing the shortest distance per detected signal in pixels to a signal of the opposite colour. The dashed line shows the cutoff used to distinguish colocalising and non-colocalising signals.
提供机构:
figshare
创建时间:
2024-11-08



