Human and Mouse UTRomes
收藏DataCite Commons2024-04-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Human_and_Mouse_UTRomes/23549526
下载链接
链接失效反馈官方服务:
资源简介:
OverviewThis dataset contains BED and GTF files representing the cleavage sites and 3'UTR isoform annotations derived from reprocessing Microwell-seq data. These objects are part of the minimum dataset required for verifying the analysis reported in Fansler et al., bioRxiv, 2023.DescriptionThe <b>BED files</b> contain candidate cleavage sites from the Mouse Cell Atlas and Human Cell Landscape datasets. In brief, paired-end reads were merged with PEAR when overlapping, cell barcodes extracted with <code>umi_tools</code>, poly-A tails removed with <code>cutadapt</code>, and then remaining reads mapped to the <code>hg38</code> or <code>mm10</code> genomes using HISAT2. Reads were partitioned into cell types according to annotations from the original publications. Per cell type, the 5' end of alignments were summarized, counts were merged to the mode with 30 nts, and finally filtered to a minimum threshold of 5 TPM. The resulting BED files identify the cell type cluster in the <code>name</code> column and the number of observed reads in the <code>score</code> column.The <b>GTF files</b> are augmentations of GENCODE vM25 and v39, using novel cleavage sites, and then truncated to 500 nt. In brief, the sites provided in the BED files were harmonized across cell types by merging to the mode within 30 nts. The candidate sites were then serially classified as (1) "validated" if already in GENCODE (2) "supported" if found in PolyASite2.0 at 3 TPM or higher (3) "likely" if <code>cleanUpdTSeq</code> scored the posterior probability of being an internal priming site below 0.0001% (4) "unlikely", otherwise. The "supported" and "likely" candidates were then used to augment GENCODE annotations of protein coding transcripts, and each transcript was truncated to the 500 nts at the 3' end. The final annotations identify the regions where the scUTRquant pipeline will quantify scRNA-seq data.Data GenerationAll code required to generate these files is available at:https://github.com/Mayrlab/mca-utrome (https://doi.org/10.5281/zenodo.8118416)https://github.com/Mayrlab/hcl-utrome (https://doi.org/10.5281/zenodo.8118411)
提供机构:
figshare
创建时间:
2023-06-20



