Files From The Exon Analysis Pipeline of Pediatric Solid Tumors

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/10607083

下载链接

链接失效反馈

官方服务：

资源简介：

Long-read iso-seq validation of exon regions: Methods of the Long-read Sequencing: Libraries for each tumor were prepared and sequencing on a Pacific Biosciences (PacBio) RSII instrument at JAX. The raw data files were processed according to the PacBio Isoseq3 pipeline which utilizes a number of command line tools provided in PacBio SMRT Tools v10.2 (https://www.pacb.com/support/software-downloads/). The pipeline generates non-redundant full-length (FL) transcripts in the following steps for each tumor sample: (i) compute consensus sequences and read quality, (ii) remove primers and adapters, (iii) remove polyA tail and artificial concatemers, (iii) de novo isoform-level clustering, (iv) minimap2 aligns FL transcripts to human reference (GENCODEv40), (v) transcripts were collapsed based on genomic mapping, long-read abundance was estimated, and GTF annotation file generated, (vi) SQANTI3 performed transcript classification and created both a reference corrected transcriptome fasta file and a corrected GTF file for each tumor. Gene target exon PSI value calculation: The “generateEvents” operation from SUPPA2v2.3 (https://github.com/comprna/SUPPA) was used to extract splice sites from each tumor GTF, next “psiPerEvent” operation was applied to calculate Percent Spliced In (PSI) statistic for each alternative splice site using Kallisto estimated transcript abundance values. Gene target exons of interest were selected from output tables using an absolute splice site genomic coordinate match. Please contact Dr Ching Lau for additional inquiries (ching.lau@jax.org).

创建时间：

2024-03-20

5,000+

优质数据集

54 个

任务类型

进入经典数据集