Files From The Exon Analysis Pipeline of Pediatric Solid Tumors
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10607083
下载链接
链接失效反馈官方服务:
资源简介:
Long-read iso-seq validation of exon regions:
Methods of the Long-read Sequencing:
Libraries for each tumor were prepared and sequencing on a Pacific Biosciences (PacBio) RSII instrument at JAX. The raw data files were processed according to the PacBio Isoseq3 pipeline which utilizes a number of command line tools provided in PacBio SMRT Tools v10.2 (https://www.pacb.com/support/software-downloads/).
The pipeline generates non-redundant full-length (FL) transcripts in the following steps for each tumor sample: (i) compute consensus sequences and read quality, (ii) remove primers and adapters, (iii) remove polyA tail and artificial concatemers, (iii) de novo isoform-level clustering, (iv) minimap2 aligns FL transcripts to human reference (GENCODEv40), (v) transcripts were collapsed based on genomic mapping, long-read abundance was estimated, and GTF annotation file generated, (vi) SQANTI3 performed transcript classification and created both a reference corrected transcriptome fasta file and a corrected GTF file for each tumor.
Gene target exon PSI value calculation:
The “generateEvents” operation from SUPPA2v2.3 (https://github.com/comprna/SUPPA) was used to extract splice sites from each tumor GTF, next “psiPerEvent” operation was applied to calculate Percent Spliced In (PSI) statistic for each alternative splice site using Kallisto estimated transcript abundance values. Gene target exons of interest were selected from output tables using an absolute splice site genomic coordinate match.
Please contact Dr Ching Lau for additional inquiries (ching.lau@jax.org).
创建时间:
2024-03-20



