Improving the diversity of captured full-length isoforms using a normalized single-molecule RNA sequencing method

NIAID Data Ecosystem2026-04-25 收录

下载链接：

https://www.ncbi.nlm.nih.gov/sra/SRP187501

下载链接

链接失效反馈

官方服务：

资源简介：

Human genes form a large variety of isoforms after transcription, encoding distinct transcripts to exert different functions. Single-molecule RNA sequencing facilitates accurate identification of the isoforms by extending nucleotide read length significantly. However, the gene or isoform diversity is lowly represented by the mRNA molecules captured by sequencing because of high diversity in gene expression level, combined with relatively low sequence output. Here, we present a new modified protocol involving cDNA normalization before the library preparation for PacBio RS II sequencing, and thus, generating an increased number of molecules representing different isoforms. Validation sequencing of blood cells, and gastric cancer and adjacent non-malignant tissues exhibited an additional 1.8-2.3 and 1.8-4.7 fold increase in high-quality isoform species by the new cDNA normalization-based capture procedure, as compared to extending read length significantl, per 100,000 raw reads, respectively. The normalized libraries detected substantially increased amount of low abundant transcripts encoding functionally important proteins such as transcription factors and kinases. In addition, we also developed an allele-specific isoform identification and quantification tool (ASIIQT) for non-normalized next-generation RNA sequencing method to sequentially correct, phase, and quantify the isoforms identified by normalized single-molecule sequencing. Finally, to provide the proof-of-concept data to establish the superiority of the new RNA sequencing protocol and ASIIQT methods over existing protocols by profiling and comparing the transcriptomes of gastric signet-ring cell carcinomas and paired non-malignant gastric tissues, and identifying new cancer-specific transcriptome signatures, and thus, bring out the utility of newly developed protocols in gene expression data analyses.

创建时间：

2020-03-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集