Long-read cDNA sequencing identifies functional pseudogenes in the human transcriptome

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://www.ncbi.nlm.nih.gov/sra/SRP287687

下载链接

链接失效反馈

官方服务：

资源简介：

Pseudogenes are gene copies that were presumed to be functionless relics of evolution due to deleterious mutations or transcriptional silencing. Where pseudogenes are transcribed, they can encode intact or truncated proteins or act through RNA-intrinsic mechanisms. However, the extent, characteristics and functional impact of the human pseudogene transcriptome are unclear. Short-read sequencing platforms have limited power to identify full-length pseudogene transcripts or accurately quantify their expression due to the high similarity of pseudogenes to their parent genes. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identified hundreds of novel transcribed pseudogenes that are expressed in tissue-specific patterns. Pseudogene transcripts exhibit complex splicing patterns and contribute to the coding sequences of genes. Many pseudogene transcripts contain intact open-reading frames and are potential unannotated protein-coding genes, some of which we demonstrate to be efficiently translated in human cells. To demonstrate the impact of noncoding pseudogenes on the cellular transcriptome, we ablated the pseudogene PDCL3P4 and identified hundreds of dysregulated genes. PDCL3P4 transcript is enriched in the nucleus and may constitute a novel pseudogene-derived trans-acting long noncoding RNA. This study identifies a complex, dynamic human pseudogene transcriptome which will facilitate determination of the impact of pseudogene transcripts in health and disease.

创建时间：

2022-11-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集