five

Redefining normal breast cell populations using long noncoding RNAs

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA875146
下载链接
链接失效反馈
官方服务:
资源简介:
Single-cell transcriptomics (scRNAseq) has emerged as a powerful tool to assess the transcriptome of individual cells, revealing new and rare cell types and improving the reconstruction of lineage hierarchies. However, a major limitation of current studies is that they only quantify sequencing reads that map to annotated isoforms, which represent roughly one third of all human transcripts. Since the vast majority of unannotated genes are long noncoding RNAs (lncRNAs), these remain largely unexplored. The human breast is a complex organ that harbours different cell populations. Interestingly, different mammary cell populations give rise to different breast tumour subtypes and the cell-of-origin determines the tumour molecular characteristics and clinical outcomes. Using deep bulk RNAseq of normal breast epithelial cells we discovered >13,000 lncRNAs (being 95% unannotated) and mapped their expression levels in scRNAseq, showing they perform better than protein-coding genes at clustering the different cell types. On average,each cell expressed 900 lncRNAs and 4,000 protein-coding genes. LncRNAs had significantly higher cluster specificity levels and were expressed in less cells than their protein-coding counterpart, which is in line with the view of lncRNAs being highly cell type-specific and their apparent lower expression levels being a result of bulk RNAseq estimates. Indeed, when investigating the expression of lncRNAs at cellular level, we confirmed it to be comparable to that of protein-coding genes. We conducted a thorough assessment of these lncRNAs, their expression levels in individual cells and across populations and their correlation with breast cancer subtypes. On average, each cell population has nearly 300 lncRNA markers, from which at least 30 (10%) can be used to classify breast cancer tumours in their different subtypes. Notably, using their predicted protein-coding targets or annotation, we were able to link several specific lncRNAs to tumour subtypes and cancer hallmarks.

单细胞转录组学(scRNAseq)已成为解析单个细胞转录组的强大工具,能够发现新型稀有细胞类型,并优化细胞谱系层级的重构工作。然而当前研究存在一项核心局限:仅对比对至已注释剪接异构体的测序读段进行定量,而这类读段仅占人类全部转录本的约三分之一。由于绝大多数未注释的基因为长链非编码RNA(long noncoding RNAs,lncRNAs),这类基因至今仍未得到充分探索。人类乳腺是一个复杂的器官,包含多种不同的细胞群。值得注意的是,不同的乳腺细胞群可衍生出不同的乳腺肿瘤亚型,且细胞起源决定了肿瘤的分子特征与临床结局。我们通过对正常乳腺上皮细胞开展深度批量RNA测序(bulk RNAseq),共发现超过13000个长链非编码RNA(其中95%为未注释基因),并在单细胞转录组测序数据中绘制了它们的表达谱,结果显示其在不同细胞类型的聚类分析中表现优于蛋白编码基因。平均而言,每个细胞可表达900个长链非编码RNA与4000个蛋白编码基因。长链非编码RNA的聚类特异性显著更高,且在更少的细胞中表达,这与“长链非编码RNA具有高度细胞类型特异性,而其表观表达水平偏低是批量RNA测序估算导致的结果”这一学术观点相符。实际上,在单细胞水平探究长链非编码RNA的表达时,我们证实其表达水平与蛋白编码基因相当。我们对这批长链非编码RNA进行了全面评估,涵盖其在单个细胞与不同细胞群中的表达水平,以及它们与乳腺肿瘤亚型的相关性。平均而言,每个细胞群拥有近300个长链非编码RNA标志物,其中至少30个(占比10%)可用于区分不同亚型的乳腺肿瘤。值得注意的是,通过预测其蛋白编码靶标或进行功能注释,我们能够将多个特定的长链非编码RNA与肿瘤亚型及癌症特征关联起来。
创建时间:
2022-08-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作