Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE160910
下载链接
链接失效反馈官方服务:
资源简介:
Systematic delineation of complex biological systems is an ever-challenging and resource-intensive process. Single-cell transcriptomics allows us to study cell-to-cell variability in complex tissues at an unprecedented resolution. Accurate modeling of gene expression plays a critical role in the statistical determination of tissue-specific gene expression patterns. In the past few years, considerable efforts have been made to identify appropriate parametric models for single cell expression data. The zero-inflated version of Poisson/Negative Binomial and Log-Normal distributions have emerged as the most popular alternatives due to their ability to accommodate high dropout rates, as commonly observed in single cell data. While the majority of the parametric approaches directly model expression estimates, we explore the potential of modeling expression-ranks, as robust surrogates for transcript abundance. Here we examined the performance of the Discrete Generalized Beta Distribution (DGBD) on real data and devised a Wald-type test for comparing gene expression across two phenotypically divergent groups of single cells. We performed a comprehensive assessment of the proposed method, to understand its advantages as compared to some of the existing best practice approaches. Besides striking a reasonable balance between Type 1 and Type 2 errors, we concluded that ROSeq, the proposed differential expression test is exceptionally robust to expression noise and scales rapidly with increasing sample size. For wider dissemination and adoption of the method, we created an R package called ROSeq, and made it available on the Bioconductor platform. Two single cell dataset are included: (1) 150 single human foreskin BJ fibroblast cells (2) 352 single K562 cells were profiled using Fluidigm based single cell RNA-seq protocol to characterized cellular heterogeneity of two phenotypically divergent cells. In addition, RNA expression profiling of bulk cells (3 replicates for BJ and 4 replicates for K562) were performed in tubes. Polaris™ integrated fluidic circuit (IFC) was used to sequester up to 48 single cells per cell type per one IFC run using Polaris system. The captured single cells were processed for cell lysis, reverse transcription (RT), and full-length transcriptome amplification using template-switching chemistry. Following harvest from the IFC, sequencing libraries are generated using a modified Nextera® protocol and sequenced on Illumina® NextSeq 500 platform.
创建时间:
2021-03-09



