Modeling bias and variation in the stochastic processes of small RNA sequencing
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93399
下载链接
链接失效反馈官方服务:
资源简介:
The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers is hindered by high variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in library amplification steps and sequencing depth variation. Our analytical contributions are the description of the Linear Quadratic (LQ) relation between the mean and variance of the sequence counts in an RNA-seq experiment and the derivation of the Poisson truncated mixture as the underlying probability distribution for RNA-seq data. Using a large number of sequencing datasets, we demonstrate here how one can use this modeling framework to calculate empirical correction factors for ligase bias, while accounting for random variation in sequence counts. Bias correction may remove the majority of bias in the absence of differential expression and more than 40% of the bias in the presence of variable expression of miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Small RNA libraries were prepared from equimolar or ratiometric mixes of two synthetic RNA pools at the indicated input concentrations using various protocol variations.
创建时间:
2019-05-15



