Data from: On the optimal trimming of high-throughput mRNA sequence data
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.7rm34
下载链接
链接失效反馈官方服务:
资源简介:
The widespread and rapid adoption of high-throughput sequencing
technologies has afforded researchers the opportunity to gain a deep
understanding of genome level processes that underlie evolutionary change,
and perhaps more importantly, the links between genotype and phenotype. In
particular, researchers interested in functional biology and adaptation
have used these technologies to sequence mRNA transcriptomes of specific
tissues, which in turn are often compared to other tissues, or other
individuals with different phenotypes. While these techniques are
extremely powerful, careful attention to data quality is required. In
particular, because high-throughput sequencing is more error-prone than
traditional Sanger sequencing, quality trimming of sequence reads should
be an important step in all data processing pipelines. While several
software packages for quality trimming exist, no general guidelines for
the specifics of trimming have been developed. Here, using empirically
derived sequence data, I provide general recommendations regarding the
optimal strength of trimming, specifically in mRNA-Seq studies. Although
very aggressive quality trimming is common, this study suggests that a
more gentle trimming, specifically of those nucleotides whose Phred score
< 2 or < 5, is optimal for most studies across a wide
variety of metrics.
提供机构:
Dryad
创建时间:
2014-01-14



