Supporting data for "Using synthetic RNA to benchmark poly(A) length inference from direct RNA sequencing."
收藏DataCite Commons2025-07-29 更新2026-05-03 收录
下载链接:
http://gigadb.org/dataset/102736
下载链接
链接失效反馈官方服务:
资源简介:
Polyadenylation is a dynamic process which is important in cellular physiology. Oxford Nanopore Technologies direct RNA-sequencing provides a strategy for sequencing the full-length RNA molecule and analysis of the transcriptome and epi-transcriptome. There are currently several tools available for poly(A) tail-length estimation, including well-established tools such as <i>tailfindr</i> and <i>nanopolish</i>, as well as two more recent deep learning models: <i>Dorado</i> and <i>BoostNano</i>. However, there has been limited benchmarking of the accuracy of these tools against gold-standard datasets. In this paper we evaluate four poly(A) estimation tools using synthetic RNA standards (Sequins), which have known poly(A) tail-lengths and provide a valuable approach to measuring the accuracy of poly(A) tail-length estimation. All four tools generate mean tail-length estimates which lie within 12% of the correct value. Overall, <i>Dorado</i><em> </em>is recommended as the preferred approach due to its relatively fast run times, low coefficient of variation and ease of use with integration with base-calling.
提供机构:
GigaScience Database
创建时间:
2025-07-29



