ModiCal: A targeted calibration workflow for site-specific m5C validation by nanopore direct RNA sequencing
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP187198
下载链接
链接失效反馈官方服务:
资源简介:
Accurate identification of RNA 5-methylcytidine (m5C) at single-nucleotide resolution remains a central challenge in nanopore direct RNA sequencing (DRS). Current global scanning and modification-aware basecalling methods enable transcriptome-wide profiling but often yield high false-positive rates and lack site-specific accuracy. To address this, we repurposed ModiDeC, originally a de novo multi-modification classifier, into a targeted, high-precision validation tool for RNA modification sites with prior biochemical knowledge. This was implemented through a three-step calibration workflow that alternates between biochemical and computational modules, using the well-characterized m5C2278 site in Saccharomyces cerevisiae 25S rRNA as a starting point. The workflow begins with baseline training using short synthetic RNAs carrying either a methylated or unmodified C2278, followed by calibration with unmodified reference signals curated from full length in vitro transcribed (IVT) RNA, and concludes with biological validation on wild-type and methyltransferase-knockout yeast samples. The baseline model accurately detected the bona fide m5C2278 site but initially produced off-target predictions. Iterative retraining with unmodified IVT signals progressively reduced and ultimately eliminated false positives while maintaining a strong signal at the bona fide site. The final model retained enzyme-dependent detection in wild-type versus knockout yeast and, when explicitly targeted, was also able to detect the second rRNA site, C2870, which remained invisible in the initial analysis. Application to the less well-established m5C1218 site in Dengue virus genomic RNA confirmed that the same calibration logic generalizes across RNA contexts. Together, this study establishes a reproducible and transferable framework that integrates biochemical validation with iterative neural network refinement, providing a route toward reliable site-specific m5C confirmation by nanopore direct RNA sequencing.
创建时间:
2026-01-06



