five

Lambda genome data with either CpG/GpC methylation, CpG Glucosylation, or no modifications.. ReQuant data

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB77524
下载链接
链接失效反馈
官方服务:
资源简介:
Nanopore sequencing allows identification of base modifications, such as methylation, directly from raw current data. Prevailing approaches, including deep learning (DL) methods, require training data covering all possible sequence contexts. This data can be prohibitively expensive or impossible to obtain for some modifications. Hence, research into DNA modifications focuses on the most prevalent modification in human DNA: 5mC in a CpG context. Improved generalisation is required to reach the technology’s full potential: calling any modification from raw current data. We developed ReQuant, an algorithm to impute full, k-mer based, modification models from limited k-mer context examples. Our method is highly accurate for calling modifications (CpG/GpC methylation, and CpG glucosylation) in Lambda phage R9 data when fitting on ≤25% of all possible 6-mers with a modification, and extends to human R10 data. The success of our approach shows that DNA modifications have a consistent and therefore predictable effect on Nanopore current levels, suggesting that interpretable rule-based imputation in unseen contexts is possible. Our approach circumvents the complexity of modification-specific DL tools and enables modification calling when not all sequence contexts can be obtained, opening up a vast field of biological base modification research.
创建时间:
2024-08-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作