five

REPARATION: Ribosome Profiling Assisted (Re-) Annotation of Bacterial genomes

收藏
NIAID Data Ecosystem2026-04-30 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP094797
下载链接
链接失效反馈
官方服务:
资源简介:
The delineation of genes in bacteria has remained an important challenge because prokaryotic genomes are often tightly packed frequently resulting in overlapping genes. We hereby present a de novo approach called REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION) to delineate translated open reading frames (ORFs) in bacteria independent of (available) genome annotation. By deep sequencing of ribosome protected mRNA fragments (RPF) to map translating ribosomes across the entire genome, REPARATION takes advantage of the recently developed ribosome profiling (Ribo-seq) technique. REPARATION starts by traversing the entire genome to generate all possible ORFs and then collects their corresponding RPF signal information. Based on a growth curve model to estimate minimum ORF read density and Ribo-seq RPF coverage, thresholds indicative of translation is estimated. Finally, our algorithm applies a random forest model to build a classifier to classify putative protein coding ORFs. We evaluated the performance of REPARATION on 3 annotated bacterial species using in-house generated Ribo-seq data and matching N-terminal and shotgun proteomics data next to publically available Ribo-seq data. In all cases, about 80% of the ORFs predicted by REPARATION were previously annotated as protein coding. While 13-20% were variants of previously annotated ORFs and about 3-4% point to novel translated ORFs within intergenic or other regions previously annotated as non-coding. Without stringent length restrictions REPARATION was able to identify several small ORFs (sORFs). Multiple supportive evidence from matching MS data and sequence conservation analysis was obtained to validate predicted ORFs. Overall design: Ribosome profiling and mRNA sequencing of monosome and polysome-enriched samples of Salmonella enterica serovar Typhimurium - strain SL1344.
创建时间:
2021-11-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作