Minimal repeats identified by machine learning as novel cross-species markers of translation initiation sites
收藏Figshare2025-09-26 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_Minimal_repeats_identified_by_interpretable_machine_learning_as_novel_cross-species_markers_of_translation_initiation_sites_b_/28934366
下载链接
链接失效反馈官方服务:
资源简介:
Statistical approaches indicate that tandem repeats (TRs) play a crucial regulatory role in translation initiation site (TIS) selection and proteomic diversity. By leveraging machine learning techniques, we enhanced the previous approaches and investigated the impact of the TR spectrum, ranging from 2 to 85 repeats on TIS selection across four species, consisting of human, mouse, bovine, and fruit fly. We identified a subset of 50 key motifs that effectively distinguish TIS-present from TIS-absent regions with high accuracy. The majority of these motif sequences (80 %) were minimal repeats (MRs), mainly ranging from 2 to 3 repeats. We found evolutionary conservation of the majority of these motifs, underscoring their functional importance, while certain species-specific MRs served as genomic fingerprints, reflecting unique regulatory adaptations. Additionally, the dense distribution of MRs around TISs highlighted their potential as genomic codes for identifying TIS hotspots. This study signified that alongside TRs, MRs are key genomic markers for TIS selection, offering insights into their biological roles and evolutionary significance. The mechanistic aspects of MRs/TRs are yet to be discovered in future studies.
创建时间:
2025-09-26



