five

Annotation-free prediction of microbial dioxygen utilization

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Annotation-free_prediction_of_microbial_dioxygen_utilization/26065345
下载链接
链接失效反馈
官方服务:
资源简介:
Aerobes require dioxygen to grow; anaerobes do not. But nearly all microbes -- aerobes, anaerobes, and facultative organisms alike -- express enzymes whose substrates include oxygen, if only for detoxification. This presents a challenge when trying to assess which organisms are aerobic from genomic data alone. This challenge can be overcome by noting that oxygen utilization has wide-ranging effects on microbes: aerobes typically have larger genomes encoding distinctive oxygen-utilizing enzymes, for example. These effects permit high-quality prediction of oxygen utilization from annotated genome sequences, with several models displaying ~80% accuracy on a ternary classification task wherein blind guessing is only 33% accurate. Since genome annotation is compute-intensive and relies on many assumptions, we asked if annotation-free methods also perform well. We discovered that simple and efficient models based entirely on genome sequence content -- e.g. triplets of amino acids -- perform as well as intensive annotation-based classifiers, enabling rapid processing of genomes. We further show that amino acid trimers are useful because they encode information about protein composition and phylogeny. To showcase the utility of rapid prediction, we estimated the prevalence of aerobes and anaerobes in diverse natural environments cataloged in the Earth Microbiome Project.Focusing on a well-studied oxygen gradient in the Black Sea, we found quantitative correspondence between local chemistry (oxygen sulfide concentration ratio) and the composition of microbial communities. We therefore suggest that statistical methods like ours might be used to estimate, or "sense,'' pivotal features of the chemical environment using DNA sequencing data.
创建时间:
2024-06-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作