five

CAZyO_GH

收藏
DataCite Commons2025-11-04 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/CAZyO_GH/30535823
下载链接
链接失效反馈
官方服务:
资源简介:
Glycoside hydrolases (GHs) are central to the carbon cycling across environments, yet accurate annotation of GHs in metagenomic datasets remains challenging due to the multidomain architecture of carbohydrate-active enzymes and the prevalence of unassembled short reads. Here, we present CAZyO<sub>GH</sub> (CAZymes Open-source GH annotation), a curated reference database for the domain-specific identification of 135 protein domains spanning 99 GH families. Unlike existing tools such as dbCAN and GeneHunt, which rely on full-length protein sequences, CAZyO<sub>GH</sub> focuses on individual GH domains, enabling robust annotation of both assembled and unassembled metagenomic data.We first validated CAZyO<sub>GH</sub> by reanalyzing genomes listed in CAZy db, where predicted GH profiles closely matched reported values. Next, we used CAZyO<sub>GH</sub> to analyze twelve human gut metagenomes and twelve newly sequenced soil microbiomes to reveal environment-specific GH repertoires. By accurately detecting catalytic domains independent of the genomic context, CAZyO<sub>GH</sub> improves sensitivity and specificity in short-read metagenomic annotation. This framework provides a scalable and reproducible approach to investigate carbohydrate-active enzymes across ecosystems, advancing our capacity to characterize microbial functional potential in global carbon cycling.
提供机构:
figshare
创建时间:
2025-11-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作