Metagenomics uncovers dietary adaptations for chitin digestion in the gut microbiota of convergent myrmecophagous mammals
收藏Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/7995394
下载链接
链接失效反馈官方服务:
资源简介:
Metagenomics uncovers dietary adaptations for chitin digestion in the gut microbiota of convergent myrmecophagous mammals Sophie Teulleta,#, Marie-Ka Tilaka, Amandine Magdeleinea, Roxane Schaubb,c, Nora M. Weyerd, Wendy Panainod,e, Andrea Fullerd, William. J. Loughryf, Nico L. Avenantg, Benoit de Thoisyh,i, Guillaume Borrelj and Frédéric Delsuca,# aInstitut des Sciences de l’Evolution de Montpellier (ISEM), Univ Montpellier, CNRS, IRD, Montpellier, France bCIC AG/Inserm 1424, Centre Hospitalier de Cayenne Andrée Rosemon, Cayenne, French Guiana cTropical Biome and immunopathology, Université de Guyane, Labex CEBA, DFR Santé, Cayenne, French Guiana dBrain Function Research Group, School of Physiology, University of the Witwatersrand, Johannesburg, South Africa eCentre for African Ecology, School of Animals, Plant, and Environmental Sciences, University of the Witwatersrand, Johannesburg, South Africa fDepartment of Biology, Valdosta State University, Valdosta, GA, USA gNational Museum and Centre for Environmental Management, University of the Free State, Bloemfontein, South Africa hInstitut Pasteur de la Guyane, Cayenne, French Guiana, France iKwata NGO, Cayenne, French Guiana, France jInstitut Pasteur, Université Paris Cité, UMR CNRS 6047, Evolutionary Biology of the Microbial Cell, Paris, France #Corresponding authors: sophie.teullet@umontpellier.fr; frederic.delsuc@umontpellier.fr Abstract In mammals, myrmecophagy (ant and termite consumption) represents a striking example of dietary convergence. This trait evolved independently at least five times in placentals with myrmecophagous species comprising aardvarks, anteaters, some armadillos, pangolins, and aardwolves. The gut microbiome plays an important role in dietary adaptation, and previous analyses of 16S rRNA metabarcoding data have revealed convergence in the composition of the gut microbiota among some myrmecophagous species. However, the functions performed by these gut bacterial symbionts and their potential role in the digestion of prey chitinous exoskeletons remain open questions. Using long- and short-read sequencing of fecal samples, we generated 29 gut metagenomes from nine myrmecophagous and closely related insectivorous species sampled in French Guiana, South Africa, and the USA. From these, we reconstructed 314 high-quality bacterial genome bins of which 132 carried chitinase genes, highlighting their potential role in insect prey digestion. These chitinolytic bacteria belonged mainly to the family Lachnospiraceae, and some were likely convergently recruited in the different myrmecophagous species as they were detected in several host orders (i.e., Enterococcus faecalis, Blautia sp), suggesting that they could be directly involved in the adaptation to myrmecophagy. Others were found to be more host-specific, possibly reflecting phylogenetic constraints and environmental influences. Overall, our results highlight the potential role of the gut microbiome in chitin digestion in myrmecophagous mammals and provide the basis for future comparative studies performed at the mammalian scale to further unravel the mechanisms underlying the convergent adaptation to myrmecophagy. Main figures and corresponding datasets Figure_1_dataset.zip contains: FIGURE 1. Phylogenetic position of the 314 high-quality selected bins reconstructed from 29 gut metagenomes of the nine focal myrmecophagous species within a reference prokaryotic phylogeny. A: Phylogeny of the 314 selected bins (red branches) with 2496 prokaryote reference genomes. Circles respectively indicate (from inner to outer circles): the bacterial phyla and kingdom to which these genome bins were assigned based on the Genome Taxonomy Database release 7 (Parks et al, 2021). Clades, where a subtree was defined, are highlighted in blue for the Firmicutes (Fig. 1B), green for the Bacteroidetes, and pink for the Proteobacteria (Figs. S2 A and B, respectively). B: Subtree within Fimircutes showing myrmecophagous-specific clades (blue highlights; dark blue corresponds to the three clades mentioned in the results, light blue to the other clades). The outer circle indicates the bacterial family to which these genome bins were assigned based on the Genome Taxonomy Database. Bins’ names of the myrmecophagous-specific clades are indicated at leaves of the phylogenetic tree together with the genus to which they were assigned to. phylophlan_LR_SR_ToL_FINAL_concatenated.aln: Alignment of the concatenated markers assembled by PhyloPhlAn v3.0.58. phylophlan_LR_SR_ToL_FINAL.tre: Phylogenetic tree reconstructed by PhyloPhlAn v3.0.58 for the 314 high quality selected genome bins and the 2496 prokaryote reference genomes. Figure_2_dataset.zip contains: FIGURE 2. Phylogeny of the 394 GH18 sequences identified in 132 high-quality selected bins reconstructed from 29 gut metagenomes of the nine focal myrmecophagous species and relatives. Red branches indicate the 237 sequences having an active chitinolytic site (DXXDXDXE). Circles respectively indicate (from inner to outer circles): the bacterial family and phyla of the bin the sequence was retrieved from. Colored sequence names indicate the host species. Colored circles at certain nodes indicate enzymes to which sequences are similar when blasting them against the NCBI non-redundant protein database. Sequence names are indicated at leaves of the tree and begin with the genus to which the bin they were identified in was assigned to. GH18_sequences_from_selected_bins_alignment.fasta: Alignment of the 394 GH18 sequences identified in 132 high quality selected bins computed with MAFFT v7.450. GH18__sequences_from_selected_bins_tree.newick: Phylogenetic tree of the 394 GH18 sequences inferred with RAxML v8.2.11 within Geneious Prime 2022.0.2. Figure_3_dataset.zip contains: FIGURE 3. Detection of the 314 high-quality bacterial genomes (lines) in the 29 gut metagenomes (columns) of the nine focal species. Each square indicates the detection of a genome bin in a sample as estimated by anvi’o v7 (Eren et al, 2021). Names of bins are indicated on the left with red indicating chitinolytic bins (Table S2). The names begin with the genus to which the bin was assigned to. Asterisks (*) indicate bins detected in at least one soil sample (detection > 0.25) (Fig. S4, Table S2, and detection table available via Zenodo). Phylogenetic relationships of host species distinguished by different color strips are represented at the bottom of the graph. Columns on the right indicate (from left to right): the number of GH18 sequences identified in each bin (from 0 to 17), the bin’s taxonomic phylum, class, order, and family. The phylogeny of the 314 selected bins inferred with PhyloPhlAn v3.0.58 (Asnicar et al, 2020) is also represented on the right of the graph (see Fig. S1). Silhouettes were downloaded from phylopic.org. detection_bins_across_gut_metagenomes.txt: Detection table as tab-delimited file containing the detection values inferred by anvi'o v7 for the 314 high quality selected bins across the 29 gut metagenomes from the nine focal myrmecophagous species. Figure_4_dataset.zip contains: FIGURE 4. Distribution of chitinolytic selected bins (red links) among the nine focal myrmecophagous species and relatives. Phylogenies of the 314 high-quality selected bins (Fig. S1) and of the nine host species (downloaded from timetree.org) are represented respectively on the left and the right of the graph. Links illustrate, for each bin, in which host species the bin was detected (detection threshold > 0.25). Red links indicate bins in which at least one GH18 sequence with an active chitinolytic site (DXXDXDXE) was found (chitinolytic bins). The size of the circles at the tips of the host phylogeny is proportional to the number of samples (n = 1 for D. kap; n = 2 for D. nov, C. uni and M. tri; n = 3 for T. tet and O. afe; n = 4 for D. sp. nov FG; n = 6 for P. cri and S. tem). Bins’ names are indicated at the tip of the bins’ phylogeny and main bacterial phyla are indicated by colored vertical bars. This graph was done with the cophylo R package within the phytools suite (Revell, 2012). Silhouettes were downloaded from phylopic.org. presence_absence_MAGs_in_metagenomes.txt: Presence/absence matrix of the 314 selected genome bins across the 29 gut metagenomes. host_species_phylo_reduced_fig4.newick: Host phylogenetic timetree. Table_1_sample_infos.xls: Detailed sample information for the 33 fecal samples collected. N.B.: Diet was determined based on field observations (i.e., dissections) and the literature. Supplementary results Supplementary_results_Teullet_etal_2023.zip includes a comparison of genome statistics of the selected bins reconstructed from the long-read vs the short-read datasets, a phylogeny of the set of selected bins before dereplication (n = 407) and a comparison of the distribution of shared and specific genome bins carrying GH18 among host orders. Supplementary material Supplementary_material_Teullet_etal_2023.zip contains Supplementary figures (S1-S4) and tables (S1-S4). phylophlan_314_bins_phylogeny_FINAL_concatenated.aln and phylophlan_314_bins_phylogeny_FINAL.tre: Alignment of the concatenated markers and the final tree (respectively) reconstructed by PhyloPhlAn v3.0.58 for the 314 high-quality selected and dereplicated genome bins. phylophlan_407_selected_bins_nodRep_concatenated.aln and phylophlan_407_selected_bins_phylogeny_FINAL.tre: Alignment of the concatenated markers and the final tree (respectively) reconstructed by PhyloPhlAn v3.0.58 for the 407 high-quality selected genome bins before dereplication. abundance_bins_across_gut_metagenomes.txt: A tab-delimited file corresponding to the absolute abundance values inferred by anvi'o v7 for the 314 high-quality selected bins across the 29 gut metagenomes from the nine focal myrmecophagous species. detection_bins_across_soil_samples.txt: A tab-delimited file corresponding to the detection values inferred by anvi'o v7 for the 140 high-quality selected bins reconstructed from the aardvark, ground pangolin and southern aardwolf gut metagenomes across the eight soil samples collected on sample sites in South Africa. Assemblies Long-read_metagenomic_assemblies_polished.zip contains the 31 long-read metagenomes assembled with metaFlye strain v2.9 and polished with short reads using Pilon v1.4, which were used for binning. Long-read_metagenomic_assemblies_not_polished.zip contains the 33 long-read metagenomes assembled with metaFlye strain v2.9 before polishing. Short-read_metagenomic_assemblies.zip contains the 31 short-read metagenomes assembled with metaSPAdes and MEGAHIT. N.B: Two samples (DASY M1746 and DASY VLD168) were not sequenced using Illumina short reads. Only long reads were generated and assembled for these two samples and are made available here. As these assemblies could not be polished, these samples were not included in downstream analyses. Two samples (CAB M3141 and MYR M5293) were highly contaminated by host reads and not used in downstream analyses. As they were still assembled with the other samples, the corresponding metagenomes are made available here. Binning: genome bins and dereplication results High-quality_selected_bins_dereplicated.zip contains the 314 high quality selected bins (>90% completion, <5% redundancy) reconstructed from long- and short-read metagenomes with metaBAT2 and dereplicated with dRep at 98% ANI. metaBAT2_short-read_assemblies_bins.zip contains all bins reconstructed from the short-read assemblies with metaBAT2 (i.e., output of metaBAT2). metaBAT2_long-read_assemblies_bins.zip contains all bins reconstructed from the long-read polished assemblies with metaBAT2 (i.e., output of metaBAT2). Output_dRep_98ANI_407_bins_long-short-reads.zip contains the output of the dereplication analysis done on the set of 407 high-quality selected genome bins reconstructed from long- (n = 201) and short-read (n = 206; labeled "spad") metagenomes. It was performed with dRep using default parameters. After this step, the final dataset included 314 high-quality non-redundant genome bins. This folder includes: LR_SR_407_bins_dRep_98ANI_Primary_clustering_dendrogram.pdf: The primary clustering of selected genome bins using the Mash algorithm with an ANI threshold of 90%. LR_SR_407_bins_dRep_98ANI_Secondary_clustering_dendrograms.pdf: The secondary clustering of selected genome bins using the fastANI algorithm with an ANI threshold of 98%. LR_SR_407_bins_dRep_98ANI_Cluster_scoring.pdf: The clustering score attributed to each genome bin during dereplication. Asteriks (*) indicate genomes chosen to be the representative genomes of their cluster.
# 论文标题
《宏基因组学揭示趋同食蚁哺乳动物肠道微生物群的几丁质消化饮食适应性》
# 作者
Sophie Teullet*, Marie-Ka Tilak, Amandine Magdeleine, Roxane Schaub,b,c, Nora M. Weyer, Wendy Panaino,d,e, Andrea Fuller, William. J. Loughryf, Nico L. Avenantg, Benoit de Thoisyh,i, Guillaume Borrelj and Frédéric Delsuc*,#
# 单位
a 法国蒙彼利埃大学、法国国家科研中心(CNRS)、法国发展研究院(IRD)蒙彼利埃进化科学研究所(ISEM)
b 法属圭亚那卡宴市 Cayenne Andrée Rosemon 医院 CIC AG/Inserm 1424
c 法属圭亚那大学热带生物组与免疫病理学实验室、CEBA 卓越实验室、DFR 健康研究部
d 南非金山大学生理学院脑功能研究组
e 南非金山大学非洲生态中心、动物、植物与环境科学学院
f 美国佐治亚州瓦尔多斯塔州立大学生物学系
g 南非布隆方丹自由州大学国家博物馆与环境管理中心
h 法属圭亚那巴斯德研究所
i 法属圭亚那 Kwata 非政府组织
j 巴黎巴黎西岱大学巴斯德研究所、CNRS UMR 6047 微生物细胞进化生物学研究组
* 共同第一作者,# 通讯作者:sophie.teullet@umontpellier.fr; frederic.delsuc@umontpellier.fr
## 摘要
在哺乳动物中,食蚁习性(myrmecophagy,取食蚂蚁和白蚁)是饮食趋同演化的典型范例。该性状在真兽类中至少独立演化五次,食蚁物种包括土豚、食蚁兽、部分犰狳、穿山甲和土狼。肠道微生物群在饮食适应中发挥重要作用,此前针对16S rRNA 扩增子测序(16S rRNA metabarcoding)数据的分析已揭示部分食蚁物种的肠道微生物群组成存在趋同性。然而,这些肠道细菌共生体的功能及其在猎物体几丁质外骨骼消化中的潜在作用仍有待阐明。本研究通过对粪便样本开展长读长与短读长测序,从法属圭亚那、南非和美国采集的9种食蚁及近缘食虫哺乳动物的样本中构建了29份肠道宏基因组。由此我们重构得到314个高质量细菌基因组箱,其中132个携带几丁质酶基因(chitinase genes),表明这些细菌可能在昆虫猎物消化中发挥潜在作用。这些几丁质降解细菌主要隶属于毛螺菌科(Lachnospiraceae),部分类群在不同食蚁物种中被反复检出(如粪肠球菌Enterococcus faecalis、布劳特氏菌属Blautia sp.),提示它们可能直接参与了食蚁习性的适应过程;其余类群则表现出较强的宿主特异性,这可能反映了系统发育约束与环境影响。综上,本研究结果揭示了肠道微生物群在食蚁哺乳动物几丁质消化中的潜在作用,为后续开展哺乳动物尺度的比较研究以进一步解析食蚁习性趋同适应的机制奠定了基础。
## 主要图表与对应数据集
### Figure_1_dataset.zip
包含:
图1:基于参考原核生物系统发育,对从9种焦点食蚁哺乳动物的29份肠道宏基因组中重构的314个高质量细菌基因组箱的系统发育位置。
A: 314个选定基因组箱(红色分支)与2496个原核生物参考基因组的系统发育树。圆圈从内到外依次表示:基于《基因组分类学数据库第7版》(Parks等,2021)分配的这些基因组箱所属的细菌门和界。其中被定义为亚树的演化支分别以蓝色(厚壁菌门Firmicutes,图1B)、绿色(拟杆菌门Bacteroidetes,分别对应补充图S2A和S2B)和粉色(变形菌门Proteobacteria)高亮标记。
B: 厚壁菌门内的亚树,显示食蚁特异性演化支(蓝色高亮;深蓝色对应文中提及的3个演化支,浅蓝色对应其余演化支)。外圈表示基于《基因组分类学数据库》分配的这些基因组箱所属的细菌科。系统发育树叶片处标注了食蚁特异性演化支的基因组箱名称及其所属属。
- phylophlan_LR_SR_ToL_FINAL_concatenated.aln:由PhyloPhlAn v3.0.58组装的串联标记基因序列比对文件。
- phylophlan_LR_SR_ToL_FINAL.tre:由PhyloPhlAn v3.0.58为314个高质量选定基因组箱与2496个原核生物参考基因组重构的系统发育树文件。
### Figure_2_dataset.zip
包含:
图2:从9种焦点食蚁哺乳动物的29份肠道宏基因组中重构的132个高质量细菌基因组箱内鉴定的394个GH18家族序列的系统发育树。红色分支表示237个具有活性几丁质降解位点(DXXDXDXE)的序列。圆圈从内到外依次表示:该序列所属基因组箱的细菌科和门。彩色序列名称对应宿主物种。部分节点处的彩色圆圈表示该序列与NCBI非冗余蛋白质数据库中比对到的同源酶相似性。系统发育树叶片处标注了序列名称,其前缀为该序列所属基因组箱的 assigned genus。
- GH18_sequences_from_selected_bins_alignment.fasta:由MAFFT v7.450对132个高质量选定基因组箱内鉴定的394个GH18家族序列进行比对得到的文件。
- GH18_sequences_from_selected_bins_tree.newick:由RAxML v8.2.11在Geneious Prime 2022.0.2环境下为394个GH18家族序列重构的系统发育树文件。
### Figure_3_dataset.zip
包含:
图3:314个高质量细菌基因组箱(行)在9种焦点物种的29份肠道宏基因组(列)中的检出情况。每个方格表示通过anvi’o v7(Eren等,2021)估算的该基因组箱在样本中的检出信号。左侧标注了基因组箱名称,红色表示几丁质降解基因组箱(表S2)。名称前缀为该基因组箱所属的属。星号(*)表示至少在一份土壤样本中检出(检出值>0.25)的基因组箱(补充图S4、表S2及检出数据表可通过Zenodo获取)。宿主物种的系统发育关系以底部不同颜色条带区分。右侧列从左到右依次表示:每个基因组箱中鉴定的GH18序列数量(0~17)、该基因组箱的分类学门、纲、目和科。右侧还展示了由PhyloPhlAn v3.0.58(Asnicar等,2020)重构的314个选定基因组箱的系统发育树(详见补充图S1)。剪影图像下载自phylopic.org。
- detection_bins_across_gut_metagenomes.txt:制表符分隔的检出值文件,包含由anvi’o v7估算的314个高质量选定基因组箱在9种焦点食蚁哺乳动物的29份肠道宏基因组中的检出数值。
### Figure_4_dataset.zip
包含:
图4:几丁质降解选定基因组箱(红色连线)在9种焦点食蚁哺乳动物及近缘物种中的分布情况。左侧为314个高质量选定基因组箱的系统发育树(补充图S1),右侧为9种宿主物种的系统发育树(下载自timetree.org)。连线表示每个基因组箱在哪些宿主物种中被检出(检出阈值>0.25)。红色连线表示该基因组箱中至少包含一个带有活性几丁质降解位点(DXXDXDXE)的GH18序列的基因组箱(即几丁质降解基因组箱)。宿主系统发育树顶端圆圈的大小与样本数量成正比(D. kap: n=1;D. nov、C. uni和M. tri: n=2;T. tet和O. afe: n=3;D. sp. nov FG: n=4;P. cri和S. tem: n=6)。基因组箱系统发育树顶端标注了基因组箱名称,主要细菌门以彩色竖条标注。本图由phytools套件中的cophylo R包(Revell,2012)绘制。剪影图像下载自phylopic.org。
- presence_absence_MAGs_in_metagenomes.txt:314个选定宏基因组组装基因组(MAGs)在29份肠道宏基因组中的存在/缺失矩阵文件。
- host_species_phylo_reduced_fig4.newick:宿主物种系统发育时间树文件。
- Table_1_sample_infos.xls:收集的33份粪便样本的详细样本信息表。注:饮食类型基于野外观察(即解剖)及文献确定。
## 补充结果
Supplementary_results_Teullet_etal_2023.zip 包含:长读长与短读长数据集重构的选定基因组箱的基因组统计比较、去重复前的选定基因组箱集合(n=407)的系统发育树,以及携带GH18的共享和特异性基因组箱在宿主演化支中的分布比较。
## 补充材料
Supplementary_material_Teullet_etal_2023.zip 包含补充图(S1~S4)和补充表(S1~S4)。
- phylophlan_314_bins_phylogeny_FINAL_concatenated.aln 和 phylophlan_314_bins_phylogeny_FINAL.tre:分别为PhyloPhlAn v3.0.58为314个高质量选定并去重复的基因组箱重构的串联标记基因序列比对文件与最终系统发育树文件。
- phylophlan_407_selected_bins_nodRep_concatenated.aln 和 phylophlan_407_selected_bins_phylogeny_FINAL.tre:分别为PhyloPhlAn v3.0.58为去重复前的407个高质量选定基因组箱重构的串联标记基因序列比对文件与最终系统发育树文件。
- abundance_bins_across_gut_metagenomes.txt:制表符分隔的丰度文件,包含由anvi’o v7估算的314个高质量选定基因组箱在9种焦点食蚁哺乳动物的29份肠道宏基因组中的绝对丰度值。
- detection_bins_across_soil_samples.txt:制表符分隔的检出值文件,包含由anvi’o v7估算的从土豚、树穿山甲和南部土狼肠道宏基因组中重构的140个高质量选定基因组箱在南非采样点采集的8份土壤样本中的检出数值。
## 基因组组装
Long-read_metagenomic_assemblies_polished.zip:包含31份长读长宏基因组组装结果,由metaFlye v2.9组装并使用Pilon v1.4通过短读长数据抛光,用于后续基因组分箱分析。
Long-read_metagenomic_assemblies_not_polished.zip:包含33份未经抛光的长读长宏基因组组装结果,由metaFlye v2.9组装得到。
Short-read_metagenomic_assemblies.zip:包含31份由metaSPAdes和MEGAHIT组装的短读长宏基因组结果。注:两份样本(DASY M1746和DASY VLD168)未进行Illumina短读长测序,仅生成并组装了长读长数据,此处提供其组装结果。由于无法对其进行抛光,这两份样本未纳入后续分析。另有两份样本(CAB M3141和MYR M5293)被宿主读长严重污染,未纳入后续分析,但仍与其他样本一同完成组装,其对应的宏基因组组装结果在此处提供。
## 基因组分箱:基因组箱与去重复结果
High-quality_selected_bins_dereplicated.zip:包含从长读长和短读长宏基因组中通过metaBAT2重构并以dRep在98%平均核苷酸一致性(ANI, Average Nucleotide Identity)阈值下去重复得到的314个高质量选定基因组箱(完整性>90%,冗余度<5%)。
metaBAT2_short-read_assemblies_bins.zip:包含所有通过metaBAT2从短读长组装结果中重构的基因组箱(即metaBAT2的输出结果)。
metaBAT2_long-read_assemblies_bins.zip:包含所有通过metaBAT2从长读长抛光组装结果中重构的基因组箱(即metaBAT2的输出结果)。
Output_dRep_98ANI_407_bins_long-short-reads.zip:包含对从长读长(n=201)和短读长(n=206,标注为“spad”)宏基因组中重构的407个高质量选定基因组集合进行去重复分析的结果,使用dRep默认参数运行。经此步骤后,最终数据集包含314个高质量非冗余基因组箱。该文件夹包含:
- LR_SR_407_bins_dRep_98ANI_Primary_clustering_dendrogram.pdf:使用Mash算法、以90%ANI为阈值的选定基因组箱初级聚类树状图。
- LR_SR_407_bins_dRep_98ANI_Secondary_clustering_dendrograms.pdf:使用fastANI算法、以98%ANI为阈值的选定基因组箱次级聚类树状图。
- LR_SR_407_bins_dRep_98ANI_Cluster_scoring.pdf:去重复过程中为每个基因组箱分配的聚类得分文件。星号(*)表示被选为该聚类代表基因组的物种。
创建时间:
2023-06-28



