Comparative genomic analysis of chemosensory-related gene families in gastropods
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.cz8w9gj7h
下载链接
链接失效反馈官方服务:
资源简介:
Chemoreception is critical for the survival and reproduction of animals. Except for a reduced group of insects and spiders, the molecular identity of chemosensory proteins is poorly understood in invertebrates. Gastropoda is the extant mollusk class with the greatest species richness, including marine, freshwater, and terrestrial lineages, and likely, highly diverse chemoreception systems. Here, we performed a comprehensive comparative genome analysis taking advantage of the chromosome-level information of two Gastropoda species, one of which belongs to a lineage that underwent a whole genome duplication event. We identified thousands of previously uncharacterized chemosensory-related genes, the majority of them encoding G protein-coupled receptors (GPCR), mostly organized into clusters distributed across all chromosomes. We also detected gene families encoding degenerin epithelial sodium channels (DEG-ENaC), ionotropic receptors (IR), sensory neuron membrane proteins (SNMP), Niemann–Pick type C2 (NPC2) proteins, and lipocalins, although much smaller in size. Our phylogenetic analysis of the GPCR gene family across protostomes revealed: (i) large gene family expansions in Gastropoda; (ii) clades including members from all protostomes; and (iii) species-specific clades with a huge number of receptors. For the first time, we provide new and valuable knowledge into the evolution of the chemosensory gene families in invertebrates other than arthropods.
Methods
Please see the README document. The starting datasets were represented by files with the nucleotide sequence corresponding to the genome and proteome of three gastropod species, with their respective annotation files. These data are available in the server of the National Center for Biotechnology Information (NCBI); it was not data generated in our work, as we indicated in materials and methods. Then we used the BITACORA v.1.2.1 software to identify genes from chemosensory families, the output of the analysis yielded files with protein sequences identified by this tool. Alignments were made on these sequences with the Mafft v.7.453 software and phylogenetic trees were built with IQTree v.2.1.2. In addition, we run homemade scripts for the identification of gene clusters by measuring the physical distance among genes. Finally, the genetic distances among genes were estimated with the MEGA-CC v.11.0.11 program.
化学感应对于动物的生存与繁衍至关重要。除了少数特化缩减的昆虫和蜘蛛类群外,无脊椎动物化学感应蛋白的分子特征仍鲜为人知。腹足纲(Gastropoda)是现存物种丰富度最高的软体动物纲,涵盖海洋、淡水与陆生支系,其化学感应系统大概率具有高度多样性。本研究借助两种腹足类物种的染色体级基因组信息开展了全面的比较基因组分析,其中一个物种所属的演化支系曾发生过全基因组复制事件。我们鉴定出数千个此前未被表征的化学感应相关基因,其中绝大多数编码G蛋白偶联受体(G protein-coupled receptor, GPCR),这些基因大多成簇分布于各条染色体上。我们还检测到了编码变性素上皮钠通道(degenerin epithelial sodium channel, DEG-ENaC)、离子型受体(ionotropic receptor, IR)、感觉神经元膜蛋白(sensory neuron membrane protein, SNMP)、尼曼-皮克C2型蛋白(Niemann–Pick type C2, NPC2)以及脂笼蛋白(lipocalin)的基因家族,尽管这些家族的基因数量相对较少。我们对原口动物的GPCR基因家族开展系统发育分析后发现:其一,腹足纲中存在大规模的基因家族扩张;其二,存在包含所有原口动物类群成员的进化枝;其三,存在包含大量受体的物种特异性进化枝。本研究首次为节肢动物以外的无脊椎动物化学感应基因家族的演化提供了全新且极具价值的认知。
方法
详见README文档。本研究的初始数据集包含三种腹足类物种的基因组与蛋白质组核苷酸序列文件,以及各自对应的注释文件。这些数据可从美国国家生物技术信息中心(NCBI)服务器获取,正如我们在材料与方法部分所述,本研究未自主生成原始实验数据。随后,我们使用BITACORA v.1.2.1软件鉴定化学感应家族基因,分析结果输出了经该工具鉴定的蛋白质序列文件。我们使用Mafft v.7.453软件对上述序列进行多序列比对,并借助IQTree v.2.1.2软件构建系统发育树。此外,我们通过编写自定义脚本,通过测量基因间的物理距离来鉴定基因簇。最后,我们使用MEGA-CC v.11.0.11软件估算基因间的遗传距离。
创建时间:
2023-05-02



