five

Higher evolutionary dynamics of gene copy number for Drosophila glue genes located near short repeat sequences

收藏
DataONE2023-04-21 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:dc0d7c33415b8057aeca266476426402cf0710dd8b85ac2d9f0aaaef34454771
下载链接
链接失效反馈
官方服务:
资源简介:
Background During evolution, genes can experience duplications, losses, inversions and gene conversions. Why certain genes are more dynamic than others is poorly understood. Here we examine how several Sgs genes encoding glue proteins, which make up a bioadhesive that sticks the animal during metamorphosis, have evolved in Drosophila species. Results We examined high-quality genome assemblies of 24 Drosophila species to study the evolutionary dynamics of four glue genes that are present in D. melanogaster and are part of the same gene family – Sgs1, Sgs3, Sgs7 and Sgs8 – across approximately 30 millions of years. We annotated a total of 102 Sgs genes and grouped them into 4 subfamilies. We present here a new nomenclature for these Sgs genes based on protein sequence conservation, genomic location and presence/absence of internal repeats. Two types of glue genes were uncovered. The first category (Sgs1, Sgs3x, Sgs3e) showed a few gene losses but no duplication, no local inversion and no ..., , Supplementary Files File S1. Compressed zip file of the gene annotations (GenBank .gb files, inputs for Easyfig) of large genomic regions containing all the Sgs genes and their neighboring genes in the 24 studied species. File S2. Fasta file of all the Sgs amino acid sequences used to create Figure 1B and Figure S1. File S3. Compressed zip file of reference and corrected nucleotide sequences used to create Figure S2. File S4. Compressed zip file of Sgs protein alignments (fasta.files) used to compute phylogenetic trees and make Weblogo figures. File S5. Sgs coding sequence length in bp for species having an Sgs3x copy (.csv file, input for R script sgs_size.R). File S6. Sgs coding sequence length in bp for species not having an Sgs3x copy (.csv file, input for R script sgs_size.R). File S7. Compressed zip file of comparisons between pairs of large genomic regions (.out files obtained as outputs from Easyfig). File S8. Table of pairwise percentage of identity between several Sgs1 and Sgs...

## 背景 在进化过程中,基因可发生基因重复(duplications)、基因丢失(losses)、倒位(inversions)及基因转换(gene conversions)现象。目前学界对于为何部分基因的进化动态性显著高于其余基因的机制仍不甚明晰。本研究聚焦于黑腹果蝇(D. melanogaster)中4个编码胶蛋白的Sgs基因(Sgs genes)——这类胶蛋白构成一种生物粘合剂(bioadhesive),可使动物在变态发育(metamorphosis)阶段得以附着——并探究其在果蝇属(Drosophila)物种中的进化历程。 ## 结果 本研究针对24个果蝇物种的高质量基因组组装(genome assemblies)结果展开分析,以探究约3000万年演化历程中,黑腹果蝇(D. melanogaster)中4个同属胶蛋白基因家族(gene family)的基因——即Sgs1、Sgs3、Sgs7与Sgs8——的进化动态。研究共注释得到102个Sgs基因,并将其划分为4个亚家族(subfamilies)。本研究基于蛋白序列保守性、基因组位置及内部重复序列的有无,提出了一套全新的Sgs基因命名法则(nomenclature)。本研究共鉴定出两类胶蛋白基因:第一类(包含Sgs1、Sgs3x、Sgs3e)仅发生少量基因丢失事件,未出现基因重复、局部倒位及……(原文未完成)。 ## 补充文件 补充文件1:涵盖24个研究物种中全部Sgs基因及其邻近基因的大型基因组区域的基因注释压缩包(GenBank(GenBank).gb格式文件,为Easyfig(Easyfig)的输入文件)。 补充文件2:用于绘制图1B与图S1的所有Sgs氨基酸序列的FASTA(FASTA)文件。 补充文件3:用于绘制图S2的参考序列与校正后的核苷酸序列压缩包。 补充文件4:用于构建系统发育树(phylogenetic trees)及生成Weblogo(Weblogo)图的Sgs蛋白比对文件(FASTA(FASTA)格式文件)压缩包。 补充文件5:携带Sgs3x拷贝的物种的Sgs编码序列长度(以碱基对为单位)数据表(.csv格式文件,为R脚本(R script)sgs_size.R的输入文件)。 补充文件6:未携带Sgs3x拷贝的物种的Sgs编码序列长度(以碱基对为单位)数据表(.csv格式文件,为R脚本(R script)sgs_size.R的输入文件)。 补充文件7:多组大型基因组区域比对结果压缩包(为Easyfig(Easyfig)输出的.out格式文件)。 补充文件8:多个Sgs1与Sgs……基因间的两两百分比相似度数据表。
创建时间:
2023-11-30
二维码
社区交流群
二维码
科研交流群
商业服务