Data from: A garter snake transcriptome: pyrosequencing, de novo assembly, and sex-specific differences
收藏DataONE2015-05-26 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Background: The reptiles, characterized by both diversity and unique evolutionary adaptations, provide a comprehensive system for comparative studies of metabolism, physiology, and development. However, molecular resources for ectothermic reptiles are severely limited, hampering our ability to study the genetic basis for many evolutionarily important traits such as metabolic plasticity, extreme longevity, limblessness, venom, and freeze tolerance. Here we use massively parallel sequencing (454 GS-FLX Titanium) to generate a transcriptome of the western terrestrial garter snake (Thamnophis elegans) with two goals in mind. First, we develop a molecular resource for an ectothermic reptile; and second, we use these sex-specific transcriptomes to identify differences in the presence of expressed transcripts and potential genes of evolutionary interest. Results: Using sex-specific pools of RNA (one pool for females, one pool for males) representing 7 tissue types and 35 diverse individuals, we produced 1.24 million sequence reads, which averaged 366 bp in length after cleaning. Assembly of the cleaned reads from both sexes with NEWBLER and MIRA resulted in 96,379 contigs containing 87% of the cleaned reads. Over 34% of these contigs and 13% of the singletons were annotated based on homology to previously identified proteins. From these homology assignments, additional clustering, and ORF predictions, we estimate that this transcriptome contains ~13,000 unique genes that were previously identified in other species and over 66,000 transcripts from unidentified protein-coding genes. Furthermore, we use a graph-clustering method to identify contigs linked by NEWBLER-split reads that represent divergent alleles, gene duplications, and alternatively spliced transcripts. Beyond gene identification, we identified 95,295 SNPs and 31,651 INDELs. From these sex-specific transcriptomes, we identified 190 genes that were only present in the mRNA sequenced from one of the sexes (84 female-specific, 106 male-specific), and many highly variable genes of evolutionary interest. Conclusions: This is the first large-scale, multi-organ transcriptome for an ectothermic reptile. This resource provides the most comprehensive set of EST sequences available for an individual ectothermic reptile species, increasing the number of snake ESTs 50-fold. We have identified genes that appear to be under evolutionary selection and those that are sex-specific. This resource will assist studies on gene expression and comparative genomics, and will facilitate the study of evolutionarily important traits at the molecular level.
背景:爬行动物兼具多样性与独特的进化适应性,为代谢、生理学及发育学的比较研究提供了完备的研究体系。然而,针对外温性爬行动物(ectothermic reptiles)的分子资源却极度匮乏,这阻碍了我们对诸多进化关键性状的遗传基础展开研究,这些性状包括代谢可塑性、极端长寿、无肢性状、毒液及冻耐受能力。本研究采用大规模平行测序(massively parallel sequencing)技术(454 GS-FLX Titanium平台),构建西部陆生束带蛇(Thamnophis elegans)的转录组(transcriptome),研究目标分为两点:其一,为外温性爬行动物开发分子研究资源;其二,利用这些性别特异性转录组,识别表达转录本与潜在进化关联基因的表达差异。
结果:使用覆盖7种组织类型、35个不同个体的性别特异性RNA混合样本(雌性、雄性各一组),我们共获得124万条序列读长,质控过滤后平均读长为366 bp。使用NEWBLER与MIRA对雌雄个体的质控后读长进行组装,共得到96379条重叠群(contig),覆盖87%的质控后读长。其中超过34%的重叠群与13%的单序列(singleton)可通过与已报道蛋白的同源性注释得到功能注释。基于这些同源性分配结果、额外的聚类分析及开放阅读框(ORF)预测,我们估计该转录组包含约13000个已在其他物种中报道过的独特基因,以及超过66000条来自未鉴定蛋白编码基因的转录本。此外,我们利用图聚类方法,识别出由NEWBLER拆分读长所连接的重叠群,这些重叠群对应等位基因变异、基因复制及可变剪接转录本。除基因鉴定外,我们还识别出95295个单核苷酸多态性(SNP)与31651个插入缺失多态性(INDEL)。通过分析性别特异性转录组,我们发现190个仅在单一性别mRNA中存在的基因(84个雌性特异性基因、106个雄性特异性基因),以及诸多进化关联的高变异基因。
结论:本研究构建了首个针对外温性爬行动物的大规模多器官转录组。该资源为单个外温性爬行动物物种提供了目前最全面的表达序列标签(EST)序列集合,将蛇类表达序列标签的数量提升了50倍。我们已鉴定出处于进化选择压力下的基因及性别特异性表达基因。该资源将助力基因表达与比较基因组学研究,并推动进化关键性状的分子水平研究。
创建时间:
2015-05-26



