Data from: Genome reannotation of the lizard Anolis carolinensis based on 14 adult and embryonic deep transcriptomes
收藏DataONE2013-03-19 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Background: The green anole lizard, Anolis carolinensis, is a key species for both laboratory and field-based studies of evolutionary genetics, development, neurobiology, physiology, behavior, and ecology. As the first non-avian reptilian genome sequenced, A. carolinesis is also a prime reptilian model for comparison with other vertebrate genomes. The public databases of Ensembl and NCBI have provided a first generation gene annotation of the anole genome that relies primarily on sequence conservation with related species. A second generation annotation based on tissue-specific transcriptomes would provide a valuable resource for molecular studies. Results: Here we provide an annotation of the A. carolinensis genome based on de novo assembly of deep transcriptomes of 14 adult and embryonic tissues. This revised annotation describes 59,373 transcripts, compared to 16,533 and 18,939 currently for Ensembl and NCBI, and 22,962 predicted protein-coding genes. A key improvement in this revised annotation is coverage of untranslated region (UTR) sequences, with 79% and 59% of transcripts containing 5' and 3' UTRs, respectively. Gaps in genome sequence from the current A. carolinensis build (Anocar2.0) are highlighted by our identification of 16,542 unmapped transcripts, representing 6,695 orthologues, with less than 70% genomic coverage. Conclusions: Incorporation of tissue-specific transcriptome sequence into the A. carolinensis genome annotation has markedly improved its utility for comparative and functional studies. Increased UTR coverage allows for more accurate predicted protein sequence and regulatory analysis. This revised annotation also provides an atlas of gene expression specific to adult and embryonic tissues.
背景:卡罗莱纳安乐蜥(Anolis carolinensis)是进化遗传学、发育生物学、神经生物学、生理学、行为学与生态学领域实验室及野外研究的关键模式物种。作为首个完成全基因组测序的非鸟类爬行类动物,卡罗莱纳安乐蜥也是开展脊椎动物基因组比较研究的核心爬行类模型。Ensembl与NCBI公共数据库已基于与近缘物种的序列保守性,完成了该安乐蜥基因组的第一代基因注释。若能基于组织特异性转录组构建第二代注释,将为分子生物学研究提供极具价值的研究资源。
结果:本研究基于14种成体及胚胎组织的深度转录组从头组装(de novo assembly),完成了卡罗莱纳安乐蜥基因组的注释工作。本次修订后的注释共收录59373条转录本,而Ensembl与NCBI当前的注释分别仅包含16533条与18939条转录本,同时预测得到22962个蛋白质编码基因。本次修订注释的一项关键改进是对非翻译区(untranslated region, UTR)序列的覆盖:分别有79%与59%的转录本包含5'非翻译区(5' UTR)与3'非翻译区(3' UTR)。本研究鉴定出16542条未定位转录本(对应6695个直系同源基因),其基因组覆盖度不足70%,这一结果凸显了当前卡罗莱纳安乐蜥基因组组装版本(Anocar2.0)中存在的序列缺口。
结论:将组织特异性转录组序列整合入卡罗莱纳安乐蜥的基因组注释,显著提升了其在比较研究与功能研究中的应用价值。更高的非翻译区覆盖度可支持更精准的蛋白质序列预测与调控分析。本次修订后的注释还提供了成体与胚胎组织特异性的基因表达图谱。
创建时间:
2013-03-19



