The genome of the Australian dragon lizard Pogona vitticeps
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/genome-australian-dragon-pogona-vitticeps/616355
下载链接
链接失效反馈官方服务:
资源简介:
We present a genomic resource for a wild-caught male ZZ Central Bearded Dragon Pogona vitticeps, from Australia. The genomic sequence, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (185x raw read depth) from 13 insert libraries ranging from 250 bp to 40 Kbp. After filtering for low quality and duplicated reads, a total of 146 Gbp of data (86x) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % SNP plus indels) complicated assembly, nevertheless 96.4% of reads mapped back to the assembled scaffolds indicating that the assembly included most of the sequenced genome. The total length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp, with an N50 of 2.29 Mbp.
We also present transcriptome data from brain, heart, lung, liver, kidney, skeletal muscle and gonads of male and female P. vitticeps, assembled these datasets (from seven individuals) into 595,564 contigs.
Genes were annotated on the basis of de novo prediction, similarity to Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and Pogona vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63% of which had intact open reading frames. Our assembly captured 99% (246 of 248) of the core CEGMA genes, with 93% (231) being complete.
本研究报道了一份采自澳大利亚的野生雄性ZZ型中部鬃狮蜥(Pogona vitticeps)的基因组学资源。该基因组序列通过Illumina HiSeq 2000测序平台生成,源自13个插入片段长度范围为250 bp至40 Kbp的文库,总计产生317 Gbp数据,原始测序深度达185x。经过低质量序列与重复测序读段(reads)过滤后,共获得146 Gbp有效数据(测序深度86x)用于基因组组装。极高的杂合度(含0.85%的单核苷酸多态性(SNP)与插入缺失变异(indels))给组装工作带来了极大挑战,但仍有96.4%的测序读段可比对回组装得到的基因组骨架(scaffolds),表明该组装覆盖了绝大部分测序获得的基因组。本次组装得到的基因组总长度为1.8 Gbp,共包含545310个基因组骨架(scaffolds),其中69852个骨架长度大于300 bp,最长骨架达14.68 Mbp,N50值为2.29 Mbp。
本研究同时报道了来自雄性与雌性中部鬃狮蜥的脑、心脏、肺、肝脏、肾脏、骨骼肌以及性腺的转录组数据,并将来自7个个体的转录组数据组装为595564个重叠群(contigs)。
本研究基于从头预测(de novo prediction)、与安乐蜥(Anolis carolinensis)、家鸡(Gallus gallus)以及智人(Homo sapiens)的蛋白质序列比对,同时结合中部鬃狮蜥的转录组组装序列进行基因注释,最终在组装结果中得到19406个蛋白质编码基因,其中63%的基因拥有完整的开放阅读框(open reading frames)。本次组装覆盖了99%的核心CEGMA基因(246/248),其中93%(231个)为完整基因。
提供机构:
University of Canberra



