Supporting data for "De novo genome assembly of the Indian Blue Peacock (Pavo cristatus), from Oxford Nanopore and Illumina sequencing"
收藏DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100559
下载链接
链接失效反馈官方服务:
资源简介:
<i>Pavo cristatus</i>, the Indian peafowl are located in natural habitats of South Asia. The male blue peacock bird is known for its elegance, majestic looks and beauty. Since prehistoric times they have been described in Indian culture and has been adopted as the national bird of India. In this study, we present the first draft genome sequence of the peacock using Illumina and Oxford Nanopore technologies (ONT).<br>ONT sequencing resulted in approximately 2.3-fold sequencing coverage, whereas Illumina generated 150 bp paired-end sequence data at 284.6-fold sequencing coverage from five libraries. Subsequently, we generated de novo genome assembly of the peacock genome with a 0.915 Gigabases (Gb) with a scaffold N50 of 0.23 Megabases (Mb). We also predicted that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences.<br>We report a high-quality genome assembly of the peacock using a hybrid assembly generated from Illumina and ONT sequencing platforms. Long read chemistry generated from ONT was found to be useful in addressing challenges related to de novo assembly particularly at regions containing repetitive sequences that span longer than the read length, and which cannot be resolved using only short-read-based assembly. The contig assembly on the short reads from Illumina resulted in an N50 of 1639 bases, whereas using 2.3x coverage from ONT increased the N50 by nine fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone resulted in total of 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly having 15,025 super scaffolds with a N50 of about 0.23 Mb. The completeness of our genome assembly was verified with the fact that 95% of proteins predicted by homology were matched to those submitted in public repository. Further in concordance with other phylogenetic studies, the avian phylogeny on the conserved genes showed P. cristatus being closest with Gallus gallus followed by Meleagris gallopavo and Anas platyrhynchos. In comparison to the recently published peacock genome assembly, the current hybrid assembly appears to be much superior with a greater sequencing depth, lesser non-ATGC in the assembly and with a reduced number of scaffolds as evident by a nearly 9.1-fold improvement in N50 statistics.
提供机构:
GigaScience Database
创建时间:
2019-03-20



