five

A high-quality genome assembly for Dillenia turbinata (Dilleniales)

收藏
DataONE2023-05-24 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:60a1efb29a4cb38816575549c10f60e4b9107dbec7ebd2b828d75049f7c43d76
下载链接
链接失效反馈
官方服务:
资源简介:
Objectives: Dillenia turbinata (Dilleniaceae) is a member of the order Dilleniales, an enigmatic clade of critical importance for understanding the diversification history of flowering plants but for which genome sequences are not available. We have produced and annotated a chromosome-scale whole genome assembly for D. turbinata through the resources of the 10KP (10,000 Plants) Genomes Project. The genome assembly and associated data provided here will serve as a useful resource for comparative and evolutionary genomics research across the flowering plants. Data description: The D. turbinata genome was assembled from Oxford Nanopore Technology (ONT) and whole-genome shotgun (WGS) sequences, and scaffolded into chromosome-scale pseudomolecules using Hi-C data. The genome assembly is 723,739,077 base pairs in length with a BUSCO completeness score of 97%.  Twenty-eight scaffolds contain more than 99% of the assembly. The repeat-masked genome sequence is annotated with 36,967 protein-codin..., Genome assembly and annotation Raw nanopore reads in fastq format were assembled with Canu v2.2 (Koren et al. 2017) using an estimated genome of 900Mb to guide coverage parameters during the read correction, trimming, and assembly steps of the pipeline. The resulting primary assembly was polished with the WGS reads using NextPolish v1.3.1 (Hu et al. 2020), and duplicated constructs were removed by Purge Haplotigs (Roach et al. 2018). The set of deduplicated contigs was scaffolded on the basis of Hi-C reads using the Juicer pipeline (Durand et al. 2016) and 3d-dna tools (Dudchenko et al. 2017) with default parameters. Genome annotation was performed using the MAKER-P pipeline (Campbell et al. 2014) supplied with coding DNA sequences (CDS) from a Trinity (Grabherr et al. 2011) assembly of the Dillenia transcriptome reads, proteomes from four publicly available eudicot genomes —Arabidopsis thaliana, Aquilegia coerulea, Nelumbo nucifera, and Vitis vinifera, and a custom repeat library of tr..., The included data files may be opened with MS Word (Detailed Methods.docx), MS Ecel (Dillenia.genome.assembly.stats.xlsx), standard image viewer software (Dillenia.BUSCO.summaries.png), and standard text editor programs (Dillenia.genome.fasta and Dillenia.maker.predict.36967.final.gff), .
创建时间:
2023-11-29
二维码
社区交流群
二维码
科研交流群
商业服务