five

The haplotype-resolved chromosome pairs and transcriptome data of a heterozygous diploid African cassava cultivar

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/102193
下载链接
链接失效反馈
官方服务:
资源简介:
Cassava (<i>Manihot esculenta</i>) is an important clonally propagated food crop in tropical and sub-tropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult to assemble genome. <br>Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler Hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present two chromosome scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy above QV46, contig N50 above 18 Mbp, BUSCO completeness of 99%, and 35 K phased gene loci, it is the most accurate, continuous, complete and haplotype-resolved cassava genome assembly so far. <i>Ab initio</i> gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue-specific and inconsistent across different tissues. Direction-shifting was observed in less than 2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome re-arrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR-retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding.<br>The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy and continuity.
提供机构:
GigaScience Database
创建时间:
2022-02-11
二维码
社区交流群
二维码
科研交流群
商业服务