Blueprint for phasing and assembling the genomes of heterozygous polyploids: Application to the octoploid genome of strawberry
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.25338/B8TP7G
下载链接
链接失效反馈官方服务:
资源简介:
The challenge of allelic diversity for assembling haplotypes is
exemplified in polyploid genomes containing homoeologous chromosomes of
identical ancestry, and significant homologous variation within their
ancestral subgenomes. Cultivated strawberry (Fragaria × ananassa) and its
progenitors are outbred octoploids in which up to eight homologous and
homoeologous alleles are preserved. This introduces significant risk of
haplotype collapse, switching, and chimeric fusions during assembly. Using
third generation HiFi sequences from PacBio, we assembled the genome of
the day-neutral octoploid F. × ananassa hybrid ‘Royal Royce’ from the
University of California. Our goal was to produce subgenome- and
haplotype-resolved assemblies of all 56 chromosomes, accurately
reconstructing the parental haploid chromosome complements.
Previous work has demonstrated that partitioning sequences by parental
phase supports direct assembly of haplotypes in heterozygous diploid
species. We leveraged the accuracy of HiFi sequence data with
pedigree-informed sequencing to partition long read sequences by phase,
and reduce the downstream risk of subgenomic chimeras during assembly. We
were able to utilize an octoploid strawberry recombination breakpoint map
containing 3.6 M variants to identify and break chimeric junctions, and
perform scaffolding of the phase-1 and phase-2 octoploid assemblies. The
N50 contiguity of the phase-1 and phase-2 assemblies prior to scaffolding
and gap-filling was 11 Mb. The final haploid assembly represented seven of
28 chromosomes in a single contiguous sequence, and averaged fewer than
three gaps per pseudomolecule. Additionally, we re-annotated the octoploid
genome to produce a custom F. × ananassa repeat library and improved set
of gene models based on IsoSeq transcript data and an expansive RNA-seq
expression atlas. Here we present ‘FaRR1’, a gold-standard reference
genome of F. × ananassa cultivar ‘Royal Royce’ to assist future genomic
research and molecular breeding of allo-octoploid strawberry.
提供机构:
Dryad
创建时间:
2021-11-08



