Three haplotype-resolved pentaploid Rosa assemblies with assembled and extracted single copy orthologue (SCO) sequences from Rosa canina genome, diploid Rosa species, and sect. Caninae pollen
收藏DataCite Commons2025-06-05 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.cc2fqz6fh
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was created to analyze nuclear single copy regions in Rosa
canina genome assembly based on single copy orthologue (SCO) tags (Debray
et al. 2019). We used 24 diploid rose species, pollen data from sekt.
Caninae roses, Rosa canina subgenome specific SCO sequences, and outgroup
species from the Rosaceae family. Target capturing baits were designed by
Agilent Technologies covered exons, UTRs, and small introns. The raw reads
of target capturer sequencing were processed with GATK to obtain a sample
specific reference for each SCO locus and each sample. Additionally,
single copy loci for all whole genome assemblies (Rosa
canina, Rubus idaeus, and Fragaria species) were extracted and
concatenated into subgenome-specific sequences and together with sample
specific SCO references used for alignment analysis. The dataset also
includes haplotype-resolved genome assemblies of Rosa canina S27,
Rosa canina (DToL), and Rosa agrestis (DToL). All three Rosa samples are
pentaploid (2n=5x=35), and all three assemblies are chromosome-level. Note
that the assemblies only contain pseudochromosomal sequences, no unplaced
contigs. The PacBio HiFi and Hi-C data of Rosa canina S27 were sequenced
in-house and can be downloaded by NCBI BioProject: PRJNA1111045, while the
sequencing data of Rosa canina (DToL) and Rosa agrestis (DToL)
are from Darwin Tree of Life (DToL). The NCBI BioProject accessions for
the two DToL Rosa data are PRJEB79802 and PRJEB79880,
respectively. The chromosomes of Rosa canina are named as
"Rca#_Subgenome", in which # denotes the chromosome number
(possible value: 1-7) and 'Subgenome' can only be one of S1_h1,
S1_h2, S2, R3, and R4. Similarly, the chromosomes of Rosa
agrestis are named as "Rag#_Subgenome", where # also denotes the
chromosome number and 'Subgenome' can only be one of S1, S2, R3,
R4_h1, and R4_h2. Please check our publication if you want to learn about
how we resolve the haplotypes.
提供机构:
Dryad
创建时间:
2024-07-17



