five

Data_Sheet_3_Pan-genome analysis of three main Chinese chestnut varieties.zip

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_3_Pan-genome_analysis_of_three_main_Chinese_chestnut_varieties_zip/20367981
下载链接
链接失效反馈
官方服务:
资源简介:
Chinese chestnut (Castanea mollissima Blume) is one of the earliest domesticated plants of high nutritional and ecological value, yet mechanisms of C. mollissima underlying its growth and development are poorly understood. Although individual chestnut species differ greatly, the molecular basis of the formation of their characteristic traits remains unknown. Though the draft genomes of chestnut have been previously released, the pan-genome of different variety needs to be studied. We report the genome sequence of three cultivated varieties of chestnut herein, namely Hei-Shan-Zhai-7 (H7, drought-resistant variety), Yan-Hong (YH, easy-pruning variety), and Yan-Shan-Zao-Sheng (ZS, early-maturing variety), to expedite convenience and efficiency in its genetics-based breeding. We obtained three chromosome-level chestnut genome assemblies through a combination of Oxford Nanopore technology, Illumina HiSeq X, and Hi-C mapping. The final genome assemblies are 671.99 Mb (YH), 790.99 Mb (ZS), and 678.90 Mb (H7), across 12 chromosomes, with scaffold N50 sizes of 50.50 Mb (YH), 65.05 Mb (ZS), and 52.16 Mb (H7). Through the identification of homologous genes and the cluster analysis of gene families, we found that H7, YH and ZS had 159, 131, and 91 unique gene families, respectively, and there were 13,248 single-copy direct homologous genes in the three chestnut varieties. For the convenience of research, the chestnut genome database1 was constructed. Based on the results of gene family identification, the presence/absence variations (PAVs) information of the three sample genes was calculated, and a total of 2,364, 2,232, and 1,475 unique genes were identified in H7, YH and ZS, respectively. Our results suggest that the GBSS II-b gene family underwent expansion in chestnut (relative to nearest source species). Overall, we developed high-quality and well-annotated genome sequences of three C. mollissima varieties, which will facilitate clarifying the molecular mechanisms underlying important traits, and shortening the breeding process.

中国板栗(Castanea mollissima Blume)是最早被驯化的兼具高营养价值与生态价值的植物之一,但目前人们对其生长发育的分子机制仍知之甚少。尽管不同板栗物种间性状差异显著,但其特征性状形成的分子基础仍未明确。此前虽已发布板栗的草图基因组,但不同品种的泛基因组(pan-genome)仍有待深入研究。本研究报道了三个栽培板栗品种的基因组序列,分别为黑山寨7号(H7,抗旱品种)、燕红(YH,易修剪品种)以及燕山早生(ZS,早熟品种),以期为基于遗传学的育种工作提供便捷与效率支持。我们结合牛津纳米孔(Oxford Nanopore)技术、Illumina HiSeq X测序平台以及Hi-C映射技术,获得了三个染色体级别的板栗基因组组装结果。最终的基因组组装大小分别为671.99 Mb(YH)、790.99 Mb(ZS)以及678.90 Mb(H7),均包含12条染色体,支架N50(scaffold N50)大小分别为50.50 Mb(YH)、65.05 Mb(ZS)以及52.16 Mb(H7)。通过同源基因鉴定与基因家族聚类分析,我们发现H7、YH和ZS分别拥有159、131和91个独特基因家族,且三个板栗品种间共存在13248个单拷贝直系同源基因。为方便后续研究,我们构建了板栗基因组数据库1。基于基因家族鉴定结果,我们计算了三个样本基因的存在/缺失变异(presence/absence variations,PAVs)信息,最终在H7、YH和ZS中分别鉴定到2364、2232和1475个独特基因。我们的研究结果表明,相较于最近缘的源物种,板栗的GBSS II-b基因家族发生了扩张。综上,本研究获得了三个高质量且注释完善的中国板栗品种基因组序列,这将有助于阐明重要性状形成的分子机制,并缩短育种周期。
创建时间:
2022-07-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作