Sorghum bicolor strain:Keller, E-Tian, Ji2731 Genome sequencing and assembly
收藏DataCite Commons2020-10-10 更新2025-04-09 收录
下载链接:
https://db.cngb.org/search/project/CNPhis0000340/
下载链接
链接失效反馈官方服务:
资源简介:
Sorghum (Sorghum bicolor) is globally produced as a source of food, feed, fibre and fuel. Grain and sweet sorghums differ in a number of important traits including stem sugar and juice accumulation, plant height and production of grain and biomass. The first whole genome sequence of a grain sorghum is available, but additional genome sequences are required to study genome-wide and intraspecies variation for dissecting the genetic basis of these important traits and for tailor-designed breeding of this important C4 crop. We resequenced two sweet and one grain sorghum inbred lines, and identified a set of nearly 1,500 genes differentiating sweet and grain sorghum. In addition, we uncovered 1,057,018 SNPs, 99,948 indels of 1-10bp in length and 16,487 presence/absence variations. In addition, 17,111 CNVs were detected. This is a first report on the identification of genome-wide patterns of genetic variation in sorghum. Because some genes might exist in sorghum but are missed in the currently assembled BTx623 sorghum genome. We assembled unmapped reads with SOAPdenovo and obtained contigs with a total length of 7.2 Mb of sequences. Annotation of these contigs showed 73 putative absent genes with an average length of 409bp (only coding regions were considered). A BLAST search against Arabidopsis, rice and maize genome databases revealed that 33 of these genes showed homology with known proteins (E value < 1e-617 ).
高粱(Sorghum bicolor)作为粮食、饲料、纤维和燃料的来源在全球范围内种植。谷物高粱与甜高粱在多个重要性状上存在差异,包括茎秆糖分与汁液积累、株高以及谷物和生物量的产量。首份谷物高粱全基因组序列已公布,但仍需更多基因组序列以研究全基因组及种内变异,从而解析这些重要性状的遗传基础,并为这一重要C4作物开展定制化育种。我们对两个甜高粱自交系和一个谷物高粱自交系进行了重测序,鉴定出近1500个区分甜高粱与谷物高粱的基因。此外,我们还发现了1,057,018个单核苷酸多态性(SNP)、99,948个长度为1-10bp的插入缺失(indel)以及16,487个存在/缺失变异(presence/absence variation)。同时,检测到17,111个拷贝数变异(CNV)。这是首份关于高粱全基因组遗传变异模式鉴定的报告。由于高粱中可能存在一些基因,但在当前组装的BTx623高粱基因组中未被包含,我们利用SOAPdenovo组装未映射的读段,获得了总长度为7.2 Mb的重叠群(contig)序列。对这些重叠群的注释显示存在73个推定的缺失基因,平均长度为409bp(仅考虑编码区)。通过对拟南芥、水稻和玉米基因组数据库的BLAST搜索发现,这些基因中有33个与已知蛋白质具有同源性(E值<1e-6)。
提供机构:
CNGB
创建时间:
2018-10-20



