Genic sub-assembly sequences and Low copy number assembly of Triticum aestivum (bread wheat) 5x 454 raw reads (study ERP000319). WGS assembly 1 'OA' has EMBL bank accesssion range CALO01000001-CALO01945079 WGS assembly 2 'LCG'has EMBL bank accession range CALP010000001-CALP015321847. Triticum aesti
收藏NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB217
下载链接
链接失效反馈官方服务:
资源简介:
Bread wheat has a hexaploid genome with a size of approximately 17GB, making it one of the largest and most complex plant genomes. Wheat is of fundamental importance to world agriculture with an estimated 2007 harvest of ~550m tonnes. The OA dataset (EMBL bank accesssion range CALO01000001-CALO01945079) represents a genic sub-assembly of the 5x coverage 454 sequence of bread wheat (Chinese Spring line) described in SRA study ERP000319. In this project we constructed a set of orthologous representative grass genes incorporating genes from Brachypodium distachyon, Sorghum bicolor, Oryza sativa and Hordeum vulgare. We mapped and assembled wheat raw reads on each OG (Orthologous Group) representative using stringent parameters to avoid collapsing of homologous sequences. Sub-assembly sequences were assigned sub-genome predictions (A, B, D or X for unknown) using a trained machine learning classifier. For this submission, all sub-assembly sequences <100bp were removed. The Low Copy-number Genome assembly (LCG) (EMBL bank accession range CALP010000001-CALP015321847) was constructed by filtering out repetitive sequences and assembling the remaining low-copy sequences de novo using gsAssembler from the Newbler package (development version 2.6pre) using the “-large” parameter. The data sets represent outputs of the BBSRC funded grant “Mining the allohexaploid wheat genome for useful sequence polymorphisms”. The grant is led by Prof. Keith Edwards (University of Bristol) and is a collaboration between Prof. Neil Hall and Dr. Anthony Hall (University of Liverpool), Dr. Gary Barker (University of Bristol) and Prof. Mike Bevan (John Innes Centre) (BB/G013004/1, BB/G012865/1). WGS assembly 1 'OA' has EMBL bank accesssion range CALO01000001-CALO01945079 WGS assembly 2 'LCG'has EMBL bank accession range CALP010000001-CALP015321847
创建时间:
2013-01-03



