five

Integration of mate pair sequences to improve shotgun assemblies of flow-sorted chromosome arms of hexaploid wheat

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP002001
下载链接
链接失效反馈
官方服务:
资源简介:
The assembly of a wheat genome sequence is challenging due to the hexaploid nature and the extreme repeat content (>80%) of the wheat genome. Flow sorting of single chromosome arms can now be used to overcome the difficulties associated with a polyploid genome, however the high repeat content is nevertheless causing extreme assembly fragmentation. Long jump paired sequences (mate pairs) can aid in reducing assembly fragmentation by joining multiple contigs into single scaffolds. The aim of this work was to assess how mate pair data generated from multiple displacement amplified DNA of flow sorted chromosomes can help reduce fragmentation and increase the information content of shotgun assemblies of wheat chromosomes. Three mate pair libraries of ~2Kb, ~3Kb, and ~5Kb were integrated into 7BS and 7BL shotgun assemblies using SSPACE. We produced total mate pair coverage of 88.9 and 63.9 for the 7BS and 7BL, respectively, but only ~1% of the mate pair data was used in scaffolding. 17481 7BS and 23365 7BL scaffolds were made with an average of 3.8-3.9 contigs per scaffold. Mate pair information improved the assembly N50 by 4.5 fold for 7BS (N50=11Kb) and 5.3 fold for 7BL (N50=8.3Kb). 70 and 80% of the total lengths of 7BS and 7BL SSPACE assemblies were represented by scaffolds, respectively, covering ~45% of the chromosome 7B length. Contigs found in scaffolds were biased towards gene containing contigs. Integration of mate-pair data increased average gene coverage per sequence by 3-5% and the number of estimated full-length genes by 10-20%. Experimental design of mate pair sequencing to improve cost efficiency is discussed.
创建时间:
2021-02-04
二维码
社区交流群
二维码
科研交流群
商业服务