five

The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP002046
下载链接
链接失效反馈
官方服务:
资源简介:
In today’s age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24). The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but we have little information on the nature and identity of repetitive units in gymnosperms. The pines (Pinus) have extensive genetic resources, with approximately 329,000 ESTs from eleven species and genetic maps in 8 species, including a dense map of the twelve linkage groups in Pinus taeda. We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS) sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (= 75% nucleotide identity) elsewhere in the genome, but only 23% have identical copies (99% identity). The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. This study indicates that the majority of repeats in the P. taeda genome are ‘novel’ and will therefore require additional BAC or genomic sequencing for accurate characterization. This study provides the first evidence that sequencing a pine genome using a WGS approach is a feasible, and highly advantageous, goal.
创建时间:
2013-08-23
二维码
社区交流群
二维码
科研交流群
商业服务