Data from: high repeat content in the genomes of sparrows: the importance of genome assembly completeness for transposable element discovery
收藏DataCite Commons2026-03-05 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.cjsxksncs
下载链接
链接失效反馈官方服务:
资源简介:
Transposable elements (TE) play critical roles in shaping genome
evolution. However, the highly repetitive sequence content of TEs is a
major source of assembly gaps. This makes it difficult to decipher the
impact of these elements on the dynamics of genome evolution. The
increased capacity of long-read sequencing technologies to span highly
repetitive regions of the genome should provide novel insights into
patterns of TE diversity. Here we report the generation of highly
contiguous reference genomes using PacBio long read and Omni-C
technologies for three species of sparrows in the family Passerellidae. To
assess the influence of sequencing technology on TE annotation, we
compared these assemblies to three chromosome-level sparrow assemblies
recently generated by the Vertebrate Genomes Project and nine other
sparrow species generated using a variety of short- and long-read
technologies. All long-read based assemblies were longer in length (range:
1.12-1.41 Gb) than short-read assemblies (0.91-1.08 Gb). Assembly length
was strongly correlated with the amount of repeat content, with longer
genomes showing much higher levels of repeat content than typically
reported for the avian order Passeriformes. Repeat content for the
Bell's sparrow (31.2% of genome) was the highest level reported to
date for a songbird genome assembly and was more in line with woodpecker
(order Piciformes) genomes. CR1 LINE elements retained from an expansion
that occurred 25-30 million years ago were the most abundant TEs in the
song sparrow genome. Although the other five sparrow species also exhibit
evidence for a spike in CR1 LINE activity at 25-30 million years ago, LTR
elements stemming from more recent expansions were the most abundant
elements in these species. LTRs were uniquely abundant in the Bell's
sparrow genome deriving from two recent peaks of activity. Higher levels
of repeat content (79.2-93.7%) were found on the W chromosome relative to
the Z (20.7-26.5) or autosomes (16.1-30.9%). These patterns support a
dynamic model of transposable element expansion and contraction
underpinning the seemingly constrained and small sized genomes of birds.
Our work highlights how the resolution of difficult-to-assemble regions of
the genome with new sequencing technologies promises to transform our
understanding of avian genome evolution.
提供机构:
Dryad
创建时间:
2023-12-14



