Data from: The Genome sequence of a widespread apex predator, the golden eagle (Aquila chrysaetos)
收藏DataCite Commons2025-05-01 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.d0md6
下载链接
链接失效反馈官方服务:
资源简介:
Biologists routinely use molecular markers to identify conservation units,
to quantify genetic connectivity, to estimate population sizes, and to
identify targets of selection. Many imperiled eagle populations require
such efforts and would benefit from enhanced genomic resources. We
sequenced, assembled, and annotated the first eagle genome using DNA from
a male golden eagle (Aquila chrysaetos) captured in western North America.
We constructed genomic libraries that were sequenced using Illumina
technology and assembled the high-quality data to a depth of ~40x
coverage. The genome assembly includes 2,552 scaffolds >10 Kb and
415 scaffolds >1.2 Mb. We annotated 16,571 genes that are involved
in myriad biological processes, including such disparate traits as beak
formation and color vision. We also identified repetitive regions spanning
92 Mb (~6% of the assembly), including LINES, SINES, LTR-RTs and DNA
transposons. The mitochondrial genome encompasses 17,332 bp and is ~91%
identical to the Mountain Hawk-Eagle (Nisaetus nipalensis). Finally, the
data reveal that several anonymous microsatellites commonly used for
population studies are embedded within protein-coding genes and thus may
not have evolved in a neutral fashion. Because the genome sequence
includes ~800,000 novel polymorphisms, markers can now be chosen based on
their proximity to functional genes involved in migration, carnivory, and
other biological processes.
生物学家通常借助分子标记(molecular markers)来界定保护单元(conservation units)、量化遗传连通性(genetic connectivity)、估算种群规模(population sizes),并识别选择靶标(selection targets)。诸多处于濒危状态的鹰类种群亟需此类研究,并可从更完善的基因组资源中获益。
本研究以在北美西部捕获的一只雄性金雕(*Aquila chrysaetos*)为材料,对首个鹰类基因组进行了测序、组装与注释。我们构建了基因组文库(genomic libraries),并利用Illumina测序技术(Illumina technology)完成测序,随后将高质量测序数据组装至约40倍覆盖深度(coverage)。该基因组组装结果包含2552个长度大于10 Kb的基因组支架(scaffold),以及415个长度大于1.2 Mb的基因组支架。我们共注释得到16571个基因,这些基因参与众多生物学过程,涵盖喙部发育、色觉等迥异的性状。我们还鉴定出总长92 Mb(约占组装序列的6%)的重复序列区域,涵盖长散在核元件(LINES)、短散在核元件(SINES)、长末端重复序列反转录转座子(LTR-RTs)以及DNA转座子(DNA transposons)。该线粒体基因组全长17332 bp,与林雕(*Nisaetus nipalensis*)的同源序列相似度约为91%。最后,本研究数据显示,数种常用于种群研究的匿名微卫星序列嵌入于蛋白编码基因(protein-coding genes)内部,因此其演化可能并非遵循中性模式。鉴于该基因组序列包含约80万个新的多态性位点(polymorphisms),如今可根据标记与参与迁徙、食肉习性及其他生物学过程的功能基因的间距来筛选分子标记。
提供机构:
Dryad
创建时间:
2014-04-23



