Data from: Single molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
收藏DataONE2017-04-04 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Highly-repetitive satellite DNA (satDNA) repeats are found in most eukaryotic genomes. SatDNAs are rapidly evolving and have roles in genome stability and chromosome segregation. Their repetitive nature poses a challenge for genome assembly and makes progress on the detailed study of satDNA structure difficult. Here we use single-molecule sequencing long reads from Pacific Biosciences (PacBio) to determine the detailed structure of all major autosomal complex satDNA loci in Drosophila melanogaster, with a particular focus on the 260-bp and Responder satellites. We determine the optimal de novo assembly methods and parameter combinations required to produce a high quality assembly of these previously unassembled satDNA loci and validate this assembly using molecular and computational approaches. We determined that the computationally intensive PBcR-BLASR assembly pipeline yielded better assemblies than the faster and more efficient pipelines based on the MHAP hashing algorithm, and that it is essential to validate assemblies of repetitive loci. The assemblies reveal that satDNA repeats are organized into large arrays interrupted by transposable elements. The repeats in the center of the array tend to be homogenized in sequence, suggesting that gene conversion and unequal crossovers lead to repeat homogenization through concerted evolution, though the degree of unequal crossing-over may differ among complex satellite loci. We find evidence for higher order structure within satDNA arrays that suggest recent structural rearrangements. These assemblies provide a platform for the evolutionary and functional genomics of satDNAs in pericentric heterochromatin.
高度重复卫星DNA(satellite DNA,satDNA)重复序列广泛存在于多数真核生物基因组中。satDNA进化速率较快,且在基因组稳定性与染色体分离过程中发挥重要功能。其高度重复的特性给基因组组装带来了极大挑战,也使得对satDNA结构的精细化研究难以取得进展。本研究利用太平洋生物科学公司(PacBio)的单分子测序长读长数据,解析了黑腹果蝇中所有主要常染色体复合satDNA位点的精细结构,重点聚焦于260 bp长度的卫星序列与Responder卫星序列。我们确定了对此前无法组装的satDNA位点进行高质量组装所需的最优从头组装方法与参数组合,并通过分子与计算手段对该组装结果进行了验证。研究发现,计算成本高昂的PBcR-BLASR组装流程相较于基于MHAP哈希算法的快速高效组装流程,能够产出质量更优的组装结果,且对重复位点的组装结果进行验证是必不可少的步骤。组装结果显示,satDNA重复序列以大型阵列形式存在,且被转座元件所打断。阵列中部的重复序列序列一致性较高,这表明基因转换与不等交换通过协同进化介导了重复序列的均一化过程,不过不等交换的程度在不同复合卫星位点间可能存在差异。我们在satDNA阵列中发现了高级结构的相关证据,这暗示近期发生了结构重排事件。这些组装结果为着丝粒周边异染色质区域内satDNA的进化基因组学与功能基因组学研究提供了重要的研究平台。
创建时间:
2017-04-04



