Supporting data for "CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes"

Name: Supporting data for "CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes"
Creator: GigaScience Database
Published: 2025-05-26 17:20:04
License: 暂无描述

DataCite Commons2025-05-26 更新2025-04-15 收录

下载链接：

http://gigadb.org/dataset/100729

下载链接

链接失效反馈

官方服务：

资源简介：

Easy-to-use and fast bioinformatics pipelines for long-read assembly that go beyond the contig-level to generate high-quality chromosome-scale genomes from raw data remain scarce.<br>Chromosome Scale Assembler (CSA) is a novel computationally highly efficient bioinformatics pipeline that fills this gap. CSA integrates information from scaffolded assemblies (e.g. Hi-C or 10X Genomics) or even from diverged reference genomes into the assembly process. As CSA performs automated assembly of chromosome-sized scaffolds, we benchmark its performance against state-of-the art reference genomes that have been built in a laborious fashion using multiple separate assembly tools and manual curation. CSA increases the contig length using scaffolding, local re-assembly and gap-closing. On certain datasets, initial contig N50 may be increased up to 4.5-fold. For smaller vertebrate genomes, chromosome-scale assemblies can be achieved within 12 h using low cost, high-end desktop computers. Mammalian genomes can be processed within 16 h on compute-servers. Using diverged reference genomes for fish, birds and mammals, we demonstrate that CSA calculates chromosome-scale assemblies from long-read data and genome comparisons alone. Even contig-level draft assemblies of diverged genomes are helpful for reconstructing chromosome-scale sequences. CSA is capable of assembling ultra-long reads.<br>CSA can speed-up and simplify chromosome-level assembly and significantly lower costs of large-scale family-level vertebrate genome projects.

提供机构：

GigaScience Database

创建时间：

2020-03-23