five

Supporting data for "LRTK: A platform agnostic toolkit for linked-read analysis of both human genomes and metagenomes"

收藏
DataCite Commons2025-05-26 更新2024-07-13 收录
下载链接:
http://gigadb.org/dataset/102524
下载链接
链接失效反馈
官方服务:
资源简介:
Linked-read sequencing technologies generate high-base quality short-reads that contain extrapolative information on long-range DNA connectedness. These advantages of linked-read technologies are well-known and have been demonstrated in many human genomic and metagenomic studies. However, existing linked-read analysis pipelines (e.g., Long Ranger) were primarily developed to process sequencing data from the human genome and are not suited for analyzing metagenomic sequencing data. Moreover, linked-read analysis pipelines are typically limited to one specific sequencing platform. To address these limitations, we present the Linked-Read ToolKit (LRTK), a unified and versatile toolkit for platform agnostic processing of linked-read sequencing data from both human genome and metagenome. LRTK provides functions to perform linked-read simulation, barcode sequencing error correction, barcode-aware read alignment and metagenome assembly, reconstruction of long DNA fragments, taxonomic classification and quantification, as well as barcode-assisted genomic variant calling and phasing. LRTK has the ability to process multiple samples automatically, and provides users with the option to generate reproducible reports during processing of raw sequencing data and at multiple checkpoints throughout downstream analysis. We applied LRTK on linked-reads from simulation, mock community and real datasets for both human genome and metagenome. We showcased LRTKs ability to generate comparative performance results from preceding benchmark studies and to report these results in publication-ready HTML document plots. LRTK provides comprehensive and flexible modules along with an easy-to-use Python-based workflow for processing linked-read sequencing datasets, thereby filling the current gap in the field caused by platform-centric genome-specific linked-read data analysis tools.
提供机构:
GigaScience Database
创建时间:
2024-05-07
二维码
社区交流群
二维码
科研交流群
商业服务