PATHOEXTRACT: A BIOINFORMATIC PIPELINE FOR QUALITY CONTROL AND HOST DNA REMOVAL IN PLASMODIUM FALCIPARUM NGS DATA
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/14006897
下载链接
链接失效反馈官方服务:
资源简介:
Malaria, caused by Plasmodium falciparum, is a significant global health burden, particularly in sub-Saharan Africa. Deep sequencing (NGS) of parasite genomes has revolutionized our understanding of its biology and the emergence of drug resistance. However, the presence of host human DNA and other microbial contaminants within patient samples can hinder accurate and efficient parasite genome analysis. To address this challenge, we have developed PathoExtract, a robust bioinformatics pipeline that integrates commonly used tools into a streamlined workflow. PathoExtract leverages Snakemake, a workflow management system, to provide a flexible and reproducible framework for data processing. The pipeline incorporates rigorous quality control steps to identify and remove low-quality reads and contaminants. Host DNA and microbial sequences are effectively filtered out using a combination of alignment-based and alignment-free methods, ensuring that only Plasmodium falciparum reads are retained for downstream analysis.The pipeline offers an intuitive graphical user interface, making it accessible to researchers with varying levels of bioinformatics expertise. This user-friendly interface simplifies the process of running the pipeline, even for those unfamiliar with command-line tools. The code and documentation for PathoExtract are freely available at: https://github.com/stanlasso/DREPAL-PATHOEXTRACT.
创建时间:
2024-10-29



