five

PATHOEXTRACT: A BIOINFORMATIC PIPELINE FOR QUALITY CONTROL AND HOST DNA REMOVAL IN PLASMODIUM FALCIPARUM NGS DATA

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/14006897
下载链接
链接失效反馈
官方服务:
资源简介:
Malaria, caused by Plasmodium falciparum, is a significant global health burden, particularly in sub-Saharan Africa. Deep sequencing (NGS) of parasite genomes has revolutionized our understanding of its biology and the emergence of drug resistance. However, the presence of host human DNA and other microbial contaminants within patient samples can hinder accurate and efficient parasite genome analysis. To address this challenge, we have developed PathoExtract, a robust bioinformatics pipeline that integrates commonly used tools into a streamlined workflow. PathoExtract leverages Snakemake, a workflow management system, to provide a flexible and reproducible framework for data processing. The pipeline incorporates rigorous quality control steps to identify and remove low-quality reads and contaminants. Host DNA and microbial sequences are effectively filtered out using a combination of alignment-based and alignment-free methods, ensuring that only Plasmodium falciparum reads are retained for downstream analysis.The pipeline offers an intuitive graphical user interface, making it accessible to researchers with varying levels of bioinformatics expertise. This user-friendly interface simplifies the process of running the pipeline, even for those unfamiliar with command-line tools. The code and documentation for PathoExtract are freely available at: https://github.com/stanlasso/DREPAL-PATHOEXTRACT.
创建时间:
2024-10-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作