Plasmodium falciparum strain CO01. Genome sequencing and assembly from PacBio Sequel. Plasmodium falciparum strain:CO01
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA757237
下载链接
链接失效反馈官方服务:
资源简介:
Recent advances in long read technologies not only enable large consortia to aim to sequence all eukaryotes on Earth, but they also allow individual laboratories to sequence their species of interest with relatively low investment. Although there is a promise to obtain 'perfect genomes' with long read technologies, the number of contigs often exceeds the number of chromosomes by far, containing many insertion and deletion errors around homopolymer tracks. To overcome these issues, we implemented the ILRA pipeline to correct long read-based assemblies, so contigs are reordered, renamed, merged, circularized, or filtered if erroneous or contaminated, and Illumina reads are used to correct homopolymer errors. We successfully tested our approach to improve the genomes of four novel Plasmodium falciparum assemblies from field samples. PacBio RSII assemblies in PRJNA714074. PacBio Sequel assemblies in PRJNA757237.
长读长测序技术(long read technologies)的近期进展,不仅使得大型研究联盟能够启动对地球上所有真核生物的测序计划,也让单个实验室可以以相对较低的投入,对目标研究物种开展测序工作。尽管长读长测序技术有望获得"完美基因组",但最终得到的重叠群(contig)数目往往远超染色体数目,且在同源多聚体区域(homopolymer tracks)周围存在大量插入、缺失错误。为解决上述问题,我们构建了ILRA分析流程以校正基于长读长的基因组组装结果:对重叠群进行重排序、重命名、合并、环化处理,或对存在错误或污染的序列进行过滤;同时借助Illumina测序读段(Illumina reads)校正同源多聚体区域的错误。我们已针对4份野外样本来源的新型恶性疟原虫(Plasmodium falciparum)组装基因组开展测试,成功验证了该方法的基因组优化效果。PRJNA714074收录了基于PacBio RSII平台的基因组组装结果;PRJNA757237收录了基于PacBio Sequel平台的基因组组装结果。
创建时间:
2021-08-30



