five

Supporting data for "Highly accurate whole genome imputation of SARS-CoV-2 from partial or low-quality sequences"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100946
下载链接
链接失效反馈
官方服务:
资源简介:
The current SARS-CoV-2 pandemic has emphasized the utility of viral whole genome sequencing (WGS) in the surveillance and control of the pathogen. An unprecedented ongoing global initiative is producing hundreds of thousands of sequences worldwide. However, the complex circumstances in which viruses are sequenced, along with the demand of urgent results, causes a high rate of incomplete and therefore useless, sequences. Viral sequences evolve in the context of a complex phylogeny and different positions along the genome are in linkage disequilibrium. Therefore, an imputation method would be able to predict missing positions from the available sequencing data. <br> The impuSARS application, which takes advantage of the enormous number of SARS-CoV-2 genomes available, using a reference panel containing 239,301 sequences, to produce missing data imputation in viral genomes, has been developed. The impuSARS was tested in a wide range of conditions (continuous fragments, amplicons or sparse individual positions missing) showing great fidelity when reconstructing the original sequences, recovering the lineage with a 100% precision for almost all the lineages, even in very poorly covered genomes (&lt; 20%). <br> Imputation can improve the pace of SARS-CoV-2 sequencing production by recovering many incomplete or low-quality sequences that would be otherwise discarded. impuSARS can be incorporated in any primary data processing pipeline for SARS-CoV-2 WGS.
提供机构:
GigaScience Database
创建时间:
2021-11-11
二维码
社区交流群
二维码
科研交流群
商业服务