SCoV2-VAR: A light-weighted, customizable, and open-source database of 12 million SARS-CoV-2 genomes
收藏Zenodo2025-01-16 更新2026-05-29 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.14664720
下载链接
链接失效反馈官方服务:
资源简介:
Explosive accumulation of SARS-CoV-2 variants is posing a challenge to monitoring virus mutation and other data dealing, particularly based on centralized databases. The present study aimed to establish a light-weighted, customizable, and open-source database for SARS-CoV-2 genomes and annotations, without any access limit. The database, named SCoV2-VAR, was constructed, based on the variations (VAR) of the full-length SARS-CoV-2 (SCoV2) data uploaded on websites. All sequence samples were subject to quality control, single nucleotide polymorphism (SNP) annotation, format conversion, and final compression before appending to SCoV2-VAR. The final version of SCoV2-VAR (up to Feb 2024) contained more than 12 million SARS-CoV-2 records, with full genome and annotations. SCoV2-VAR was extremely light-weighted, with a storage size of 937 Mb for all 12 million sequences, post a 1: 596 compression. SCoV2-VAR is capable of timely updating, quickly querying, and customizable outputting SARS-CoV-2 sequences and their annotations. Additionally, the present study provided an overview of all 12 million SARS-CoV-2 samples, for both sequences and annotations.
提供机构:
Zenodo
创建时间:
2025-01-16



