PRJNA1088471 SARS-CoV-2 Wastewater Sequences Processed
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/fmwvksr66f
下载链接
链接失效反馈官方服务:
资源简介:
The data come from NCBI BioProject PRJNA1088471, as originally analyzed in Overton et. al (2024). These data are short read sequences from wastewater in Toronto, Ontario. See Overton et. al (2024) for a detailed description of the genomic sequencing process.
Data processing involved alignment of the short reads to the Wuhan-1 reference sequence (NC_045512) with `minimap2` v2.28, identifying the mutations relative to the reference, and recording the number of times a mutation was observed (counts) and the depth of coverage. The frequency is calculated as the counts divided by the coverage. The mutation pre-processing pipeline is available at \url{https://github.com/DASL-Lab/data-treatment-plant}, and heavily relies on the GromStole pipeline (\url{https://github.com/PoonLab/gromstole}).
After being processed into counts and coverage, the data were filtered to only include mutations that are relevant to analysis. There were many mutations with either consistently low counts (possibly due to sequencing errors) or low coverage.
We found all mutations that had both a frequency of at least 0.1 and a frequency below 0.9 (with a coverage at least 40) at at least two time points during the study in any location. This ensures that we have all of the mutations that were potentially part of a circulating lineage without relying on lineage definitions.
Overton, Alyssa K., Jennifer J. Knapp, Opeyemi U. Lawal, et al. “Genomic Surveillance of a Canadian Airport Wastewater Samples Allows Early Detection of Emerging SARS-CoV-2 Lineages.” Preprint, April 9, 2024. https://doi.org/10.21203/rs.3.rs-4183960/v1.
创建时间:
2025-11-17



