Gene-Level DRACH Motif Atlas of SARS-CoV-2 (2020 - 2025): N, Spike, ORF6, 5′ UTR, 3′ UTR
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/6y75tk8j5b
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains per-genome DRACH motif annotations for 9,356,279 SARS-CoV-2 genomes across five genomic regions: 5′ UTR, Spike (S), ORF6, Nucleocapsid (N), and 3′ UTR. For each genome and region, it provides:
seq_id: Genome identifier
year: Collection year (2020–2025)
drach_count: Number of DRACH motifs ([AGT][AG]AC[ACT])
drach_positions: 1-based genomic positions of motifs
drach_sequences: Actual DRACH k-mers (e.g., GGACT)
drach_density_per_kb: Motif count normalized per kilobase
Data is derived from the Wuhan-Hu-1 reference (NC_045512.2) and processed from a globally representative aligned FASTA.
Files are provided in tab-separated (TSV) format, with compressed .zst versions for efficient storage.
Summary:
SARS-CoV-2 Gene-Level DRACH Motif Analysis Summary
=======================================================
5′ UTR
------------------------------
Total genomes: 9,356,279
Mean DRACH density: 21.48 motifs/kb
Median DRACH density: 22.64 motifs/kb
Temporal trend (2020–2025): -8.5% decline
Spike (S)
------------------------------
Total genomes: 9,356,279
Mean DRACH density: 21.42 motifs/kb
Median DRACH density: 21.72 motifs/kb
Temporal trend (2020–2025): -1.3% decline
ORF6
------------------------------
Total genomes: 9,356,279
Mean DRACH density: 21.44 motifs/kb
Median DRACH density: 21.50 motifs/kb
Temporal trend (2020–2025): 0.9% increase
Nucleocapsid (N)
------------------------------
Total genomes: 9,356,279
Mean DRACH density: 28.03 motifs/kb
Median DRACH density: 28.57 motifs/kb
Temporal trend (2020–2025): 1.3% increase
3′ UTR
------------------------------
Total genomes: 9,356,279
Mean DRACH density: 12.33 motifs/kb
Median DRACH density: 13.10 motifs/kb
Temporal trend (2020–2025): -5.9% decline
This dataset is for our upcoming article.
A Preprint of full genome analysis has been posted at: https://doi.org/10.21203/rs.3.rs-7926428/v1
please note, this dataset is large scale analysis, suitable for use therefore, for population level analysis around the globe.
Data has been processed by TahirHB@Hotmail.Com
创建时间:
2025-10-28



