Global SARS-CoV-2 Proteome Mutation Burden Atlas Derived From 103 Million Amino Acid Sequences Covering S, N, M, E, ORF1a, ORF1b, ORF3a, ORF6, ORF7a, ORF7b, and ORF8 Proteins Spanning 2020 to Q3-2025
收藏Figshare2025-11-19 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Global_SARS-CoV-2_Proteome_Mutation_Burden_Atlas_Derived_From_103_Million_Amino_Acid_Sequences_Covering_S_N_M_E_ORF1a_ORF1b_ORF3a_ORF6_ORF7a_ORF7b_and_ORF8_Proteins_Spanning_2020_to_Q3-2025/30655373/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset presents the largest known <b>per-protein amino acid mutation burden analysis</b> of SARS-CoV-2 to date, encompassing <b>103,378,188 </b><b>high-quality translated sequences</b> from global surveillance.For each of the 11 canonical SARS-CoV-2 proteins (structural, replicase, and accessory), we provide:Per-sequence mutation burden vs. Wuhan-Hu-1 referenceYear-stratified evolutionary statistics (where metadata available)Chunked, analysis-ready records in JSON.zst and TSV.zst formats<br><b>SARS-CoV-2 Mutation Burden Summary </b>==================================================S: 9398268 seqs, mean burden = 113.66N: 9397174 seqs, mean burden = 37.83M: 9398092 seqs, mean burden = 12.10E: 9398041 seqs, mean burden = 2.24ORF1a: 9398286 seqs, mean burden = 178.16ORF1b: 9398287 seqs, mean burden = 6940.96ORF3a: 9398086 seqs, mean burden = 5.92ORF6: 9397993 seqs, mean burden = 1.04ORF7a: 9397998 seqs, mean burden = 7.06ORF7b: 9397956 seqs, mean burden = 2.47ORF8: 9397712 seqs, mean burden = 4.93<b>Note:</b><i>ORF1b burden reflects alignment artifact due to frameshift-dependent expression; values not biologically comparable to other genes.</i><br>Total Sequences: <b>103,378,188</b>Please properly cite this dataset if you use it. This dataset is for my upcoming journal article. Also that, earlier i started this processing for 2025 but expanded to contain 2020 up to 2025 so even tough a few folders contain label 2025 but all has coverage span from 2020 up to quarter 3 of 2025 Study & Data Processed by: TahirHB@Hotmail.Com
提供机构:
Bhatti, Tahir
创建时间:
2025-11-19



