five

Mutation accumulation of main spike genotypes.

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Mutation_accumulation_of_main_spike_genotypes_/23527185
下载链接
链接失效反馈
官方服务:
资源简介:
The global COVID-19 pandemic has lasted for 3 years since its outbreak, however its origin is still unknown. Here, we analyzed the genotypes of 3.14 million SARS-CoV-2 genomes based on the amino acid 614 of the Spike (S) and the amino acid 84 of NS8 (nonstructural protein 8), and identified 16 linkage haplotypes. The GL haplotype (S_614G and NS8_84L) was the major haplotype driving the global pandemic and accounted for 99.2% of the sequenced genomes, while the DL haplotype (S_614D and NS8_84L) caused the pandemic in China in the spring of 2020 and accounted for approximately 60% of the genomes in China and 0.45% of the global genomes. The GS (S_614G and NS8_84S), DS (S_614D and NS8_84S), and NS (S_614N and NS8_84S) haplotypes accounted for 0.26%, 0.06%, and 0.0067% of the genomes, respectively. The main evolutionary trajectory of SARS-CoV-2 is DS→DL→GL, whereas the other haplotypes are minor byproducts in the evolution. Surprisingly, the newest haplotype GL had the oldest time of most recent common ancestor (tMRCA), which was May 1 2019 by mean, while the oldest haplotype DS had the newest tMRCA with a mean of October 17, indicating that the ancestral strains that gave birth to GL had been extinct and replaced by the more adapted newcomer at the place of its origin, just like the sequential rise and fall of the delta and omicron variants. However, the haplotype DL arrived and evolved into toxic strains and ignited a pandemic in China where the GL strains had not arrived in by the end of 2019. The GL strains had spread all over the world before they were discovered, and ignited the global pandemic, which had not been noticed until the virus was declared in China. However, the GL haplotype had little influence in China during the early phase of the pandemic due to its late arrival as well as the strict transmission controls in China. Therefore, we propose two major onsets of the COVID-19 pandemic, one was mainly driven by the haplotype DL in China, the other was driven by the haplotype GL globally.

新型冠状病毒(SARS-CoV-2)全球大流行自暴发以来已持续三年,但其起源仍未明确。本研究基于刺突蛋白(Spike, S)第614位氨基酸与非结构蛋白8(nonstructural protein 8, NS8)第84位氨基酸位点,对314万个SARS-CoV-2全基因组的基因型进行分析,共鉴定出16种连锁单体型。GL单体型(S_614G与NS8_84L)是驱动本次全球大流行的优势单体型,占已完成测序的病毒基因组总数的99.2%;而DL单体型(S_614D与NS8_84L)曾于2020年春季引发中国境内的新冠疫情流行,占中国境内测序基因组的约60%,占全球测序基因组的0.45%。GS(S_614G与NS8_84S)、DS(S_614D与NS8_84S)以及NS(S_614N与NS8_84S)单体型的基因组占比分别为0.26%、0.06%与0.0067%。SARS-CoV-2的主要演化轨迹为DS→DL→GL,其余单体型均为演化过程中的次要副产物。令人意外的是,出现时间最晚的GL单体型却拥有最早的最近共同祖先时间(time of most recent common ancestor, tMRCA),平均推算时间为2019年5月1日;而最古老的DS单体型的最近共同祖先时间反而最晚,平均为2019年10月17日,这表明孕育GL单体型的祖先毒株已在其起源地被适应性更强的新兴毒株淘汰取代,正如德尔塔(Delta)与奥密克戎(Omicron)变异株的相继兴衰历程。不过,DL单体型毒株在2019年末GL毒株尚未抵达中国之时,便已传入并演化出高致病性毒株,进而引发中国境内的疫情流行。GL毒株在被正式发现前已在全球范围内广泛传播,并最终引发全球大流行,直至中国境内报告病毒病例后才引起全球关注。但由于抵达时间较晚,叠加中国当时实施的严格传播管控措施,GL单体型在疫情早期对中国境内的疫情影响有限。因此,本研究提出新冠大流行存在两大主要暴发节点:其一为以DL单体型驱动的中国境内疫情暴发,其二为以GL单体型驱动的全球大流行。
创建时间:
2023-06-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作