Limitations of Genomic Surveillance: A Biologically Plausible Synthetic SARS-CoV-2 Spike Variant Escapes Detection by Nextclade
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Limitations_of_Genomic_Surveillance_A_Biologically_Plausible_Synthetic_SARS-CoV-2_Spike_Variant_Escapes_Detection_by_Nextclade/29484023
下载链接
链接失效反馈官方服务:
资源简介:
I report the design, computational modeling, and genomic analysis of a synthetically engineered SARS-CoV-2 spike gene derived from the XFG lineage (XBB.1.5). The construct incorporates 19 biologically plausible amino acid substitutions, selected to preserve structural stability, ACE2 binding affinity, and codon pair usage consistent with natural Omicron evolution.
Structural fidelity was validated using SWISS-MODEL and PyMOL visualization tools. Codon Pair Bias Index (CPBI) and Codon Adaptation Index (CAI) analyses revealed no significant deviation from the natural isolate population, indicating compatibility with human translation machinery.
Mutation mapping confirmed overlap with known antigenic sites, suggesting potential immune escape features while maintaining functional constraints.
Upon integration into a full-genome context and submission to Nextclade for lineage classification, the sequence was assigned to clade XFG with mutation patterns indistinguishable from circulating recombinants such as XBB.1.5. Notably, key mutations including F486P , S373P , and Q954H were found within the natural variation reported in global databases.
Total mutations detected by Nextclade
Total Substitutions = 175 SNVs
Total Deletions = 77 deletions
Total Insertions = 33 insertions
Total Frame Shifts = 0
Total Aminoacid Substitutions = 117 AA mutations
Total Amino acid Deletions = 20 AA deletions
Total Aminoacid Insertions = 1 AA insertion
Spike protein level mutations described by nextclade:
S:F2L, S:S13F, S:L18F, S:T22N, S:S31P, S:T33I, S:V90F, S:K182R,
S:R190S, S:F194L, S:L226F, S:R346T, S:N370S, S:S383P,
S:K444R, S:H445R, S:N487D, S:S490F, S:Q493E, S:F497P, S:N501Y,
S:P621S, S:S680F, S:H681R, S:A903V, S:T941S, S:Q965H,
S:V1065L, S:V1068I, S:T1077I, S:P1143L, S:Y1155H, S:L1197I,
S:T1231S.
Some notable mutations given by nextclade:
S:F456L: I has been seen in JN.1 frequently.
S:S383P, S:S373P: Found in newer Omicron lineages
S:H445R, S:K444R: Affects antibody binding
S:N450D: Enhances ACE2 binding
S:P621S: Fusion domain mutation
All of these are consistent with natural evolution and there were no red flags for engineering.
Founder Mutations:
S:S31P, S:S27-, S:I68-, S:I212-
are also consistent for clad assignment as usual.
Sequence Quality Report
QC overall Score362.111(considered bad )(may be because I have introduced high number of mutations)
QC missing = 4 (considered Low)
Non ACGTNs1(which is minimal)
Coverage~99.8%High quality
cds Coverage~99.8%Good CDS alignment
Conclusion:
Engineered sequences can be designed to fall within the bounds of natural variation , making them hard to detect using standard biosurveillance pipelines.
This work demonstrates that engineered synthetic constructs can evolve under biological constraints and remain undetectable by current genomic surveillance systems. These findings raise concerns about the ability of standard biosurveillance pipelines to identify engineered sequences embedded within expected evolutionary trajectories.
This work builds upon my previous study:
DOI: https://doi.org/10.5281/zenodo.15809071
创建时间:
2025-07-05



