Limitations of Genomic Surveillance: A Biologically Plausible Synthetic SARS-CoV-2 Spike Variant Escapes Detection by Nextclade
收藏Figshare2025-07-05 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Limitations_of_Genomic_Surveillance_A_Biologically_Plausible_Synthetic_SARS-CoV-2_Spike_Variant_Escapes_Detection_by_Nextclade/29484023
下载链接
链接失效反馈官方服务:
资源简介:
I report the design, computational modeling, and genomic analysis of a synthetically engineered SARS-CoV-2 spike gene derived from the XFG lineage (XBB.1.5). The construct incorporates 19 biologically plausible amino acid substitutions, selected to preserve structural stability, ACE2 binding affinity, and codon pair usage consistent with natural Omicron evolution.Structural fidelity was validated using SWISS-MODEL and PyMOL visualization tools. Codon Pair Bias Index (CPBI) and Codon Adaptation Index (CAI) analyses revealed no significant deviation from the natural isolate population, indicating compatibility with human translation machinery.Mutation mapping confirmed overlap with known antigenic sites, suggesting potential immune escape features while maintaining functional constraints.Upon integration into a full-genome context and submission to Nextclade for lineage classification, the sequence was assigned to clade XFG with mutation patterns indistinguishable from circulating recombinants such as XBB.1.5. Notably, key mutations including F486P , S373P , and Q954H were found within the natural variation reported in global databases.Total mutations detected by NextcladeTotal Substitutions = 175 SNVsTotal Deletions = 77 deletionsTotal Insertions = 33 insertionsTotal Frame Shifts = 0Total Aminoacid Substitutions = 117 AA mutationsTotal Amino acid Deletions = 20 AA deletionsTotal Aminoacid Insertions = 1 AA insertionSpike protein level mutations described by nextclade:S:F2L, S:S13F, S:L18F, S:T22N, S:S31P, S:T33I, S:V90F, S:K182R,S:R190S, S:F194L, S:L226F, S:R346T, S:N370S, S:S383P,S:K444R, S:H445R, S:N487D, S:S490F, S:Q493E, S:F497P, S:N501Y,S:P621S, S:S680F, S:H681R, S:A903V, S:T941S, S:Q965H,S:V1065L, S:V1068I, S:T1077I, S:P1143L, S:Y1155H, S:L1197I,S:T1231S.Some notable mutations given by nextclade:S:F456L: I has been seen in JN.1 frequently.S:S383P, S:S373P: Found in newer Omicron lineagesS:H445R, S:K444R: Affects antibody bindingS:N450D: Enhances ACE2 bindingS:P621S: Fusion domain mutationAll of these are consistent with natural evolution and there were no red flags for engineering.Founder Mutations:S:S31P, S:S27-, S:I68-, S:I212-are also consistent for clad assignment as usual.Sequence Quality ReportQC overall Score362.111(considered bad )(may be because I have introduced high number of mutations)QC missing = 4 (considered Low)Non ACGTNs1(which is minimal)Coverage~99.8%High qualitycds Coverage~99.8%Good CDS alignmentConclusion:Engineered sequences can be designed to fall within the bounds of natural variation , making them hard to detect using standard biosurveillance pipelines. This work demonstrates that engineered synthetic constructs can evolve under biological constraints and remain undetectable by current genomic surveillance systems. These findings raise concerns about the ability of standard biosurveillance pipelines to identify engineered sequences embedded within expected evolutionary trajectories.This work builds upon my previous study:DOI: https://doi.org/10.5281/zenodo.15809071
创建时间:
2025-07-05



