Distinct Evolutionary Trajectory of SARS-CoV-2 Supported by Phylogenetic Analysis of 11 Coronaviruses: A Study of CpG Dynamics, Mutations, and Maximum Likelihood Tree Construction (RAxML) Relative to Wuhan-Hu-1 (NC_045512)
收藏DataCite Commons2025-04-02 更新2025-05-07 收录
下载链接:
https://figshare.com/articles/dataset/Distinct_Evolutionary_Trajectory_of_SARS-CoV-2_Supported_by_Phylogenetic_Analysis_of_11_Coronaviruses_A_Study_of_CpG_Dynamics_Mutations_and_Maximum_Likelihood_Tree_Construction_RAxML_Relative_to_Wuhan-Hu-1_NC_045512_/28718957
下载链接
链接失效反馈官方服务:
资源简介:
This project analyzes the genomic and evolutionary features of 11 coronaviruses, including SARS-CoV-2, its predecessors, and related viruses. The analysis focuses on understanding CpG dynamics, mutation profiles, and phylogenetic relationships relative to the reference genome Wuhan-Hu-1 (NC_045512).A key finding is the distinct evolutionary trajectory of SARS-CoV-2, which is strongly supported by phylogenetic evidence, suggesting that it has accumulated unique mutations that differentiate it from other coronaviruses.Key Analyses:Sequence Alignment :Sequences were aligned using MAFFT to identify conserved and divergent regions.CpG Analysis :Sliding window analysis was performed to calculate CpG counts and observed/expected (O/E) ratios across the genomes, revealing patterns of CpG depletion consistent with evolutionary pressures.Mutation Profiling :Mutations relative to Wuhan-Hu-1 were identified using custom Python scripts, highlighting the genetic divergence of SARS-CoV-2 and other coronaviruses.Phylogenetic Analysis :RAxML was used to construct a maximum likelihood tree with 100 bootstrap replicates, providing robust statistical support for the branching patterns.Visualization using FigTree revealed SARS-CoV-2’s distinct placement in the tree, supported by high bootstrap values that underscore its unique evolutionary trajectory.Dataset Includes:Input GenBank files (.gb).Combined FASTA file (combined_sequences.fasta) containing sequences from 11 coronaviruses.Aligned sequences (aligned_sequences.fasta and aligned_sequences_renamed.fasta).Mutation analysis results (mutations_analysis.txt).RAxML output files for phylogenetic analysis:Best-scoring ML tree (RAxML_bestTree.output_tree).Bootstrap trees (RAxML_bootstrap.output_tree).ML tree with bootstrap support values (RAxML_bipartitions.output_tree).Visualizations:Bar charts showing mutation counts per genome.Phylogenetic tree visualization (phylogenetic_tree.png).Scripts used for data processing, alignment, mutation profiling, and phylogenetic analysis.CpG analysis results (Civet_SARS_CoV_cpg_analysis.txt).Methods:Converted GenBank files to FASTA format using Biopython.Aligned sequences using MAFFT to ensure accurate identification of conserved and divergent regions.Identified mutations relative to Wuhan-Hu-1 using a custom Python script, enabling detailed comparison of genetic differences.Performed phylogenetic analysis using RAxML, constructing a maximum likelihood tree with 100 bootstrap replicates to assess confidence in branching patterns.Visualized results using Matplotlib, FigTree to interpret and communicate findings effectively.Implications:This work contributes to understanding the genomic architecture and evolutionary relationships of coronaviruses. Key findings include:Evidence of CpG depletion across most coronaviruses, consistent with broader trends in viral genomics.Identification of unique mutations in SARS-CoV-2 that differentiate it from other coronaviruses.Strong phylogenetic support for SARS-CoV-2’s distinct evolutionary trajectory, as evidenced by high bootstrap values in the maximum likelihood tree.These findings highlight the shared mechanisms of CpG depletion and mutation accumulation in coronaviruses while underscoring the unique evolutionary path of SARS-CoV-2.Published on Figshare:10.6084/m9.figshare.28718957Special thanks to the open science community for tools like Biopython, MAFFT, RAxML, and FigTree that made this analysis possible.
提供机构:
figshare
创建时间:
2025-04-02



