Structural modelling and biophysical analyses reveal a dimeric coiled-coil architecture in the FAZ10 central region of Trypanosoma brucei
收藏DataCite Commons2026-05-05 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19234680
下载链接
链接失效反馈官方服务:
资源简介:
This Zenodo record contains the datasets associated with the FAZ10 study. The deposit includes experimental data, modeling predictions, all-atom molecular dynamics (AA-MD) simulations, coarse-grained molecular dynamics (CG-MD) simulations with GōMartini 3, and ESPResSo polymer reference simulations used for structural and conformational analysis.
The dataset is organized into several compressed archives:
1. Experimental_Data.tar.xz
This archive contains experimental data derived from the biophysical characterization of the FAZ10 central region and its individual domains (coiled-coil and globular). The archive is organized into three main directories corresponding to the experimental techniques used: Size Exclusion Chromatography (SEC), SEC coupled with Multi-Angle Light Scattering (SEC–MALS), and Circular Dichroism (CD).
SEC directory
This directory contains chromatographic profiles for each recombinant protein analyzed using an ÄKTA Purifier 10 system (GE Healthcare Life Sciences) coupled to a Superdex 200 column. It includes:
SEC_CentralRegion.csv: elution profile of the FAZ10 central region.
SEC_CoiledCoilDomain.csv: elution profile of the coiled-coil domain.
SEC_GlobularDomain.csv: elution profile of the globular domain.
SEC–MALS directory
This directory contains data obtained from size exclusion chromatography coupled to a miniDAWN® TREOS® multi-angle light scattering detector and an Optilab T-rEX differential refractometer (Wyatt Technology), enabling determination of oligomeric state and molecular mass. It includes:
SEC-MALS_CentralRegion.csv: data for the FAZ10 central region.
SEC-MALS_CoiledCoilDomain.csv: data for the coiled-coil domain.
SEC-MALS_GlobularDomain.csv: data for the globular domain.
Circular_Dichroism directory
This directory contains experimental and theoretical CD spectra for the FAZ10 central region and its globular domain. Experimental spectra were normalized based on protein concentration and residue number using CDTool. Theoretical spectra were calculated from AlphaFold2 structural models using PDBMD2CD and scaled to match the experimental spectra at 222 nm for comparison. It includes:
CD_CentralRegion_experimental_normalized.csv: experimental CD data for the central region.
CD_CentralRegion_theoretical_normalized.csv: theoretical CD data for the central region.
CD_GlobularDomain_experimental_normalized.csv: experimental CD data for the globular domain.
CD_GlobularDomain_theoretical_normalized.csv: theoretical CD data for the globular domain.
2. AlphaFold2.tar.xz
This dataset contains the structural prediction of the FAZ10 central region using AlphaFold2. The prediction was performed for the FAZ10 central region as a homodimer. The dataset includes:
config.json: Configuration parameters used for the prediction.
Faz10_center_dimer.a3m: Multiple sequence alignment (MSA) used as input for structure prediction.
Predicted_aligned_error_v1.json: Raw PAE data in JSON format.
Rank_001_model_2.pdb: Unrelaxed version of the top-ranked model.
Rank_002_model_1.pdb: Second-ranked model (model 1).
Rank_001_model_2_relaxed.pdb: Relaxed structure (AMBER refinement) selected for MD simulations (model 2, ranked 1).
Rank_003_model_4.pdb: Third-ranked model (model 4).
Rank_004_model_3.pdb: Fourth-ranked model (model 3).
Rank_005_model_5.pdb: Fifth-ranked model (model 5).
3. In_silico_predictions.tar.xz
This dataset contains in silico predictions derived from the analysis of the full-length FAZ10 protein and its central region using computational tools. The dataset includes:
IUPred2A_disorder.xlsx: File containing per-residue intrinsic disorder scores for the full-length FAZ10 protein, calculated using IUPred2A in long disorder mode. Values above the standard threshold (0.5) indicate regions with a high propensity for intrinsic disorder.
MARCOIL_coiledcoil_SAH.xlsx: File containing coiled-coil probability (P-score) and SAH window scores predicted using the MARCOIL algorithm implemented in the Waggawagga platform. These predictions identify regions with coiled-coil propensity and potential single alpha-helix behavior.
PLIP_report_region.txt: Report of molecular interactions identified within the FAZ10 central region, including hydrophobic contacts, hydrogen bonds, salt bridges, and π–cation interactions. The analysis was performed using PLIP, taking as input the last frame of atomistic molecular dynamics (AA-MD) simulations.
PLIP_input_last_frame_central_region.pdb: PDB file corresponding to the final frame of the atomistic molecular dynamics simulation of the FAZ10 central region. This structure was used as input for the PLIP interaction analysis, ensuring full reproducibility of the reported interactions.
PLIP_output_last_frame_session.pse: PyMOL session file corresponding to the last frame of the FAZ10 central region after PLIP analysis. This file enables direct visualization of the detected interactions and their structural context.
3. AA_MD_1_us.tar.xz
This archive contains the AA-MD dataset for FAZ10 (100 ns each replica). It includes:
FAZ10_dry.pdb: reference structure of FAZ10 used as the topology/coordinate file for trajectory analysis.
output_fit_R1_10.xtc
output_fit_R2_10.xtc
output_fit_R3_10.xtc
The output_fit_*.xtc files are fitted/processed trajectories corresponding to independent AA-MD replicas without solvent. The compressed coordinates were written every 5000 steps with a timestep of 0.002 ps, corresponding to one saved frame every 10 ps (0.01 ns).
4. CG_MD_50_us.tar.xz
This archive contains the CG-MD dataset for FAZ10 (10 μs each replica). It includes:
FAZ10_cg_dry.pdb: coarse-grained reference structure used for trajectory analysis.
output_fit_R1_10.xtc
output_fit_R2_10.xtc
output_fit_R3_10.xtc
output_fit_R4_10.xtc
output_fit_R5_10.xtc
The output_fit_*.xtc files are fitted/processed trajectories from independent CG-MD replicas without solvent. The compressed coordinates were written every 5000 steps with a timestep of 0.02 ps, corresponding to one saved frame every 100 ps (0.1 ns).
5. ESPRESSO_MD.tar.xz
This archive contains the ESPResSo polymer simulation files used as a reference dataset for polymer-like behavior analyses. It includes:
polymer_md_507.py: ESPResSo simulation script.
polymer_507_saw.vtf: trajectory/structure file generated in VTF format.
polymer_507_saw.log: simulation log file.
polymer_507_saw_Ree_ns_nm.dat: end-to-end distance (ReeR_{ee}Ree) as a function of time, reported in ns and nm.
polymer_507_saw_Rg_ns_nm.dat: radius of gyration (RgR_gRg) as a function of time, reported in ns and nm.
polymer_507_saw_Ree_over_Rg_ns.dat: time series of the Ree/Rg ratio.
Together, these files provide the simulation trajectories, reference structures, and derived observables used in the manuscript. The dataset is intended to support reproducibility, reanalysis, and comparison of the structural and conformational behavior of FAZ10 across atomistic, coarse-grained, and polymer-reference models.6. DATA_analysis.tar.xz
It contains the derived analysis data obtained from the AA-MD and CG-MD simulations of FAZ10. The archive is organized into two main directories, AA and CG, each containing the processed outputs used for structural and conformational analysis.
AA directory
The AA folder contains the analysis outputs derived from the AA-MD simulations. It includes:
E2E: end-to-end distance data in .dat format for chain 1, chain 2, and the full system, each for replicas R1-R3
Rg: radius of gyration data in .xvg format for chain 1, chain 2, and the full system, each for replicas R1-R3
RMSD: RMSD time series in .xvg format for chain 1, chain 2, and the full system, each for replicas R1-R3
RMSF: residue-wise RMSF profiles in .xvg format for chain 1, chain 2, and the full system, each for replicas 1-3
Thus, the AA analysis set provides processed observables for three independent replicas at both chain and whole-system levels.
CG directory
The CG folder contains the analysis outputs derived from the CG-MD simulations. It includes:
angle: three-dimensional angle time series in .txt format for the full system
E2E: end-to-end distance data in .dat format
RG: radius of gyration data in .xvg format
RMSD: RMSD time series in .xvg format
RMSF: residue-wise RMSF profiles in .xvg format
For the CG dataset:
chain-level analyses (chain1, chain2) are provided for replicas R1-R3
system-level analyses (system) are provided for replicas R1-R5
This organization reflects the processed CG-MD dataset used for comparison with the AA simulations across multiple structural descriptors.
File formats
.dat files contain plain-text numerical time series, used here for end-to-end distance analyses.
.xvg files are GROMACS-formatted text outputs containing time series or residue-based profiles for radius of gyration, RMSD, and RMSF.
提供机构:
Zenodo
创建时间:
2026-04-09



