Sequence-structure-function relationships in class I MHC: a local frustration perspective

NIAID Data Ecosystem2026-03-11 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.gmsbcc2hx

下载链接

链接失效反馈

官方服务：

资源简介：

Class I Major Histocompatibility Complex (MHC) binds short antigenic peptides with the help of Peptide Loading Complex (PLC), and presents them to T-cell Receptors (TCRs) of cytotoxic T-cells and Killer-cell Immunglobulin-like Receptors (KIRs) of Natural Killer (NK) cells. With more than 10000 alleles, the Human Leukocyte Antigen (HLA) chain of MHC is the most polymorphic protein in humans. This allelic diversity provides a wide coverage of peptide sequence space, yet does not affect the three-dimensional structure of the complex. Moreover, TCRs mostly interact with pMHC in a common diagonal binding mode, and KIR-pMHC interaction is allele-dependent. With the aim of establishing a framework for understanding the relationships between polymorphism (sequence), structure (conserved fold) and function (protein interactions) of the MHC, we performed here a local frustration analysis on pMHC homology models covering 1436 HLA I alleles. An analysis of local frustration profiles indicated that (1) variations in MHC fold are unlikely due to minimally-frustrated and relatively conserved residues within the HLA peptide-binding groove, (2) high frustration patches on HLA helices are either involved in or near interaction sites of MHC with the TCR, KIR, or Tapasin of the PLC, and (3) peptide ligands mainly stabilize the F-pocket of HLA binding groove. Methods Data collection for records_matureHLA.fasta: The sequences contained in this fasta file were obtained from the IMGT/HLA dataset. Only HLA binding groove residues are included (residues 1-180). Data collection for data_frame_SRFI.csv: This table includes mainly single residue frustration index data from pMHC structures covering 1436 HLA Class I alleles in complex with 3-10 nonamer peptides. 3-10 high affinity peptide ligands were predicted using netMHCpan 3.0 for each allele, then homology models were created using Modeller v9.19. Local frustration analysis was then carried out using frustratometer2 (stand-alone version from https://github.com/gonzaparra/frustratometer2 was used). The column names are as obtained from frustratometer2. FrstIndex column includes singe residue frustration index values. SASA,RSASA,Peptide,Allele columns include position-specific Solvent Accessible Surface Area (SASA), Relative SASA, peptide sequences and allele names, respectively. Data collection for df_SF_R_20200428.csv: This table includes a "reduced" version of data_frame_SRFI.csv, with following columns: Allele: specific HLA I allele name, Sequence: binding-groove sequence Chain: the chain on which the respective position is located Res: position of residue within the sequence AA: one-letter amino-acid code ChainRes: a concatenated string including Chain and Res fields. SASA, RSASA: as described above FI_mean, FI_mean_sd, FI_median: Mean and median SRFI values calculated for each position using data_frame_SRFI. FI_mean_sd denotes standart-deviation of mean SRFI values. rvET: Position-specific real-value Evolutionary Trace scores. FI_median_diff: The difference in median SRFI values upon peptide-binding. Locus: Gene Locus (A, B or C) Core Allele: True, if the allele is among the core alleles reported by Robinson et al. (2017) (https://doi.org/10.1371/journal.pgen.1006862). False, otherwise. Pocket: The peptide-binding pocket in which the respective residue is located. None if the residue is not included in any pocket. Interface: True, if the residue is a protein-protein interface residue within the MHC SS: Secondary-structure assignment Domain: The structural domain in which the respective residue in located.

创建时间：

2020-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集