Data for: Connecting MHC-I-binding motifs with HLA alleles via deep learning
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/c249p8gdzd
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the research data supporting the study, “connecting MHC-I-binding motifs with HLA alleles via deep learning”.
1. MHCI_res182_seq.json: the peptide-binding cleft sequence of each MHC-I allele extracted from the IPD-IMGT/HLA database (version 3.41.0)
2. MHCI_res182_onehot.npy: the one-hot encoding of the peptide-binding cleft sequence of MHC-I allele
3. dataframe.tar.gz: this folder contains training, validation, and benchmark datasets
[the common columns]
1. sequence: the peptide sequence
2. peptide_length: the length of peptide sequences
3. mhc: the MHC-I allele
4. meas: binding affinity for assay data
5. value: the value (between 0 and 1) converted from the binding affinity
6. bind: the label of binding
7. source: data source (assay, MS, random decoy, or peptide decoy)
1. train_hit.csv
1. data: measurements extracted from IEDB for the training process
2. columns
1. the common columns
2. MHCfovea: the prediction score of MHCfovea (used for ScoreCAM analysis)
2. train_decoy_{1-90}.csv
1. data: artificial decoy peptides for the training process; the data number of each file is almost equal to the number of eluted peptides
2. columns
1. the common columns
3. valid.csv
1. data: measurements extracted from IEDB and decoy peptides for validation
2. columns
1. the common columns
2. batch_size_{16, 32, 64}_and_learning_rate_{0.00100, 0.00010, 0.00001}: for optimizing hyperparameters of batch size and learning rate
3. DE_{1, 5, 10, 15, 30}_and_{30, 60, 90}: for optimizing the D-E ratio of the training and downsized dataset; the first number is the D-E ratio of the downsized dataset and the second number is the D-E ratio of the training dataset
4. benchmark.csv
1. data: eluted peptides extracted from IEDB and decoy peptides for the testing process
2. columns
1. the common columns
2. NetMHCpan4.1, MHCflurry2.0, MixMHCpred2.1, MHCfovea: the prediction score of each predictor
4. allele_expansion.tar.gz: this folder contains data for the allele expansion
1. peptides.csv: peptides used for allele expansion
2. output/{MHC-I group}
1. allele.json: alleles of the MHC-I group
2. motif.npy: binding motifs for each allele
3. prediction.npy: prediction score for each allele (the order is the same as that of peptides.csv)
5. MHCfovea-v1.0.0.zip: the source code of MHCfovea
创建时间:
2021-09-10



