five

Data for: Connecting MHC-I-binding motifs with HLA alleles via deep learning

收藏
Mendeley Data2021-07-09 更新2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/c249p8gdzd/2
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the research data supporting the study, “connecting MHC-I-binding motifs with HLA alleles via deep learning”. 1. MHCI_res182_seq.json: the peptide-binding cleft sequence of each MHC-I allele extracted from the IPD-IMGT/HLA database (version 3.41.0) 2. MHCI_res182_onehot.npy: the one-hot encoding of the peptide-binding cleft sequence of MHC-I allele 3. dataframe.tar.gz: this folder contains training, validation, and benchmark datasets [the common columns] 1. sequence: the peptide sequence 2. peptide_length: the length of peptide sequences 3. mhc: the MHC-I allele 4. meas: binding affinity for assay data 5. value: the value (between 0 and 1) converted from the binding affinity 6. bind: the label of binding 7. source: data source (assay, MS, random decoy, or peptide decoy) 1. train_hit.csv 1. data: measurements extracted from IEDB for the training process 2. columns 1. the common columns 2. MHCfovea: the prediction score of MHCfovea (used for ScoreCAM analysis) 2. train_decoy_{1-90}.csv 1. data: artificial decoy peptides for the training process; the data number of each file is almost equal to the number of eluted peptides 2. columns 1. the common columns 3. valid.csv 1. data: measurements extracted from IEDB and decoy peptides for validation 2. columns 1. the common columns 2. batch_size_{16, 32, 64}_and_learning_rate_{0.00100, 0.00010, 0.00001}: for optimizing hyperparameters of batch size and learning rate 3. DE_{1, 5, 10, 15, 30}_and_{30, 60, 90}: for optimizing the D-E ratio of the training and downsized dataset; the first number is the D-E ratio of the downsized dataset and the second number is the D-E ratio of the training dataset 4. benchmark.csv 1. data: eluted peptides extracted from IEDB and decoy peptides for the testing process 2. columns 1. the common columns 2. NetMHCpan4.1, MHCflurry2.0, MixMHCpred2.1, MHCfovea: the prediction score of each predictor 4. allele_expansion.tar.gz: this folder contains data for the allele expansion 1. peptides.csv: peptides used for allele expansion 2. output/{MHC-I group} 1. allele.json: alleles of the MHC-I group 2. motif.npy: binding motifs for each allele 3. prediction.npy: prediction score for each allele (the order is the same as that of peptides.csv)
创建时间:
2021-07-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作