Different model configurations evaluated on Cohort 1.

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://figshare.com/articles/dataset/Different_model_configurations_evaluated_on_Cohort_1_/11944533

下载链接

链接失效反馈

官方服务：

资源简介：

Each row represents a different model, and the columns describe the configuration of each model. The first row (bold font) corresponds to the model configuration with the best performance for the breast and colorectal cancer datasets [25]. The second row (bold underlined font) corresponds to the best performing model configuration presented here. The first column indicates the number of amino acid residues in the motif. The second column indicates the number of CDR3 amino acid residues that could be skipped when assembling a motif. For example, if the value is 2, then 2 CDR3 amino acid residues could be skipped. The third column indicates if binary indicators indicating whether the corresponding CDR3 residue was ignored were used. For example, if a CDR3 residue was ignored but would have been in the third position of a motif if it had been included, then the 3rd indicator would have a value of 1. The fourth column indicates if an amino acid was skipped in the CDR3 for the given position in the motif. The fifth column indicates if the expected frequency of the motif in blood was included as a feature. The expected frequency was estimated using publicly available data from 786 presumed healthy individuals [31]. The sixth column indicates if the log of the motif relative abundance was used for the relative abundance term. Column 7 indicates if each feature is squared and used as an additional feature, resulting in 2nd order terms in the model. Column 8 indicates if batch normalization was used. Column 9 (fourth from last) is the log-loss averaged across the one-holdout cross-validations. Column 10 (third from last) is the accuracy computed over the one-holdout cross-validations. Column 11 (second from last) is the number of gradient steps used to fit the model as determined by early-stopping. Column 12 is the number of fits to the training data, of which the best fit to the training data is applied to the holdout sample.

创建时间：

2020-03-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集