Predictive performance of models trained to predict the presence of Methanobacteriaceae from metagenomes.
收藏Figshare2022-12-14 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Predictive_performance_of_models_trained_to_predict_the_presence_of_i_Methanobacteriaceae_i_from_metagenomes_/21727599
下载链接
链接失效反馈官方服务:
资源简介:
Samples: Models trained on CV sets from all samples (n = 2203 samples in total, train = 1542 and test = 661 samples), or only the set of samples with complete metadata information for age, gender, and BMI (n = 748 samples in total, train = 524 and test = 224 samples). Feature selection: Feature selection algorithm and its tuned parameters Model: Random forests (RF) were fitted using the ranger R-package [44]; ntrees = 250 and 500 were tested for parameter tuning. Gradient boosted models (XGBoost) were fitted using theXGBoost R-package [46]; nrounds in {10, 50, 100, 250, 500, 750, 1000, 1500} and maxdepth in {1, …,10} were tested for hyperparameter tuning (only the results of the model with highest average Cohen’s kappa are given). Sample weights were provided to increase the probability of samples from the under-represented class to be sampled at each bootstrap. Accuracy, Cohen’s kappa, and number of selected features: Average +/- standard deviation across 10x cross-validation 70–30% train-test sets. The selected model is indicated in bold. (CSV)
创建时间:
2022-12-14



