Results of machine learning experiments for "Multi-classifier prediction of knee osteoarthritis progression from incomplete imbalanced longitudinal data"
收藏data.ncl.ac.uk2019-10-30 更新2025-01-15 收录
下载链接:
https://data.ncl.ac.uk/articles/dataset/_Multi-classifier_prediction_of_knee_osteoarthritis_progression_from_incomplete_imbalanced_longitudinal_data_-_results_of_machine_learning_experiments/10043060/2
下载链接
链接失效反馈官方服务:
资源简介:
The archive file includes results of machine learning experiments performed for the article "Multi-classifier prediction of knee osteoarthritis progression from incomplete imbalanced longitudinal data". The hypothesis of the article is that prediction models trained on historical data will be more effective at identifying fast progressing knee OA patients than conventional inclusion criteria.For all experiments the first level folder hierarchy indicates the method used. Where parameter tuning is performed, the second level folders indicate algorithm parameters. Each experiment output is stored in a xz compressed text file in JSON format.In experiments measuring the learning curves (training-*), each results file describes:* experiment setup (algorithm, number of subsets, down-sampled class size)* list of training set sizes* performance measure statistics for all subsets at each training size (flat list) including min, median and max score, and median deviation from median (mad), given for both test and training set instancesIn parameter tuning experiments (prediction-multi-*), each results file contains:* experiment setup (method / algorithm, number of CV repeats, number of model runs)* imputer parameters (not important, kept constant in all experiments)* classifier parameters (for random forest)* true class for each instance* class predictions by the median model from each CV-repeat* class probabilities estimated by the median model from each CV-repeat* performance measure statistics for each CV-repeat including min, median and max score, and median deviation from median (mad)In RFE experiments (prediction-multi-rfe-*) the results additionally include:* scores for all RFE steps for each CV-repeat* number of times each feature was selected (across all folds and CV-repeats)
该存档文件收录了为文章《基于不完整不平衡纵向数据的多分类预测膝关节骨关节炎进展》所进行的机器学习实验结果。该文章的假设为:基于历史数据训练的预测模型在识别快速进展的膝关节骨关节炎患者方面将比传统的纳入标准更为有效。所有实验中,一级文件夹层次结构指示所使用的方法。在参数调整实验中,二级文件夹指示算法参数。每个实验的输出均以 xz 压缩的文本文件存储,格式为 JSON。在测量学习曲线(training-*)的实验中,每个结果文件描述如下:* 实验设置(算法、子集数量、降采样类别大小)* 训练集大小的列表* 在每个训练大小下所有子集的性能衡量统计信息(扁平列表),包括最小值、中位数和最大分数,以及中位数与中位数的偏差(mad),分别针对测试集和训练集实例提供。在参数调整实验(prediction-multi-*)中,每个结果文件包含以下内容:* 实验设置(方法/算法、交叉验证重复次数、模型运行次数)* 填充参数(不重要,所有实验中保持恒定)* 分类器参数(随机森林)* 每个实例的真实类别* 由每个交叉验证重复中的中位数模型进行的类别预测* 由每个交叉验证重复中的中位数模型估计的类别概率* 每个交叉验证重复的性能衡量统计信息,包括最小值、中位数和最大分数,以及中位数与中位数的偏差(mad)。在RFE实验(prediction-multi-rfe-*)中,结果还包括:* 每个交叉验证重复的所有RFE步骤的分数* 每个特征被选中的次数(跨所有折叠和交叉验证重复)
提供机构:
Newcastle University



