five

Metatasks for Auto-Sklearn 1 - ROC AUC and Balanced Accuracy

收藏
Figshare2023-07-01 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Metatasks_for_Auto-Sklearn_1_-_ROC_AUC_and_Balanced_Accuracy/23613627
下载链接
链接失效反馈
官方服务:
资源简介:
Prediction Data of Base Models from Auto-Sklearn 1 on 71 classification datasets from the AutoML Benchmark for Balanced Accuracy and ROC AUC. The files of this figshare item include data that was collected for the paper: Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML, Lennart Purucker, Lennart Schneider, Marie Anastacio, Joeran Beel, Bernd Bischl, Holger Hoos, Second International Conference on Automated Machine Learning, 2023. The data was stored and used with the assembled framework: https://github.com/ISG-Siegen/assembled. In detail, the data contains the predictions of base models on validation and test as produced by running Auto-Sklearn 1 for 4 hours. Such prediction data is included for each model produced by Auto-Sklearn 1 on each fold of 10-fold cross-validation on the 71 classification datasets from the AutoML Benchmark. The data exists for two metrics (ROC AUC and Balanced Accuracy). More details can be found in the paper. The data was collected by code created for the paper and is available in its reproducibility repository: https://doi.org/10.6084/m9.figshare.23613624. Its usage is intended for but not limited to using assembled to evaluate post hoc ensembling methods for AutoML. Details The link above points to a hosted server that facilitates the download. We opted for a hosted server, as we found no other suitable solution to share these large files (due to file size or storage limits) for a reasonable price. If you want to obtain the data in another way or know of a more suitable alternative, please contact Lennart Purucker. The link resolves to a directory containing the following: example_metatasks: contains an example metatask for test purposes before committing to downloading all files. metatasks_roc_auc.zip: The Metatasks obtained by running Auto-Sklearn 1 for ROC AUC. metatasks_bacc.zip: The Metatasks obtained by running Auto-Sklearn 1 for Balanced Accuracy. The size after unzipping the entire file is: metatasks_roc_auc.zip: ~450GB metatasks_bacc.zip: ~330GB We suggest extracting only files that are of interest from the .zip archive, as these can be much smaller in size and might suffice for experiments. The metatask .zip files contain 2 subdirectories for Metatasks produced based on TopN or SiloTopN pruning (see paper for details). In each of these subdirectories, 2 files per metatask exist. One .json file with metadata information and a .hdf or .csv file containing the prediction data. The details on how this should be read and used as a Metatask can be found in the assembled framework and the reproducibility repository. To obtain the data without Metataks, we advise looking at the file content and metadata individually or parsing them by using Metatasks first.
创建时间:
2023-07-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作