five

Metatasks for Auto-Sklearn 1 - ROC AUC and Balanced Accuracy

收藏
DataCite Commons2025-06-01 更新2024-08-26 收录
下载链接:
https://figshare.com/articles/dataset/Metatasks_for_Auto-Sklearn_1_-_ROC_AUC_and_Balanced_Accuracy/23613627/1
下载链接
链接失效反馈
官方服务:
资源简介:
Prediction Data of Base Models from Auto-Sklearn 1 on 71 classification datasets from the AutoML Benchmark for Balanced Accuracy and ROC AUC. <br> The files of this figshare item include data that was collected for the paper: <br> <strong>Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML,</strong> <em>Lennart Purucker, Lennart Schneider, Marie Anastacio, Joeran Beel, Bernd Bischl, Holger Hoos, Second International Conference on Automated Machine Learning, 2023.</em> <br> The data was stored and used with the <em>assembled </em>framework: https://github.com/ISG-Siegen/assembled. <br> In detail, the data contains the predictions of base models on validation and test as produced by running Auto-Sklearn 1 for 4 hours. Such prediction data is included for each model produced by Auto-Sklearn 1 on each fold of 10-fold cross-validation on the 71 classification datasets from the AutoML Benchmark. The data exists for two metrics (ROC AUC and Balanced Accuracy). More details can be found in the paper. <br> The data was collected by code created for the paper and is available in its <em>reproducibility repository</em>: https://doi.org/10.6084/m9.figshare.23613624. <br> Its usage is intended for but not limited to using <em>assembled </em>to evaluate post hoc ensembling methods for AutoML. <br> Details The link above points to a hosted server that facilitates the download. We opted for a hosted server, as we found no other suitable solution to share these large files (due to file size or storage limits) for a reasonable price. If you want to obtain the data in another way or know of a more suitable alternative, please contact Lennart Purucker. <br> The link resolves to a directory containing the following: <br> example_metatasks: contains an example metatask for test purposes before committing to downloading all files. metatasks_roc_auc.zip: The Metatasks obtained by running Auto-Sklearn 1 for ROC AUC. metatasks_bacc.zip: The Metatasks obtained by running Auto-Sklearn 1 for Balanced Accuracy. <br> The size after unzipping the entire file is: metatasks_roc_auc.zip: ~450GB metatasks_bacc.zip: ~330GB We suggest extracting only files that are of interest from the .zip archive, as these can be much smaller in size and might suffice for experiments. <br> The metatask .zip files contain 2 subdirectories for Metatasks produced based on TopN or SiloTopN pruning (see paper for details). In each of these subdirectories, 2 files per metatask exist. One .json file with metadata information and a .hdf or .csv file containing the prediction data. The details on how this should be read and used as a Metatask can be found in the <em>assembled </em>framework and the reproducibility repository. To obtain the data without Metataks, we advise looking at the file content and metadata individually or parsing them by using Metatasks first.
提供机构:
figshare
创建时间:
2023-07-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作