Synthetic Data for Model Ensembling

Name: Synthetic Data for Model Ensembling
Creator: Generated by authors
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/globusharris/ensembling-constrained-optimization

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集由合成数据组成，包含20个特征和4维标签，旨在评估在受限优化设置中的集成学习方法。特征数据遵循多元正态分布，并且有一个最终的分类特征，随机将数据点分配到五个类别中的一个。标签与特征之间存在一个带有噪声的线性关系。数据规模方面，训练集包含10,000个数据点，评估集包含400个数据点。该任务的目的是为了优化而进行模型集成。

This dataset consists of synthetic data with 20 features and 4-dimensional labels, designed to evaluate ensemble learning methods under constrained optimization settings. The feature data follows a multivariate normal distribution, and there is a final categorical feature that randomly assigns each data point to one of five categories. A noisy linear relationship exists between the labels and the features. In terms of dataset scale, the training set contains 10,000 data points while the evaluation set includes 400 data points. The goal of this task is to perform model ensembling for optimization purposes.

提供机构：

Generated by authors

5,000+

优质数据集

54 个

任务类型

进入经典数据集