five

Synthetic Data for Model Ensembling

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/globusharris/ensembling-constrained-optimization
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集由合成数据组成,包含20个特征和4维标签,旨在评估在受限优化设置中的集成学习方法。特征数据遵循多元正态分布,并且有一个最终的分类特征,随机将数据点分配到五个类别中的一个。标签与特征之间存在一个带有噪声的线性关系。数据规模方面,训练集包含10,000个数据点,评估集包含400个数据点。该任务的目的是为了优化而进行模型集成。

This dataset consists of synthetic data with 20 features and 4-dimensional labels, designed to evaluate ensemble learning methods under constrained optimization settings. The feature data follows a multivariate normal distribution, and there is a final categorical feature that randomly assigns each data point to one of five categories. A noisy linear relationship exists between the labels and the features. In terms of dataset scale, the training set contains 10,000 data points while the evaluation set includes 400 data points. The goal of this task is to perform model ensembling for optimization purposes.
提供机构:
Generated by authors
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作