five

Wild-Tab

收藏
arXiv2023-12-04 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2312.01792v1
下载链接
链接失效反馈
官方服务:
资源简介:
Wild-Tab是一个专为表格回归任务中的分布外泛化(OOD)设计的大型基准。该数据集包含三个来自天气预测和电力消耗估计等领域的工业数据集,即V PowerS、V PowerR和Weather。这些数据集为评估OOD性能提供了具有挑战性的测试平台,特别是在真实世界条件下。数据集的创建涉及从公开可用数据中进行规范分区,并遵循特定的方法论进行数据分割。Wild-Tab的应用领域主要集中在解决机器学习模型在面对训练集分布之外的数据时的泛化问题,这对于部署在各种现实世界环境中处理分布偏移的机器学习模型至关重要。

Wild-Tab is a large-scale benchmark specifically designed for out-of-distribution (OOD) generalization in tabular regression tasks. This dataset encompasses three industrial datasets originating from domains such as weather forecasting and electricity consumption estimation, namely V PowerS, V PowerR, and Weather. These datasets serve as a challenging testbed for evaluating OOD performance, especially under real-world conditions. The construction of Wild-Tab involves standardized partitioning from publicly accessible data, with data splits adhering to a specific methodological framework. The primary application scope of Wild-Tab lies in addressing the generalization issue of machine learning models when encountering data outside the training distribution, which is critical for deploying machine learning models that handle distribution shifts across diverse real-world scenarios.
提供机构:
Tinkoff
创建时间:
2023-12-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作