five

Multi Class Datasets

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://doi.org/10.7910/DVN/O4RIRM
下载链接
链接失效反馈
官方服务:
资源简介:
We present 20 new multi-labeled artificial datasets, which can also be used for evaluating ambiguity resolving classifiers. The ambiguous or multi-labeled points are defined by those lying in the overlapping regions of two or more classes. Among the 20 datasets, 10 are 2-dimensional, while the rests are either 5 or 10-dimensional extended versions of the 2-dimensonal ones. The extensions are done following one of the two techniques. In the first strategy, datasets ate designed by appending 3 new dimensions each sampled uniformly at random and scaled between a specified range. The new 5-dimensional dataset is rotated by a random rotation matrix. This is a general technique by which any dataset can be transformed to higher dimensional feature space while conserving the properties of the ambiguous points. The second method extends the datasets by sampling them from a 10-dimensional real-valued feature space using the analogs class distributions of the corresponding 2-dimensional dataset. Such a strategy can extend a dataset to arbitrarily higher dimension feature space. However, the datasets will become sparse with increasing dimensionality. To tackle this issue the number of data points is increased in this case.
创建时间:
2017-03-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作