Colored MNIST dataset
收藏DataCite Commons2026-01-07 更新2026-05-05 收录
下载链接:
https://service.tib.eu/ldmservice/dataset/2ac1ef5c-065e-404e-9cd7-03356817e9c7
下载链接
链接失效反馈官方服务:
资源简介:
The dataset used in the paper is a binary classification task in a 300-dimensional space. The procedure for generating the training dataset is as follows: Each label y ∈ {−1, 1} is sampled uniformly at random. The first component x1 is sampled from a mixture of two Gaussian distributions with a variance of 0.15, centered at y and 1 − y respectively, with mixing proportions of 0.9 and 0.1. As the training dataset size increases, the model’s ability to learn this feature improves, thereby improving the test accuracy. The remaining 299 dimensions (x2,..., x300) are drawn from a standard normal distribution with zero mean and a variance of 0.1. They constitute the nuisance subspace, primarily used to memorise label noise.
提供机构:
TIB
创建时间:
2024-12-16



