Expert, Pitbull, MNIST, FMNIST, CIFAR10, Tiny ImageNet, Stanford Dogs, Random Gaussian, Augmented CIFAR10
收藏arXiv2023-06-27 更新2024-06-21 收录
下载链接:
https://github.com/szmazurek/ds_assessment
下载链接
链接失效反馈官方服务:
资源简介:
本研究使用了九个精心挑选的数据集,每个数据集都针对不同复杂度的分类任务。这些数据集包括图像数据,如MNIST、FMNIST、CIFAR10等,以及两个特殊数据集:一个通过随机选择像素值和类分配来最大化熵,另一个通过对CIFAR10数据集中的图像进行增强来展示高冗余。所有数据集都经过标准化处理,统一分辨率为128x128。这些数据集的应用领域广泛,旨在通过评估数据集质量来提升机器学习模型的性能和准确性。
In this study, nine carefully curated datasets are employed, each tailored for classification tasks with varying complexities. These datasets include image data such as MNIST, FMNIST, CIFAR-10, among others, plus two specialized datasets: one maximizes entropy by randomly selecting pixel values and class labels, while the other demonstrates high redundancy through image augmentation on the CIFAR-10 dataset. All datasets have undergone standardization and been unified to a consistent resolution of 128×128. These datasets cover a wide range of application scenarios, aiming to improve the performance and accuracy of machine learning models by evaluating dataset quality.
提供机构:
ACK Cyfronet AGH
创建时间:
2023-06-27



