fashion_mnist
收藏魔搭社区2026-05-15 更新2024-11-16 收录
下载链接:
https://modelscope.cn/datasets/cutedataset/fashion_mnist
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for FashionMNIST
## Table of Contents
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-fields)
- [Data Splits](#data-splits)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)
- [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
- [Dataset Curators](#dataset-curators)
- [Licensing Information](#licensing-information)
- [Citation Information](#citation-information)
- [Contributions](#contributions)
## Dataset Description
- **Homepage:** [GitHub](https://github.com/zalandoresearch/fashion-mnist)
- **Repository:** [GitHub](https://github.com/zalandoresearch/fashion-mnist)
- **Paper:** [arXiv](https://arxiv.org/pdf/1708.07747.pdf)
- **Leaderboard:**
- **Point of Contact:**
### Dataset Summary
Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
### Supported Tasks and Leaderboards
- `image-classification`: The goal of this task is to classify a given image of Zalando's article into one of 10 classes. The leaderboard is available [here](https://paperswithcode.com/sota/image-classification-on-fashion-mnist).
### Languages
[More Information Needed]
## Dataset Structure
### Data Instances
A data point comprises an image and its label.
```
{
'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at 0x27601169DD8>,
'label': 9
}
```
### Data Fields
- `image`: A `PIL.Image.Image` object containing the 28x28 image. Note that when accessing the image column: `dataset[0]["image"]` the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the `"image"` column, *i.e.* `dataset[0]["image"]` should **always** be preferred over `dataset["image"][0]`.
- `label`: an integer between 0 and 9 representing the classes with the following mapping:
| Label | Description |
| --- | --- |
| 0 | T-shirt/top |
| 1 | Trouser |
| 2 | Pullover |
| 3 | Dress |
| 4 | Coat |
| 5 | Sandal |
| 6 | Shirt |
| 7 | Sneaker |
| 8 | Bag |
| 9 | Ankle boot |
### Data Splits
The data is split into training and test set. The training set contains 60,000 images and the test set 10,000 images.
## Dataset Creation
### Curation Rationale
**From the arXiv paper:**
The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."
Here are some good reasons:
- MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily. Check out our side-by-side benchmark for Fashion-MNIST vs. MNIST, and read "Most pairs of MNIST digits can be distinguished pretty well by just one pixel."
- MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.
- MNIST can not represent modern CV tasks, as noted in this April 2017 Twitter thread, deep learning expert/Keras author François Chollet.
### Source Data
#### Initial Data Collection and Normalization
**From the arXiv paper:**
Fashion-MNIST is based on the assortment on Zalando’s website. Every fashion product on Zalando has a set of pictures shot by professional photographers, demonstrating different aspects of the product, i.e. front and back looks, details, looks with model and in an outfit. The original picture has a light-gray background (hexadecimal color: #fdfdfd) and stored in 762 × 1000 JPEG format. For efficiently serving different frontend components, the original picture is resampled with multiple resolutions, e.g. large, medium, small, thumbnail and tiny.
We use the front look thumbnail images of 70,000 unique products to build Fashion-MNIST. Those products come from different gender groups: men, women, kids and neutral. In particular, whitecolor products are not included in the dataset as they have low contrast to the background. The thumbnails (51 × 73) are then fed into the following conversion pipeline:
1. Converting the input to a PNG image.
2. Trimming any edges that are close to the color of the corner pixels. The “closeness” is defined by the distance within 5% of the maximum possible intensity in RGB space.
3. Resizing the longest edge of the image to 28 by subsampling the pixels, i.e. some rows and columns are skipped over.
4. Sharpening pixels using a Gaussian operator of the radius and standard deviation of 1.0, with increasing effect near outlines.
5. Extending the shortest edge to 28 and put the image to the center of the canvas.
6. Negating the intensities of the image.
7. Converting the image to 8-bit grayscale pixels.
#### Who are the source language producers?
**From the arXiv paper:**
Every fashion product on Zalando has a set of pictures shot by professional photographers, demonstrating different aspects of the product, i.e. front and back looks, details, looks with model and in an outfit.
### Annotations
#### Annotation process
**From the arXiv paper:**
For the class labels, they use the silhouette code of the product. The silhouette code is manually labeled by the in-house fashion experts and reviewed by a separate team at Zalando. Each product Zalando is the Europe’s largest online fashion platform. Each product contains only one silhouette code.
#### Who are the annotators?
**From the arXiv paper:**
The silhouette code is manually labeled by the in-house fashion experts and reviewed by a separate team at Zalando.
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
Han Xiao and Kashif Rasul and Roland Vollgraf
### Licensing Information
MIT Licence
### Citation Information
```
@article{DBLP:journals/corr/abs-1708-07747,
author = {Han Xiao and
Kashif Rasul and
Roland Vollgraf},
title = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
Algorithms},
journal = {CoRR},
volume = {abs/1708.07747},
year = {2017},
url = {http://arxiv.org/abs/1708.07747},
archivePrefix = {arXiv},
eprint = {1708.07747},
timestamp = {Mon, 13 Aug 2018 16:47:27 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1708-07747},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
### Contributions
Thanks to [@gchhablani](https://github.com/gchablani) for adding this dataset.
# FashionMNIST 数据集卡片
## 目录
- [数据集描述](#dataset-description)
- [数据集摘要](#dataset-summary)
- [支持任务与排行榜](#supported-tasks-and-leaderboards)
- [语言](#languages)
- [数据集结构](#dataset-structure)
- [数据实例](#data-instances)
- [数据字段](#data-fields)
- [数据划分](#data-splits)
- [数据集构建](#dataset-creation)
- [构建依据](#curation-rationale)
- [源数据](#source-data)
- [标注信息](#annotations)
- [个人与敏感信息](#personal-and-sensitive-information)
- [数据集使用注意事项](#considerations-for-using-the-data)
- [数据集的社会影响](#social-impact-of-dataset)
- [偏差讨论](#discussion-of-biases)
- [其他已知局限性](#other-known-limitations)
- [附加信息](#additional-information)
- [数据集维护者](#dataset-curators)
- [许可信息](#licensing-information)
- [引用信息](#citation-information)
- [贡献](#contributions)
## 数据集描述
- **主页:** [GitHub](https://github.com/zalandoresearch/fashion-mnist)
- **代码仓库:** [GitHub](https://github.com/zalandoresearch/fashion-mnist)
- **论文:** [arXiv](https://arxiv.org/pdf/1708.07747.pdf)
- **排行榜:**
- **联系人:**
### 数据集摘要
Fashion-MNIST是Zalando旗下的服饰图像数据集,包含60000条训练样本与10000条测试样本。每条样本均为一张28×28的灰度图像,对应10个类别中的一个标签。我们旨在将Fashion-MNIST作为原始MNIST数据集的直接无缝替代方案,用于机器学习算法的性能基准测评,二者的图像尺寸、训练与测试集划分结构完全一致。
### 支持任务与排行榜
- `image-classification`: **图像分类**:任务目标为将给定的Zalando服饰图像分类至10个类别之一。该任务的排行榜可参阅[此处](https://paperswithcode.com/sota/image-classification-on-fashion-mnist)。
### 语言
[需补充更多信息]
## 数据集结构
### 数据实例
一个数据点由一幅图像及其对应标签组成。示例格式如下:
{
'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at 0x27601169DD8>,
'label': 9
}
### 数据字段
- `image`: 一个`PIL.Image.Image`对象,存储28×28的图像。请注意,当访问图像列时:`dataset[0]["image"]`会自动对图像文件进行解码。解码大量图像文件可能会耗费大量时间,因此请务必先查询样本索引再访问`"image"`列,即**始终优先使用`dataset[0]["image"]`而非`dataset["image"][0]`**。
- `label`: 取值为0到9的整数,对应类别映射关系如下:
| 标签编号 | 类别描述 |
| --- | --- |
| 0 | T恤/上衣(T-shirt/top) |
| 1 | 裤子(Trouser) |
| 2 | 套头衫(Pullover) |
| 3 | 连衣裙(Dress) |
| 4 | 外套(Coat) |
| 5 | 凉鞋(Sandal) |
| 6 | 衬衫(Shirt) |
| 7 | 运动鞋(Sneaker) |
| 8 | 包包(Bag) |
| 9 | 短靴(Ankle boot) |
### 数据划分
数据集划分为训练集与测试集。训练集包含60000张图像,测试集包含10000张图像。
## 数据集构建
### 构建依据
**摘自arXiv论文:**
原始MNIST数据集包含大量手写数字。人工智能、机器学习与数据科学领域的研究者对该数据集青睐有加,常将其作为验证算法性能的基准。事实上,MNIST往往是研究者接触的首个数据集。他们常会说:"如果算法在MNIST上无法奏效,那它根本就无法正常工作",而"如果算法在MNIST上表现良好,在其他数据集上仍可能失败"。
以下是几项关键原因:
- MNIST难度过低:卷积神经网络在MNIST上可实现99.7%的分类精度,经典机器学习算法也能轻松达到97%的精度。可参阅我们提供的Fashion-MNIST与MNIST的基准对比页面,并阅读《多数MNIST数字仅靠单个像素即可轻松区分》一文。
- MNIST被过度使用:在2017年4月的一条Twitter线程中,谷歌大脑研究科学家、深度学习专家Ian Goodfellow呼吁研究者放弃使用MNIST数据集。
- MNIST无法代表现代计算机视觉任务:正如2017年4月的一条Twitter线程中,深度学习专家、Keras作者François Chollet所指出的那样。
### 源数据
#### 初始数据收集与归一化
**摘自arXiv论文:**
Fashion-MNIST基于Zalando官网的时尚商品类目。每件Zalando时尚商品都配有一组由专业摄影师拍摄的图片,展示商品的不同视角,例如正面、背面、细节、模特上身效果以及搭配穿搭效果。原始图片采用浅灰色背景(十六进制色值:#fdfdfd),以762×1000的JPEG格式存储。为适配不同前端组件的展示需求,原始图片会被调整为多种分辨率,例如大、中、小、缩略图以及极小尺寸。
我们使用70000个唯一商品的正面视角缩略图来构建Fashion-MNIST数据集。这些商品覆盖不同性别群体:男性、女性、儿童以及中性款。特别地,纯白色商品未被纳入数据集,因为它们与背景的对比度较低。随后,这些尺寸为51×73的缩略图将经过以下转换流程:
1. 将输入转换为PNG图像。
2. 裁剪所有与边角像素颜色相近的边缘。"相近"的定义为RGB空间中最大可能亮度值的5%以内的距离。
3. 通过对像素进行降采样(即跳过部分行与列),将图像的最长边调整为28像素。
4. 使用半径与标准差均为1.0的高斯算子对像素进行锐化,在轮廓附近的锐化效果更强。
5. 将图像的最短边延伸至28像素,并将图像置于画布中央。
6. 反转图像的亮度值。
7. 将图像转换为8位灰度像素。
#### 源数据生产者是谁?
**摘自arXiv论文:**
Zalando的每件时尚商品都配有一组由专业摄影师拍摄的图片,展示商品的不同视角,例如正面、背面、细节、模特上身效果以及搭配穿搭效果。
### 标注信息
#### 标注流程
**摘自arXiv论文:**
类别标签采用商品的轮廓代码(silhouette code)。该轮廓代码由Zalando内部的时尚专家手动标注,并由独立团队进行审核。Zalando是欧洲最大的在线时尚购物平台。每件商品仅对应一个轮廓代码。
#### 标注者是谁?
**摘自arXiv论文:**
轮廓代码由Zalando内部的时尚专家手动标注,并由独立团队进行审核。
### 个人与敏感信息
[需补充更多信息]
## 数据集使用注意事项
### 数据集的社会影响
[需补充更多信息]
### 偏差讨论
[需补充更多信息]
### 其他已知局限性
[需补充更多信息]
## 附加信息
### 数据集维护者
Han Xiao、Kashif Rasul与Roland Vollgraf
### 许可信息
MIT许可
### 引用信息
@article{DBLP:journals/corr/abs-1708-07747,
author = {Han Xiao and
Kashif Rasul and
Roland Vollgraf},
title = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
Algorithms},
journal = {CoRR},
volume = {abs/1708.07747},
year = {2017},
url = {http://arxiv.org/abs/1708.07747},
archivePrefix = {arXiv},
eprint = {1708.07747},
timestamp = {Mon, 13 Aug 2018 16:47:27 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1708-07747},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
### 贡献
感谢[@gchhablani](https://github.com/gchablani)添加本数据集。
提供机构:
maas
创建时间:
2024-11-04



