five

flwrlabs/cinic10

收藏
Hugging Face2024-08-07 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/flwrlabs/cinic10
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 dataset_info: features: - name: image dtype: image - name: label dtype: class_label: names: '0': airplane '1': automobile '2': bird '3': cat '4': deer '5': dog '6': frog '7': horse '8': ship '9': truck splits: - name: train num_bytes: 178662714 num_examples: 90000 - name: validation num_bytes: 180126542 num_examples: 90000 - name: test num_bytes: 178913694 num_examples: 90000 download_size: 771149160 dataset_size: 537702950 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* task_categories: - image-classification size_categories: - 100K<n<1M --- # Dataset Card for CINIC-10 CINIC-10 has a total of 270,000 images equally split amongst three subsets: train, validate, and test. This means that CINIC-10 has 4.5 times as many samples than CIFAR-10. ## Dataset Details In each subset (90,000 images), there are ten classes (identical to [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) classes). There are 9000 images per class per subset. Using the suggested data split (an equal three-way split), CINIC-10 has 1.8 times as many training samples as in CIFAR-10. CINIC-10 is designed to be directly swappable with CIFAR-10. To understand the motivation behind the dataset creation please visit the [GitHub repository](https://github.com/BayesWatch/cinic-10 ). ### Dataset Sources - **Repository:** https://github.com/BayesWatch/cinic-10 - **Paper:** https://arxiv.org/abs/1810.03505 - **Dataset:** http://dx.doi.org/10.7488/ds/2448 - **Benchmarking, Papers with code:** https://paperswithcode.com/sota/image-classification-on-cinic-10 ## Use in FL In order to prepare the dataset for the FL settings, we recommend using [Flower Dataset](https://flower.ai/docs/datasets/) (flwr-datasets) for the dataset download and partitioning and [Flower](https://flower.ai/docs/framework/) (flwr) for conducting FL experiments. To partition the dataset, do the following. 1. Install the package. ```bash pip install flwr-datasets[vision] ``` 2. Use the HF Dataset under the hood in Flower Datasets. ```python from flwr_datasets import FederatedDataset from flwr_datasets.partitioner import IidPartitioner fds = FederatedDataset( dataset="flwrlabs/cinic10", partitioners={"train": IidPartitioner(num_partitions=10)} ) partition = fds.load_partition(partition_id=0) ``` ## Dataset Structure ### Data Instances The first instance of the train split is presented below: ``` { 'image': <PIL.PngImagePlugin.PngImageFile image mode=RGB size=32x32>, 'label': 0 } ``` ### Data Split ``` DatasetDict({ train: Dataset({ features: ['image', 'label'], num_rows: 90000 }) validation: Dataset({ features: ['image', 'label'], num_rows: 90000 }) test: Dataset({ features: ['image', 'label'], num_rows: 90000 }) }) ``` ## Citation When working with the CINIC-10 dataset, please cite the original paper. If you're using this dataset with Flower Datasets and Flower, cite Flower. **BibTeX:** Original paper: ``` @misc{darlow2018cinic10imagenetcifar10, title={CINIC-10 is not ImageNet or CIFAR-10}, author={Luke N. Darlow and Elliot J. Crowley and Antreas Antoniou and Amos J. Storkey}, year={2018}, eprint={1810.03505}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/1810.03505}, } ```` Flower: ``` @article{DBLP:journals/corr/abs-2007-14390, author = {Daniel J. Beutel and Taner Topal and Akhil Mathur and Xinchi Qiu and Titouan Parcollet and Nicholas D. Lane}, title = {Flower: {A} Friendly Federated Learning Research Framework}, journal = {CoRR}, volume = {abs/2007.14390}, year = {2020}, url = {https://arxiv.org/abs/2007.14390}, eprinttype = {arXiv}, eprint = {2007.14390}, timestamp = {Mon, 03 Aug 2020 14:32:13 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2007-14390.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ``` ## Dataset Card Contact If you have any questions about the dataset preprocessing and preparation, please contact [Flower Labs](https://flower.ai/).
提供机构:
flwrlabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作