five

Federated EMNIST Dataset

收藏
Figshare2024-07-16 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Federated_EMNIST_Dataset/26308777/1
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is derived from the Leaf repository (https://github.com/TalwalkarLab/leaf) pre-processing of the Extended MNIST dataset, grouping examples by writer. Details about Leaf were published in "LEAF: A Benchmark for Federated Settings" https://arxiv.org/abs/1812.01097<b>Note:</b> This dataset does not include some additional preprocessing that MNIST includes, such as size-normalization and centering. In the Federated EMNIST data, the value of 1.0 corresponds to the background, and 0.0 corresponds to the color of the digits themselves; this is the <i>inverse</i> of some MNIST representations, e.g. in tensorflow_datasets, where 0 corresponds to the background color, and 255 represents the color of the digit.Data set sizes:<i>only_digits=True</i>: 3,383 users, 10 label classestrain: 341,873 examplestest: 40,832 examples<i>only_digits=False</i>: 3,400 users, 62 label classestrain: 671,585 examplestest: 77,483 examplesRather than holding out specific users, each user's examples are split across <i>train</i> and <i>test</i> so that all users have at least one example in <i>train</i> and one example in <i>test</i>. Writers that had less than 2 examples are excluded from the data set.The <code>tf.data.Datasets</code> returned by <code>tff.simulation.datasets.ClientData.create_tf_dataset_for_client</code> will yield <code>collections.OrderedDict</code> objects at each iteration, with the following keys and values, in lexicographic order by key:<code>'label'</code>: a <code>tf.Tensor</code> with <code>dtype=tf.int32</code> and shape [1], the class label of the corresponding pixels. Labels [0-9] correspond to the digits classes, labels [10-35] correspond to the uppercase classes (e.g., label 11 is 'B'), and labels [36-61] correspond to the lowercase classes (e.g., label 37 is 'b').<code>'pixels'</code>: a <code>tf.Tensor</code> with <code>dtype=tf.float32</code> and shape [28, 28], containing the pixels of the handwritten digit, with values in the range [0.0, 1.0].<br><br><br>Args<code>only_</code><code>digits</code>(Optional) whether to only include examples that are from the digits [0-9] classes. If <code>False</code>, includes lower and upper case characters, for a total of 62 class labels.<code>cache_</code><code>dir</code>(Optional) directory to cache the downloaded file. If <code>None</code>, caches in Keras' default cache directory.<br><br><br><br>ReturnsTuple of (train, test) where the tuple elements are <code>tff.simulation.datasets.ClientData</code> objects.
提供机构:
Mali, Saroj
创建时间:
2024-07-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作