非IID数据集

Name: 非IID数据集
Creator: 杜克大学
Published: 2020-08-08 04:45:12
License: 暂无描述

arXiv2020-08-08 更新2024-06-21 收录

下载链接：

https://github.com/jeremy313/non-iid-dataset-for-personalized-federated-learning

下载链接

链接失效反馈

官方服务：

资源简介：

非IID数据集是由杜克大学构建的一组数据集，旨在模拟联邦学习环境中的非独立同分布（non-IID）数据特性。该数据集基于经典的MNIST、CIFAR-10和EMNIST数据集，通过引入特征分布偏斜、标签分布偏斜和数量偏斜等方法，增强了数据集的复杂性和现实性。数据集的创建过程中，研究者们特别关注了如何通过不同的偏斜策略来模拟真实世界的数据分布。这些数据集主要用于评估和改进联邦学习框架在处理非IID数据时的性能，特别是在个性化和通信效率方面。

The non-IID datasets are a collection constructed by Duke University, designed to simulate non-independent and identically distributed (non-IID) data characteristics in federated learning environments. Built upon classic datasets including MNIST, CIFAR-10, and EMNIST, these datasets enhance their complexity and realism by introducing strategies such as feature distribution skew, label distribution skew, and quantity skew. During the development of these datasets, researchers paid special attention to simulating real-world data distributions through various skew strategies. These datasets are primarily used to evaluate and improve the performance of federated learning frameworks when handling non-IID data, particularly in terms of personalization and communication efficiency.

提供机构：

杜克大学

创建时间：

2020-08-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集