B-FHTL

Name: B-FHTL
Creator: 阿里巴巴集团
Published: 2022-06-21 15:09:02
License: 暂无描述

arXiv2022-06-21 更新2024-06-21 收录

下载链接：

https://github.com/alibaba/FederatedScope/tree/master/benchmark/B-FHTL

下载链接

链接失效反馈

官方服务：

资源简介：

B-FHTL数据集由阿里巴巴集团开发，旨在模拟联邦学习中的异构任务场景。该数据集包含三个精心设计的数据集，分别来自图和自然语言处理领域，涵盖了图分类、回归、情感分类、阅读压缩和句子对相似度预测等多种任务。每个任务都模拟了具有不同非独立同分布数据和学习任务的客户端。B-FHTL数据集不仅支持多种学习任务，还通过提供高层次的应用程序接口（API）来防止隐私泄露，并预设了跨不同学习任务的常用评估指标，如回归、分类、文本生成等。此外，B-FHTL还支持客户端级别的观察，使得用户可以详细了解每个客户端的改进情况。该数据集的应用领域广泛，旨在解决联邦学习中由于数据和任务异质性带来的挑战，推动联邦学习在多任务学习、模型预训练和自动机器学习（包括元学习和超参数优化）等领域的交叉研究。

The B-FHTL dataset, developed by Alibaba Group, is designed to simulate heterogeneous task scenarios in federated learning. This dataset comprises three meticulously designed sub-datasets sourced from the fields of graph learning and natural language processing (NLP), covering a variety of tasks including graph classification, regression, sentiment classification, reading compression, and sentence pair similarity prediction. Each task simulates clients with distinct non-independent and identically distributed (non-IID) data and learning objectives. The B-FHTL dataset not only supports multiple learning tasks but also prevents privacy leakage by providing high-level application programming interfaces (APIs), and predefines commonly used evaluation metrics across different learning tasks such as regression, classification, and text generation. Furthermore, B-FHTL supports client-level observation, enabling users to gain detailed insights into the performance improvements of each individual client. With wide-ranging application scenarios, this dataset aims to address the challenges posed by data and task heterogeneity in federated learning, and promote interdisciplinary research on federated learning in areas such as multi-task learning, model pre-training, and automated machine learning (including meta-learning and hyperparameter optimization).

提供机构：

阿里巴巴集团

创建时间：

2022-06-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集