Hard-Bench

Name: Hard-Bench
Creator: 上海人工智能实验室
Published: 2023-03-09 14:28:38
License: 暂无描述

arXiv2023-03-09 更新2024-06-21 收录

下载链接：

https://github.com/Qian2333/Hard-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

Hard-Bench是一个旨在评估先进神经网络在低资源环境下学习能力的挑战性基准。该数据集由上海人工智能实验室创建，包含11个子数据集，其中3个为计算机视觉（CV）数据集，8个为自然语言处理（NLP）数据集。数据集的创建旨在揭示现有模型在处理困难示例时的性能下降，特别是在传统低资源基准测试中表现良好的预训练网络在Hard-Bench上的表现并未显示出改进。这一发现表明，现有模型与人类水平性能之间仍存在较大的鲁棒性差距。

Hard-Bench is a challenging benchmark designed to evaluate the learning capabilities of state-of-the-art neural networks in low-resource environments. Developed by the Shanghai AI Laboratory, this benchmark consists of 11 sub-datasets, including 3 computer vision (CV) datasets and 8 natural language processing (NLP) datasets. The benchmark was created to reveal the performance degradation of existing models when handling challenging examples. Specifically, pre-trained networks that perform well in traditional low-resource benchmark tasks show no improvements on Hard-Bench. This finding indicates that a substantial robustness gap still exists between current models and human-level performance.

提供机构：

上海人工智能实验室

创建时间：

2023-03-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集