five

Hard-Bench

收藏
arXiv2023-03-09 更新2024-06-21 收录
下载链接:
https://github.com/Qian2333/Hard-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
Hard-Bench是一个旨在评估先进神经网络在低资源环境下学习能力的挑战性基准。该数据集由上海人工智能实验室创建,包含11个子数据集,其中3个为计算机视觉(CV)数据集,8个为自然语言处理(NLP)数据集。数据集的创建旨在揭示现有模型在处理困难示例时的性能下降,特别是在传统低资源基准测试中表现良好的预训练网络在Hard-Bench上的表现并未显示出改进。这一发现表明,现有模型与人类水平性能之间仍存在较大的鲁棒性差距。

Hard-Bench is a challenging benchmark designed to evaluate the learning capabilities of state-of-the-art neural networks in low-resource environments. Developed by the Shanghai AI Laboratory, this benchmark consists of 11 sub-datasets, including 3 computer vision (CV) datasets and 8 natural language processing (NLP) datasets. The benchmark was created to reveal the performance degradation of existing models when handling challenging examples. Specifically, pre-trained networks that perform well in traditional low-resource benchmark tasks show no improvements on Hard-Bench. This finding indicates that a substantial robustness gap still exists between current models and human-level performance.
提供机构:
上海人工智能实验室
创建时间:
2023-03-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作