扩展基准

Name: 扩展基准
Creator: 法国电信学院
Published: 2023-05-17 23:20:31
License: 暂无描述

arXiv2023-05-17 更新2024-06-21 收录

下载链接：

https://github.com/AnasHimmi/MissingDataRanking

下载链接

链接失效反馈

官方服务：

资源简介：

扩展基准是由法国电信学院的研究团队创建的一个大规模数据集，旨在评估自然语言处理（NLP）系统在面对缺失评分时的表现。该数据集包含超过1.31亿个评分，涵盖多种任务和指标，远超现有基准。创建过程中，研究团队收集并整合了来自不同来源的数据，包括GLUE、SGLUE、XTREME和GEM等基准的数据。该数据集的应用领域主要集中在NLP系统的全面评估，特别是在处理实际场景中可能出现的评分缺失问题，以提高评估的可靠性和有效性。

The Expanded Benchmark is a large-scale dataset developed by the research team from Télécom Paris, a French telecommunications-focused academic institution, which aims to evaluate the performance of natural language processing (NLP) systems when encountering missing rating scores. This dataset includes over 131 million rating scores, covers diverse tasks and metrics, and substantially outperforms existing benchmarks in scale. During its development, the research team collected and integrated data from multiple sources, including datasets from well-known benchmarks such as GLUE, SGLUE, XTREME, and GEM. The primary application scope of this dataset centers on comprehensive evaluation of NLP systems, specifically addressing the problem of missing rating scores that may occur in real-world scenarios, so as to improve the reliability and effectiveness of NLP system evaluations.

提供机构：

法国电信学院

创建时间：

2023-05-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集