SuperCLUE

Name: SuperCLUE
Creator: CLUE
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://www.cluebenchmarks.com/en/index.html

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个全面性的基准，用于通过包括用户查询、开放式问题和封闭式问题在内的多种任务来评估中文大型语言模型的性能。该基准包含了三个子任务：一是来自大型语言模型对战平台上的用户查询及其评分；二是开放式问题，包括单轮和多轮对话；三是封闭式问题，用于评估模型在实际应用中的性能。

This dataset is a comprehensive benchmark designed to evaluate the performance of Chinese large language models (LLMs) across multiple tasks including user queries, open-ended questions, and closed-ended questions. This benchmark comprises three subtasks: 1) User queries and their corresponding scores sourced from large language model battle platforms; 2) Open-ended questions including single-turn and multi-turn dialogues; 3) Closed-ended questions aimed at assessing model performance in real-world applications.

提供机构：

CLUE

5,000+

优质数据集

54 个

任务类型

进入经典数据集