TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval__3args_r1distill-eval_rl

Name: TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval__3args_r1distill-eval_rl
Creator: TAUR-dev
Published: 2025-11-03 22:04:34
License: 暂无描述

Hugging Face2025-11-03 更新2025-11-15 收录

下载链接：

https://hf-mirror.com/datasets/TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval__3args_r1distill-eval_rl

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了问题、答案、任务配置、任务来源、提示信息、模型响应及其评估信息等字段。具体包括：问题的文本内容、答案文本、任务配置和来源、提示的内容和角色、模型响应及正确性评估、原始数据集划分信息、元数据、模型响应的最佳答案标签及评估相关元数据等。此外，还包含了数据集的评估日期、答案索引、选项（标签和文本）、唯一标识符、难度、领域、评估类型、期望答案格式、原始答案、来源、任务类型、变体、缩写、形成的缩写、单词数量、单词列表、长度和字母信息。数据集分为训练集，大小为801,821,338字节，包含11,481个示例。

The dataset includes fields such as questions, answers, task configurations, task sources, prompt information, model responses, and their evaluation information. Specifically, it includes the text content of the questions, answer text, task configurations and sources, prompt content and roles, model responses and correctness evaluations, original dataset split information, metadata, best answer tags for model responses and related evaluation metadata, etc. Additionally, it contains evaluation date, answer index, choices (labels and text), unique identifier, difficulty, domain, evaluation type, expected answer format, original answer, source, task type, variant, acronym, formed acronym, word count, word list, length, and letter information. The dataset is split into a training set, which is 801,821,338 bytes in size and contains 11,481 examples.

提供机构：

TAUR-dev

5,000+

优质数据集

54 个

任务类型

进入经典数据集