TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-longmult_2dig-eval_rl

Name: TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-longmult_2dig-eval_rl
Creator: TAUR-dev
Published: 2025-11-09 23:46:47
License: 暂无描述

Hugging Face2025-11-09 更新2025-11-15 收录

下载链接：

https://hf-mirror.com/datasets/TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-longmult_2dig-eval_rl

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含问题、答案、提示以及与模型响应和评估相关的多个字段。测试分割包含1000个示例，数据集的总大小为7407636字节。数据集的结构旨在存储和评估模型对问题的回答，包括最佳答案的选择和评估指标等详细信息。

The dataset includes fields for questions, answers, prompts, and several others related to model responses and evaluation. The test split contains 1000 examples, and the total size of the dataset is 7407636 bytes. The structure of the dataset is designed to store and evaluate model answers to questions, including detailed information on the selection and evaluation of the best answers.

提供机构：

TAUR-dev

5,000+

优质数据集

54 个

任务类型

进入经典数据集