TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_InstOnly-RL-commonsenseQA-eval_rl
收藏Hugging Face2025-11-09 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_InstOnly-RL-commonsenseQA-eval_rl
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含问题、答案以及与任务相关的配置信息。每个样本包括一个或多个提示,每个提示包含内容和角色信息。数据集还包含了模型响应及其评估信息,包括是否正确、提取的答案、提取和评估的元数据等。此外,还包括了答案的索引和键、选项、ID等信息。数据集分为测试集,测试集包含了1221个示例,大小为9007001字节。
The dataset includes questions, answers, and task configuration information. Each sample consists of one or more prompts, each with content and role information. The dataset also contains model responses and their evaluation information, including correctness, extracted answers, extraction and evaluation metadata, etc. In addition, it includes answer index, answer key, choices, ID, and other information. The dataset is split into a test set, which contains 1221 examples and is 9007001 bytes in size.
提供机构:
TAUR-dev



