TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-gsm8k__v1
收藏Hugging Face2025-11-09 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-gsm8k__v1
下载链接
链接失效反馈官方服务:
资源简介:
这是一个评估实验数据集,用于评估gsm8k任务的模型表现。数据集包含了评估所需的多个字段,如问题、答案、任务配置、提示、模型响应等,以及日志信息和实验的元数据信息。
This is an evaluation experiment dataset for assessing the performance of models on the gsm8k task. The dataset includes multiple fields required for evaluation, such as questions, answers, task configurations, prompts, model responses, log information, and metadata about the experiment.
提供机构:
TAUR-dev



