unlearning-cleanslate/generations-llama-3_1-8b-pre_val
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/unlearning-cleanslate/generations-llama-3_1-8b-pre_val
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多任务推理评估集合,包含ARC挑战赛和多个BBH(Big-Bench Hard)任务,如布尔表达式、因果判断、日期理解、歧义消解、形式谬误、几何形状、逻辑推理、电影推荐、多步算术、导航、对象计数、表格数据推理和颜色对象推理等。每个任务配置包含输入、目标、生成参数、模型响应和评估指标,用于测试语言模型在复杂推理任务上的表现。
This dataset is a multi-task reasoning evaluation collection, including the ARC Challenge and multiple BBH (Big-Bench Hard) tasks such as boolean expressions, causal judgement, date understanding, disambiguation QA, formal fallacies, geometric shapes, logical deduction, movie recommendation, multistep arithmetic, navigation, object counting, table-based reasoning, and reasoning about colored objects. Each task configuration contains inputs, targets, generation arguments, model responses, and evaluation metrics, designed to assess language models performance on complex reasoning tasks.
提供机构:
unlearning-cleanslate



