unlearning-cleanslate/generations-07-qwen3-8b-simnpo-gentle-bm25-10b-target-100-checkpoint-374

Name: unlearning-cleanslate/generations-07-qwen3-8b-simnpo-gentle-bm25-10b-target-100-checkpoint-374
Creator: unlearning-cleanslate
Published: 2026-04-29 22:38:13
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/unlearning-cleanslate/generations-07-qwen3-8b-simnpo-gentle-bm25-10b-target-100-checkpoint-374

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个配置，主要用于AI推理和评估任务。其中包括ARC挑战数据集，涉及多项选择题和推理问题；以及多个BBH（Big-Bench Hard）任务的变体，这些任务采用思维链（CoT）和少样本学习方式，涵盖布尔表达式、因果判断、日期理解、消歧问答、Dyck语言、形式谬误、几何形状、超常语序、逻辑推理（涉及三、五、七个对象）、电影推荐、多步算术、导航、对象计数、企鹅表格数据、彩色对象推理和名字破坏等多种任务。每个配置包含输入、目标、生成参数、模型响应、过滤响应和评估分数等特征，用于模型生成和性能评估。

This dataset includes multiple configurations primarily designed for AI reasoning and evaluation tasks. It features the ARC challenge dataset, which involves multiple-choice questions and reasoning problems, along with various BBH (Big-Bench Hard) task variants that utilize chain-of-thought (CoT) and few-shot learning approaches. These tasks cover a wide range of domains such as boolean expressions, causal judgement, date understanding, disambiguation QA, Dyck languages, formal fallacies, geometric shapes, hyperbaton, logical deduction (with three, five, and seven objects), movie recommendation, multistep arithmetic, navigation, object counting, penguins in a table, reasoning about colored objects, and ruin names. Each configuration includes features like input, target, generation arguments, model responses, filtered responses, and evaluation scores, aimed at model generation and performance assessment.

提供机构：

unlearning-cleanslate