WebInstruct-CFT
收藏魔搭社区2026-01-02 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/WebInstruct-CFT
下载链接
链接失效反馈官方服务:
资源简介:
# WebInstruct-CFT Dataset
This dataset is introduced in our paper [Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate](https://huggingface.co/papers/2501.17703).
| [**🚀Project Page**](https://tiger-ai-lab.github.io/CritiqueFineTuning/) | [**📖Paper**](https://arxiv.org/pdf/2501.17703) | [**🔗Github**](https://github.com/TIGER-AI-Lab/CritiqueFineTuning) | [**🤗7B Model**](https://huggingface.co/TIGER-Lab/Qwen2.5-Math-7B-CFT) | [**🤗32B Model**](https://huggingface.co/TIGER-Lab/Qwen2.5-32B-Instruct-CFT) |
## Overview
WebInstruct-CFT is a critique-based instruction dataset derived from WebInstruct. Unlike traditional instruction datasets that focus on correct answers, our dataset includes critiques of responses, enabling models to learn through critical analysis.
## Dataset Composition
The original WebInstrcut dataset covers diverse topics:
- Mathematics (65%)
- Business (10%)
- Physics (8%)
- Chemistry (4%)
- Humanities (4%)
- Other topics
We provide three variants:
- `WebInstruct-CFT-600K`: Full version of our dataset
- `WebInstruct-CFT-50K`: Medium-sized subset used to train [Qwen2.5-Math-7B-CFT](https://huggingface.co/TIGER-Lab/Qwen2.5-Math-7B-CFT)
- `WebInstruct-CFT-4K`: Small subset used to train [Qwen2.5-32B-Instruct-CFT](https://huggingface.co/TIGER-Lab/Qwen2.5-32B-Instruct-CFT)
## Data Format
Each example follows this structure:
```json
{
"instruction": "Please critique whether the following solution to the question is correct.",
"input": "Question:\n[The original question]\n\nSolution:\n[The original response to be critiqued]",
"output": "[GPT-4o generated detailed critique of the response]"
}
```
## Citations
```
@misc{wang2025critiquefinetuninglearningcritique,
title={Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate},
author={Yubo Wang and Xiang Yue and Wenhu Chen},
year={2025},
eprint={2501.17703},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.17703},
}
```
# WebInstruct-CFT 数据集
本数据集出自论文《评判微调(Critique Fine-Tuning):学习评判优于学习模仿》([论文链接](https://huggingface.co/papers/2501.17703))。
| [**🚀项目主页**](https://tiger-ai-lab.github.io/CritiqueFineTuning/) | [**📖论文**](https://arxiv.org/pdf/2501.17703) | [**🔗GitHub**](https://github.com/TIGER-AI-Lab/CritiqueFineTuning) | [**🤗7B 模型**](https://huggingface.co/TIGER-Lab/Qwen2.5-Math-7B-CFT) | [**🤗32B 模型**](https://huggingface.co/TIGER-Lab/Qwen2.5-32B-Instruct-CFT) |
## 概览
WebInstruct-CFT 是一个源自 WebInstruct 的基于评判的指令数据集。与传统聚焦于正确答案的指令数据集不同,本数据集包含了对模型输出的评判内容,使得模型能够通过批判性分析开展学习。
## 数据集构成
原始 WebInstruct 数据集涵盖多元主题:
- 数学(65%)
- 商科(10%)
- 物理(8%)
- 化学(4%)
- 人文社科(4%)
- 其他主题
我们提供三个变体版本:
- `WebInstruct-CFT-600K`:本数据集的完整版本
- `WebInstruct-CFT-50K`:用于训练[Qwen2.5-Math-7B-CFT](https://huggingface.co/TIGER-Lab/Qwen2.5-Math-7B-CFT)的中等规模子集
- `WebInstruct-CFT-4K`:用于训练[Qwen2.5-32B-Instruct-CFT](https://huggingface.co/TIGER-Lab/Qwen2.5-32B-Instruct-CFT)的小型子集
## 数据格式
每个数据样本遵循如下结构:
json
{
"instruction": "请对以下问题的解答是否正确进行评判。",
"input": "问题:
[原始问题]
解答:
[待评判的原始模型输出]",
"output": "[由GPT-4o生成的针对该模型输出的详细评判内容]"
}
## 引用
@misc{wang2025critiquefinetuninglearningcritique,
title={Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate},
author={Yubo Wang and Xiang Yue and Wenhu Chen},
year={2025},
eprint={2501.17703},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.17703},
}
提供机构:
maas
创建时间:
2025-02-05



