COIG-P
收藏COIG-P 数据集概述
数据集基本信息
- 名称: COIG-P (Chinese Open Instruction Generalist - Preference)
- 类型: 中文偏好数据集
- 规模: 1,006k 中文偏好对
- 领域: 涵盖6个多样化领域
- Chat
- Code
- Math
- Logic
- Novel
- Role
数据集特点
- 高质量: 通过LLM-based中文偏好数据集标注流程生成
- 自动化标注: 使用15个强大LLM生成和评分chosen-rejected响应对
- 基础数据: 爬取并筛选了92k高质量中文查询
相关资源
- 论文: COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
- 数据集地址: m-a-p/COIG-P
- 衍生模型:
- 8B-sized Chinese Reward Model (CRM)
- Chinese Reward Benchmark (CRBench)
数据集加载
python from datasets import load_dataset dataset = load_dataset("m-a-p/COIG-P")
相关数据集
-
COIG-P-CRM: 用于训练中文奖励模型的数据子集 python from datasets import load_dataset dataset = load_dataset("m-a-p/COIG-P-CRM")
-
Chinese Reward Benchmark: 中文奖励基准测试集 python from datasets import load_dataset dataset = load_dataset("m-a-p/COIG-CRBench")
应用案例
- DPO训练: 提供训练脚本,基于Llama-Factory实现
- 模型评估: 使用AlignBench和KOR-Bench进行评估
- 奖励模型训练: 基于RLHF-Reward-Modeling实现
预训练模型
- Qwen2-Instruct-7B-COIG-P
- Qwen2.5-Instruct-7B-COIG-P
- Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P
- Infinity-Instruct-3M-0625-Mistral-7B-COIG-P
- Infinity-Instruct-3M-0625-Llama3-8B-COIG-P
引用
bib @misc{pteam2025coigphighqualitylargescalechinese, title={COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values}, author={P Team and Siwei Wu and Jincheng Ren and Xinrun Du and Shuyue Guo and Xingwei Qu and Yiming Liang and Jie Liu and Yunwen Li and Tianyu Zheng and Boyu Feng and Huaqing Yuan and Zenith Wang and Jiaheng Liu and Wenhao Huang and Chenglin Cai and Haoran Que and Jian Yang and Yuelin Bai and Zekun Moore Wang and Zhouliang Yu and Qunshu Lin and Ding Pan and Yuchen Jiang and Tiannan Wang and Wangchunshu Zhou and Shenzhi Wang and Xingyuan Bu and Minghao Liu and Guoyin Wang and Ge Zhang and Chenghua Lin}, year={2025}, eprint={2504.05535}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.05535}, }




