COIG-CQIA
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/m-a-p/COIG-CQIA
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个经过严格人工审核的新中文指令调整数据集,它源自于多种真实世界资源。该数据集涵盖了包括通用知识、科学、技术、工程和数学(STEM)以及人文等多个领域,并包含了信息提取和问题回答等多种任务类型。规模上,该数据集包含了来自22个来源的48,375个实例。其任务是针对中文语言模型进行指令调整。
This dataset is a rigorously human-reviewed, newly developed Chinese instruction tuning dataset derived from multiple real-world resources. It covers multiple domains including general knowledge, science, technology, engineering and mathematics (STEM), and humanities, and encompasses various task types such as information extraction and question answering. It comprises 48,375 instances sourced from 22 distinct resources, and its primary purpose is to support instruction tuning for Chinese language models.



