five

m-a-p/COIG-Writer

收藏
Hugging Face2026-02-09 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/m-a-p/COIG-Writer
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集提供了一个高质量的中文创作与思考过程数据集,包含了各种类型的中文创意写作文本,每个文本都配有一个详细的“Query”(提示)和一个“Thought”(清晰的思考过程)。该数据集旨在解决机器生成文本中常见的“AI风味”问题,如逻辑不一致、缺乏个性、分析肤浅、语言过于繁琐或叙事发展薄弱等。主要目标是提供一个资源,帮助训练语言模型生成流畅、具有深度连贯性、个性、洞察力和复杂叙事结构的内容,更接近人类创作的作品。数据集涵盖了大约50个子领域的中文创意写作和其他文本生成任务。数据集中的所有文本均为**简体中文(zh-CN**)。

This dataset provides a collection of high-quality Chinese creative writing pieces and other text types (like scientific popularization articles), each accompanied by a detailed Query (prompt) and a Thought (an articulated thinking process). It has been developed to tackle the common AI flavor often found in machine-generated text, which can include issues like logical inconsistencies, a lack of distinct personality, superficial analysis, overly elaborate language, or weak narrative development. The primary goal is to offer a resource that aids in training language models to produce content that is not only fluent but also exhibits deeper coherence, individuality, insightful perspectives, and sophisticated narrative construction, aligning more closely with human-authored compositions. The dataset covers approximately 50 sub-fields within Chinese creative writing and other text generation tasks. All text in this dataset is in **Simplified Chinese (zh-CN)**.
提供机构:
m-a-p
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作