internlm/Condor-SFT-20K
收藏Hugging Face2025-01-23 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/internlm/Condor-SFT-20K
下载链接
链接失效反馈官方服务:
资源简介:
Condor是一个用于生成监督微调(SFT)高质量数据的数据集,由InternLM开发。它包含两个阶段:数据合成阶段和数据精炼阶段。数据合成阶段利用世界知识树来生成数据,而数据精炼阶段则采用自我反思精炼策略来优化响应。Condor旨在帮助大型语言模型(LLMs)提高对话能力,特别是在只有少量高质量数据的情况下。数据集适用于文本生成任务,并支持英文和中文。
Condor is a dataset for generating high-quality Supervised Fine-Tuning (SFT) data, developed by InternLM. It consists of two stages: data synthesis and data refinement. The data synthesis stage uses the World Knowledge Tree to generate data, while the data refinement stage employs a Self-Reflection Refinement strategy to optimize responses. Condor aims to enhance the conversational capabilities of Large Language Models (LLMs), especially when only a small amount of high-quality data is available. The dataset is suitable for text generation tasks and supports both English and Chinese.
提供机构:
internlm



