cognitivecomputations/OpenCoder-LLM_opc-sft-stage1-DolphinLabeled
收藏Hugging Face2025-01-07 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/cognitivecomputations/OpenCoder-LLM_opc-sft-stage1-DolphinLabeled
下载链接
链接失效反馈官方服务:
资源简介:
OpenCoder-LLM SFT DolphinLabeled数据集旨在用于过滤OpenCoder-LLM SFT数据集。该数据集通过删除重复的指令和添加标记输出内容的flags列进行了修改。flags列包括是否拒绝回答、是否包含未经请求的建议、是否包含不适当的内容、是否包含个人信息和是否包含免责声明等标记。数据集包含了从infinity_instruct、真实用户对话历史和基于种子生成的多样化指令三个部分的代码相关指令。
The OpenCoder-LLM SFT DolphinLabeled dataset is designed to enable filtering of the OpenCoder-LLM SFT dataset. The dataset has been modified by removing duplicate instructions and adding a flags column that labels the output content with markers such as refusal to answer, unsolicited advice, presence of inappropriate content, personal information, and disclaimers. The dataset consists of code-related instructions from three parts: filtered from infinity_instruct, extracted from real user conversation histories, and generated based on seed sources for diverse instructions.
提供机构:
cognitivecomputations



