five

allenai/Dolci-Think-RL-7B-Completions-SFT

收藏
Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/allenai/Dolci-Think-RL-7B-Completions-SFT
下载链接
链接失效反馈
官方服务:
资源简介:
Dolci-Think-Completions-SFT是一个包含5,031,398个来自Olmo-3-7B-Think-SFT模型补全的数据集,主要用于过滤简单数据。数据集包含636,095个高质量提示,覆盖数学、代码、精确指令遵循、通用聊天和谜题五个领域。数据来源多样,包括IF Multi-Constraint、OMEGA Math、AceCoder等多个数据集,并经过关键词和主题过滤、基于执行的测试案例验证、F1分数过滤等多种处理。数据集分为coding、general、if、math和puzzle五个分割,适用于研究和教育用途。

Dolci-Think-Completions-SFT is a set of 5,031,398 completions from the Olmo-3-7B-Think-SFT model over the prompts considered when making Dolci-Think-RL. These completions were mainly used to filter easy data. It contains 636,095 high-quality prompts covering Math, Code, Precise Instruction Following, General Chat, and Puzzles. The dataset is composed of multiple sources including IF Multi-Constraint, OMEGA Math, AceCoder, etc., and has undergone various processing such as keyword & topic filtering, execution-based test-case validation, and F1-score filtering. The dataset is split into coding, general, if, math, and puzzle, and is licensed under ODC-BY for research and educational use.
提供机构:
allenai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作