allenai/Dolci-Think-RL-7B-Completions-DPO

Name: allenai/Dolci-Think-RL-7B-Completions-DPO
Creator: allenai
Published: 2025-12-12 03:35:25
License: 暂无描述

Hugging Face2025-12-12 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/allenai/Dolci-Think-RL-7B-Completions-DPO

下载链接

链接失效反馈

官方服务：

资源简介：

Dolci-Think-Completions-DPO是一个包含4,345,797个补全的数据集，这些补全来自Olmo-3-7B-Think-DPO模型，用于生成Dolci-Think-RL的提示。数据集包含556,095个高质量提示，覆盖数学、代码、精确指令遵循、一般聊天和谜题等多个领域。每个领域有对应的分割，数据来源包括多个公开的数据集和论文，如IF Multi-Constraint、OMEGA Math、AceCoder等。数据集经过了关键词和主题过滤、执行基于测试案例的验证、F1分数过滤等多种处理步骤。

Dolci-Think-Completions-DPO is a set of 4,345,797 completions from the Olmo-3-7B-Think-DPO model over the prompts considered when making Dolci-Think-RL. It contains 556,095 high-quality prompts covering Math, Code, Precise Instruction Following, General Chat, and Puzzles. Each split covers one of the above domains, and the original_dataset column contains the source dataset. The data sources include multiple public datasets and papers such as IF Multi-Constraint, OMEGA Math, AceCoder, etc. The dataset has undergone various processing steps including keyword & topic filtering, execution-based test-case validation, F1-score filtering, etc.

提供机构：

allenai

5,000+

优质数据集

54 个

任务类型

进入经典数据集