five

leideng/Dolci-Think-SFT-7B-4K-Plus

收藏
Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/leideng/Dolci-Think-SFT-7B-4K-Plus
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: messages list: - name: content dtype: string - name: role dtype: string - name: dataset_source dtype: string - name: id dtype: string splits: - name: train num_bytes: 77877465405 num_examples: 2268178 download_size: 36139346939 dataset_size: 77877465405 configs: - config_name: 4k_plus data_files: - split: train path: data_4k/train-* default: true - config_name: full data_files: - split: train path: data/train-* --- # Note >[!NOTE] > >This is the filtered version of [allenai/Dolci-Think-SFT-7B](https://huggingface.co/datasets/allenai/Dolci-Think-SFT-7B) where only thoese data samples with more than 4096 tokens by Qwen3 tokenizer are kept. # Dolci-Think-SFT Sources include a mixture of existing reasoning traces: * [OpenThoughts 3](https://huggingface.co/datasets/open-thoughts/OpenThoughts3-1.2M) (Apache 2.0): Extended to 32K context length and downsampled code prompts to 16X multiple, to 941,166 total prompts. Access our version, Dolci OpenThoughts 3 here. * [SYNTHETIC-2](https://huggingface.co/datasets/PrimeIntellect/SYNTHETIC-2-SFT-verified) (Apache 2.0) via the SFT-Verified split, 104,569 prompts. * [Nemotron Post-training dataset](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v1) (CC BY 4), code split only, 113,777 prompts. New prompts and new reasoning traces from us (all ODC-BY-1.0): * Dolci Think Persona IF: New precise instruction following prompts and traces created with [Nvidia's Nemotron Post-training Personas](https://huggingface.co/datasets/nvidia/Nemotron-Personas-USA). 223,123 prompts. * Dolci Precise IF: New multi-constraint instruction following data building off Pyatkin, Valentina, et al. "[Generalizing Verifiable Instruction Following](https://arxiv.org/abs/2507.02833)." (2025). 135,792 prompts. * [Dolci Think Python](https://huggingface.co/datasets/allenai/Dolci-Think-SFT-Python): 466,677 prompts (subsampled from larger mix). Existing prompts with new reasoning traces, largely repurposed from Tülu 3 / OLMo 2, with new traces generated by a mix of DeepSeek R1 and DeepSeek R1 0528: * [WildChat](https://huggingface.co/datasets/allenai/WildChat-1M) (ODC-BY-1.0), 83,054 prompts. * [OpenAssistant Guanaco](https://huggingface.co/datasets/OpenAssistant/oasst1) (Apache 2.0), 6,800 prompts. * [CoCoNot](https://huggingface.co/datasets/allenai/coconot) (ODC-BY-1.0), 10,227 prompts. * [WildGuardMix ](https://huggingface.co/datasets/allenai/wildguardmix) (Apache 2.0), 38,315 prompts. * [WildJailbreak](https://huggingface.co/datasets/allenai/wildjailbreak) (ODC-BY-1.0) 41,100 prompts. * [Aya](https://huggingface.co/datasets/CohereForAI/aya_dataset) (Apache 2.0), 98,597 prompts. * [TableGPT](https://huggingface.co/datasets/LipengCS/Table-GPT) (MIT), 4,981 prompts. * Olmo Identity Prompts, 58 samples (we trained with 290, 5 repetitions per prompt, uploaded single repetition to HuggingFace) The counts are smaller than the original prompt sources pulled from Tülu 3 / OLMo 2 due to more extensive filtering for data quality and by topics within the Azure API (blocked requests). This dataset was used for 7B post-training, the [7B dataset](https://huggingface.co/datasets/allenai/Dolci-Think-SFT) is slightly different. ## Dataset Structure Each example in the dataset contains the standard instruction-tuning data points as follow: - `id` (str): a unique identifier - `messages` (list): message format used for supervised fine-tuning (this contains user prompt and assistant responses) - `source` (str): the source dataset for the given sample Every datapoint contains the model's reasoning in `<think>...</think>` and NO `<answer>...</answer>` tags -- the answer follows directly after `</think>`. ## Model Family | **Stage** | **Olmo 3 7B Think** | **Olmo 3 32B Think** | **Olmo 3 7B Instruct** | |--------------------------|-----------------------|------------------------|---------------------------| | **Base Model** | [Olmo-3-7B](https://huggingface.co/allenai/Olmo-3-1025-7B) | [Olmo-3-32B](https://huggingface.co/allenai/Olmo-3-1125-32B) | [Olmo-3-7B](https://huggingface.co/allenai/Olmo-3-1025-7B) | | **SFT** | [Olmo-3-7B-Think-SFT](https://huggingface.co/allenai/Olmo-3-7B-Think-SFT) | [Olmo-3-32B-Think-SFT](https://huggingface.co/allenai/Olmo-3-32B-Think-SFT) | [Olmo-3-7B-Instruct-SFT](https://huggingface.co/allenai/Olmo-3-7B-Instruct-SFT) | | **DPO** | [Olmo-3-7B-Think-DPO](https://huggingface.co/allenai/Olmo-3-7B-Think-DPO) | [Olmo-3-32B-Think-DPO](https://huggingface.co/allenai/Olmo-3-32B-Think-DPO) | [Olmo-3-7B-Instruct-DPO](https://huggingface.co/allenai/Olmo-3-7B-Instruct-DPO) | | **Final Models (RLVR)** | [Olmo-3-7B-Think](https://huggingface.co/allenai/Olmo-3-7B-Think) | [Olmo-3-32B-Think](https://huggingface.co/allenai/Olmo-3-32B-Think) | [Olmo-3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) | ## License This dataset is licensed under ODC-BY. It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use). ## Citation ``` @misc{olmo2025olmo3, title={Olmo 3}, author={Team Olmo and Allyson Ettinger and Amanda Bertsch and Bailey Kuehl and David Graham and David Heineman and Dirk Groeneveld and Faeze Brahman and Finbarr Timbers and Hamish Ivison and Jacob Morrison and Jake Poznanski and Kyle Lo and Luca Soldaini and Matt Jordan and Mayee Chen and Michael Noukhovitch and Nathan Lambert and Pete Walsh and Pradeep Dasigi and Robert Berry and Saumya Malik and Saurabh Shah and Scott Geng and Shane Arora and Shashank Gupta and Taira Anderson and Teng Xiao and Tyler Murray and Tyler Romero and Victoria Graf and Akari Asai and Akshita Bhagia and Alexander Wettig and Alisa Liu and Aman Rangapur and Chloe Anastasiades and Costa Huang and Dustin Schwenk and Harsh Trivedi and Ian Magnusson and Jaron Lochner and Jiacheng Liu and Lester James V. Miranda and Maarten Sap and Malia Morgan and Michael Schmitz and Michal Guerquin and Michael Wilson and Regan Huff and Ronan Le Bras and Rui Xin and Rulin Shao and Sam Skjonsberg and Shannon Zejiang Shen and Shuyue Stella Li and Tucker Wilde and Valentina Pyatkin and Will Merrill and Yapei Chang and Yuling Gu and Zhiyuan Zeng and Ashish Sabharwal and Luke Zettlemoyer and Pang Wei Koh and Ali Farhadi and Noah A. Smith and Hannaneh Hajishirzi}, year={2025}, eprint={2512.13961}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2512.13961}, } ```
提供机构:
leideng
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作