marin-community/open-thoughts-4-128-math-gpt-oss-120b-high-annotated-32768-tokens

Name: marin-community/open-thoughts-4-128-math-gpt-oss-120b-high-annotated-32768-tokens
Creator: marin-community
Published: 2026-03-21 15:31:16
License: 暂无描述

Hugging Face2026-03-21 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/marin-community/open-thoughts-4-128-math-gpt-oss-120b-high-annotated-32768-tokens

下载链接

链接失效反馈

官方服务：

资源简介：

# open-thoughts-4-128-math-gpt-oss-120b-high-annotated-32768-tokens Math reasoning responses generated by **GPT-OSS-120B (reasoning_effort=high)** (openai/gpt-oss-120b) via the Together AI serverless API. ## Overview - **Total rows:** 1,024 - **Unique prompts:** 128 (each with 8 response annotations) - **Source prompts:** [marin-community/open-thoughts-4-128-math-qwen3-32b-annotated-32768-tokens-n8-reformatted](https://huggingface.co/datasets/marin-community/open-thoughts-4-128-math-qwen3-32b-annotated-32768-tokens-n8-reformatted) - **Generation model:** [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) - **Max tokens:** 32,768 - **Temperature:** 0.8 - **Extra parameters:** reasoning_effort=high - **Tokenizer used for stats:** Qwen/Qwen2.5-3B ## Statistics | Metric | Value | |--------|-------| | Avg tokens per response | 17,721 | | Median tokens per response | 14,270 | | Responses with `<think>` tag | 100.0% | | Complete responses (has `</think>` + `\boxed{...}`) | 785/1024 (76.7%) | | Truncated responses | 239/1024 (23.3%) | | Empty responses | 0 | ## Columns | Column | Description | |--------|-------------| | `row_id` | Sequential identifier (0-1023) | | `instruction_seed` | The math problem prompt | | `gpt_oss_120b_high_generated_text` | GPT-OSS-120B (reasoning_effort=high) generated response (with `<think>...</think>` reasoning trace) | | `ms_id` | Math seed ID -- groups all 8 responses for the same prompt | | `_source` | Source dataset identifier | | `gpt41_mini_response` | GPT-4.1 mini reference response | | `length` | Response length | ## Response Format Each response in the `gpt_oss_120b_high_generated_text` column follows this format: ``` <think> [model's reasoning trace] </think> [final answer, typically containing \boxed{...}] ``` Responses that are truncated (hit the 32,768 token limit) may be missing the closing `</think>` tag and/or the `\boxed{...}` answer. ## Construction Generated by sending each of the 128 math prompts to GPT-OSS-120B (reasoning_effort=high) 8 times (n=8) via the Together AI serverless endpoint, with `max_tokens=32768` and `temperature=0.8`. The model's reasoning trace (from the `message.reasoning` API field) is wrapped in `<think>...</think>` tags.

提供机构：

marin-community

5,000+

优质数据集

54 个

任务类型

进入经典数据集