AmanPriyanshu/tool-reasoning-sft-CODING-nvidia-Nemotron-Agentic-v1

Name: AmanPriyanshu/tool-reasoning-sft-CODING-nvidia-Nemotron-Agentic-v1
Creator: AmanPriyanshu
Published: 2026-03-14 09:32:11
License: 暂无描述

Hugging Face2026-03-14 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/AmanPriyanshu/tool-reasoning-sft-CODING-nvidia-Nemotron-Agentic-v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - text-generation language: - en tags: - reasoning - tool-calling - agentic - multi-turn - interactive - sft size_categories: - 100K<n<1M --- # Nemotron-Agentic-v1 — Cleaned & Rectified 335k multi-turn agentic tool-use trajectories from NVIDIA's Nemotron-Agentic-v1, converted into a strict reasoning + tool-call format with validated FSM transitions. ## Origin Derived from [nvidia/Nemotron-Agentic-v1](https://huggingface.co/datasets/nvidia/Nemotron-Agentic-v1). Nemotron-Agentic-v1 is a synthetic dataset of multi-turn conversations where language models decompose user goals, decide when to call tools, and reason over tool outputs. Trajectories are generated by simulating user, agent, and tool-execution roles using Qwen3-235B-A22B-Thinking, Qwen3-32B, GPT-OSS-120B, and Qwen3-235B-A22B-Instruct, with turn-level quality judgments to filter inconsistent or incorrect tool use. ## Format Each row contains a structured multi-turn conversation with explicit reasoning traces and validated tool calls. ### Message Roles | Role | Content | |---|---| | `system` | Tool-use protocol + cleaned JSON tool schemas + domain instructions | | `user` | User request or follow-up in multi-turn dialogue | | `reasoning` | `<think>…</think>` — model's step-by-step reasoning | | `tool_call` | `<tool_call>{"name": "...", "arguments": {...}}</tool_call>` — function invocation | | `tool_output` | `<tool_response>…</tool_response>` — tool execution result | | `answer` | `<answer>…</answer>` — agent's response to the user | ### Trajectory Structure ``` system → user → reasoning → [tool_call → tool_output → reasoning →]* answer → [user → reasoning → ...] ``` Conversations range from 4 to 166 turns (avg 12.6), with 0–54 tool calls per row (avg 2.1). ## Schema Single Parquet file with zstd compression. | Column | Type | Description | |---|---|---| | `messages` | string | Converted conversation (JSON list of `{role, content}`) | | `uuid` | string | Original row UUID from Nemotron-Agentic-v1 | | `split` | string | Source subset: `interactive_agent` or `tool_calling` | ## Data Distribution | Split | Source Rows | Converted Rows | Pass Rate | |---|---|---|---| | `interactive_agent` | 19,028 | 19,028 | 100.00% | | `tool_calling` | 316,094 | 312,166 | 98.76% | | **Total** | **335,122** | **331,194** | **98.83%** | ### Failure Reasons (tool_calling split only) | Reason | Count | % of Failures | |---|---|---| | `tool_call→tool_call` transition | 3,229 | 82.2% | | `tool_call→reasoning` transition | 698 | 17.8% | | `reasoning→user` transition | 1 | 0.0% | ## Usage ```py import json, random from huggingface_hub import hf_hub_download import pyarrow.parquet as pq REPO = "AmanPriyanshu/tool-reasoning-sft-nvidia-Nemotron-Agentic-v1" print("Downloading data.parquet...") local = hf_hub_download(REPO, "data.parquet", repo_type="dataset") t = pq.read_table(local) print(f"Rows: {t.num_rows:,} | Cols: {t.column_names}\n") from collections import Counter splits = [s.as_py() for s in t.column("split")] print(f"Splits: {dict(Counter(splits).most_common())}\n") idx = random.randint(0, t.num_rows - 1) row = {col: t.column(col)[idx].as_py() for col in t.column_names} msgs = json.loads(row["messages"]) roles = [m["role"] for m in msgs] print(f"Row {idx} | uuid={row['uuid']} | split={row['split']} | {len(msgs)} turns") print(f"Roles: {' -> '.join(roles)}\n") for m in msgs: content = m["content"] if m["role"] == "system": content = content[:200] + "..." elif len(content) > 400: content = content[:400] + "..." print(f"[{m['role']}]") print(content) print() ``` ## Subset Characteristics ### interactive_agent (19,028 rows) - **Domain:** Customer service (food delivery, event ticketing) - **Reasoning:** All synthesized from template pools (source `reasoning_content` is always empty) - **Tool args:** JSON strings parsed to dicts - **Multi-turn:** 2–5 user turns per conversation, model learns when NOT to call tools - **Tools:** 14 fixed tools per row (domain-specific: `authenticate_user`, `get_order_status`, etc.) ### tool_calling (312,166 rows) - **Domain:** General-purpose API usage (search, finance, utilities, social media, etc.) - **Reasoning:** Rich native chain-of-thought (avg 2,562 chars, no `<think>` tags in source) - **Tool args:** Already dicts (passed through directly) - **Tools:** 1–60 per row (avg 5.6), diverse tool schemas - **Pre-filtered:** `reasoning="on"` rows only (100% of interactive_agent, 100% of tool_calling) ## Conversion Details - OpenAI-style `tool_calls` (with `function.name` + `function.arguments`) parsed into canonical `{"name", "arguments": dict}` format - `reasoning_content` field on assistant messages → `<think>...</think>` reasoning turns - When `reasoning_content` is empty (interactive_agent), bridge reasoning synthesized from 12-variation template pools - Tool `content` handled as string, dict, or list (all serialized to JSON) - Tool output matching via positional look-ahead from assistant message - `strict` field stripped from tool schemas - Bridge reasoning inserted for all forbidden transitions: `tool_output→tool_call`, `tool_output→answer`, `user→tool_call`, `user→answer` - Consecutive reasoning turns merged - Ending fixes: conversations always terminate with `answer` role

提供机构：

AmanPriyanshu

5,000+

优质数据集

54 个

任务类型

进入经典数据集