AmanPriyanshu/tool-reasoning-sft-CODING-nvidia-Nemotron-Agentic-v1
收藏Hugging Face2026-03-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/tool-reasoning-sft-CODING-nvidia-Nemotron-Agentic-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-generation
language:
- en
tags:
- reasoning
- tool-calling
- agentic
- multi-turn
- interactive
- sft
size_categories:
- 100K<n<1M
---
# Nemotron-Agentic-v1 — Cleaned & Rectified
335k multi-turn agentic tool-use trajectories from NVIDIA's Nemotron-Agentic-v1, converted into a strict reasoning + tool-call format with validated FSM transitions.
## Origin
Derived from [nvidia/Nemotron-Agentic-v1](https://huggingface.co/datasets/nvidia/Nemotron-Agentic-v1).
Nemotron-Agentic-v1 is a synthetic dataset of multi-turn conversations where language models decompose user goals, decide when to call tools, and reason over tool outputs. Trajectories are generated by simulating user, agent, and tool-execution roles using Qwen3-235B-A22B-Thinking, Qwen3-32B, GPT-OSS-120B, and Qwen3-235B-A22B-Instruct, with turn-level quality judgments to filter inconsistent or incorrect tool use.
## Format
Each row contains a structured multi-turn conversation with explicit reasoning traces and validated tool calls.
### Message Roles
| Role | Content |
|---|---|
| `system` | Tool-use protocol + cleaned JSON tool schemas + domain instructions |
| `user` | User request or follow-up in multi-turn dialogue |
| `reasoning` | `<think>…</think>` — model's step-by-step reasoning |
| `tool_call` | `<tool_call>{"name": "...", "arguments": {...}}</tool_call>` — function invocation |
| `tool_output` | `<tool_response>…</tool_response>` — tool execution result |
| `answer` | `<answer>…</answer>` — agent's response to the user |
### Trajectory Structure
```
system → user → reasoning → [tool_call → tool_output → reasoning →]* answer → [user → reasoning → ...]
```
Conversations range from 4 to 166 turns (avg 12.6), with 0–54 tool calls per row (avg 2.1).
## Schema
Single Parquet file with zstd compression.
| Column | Type | Description |
|---|---|---|
| `messages` | string | Converted conversation (JSON list of `{role, content}`) |
| `uuid` | string | Original row UUID from Nemotron-Agentic-v1 |
| `split` | string | Source subset: `interactive_agent` or `tool_calling` |
## Data Distribution
| Split | Source Rows | Converted Rows | Pass Rate |
|---|---|---|---|
| `interactive_agent` | 19,028 | 19,028 | 100.00% |
| `tool_calling` | 316,094 | 312,166 | 98.76% |
| **Total** | **335,122** | **331,194** | **98.83%** |
### Failure Reasons (tool_calling split only)
| Reason | Count | % of Failures |
|---|---|---|
| `tool_call→tool_call` transition | 3,229 | 82.2% |
| `tool_call→reasoning` transition | 698 | 17.8% |
| `reasoning→user` transition | 1 | 0.0% |
## Usage
```py
import json, random
from huggingface_hub import hf_hub_download
import pyarrow.parquet as pq
REPO = "AmanPriyanshu/tool-reasoning-sft-nvidia-Nemotron-Agentic-v1"
print("Downloading data.parquet...")
local = hf_hub_download(REPO, "data.parquet", repo_type="dataset")
t = pq.read_table(local)
print(f"Rows: {t.num_rows:,} | Cols: {t.column_names}\n")
from collections import Counter
splits = [s.as_py() for s in t.column("split")]
print(f"Splits: {dict(Counter(splits).most_common())}\n")
idx = random.randint(0, t.num_rows - 1)
row = {col: t.column(col)[idx].as_py() for col in t.column_names}
msgs = json.loads(row["messages"])
roles = [m["role"] for m in msgs]
print(f"Row {idx} | uuid={row['uuid']} | split={row['split']} | {len(msgs)} turns")
print(f"Roles: {' -> '.join(roles)}\n")
for m in msgs:
content = m["content"]
if m["role"] == "system":
content = content[:200] + "..."
elif len(content) > 400:
content = content[:400] + "..."
print(f"[{m['role']}]")
print(content)
print()
```
## Subset Characteristics
### interactive_agent (19,028 rows)
- **Domain:** Customer service (food delivery, event ticketing)
- **Reasoning:** All synthesized from template pools (source `reasoning_content` is always empty)
- **Tool args:** JSON strings parsed to dicts
- **Multi-turn:** 2–5 user turns per conversation, model learns when NOT to call tools
- **Tools:** 14 fixed tools per row (domain-specific: `authenticate_user`, `get_order_status`, etc.)
### tool_calling (312,166 rows)
- **Domain:** General-purpose API usage (search, finance, utilities, social media, etc.)
- **Reasoning:** Rich native chain-of-thought (avg 2,562 chars, no `<think>` tags in source)
- **Tool args:** Already dicts (passed through directly)
- **Tools:** 1–60 per row (avg 5.6), diverse tool schemas
- **Pre-filtered:** `reasoning="on"` rows only (100% of interactive_agent, 100% of tool_calling)
## Conversion Details
- OpenAI-style `tool_calls` (with `function.name` + `function.arguments`) parsed into canonical `{"name", "arguments": dict}` format
- `reasoning_content` field on assistant messages → `<think>...</think>` reasoning turns
- When `reasoning_content` is empty (interactive_agent), bridge reasoning synthesized from 12-variation template pools
- Tool `content` handled as string, dict, or list (all serialized to JSON)
- Tool output matching via positional look-ahead from assistant message
- `strict` field stripped from tool schemas
- Bridge reasoning inserted for all forbidden transitions: `tool_output→tool_call`, `tool_output→answer`, `user→tool_call`, `user→answer`
- Consecutive reasoning turns merged
- Ending fixes: conversations always terminate with `answer` role
提供机构:
AmanPriyanshu



