AmanPriyanshu/tool-reasoning-sft-RESEARCH-REDSearcher_SFT_10K
收藏Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/tool-reasoning-sft-RESEARCH-REDSearcher_SFT_10K
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-generation
language:
- en
- zh
tags:
- reasoning
- tool-calling
- agentic
- multi-turn
- deep-search
- multi-step-reasoning
size_categories:
- 1K<n<10K
---
# REDSearcher SFT 10K — Cleaned & Rectified
~8,850 multi-turn deep-search agent trajectories converted into a strict reasoning + tool-call format with validated FSM transitions.
## Origin
Derived from [Zchu/REDSearcher_SFT_10K](https://huggingface.co/datasets/Zchu/REDSearcher_SFT_10K).
REDSearcher is a deep search assistant dataset featuring rigorous, multi-step, multi-source investigations. Each trajectory contains a complex research question answered through iterative search → visit → reason → answer cycles, with extensive chain-of-thought reasoning already present in the source data.
📄 **Paper:** [REDSearcher: A Novel Framework for Real-time Exploration and Discovery](https://huggingface.co/datasets/Zchu/REDSearcher_SFT_10K) *(see dataset card for citation)*
## Format
Each row contains a structured multi-turn conversation with explicit reasoning traces and validated tool calls.
### Message Roles
| Role | Content |
|---|---|
| `system` | Tool-use protocol + JSON tool schemas + deep-search instructions |
| `user` | Research question or follow-up |
| `reasoning` | `<think>…</think>` — model's step-by-step reasoning |
| `tool_call` | `<tool_call>{"name": "...", "arguments": {...}}</tool_call>` — function invocation |
| `tool_output` | `<tool_response>…</tool_response>` — tool execution result |
| `answer` | `<answer>…</answer>` — final synthesized response |
### Trajectory Structure
```
system → user → reasoning → [tool_call → tool_output → reasoning →]* answer
```
Trajectories range from ~30 to ~360 turns, with 14–152 tool calls per row (avg ~64).
## Schema
Single Parquet file with zstd compression.
| Column | Type | Description |
|---|---|---|
| `messages` | string | Converted conversation (JSON list of `{role, content}`) |
| `language` | string | Language of the query (`en` or `zh`) |
## Tools
5 tools available per trajectory:
| Tool | Description |
|---|---|
| `search` | Google web search with multiple queries |
| `visit` | Visit webpage(s) and extract content |
| `google_scholar` | Academic publication search |
| `google_maps` | Google Maps place search |
| `PythonInterpreter` | Sandboxed Python code execution |
## Conversion Details
- Source data already uses `<think>`, `<tool_call>`, `<tool_response>`, and `<answer>` XML tags — conversion is **decomposition** of compound assistant messages into separate FSM turns, not synthesis
- Assistant messages with `<think>` + `<tool_call>` split into separate `reasoning` + `tool_call` turns
- Assistant messages with `<think>` + `<answer>` split into separate `reasoning` + `answer` turns
- User messages containing `<tool_response>` mapped to `tool_output` turns
- Tool schemas extracted from `<tools>` XML block in system prompt, converted to clean JSON
- Bridge reasoning synthesized only when FSM requires it (rare — source already has `<think>` blocks)
- ~88.5% conversion rate; failures are malformed source rows with `reasoning→tool_output` transition violations (assistant has `<think>` but no `<tool_call>`/`<answer>`, followed immediately by `<tool_response>`)
- Two validation layers: FSM transition check + content-tag non-empty check
## Usage
```py
import json, random, re
from datasets import load_dataset
VALID_NEXT = {
"system": {"user"}, "user": {"reasoning"},
"reasoning": {"tool_call", "answer"}, "tool_call": {"tool_output"},
"tool_output": {"reasoning"}, "answer": {"user"},
}
ds = load_dataset("AmanPriyanshu/tool-reasoning-sft-REDSearcher_SFT_10K", split="train")
print(f"Loaded: {len(ds):,} rows\n")
idx = random.randint(0, len(ds) - 1)
row = ds[idx]
msgs = json.loads(row["messages"])
lang = row["language"]
roles = [m["role"] for m in msgs]
tc = sum(1 for r in roles if r == "tool_call")
print(f"Row {idx} | language={lang} | {len(msgs)} turns | {tc} tool_calls")
print(f"Roles: {' -> '.join(roles[:20])}{'...' if len(roles)>20 else ''}\n")
# ── Validation 1: FSM transitions ────────────────────────────────────────────
bad = [(j, roles[j], roles[j+1]) for j in range(len(roles)-1)
if roles[j+1] not in VALID_NEXT.get(roles[j], set())]
if bad:
print(f"!! FSM VIOLATIONS: {len(bad)}")
for pos, a, b in bad[:5]:
print(f" [{pos}] {a} -> {b}")
else:
print("✓ FSM transitions: all valid")
# ── Validation 2: content tags ───────────────────────────────────────────────
tag_errors = []
for i, t in enumerate(msgs):
r, c = t["role"], t["content"]
if r == "reasoning":
if not re.search(r'<think>.+</think>', c, re.DOTALL):
tag_errors.append((i, r, "empty <think>"))
elif r == "tool_call":
if not re.search(r'<tool_call>.+</tool_call>', c, re.DOTALL):
tag_errors.append((i, r, "empty <tool_call>"))
else:
blob = c[c.find("{"):c.rfind("}") + 1]
try:
obj = json.loads(blob)
if "name" not in obj or "arguments" not in obj:
tag_errors.append((i, r, "missing name/arguments"))
except json.JSONDecodeError as e:
tag_errors.append((i, r, f"invalid JSON: {e}"))
elif r == "answer":
if not re.search(r'<answer>.+</answer>', c, re.DOTALL):
tag_errors.append((i, r, "empty <answer>"))
elif r == "tool_output":
if not re.search(r'<tool_response>.+</tool_response>', c, re.DOTALL):
tag_errors.append((i, r, "empty <tool_response>"))
if tag_errors:
print(f"!! TAG ERRORS: {len(tag_errors)}")
for pos, role, err in tag_errors[:5]:
print(f" [{pos}] {role}: {err}")
else:
print("✓ Content tags: all valid")
# ── Validation 3: structure checks ───────────────────────────────────────────
checks = []
if roles[0] != "system":
checks.append("first role is not system")
if roles[1] != "user":
checks.append("second role is not user")
if roles[-1] != "answer":
checks.append(f"last role is {roles[-1]}, expected answer")
if any(roles[i] == roles[i+1] for i in range(len(roles)-1)):
dupes = [(i, roles[i]) for i in range(len(roles)-1) if roles[i] == roles[i+1]]
checks.append(f"consecutive same-role at {dupes[0]}")
if checks:
print(f"!! STRUCTURE ISSUES: {len(checks)}")
for c in checks:
print(f" {c}")
else:
print("✓ Structure: system→user→...→answer, no consecutive duplicates")
# ── Print turns ──────────────────────────────────────────────────────────────
print(f"\n{'='*70}")
print(f"FULL CONVERSATION ({len(msgs)} turns)")
print(f"{'='*70}\n")
for i, m in enumerate(msgs):
content = m["content"]
if m["role"] == "system":
content = content[:200] + "..."
elif len(content) > 300:
content = content[:300] + "..."
print(f"[{i}] {m['role']}:\n{content}\n")
```
## Language Distribution
| Language | Rows |
|---|---|
| `en` | ~95% |
| `zh` | ~5% |
提供机构:
AmanPriyanshu



