AmanPriyanshu/tool-reasoning-sft-RESEARCH-explorations

Name: AmanPriyanshu/tool-reasoning-sft-RESEARCH-explorations
Creator: AmanPriyanshu
Published: 2026-04-02 20:08:26
License: 暂无描述

Hugging Face2026-04-02 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/AmanPriyanshu/tool-reasoning-sft-RESEARCH-explorations

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - text-generation language: - en tags: - reasoning - tool-calling - agentic - multi-turn - code-exploration - multi-step-reasoning license: apache-2.0 size_categories: - 100K<n<1M --- # Explorations Trajectories — Cleaned & Stripped 149,025 multi-turn code exploration agent trajectories converted into a strict reasoning + tool-call format with validated FSM transitions. ## Origin Derived from [AmanPriyanshu/random-small-github-repositories](https://huggingface.co/datasets/AmanPriyanshu/random-small-github-repositories) and [AmanPriyanshu/random-python-github-repositories](https://huggingface.co/datasets/AmanPriyanshu/random-python-github-repositories). Each trajectory is a search session where an agent navigates a GitHub repository using terminal commands to locate a target file. The agent reasons about project structure, runs `grep`/`ls`/`find`/`cat` commands, and submits ranked file recommendations. Trajectories were filtered to the goldilocks zone (7–15 turns with successful file discovery) and expanded across 3 rounds per seed. ## Format Each row contains a structured multi-turn conversation with explicit reasoning traces and validated tool calls. ### Message Roles | Role | Content | |---|---| | `system` | Tool-use protocol + JSON tool schemas + code exploration agent instructions | | `user` | Target file request, or synthetic rejection asking agent to keep searching | | `reasoning` | `<think>…</think>` — model's step-by-step reasoning | | `tool_call` | `<tool_call>{"name": "...", "arguments": {...}}</tool_call>` — function invocation | | `tool_output` | `<tool_response>…</tool_response>` — tool execution result | | `answer` | `<answer>…</answer>` — ranked file recommendation | ### Trajectory Structure ``` system → user → reasoning → [tool_call → tool_output → reasoning →]* answer ``` For trajectories that needed bridging (14.2%), mid-trajectory file submissions are converted into multi-turn rejection loops: ``` ... → reasoning → answer → user(rejection) → reasoning(retry) → tool_call → ... → answer ``` Trajectories range from 4 to 58 turns, with an average of ~32 messages per row. ## Schema Single Parquet file with zstd compression. | Column | Type | Description | |---|---|---| | `messages` | string | Converted conversation (JSON list of `{role, content}`) | | `repo_id` | string | Anonymized repository identifier | | `dataset` | string | Source split (`small_repos` or `py_repos`) | | `alpha_hash` | string | Hash identifier for the repository | | `seed_group_idx` | int64 | Seed group index for trajectory generation | | `seed_file_options` | string | JSON list of candidate target files | | `seed_file_selected` | string | The actual target file the agent must find | | `naming_style` | string | How the target was described (`semantic` or `direct`) | | `found_in_turns` | int64 | Number of turns the agent took to find the file | | `trajectory_round` | int64 | Which of the 3 generation rounds (1, 2, or 3) | | `needed_to_bridge` | bool | Whether synthetic rejection messages were inserted | ## Tools 2 tools available per trajectory: | Tool | Description | |---|---| | `terminal` | Execute a terminal command on a Linux machine (with optional `max_chars` truncation) | | `submit_recommended_files` | Submit a ranked list of file paths as the agent's recommendation | ## Conversion Details - Source trajectories use `type`/`text`/`name`/`arguments`/`output` fields — conversion maps these to the canonical 6-role FSM format - `reasoning` outputs → `<think>…</think>` messages; consecutive reasoning blocks merged - `function_call` outputs (terminal) → `<tool_call>` + `<tool_response>` pairs - `function_call` outputs (submit_recommended_files) → `<answer>` with ranked file list - `message` outputs (text-based submissions) → `<answer>` with raw content - Mid-trajectory submissions (agent submits files then keeps exploring) → `answer → user(synthetic rejection) → reasoning(retry)` bridging with 12-variation template pools - Bridge reasoning and tail reasoning drawn from 12 domain-appropriate templates each - 100% conversion rate on all 149,025 rows; zero FSM violations, zero content-tag failures - Two validation layers: FSM transition check + content-tag non-empty regex check + tool_call JSON schema check ## Distribution | Split | Rows | % | |---|---|---| | `small_repos` | 87,591 | 58.8% | | `py_repos` | 61,434 | 41.2% | | Naming Style | Rows | % | |---|---|---| | `semantic` | 95,652 | 64.2% | | `direct` | 53,373 | 35.8% | | Bridged | Rows | % | |---|---|---| | `needed_to_bridge=False` | 127,790 | 85.8% | | `needed_to_bridge=True` | 21,235 | 14.2% | ## Usage ```py import json, random, re from datasets import load_dataset VALID_NEXT = { "system": {"user"}, "user": {"reasoning"}, "reasoning": {"tool_call", "answer"}, "tool_call": {"tool_output"}, "tool_output": {"reasoning"}, "answer": {"user"}, } ds = load_dataset("AmanPriyanshu/tool-reasoning-sft-RESEARCH-explorations", split="train") print(f"Loaded: {len(ds):,} rows\n") idx = random.randint(0, len(ds) - 1) row = ds[idx] msgs = json.loads(row["messages"]) roles = [m["role"] for m in msgs] tc = sum(1 for r in roles if r == "tool_call") print(f"Row {idx} | repo={row['repo_id']} | target={row['seed_file_selected']}") print(f" {len(msgs)} turns | {tc} tool_calls | bridged={row['needed_to_bridge']}") print(f" Roles: {' -> '.join(roles[:15])}{'...' if len(roles)>15 else ''}\n") # ── Validation 1: FSM transitions bad = [(j, roles[j], roles[j+1]) for j in range(len(roles)-1) if roles[j+1] not in VALID_NEXT.get(roles[j], set())] if bad: print(f"!! FSM VIOLATIONS: {len(bad)}") for pos, a, b in bad[:5]: print(f" [{pos}] {a} -> {b}") else: print("✓ FSM transitions: all valid") # ── Validation 2: content tags tag_ok = True for i, t in enumerate(msgs): r, c = t["role"], t["content"] if r == "reasoning" and not re.search(r'<think>.+</think>', c, re.DOTALL): tag_ok = False elif r == "tool_call" and not re.search(r'<tool_call>.+</tool_call>', c, re.DOTALL): tag_ok = False elif r == "answer" and not re.search(r'<answer>.+</answer>', c, re.DOTALL): tag_ok = False elif r == "tool_output" and not re.search(r'<tool_response>.+</tool_response>', c, re.DOTALL): tag_ok = False print(f"{'✓' if tag_ok else '!!'} Content tags: {'all valid' if tag_ok else 'errors found'}") # ── Print sample turns print(f"\n{'='*70}") for i, m in enumerate(msgs[:10]): content = m["content"] if m["role"] == "system": content = content[:150] + "..." elif len(content) > 200: content = content[:200] + "..." print(f"[{i}] {m['role']}:\n{content}\n") if len(msgs) > 10: print(f"... ({len(msgs) - 10} more turns)") ```

提供机构：

AmanPriyanshu

5,000+

优质数据集

54 个

任务类型

进入经典数据集