qwen3.5-27b-cli-reasoning-3632x
收藏魔搭社区2026-05-16 更新2026-03-07 收录
下载链接:
https://modelscope.cn/datasets/LocoreMind/qwen3.5-27b-cli-reasoning-3632x
下载链接
链接失效反馈官方服务:
资源简介:
# Qwen3.5-27B CLI Reasoning 3632x
A synthetic reasoning dataset for CLI/terminal command assistance, distilled from **Qwen3.5-27B** with thinking mode enabled.
Each sample contains a realistic user scenario describing a terminal task, paired with the model's reasoning chain (`<think>`) and a structured JSON answer (`command` + `description`).
## Dataset Summary
| | |
|---|---|
| **Source model** | Qwen3.5-27B (DashScope API) |
| **Samples** | 3,632 |
| **Thinking mode** | Enabled (budget: 4096 tokens) |
| **Source data** | [b-mc2/cli-commands-explained](https://huggingface.co/datasets/b-mc2/cli-commands-explained) (16K CLI commands) |
| **Command coverage** | 3,034 unique command prefixes (100% of source) |
| **Quality filter** | JSON parseable + `bash -n` syntax check passed |
| **License** | CC-BY-4.0 |
## Creation Pipeline
This dataset was built using a three-phase distillation pipeline:
1. **Phase 1 — Prompt Generation** (non-thinking mode): Each raw CLI command entry (`code`, `title`, `description`, `date`) was fed to Qwen3.5-27B to generate a realistic, scenario-based user prompt with environment context (`Current time`, `Current OS`, `Shell`). The model infers OS/Shell from the command itself.
2. **Phase 2 — Thinking Distillation**: The generated prompts were sent back to Qwen3.5-27B with thinking mode enabled. The model reasons about the user's problem and outputs a structured JSON answer.
3. **Phase 3 — Quality Filtering**: Responses were validated for JSON parseability and command syntax (`bash -n`). 89 samples with broken JSON escaping or truncated multi-line commands were filtered out (2.4% rejection rate).
### Stratified Sampling
3,721 samples were selected from the 16K source dataset using stratified sampling:
- All high-vote commands (votes ≥ 11): 797 classic commands
- One representative per unique command prefix: 2,924 additional commands
- 100% coverage of all 3,034 command prefixes in the source
## Data Format
```json
{
"messages": [
{
"role": "system",
"content": ""
},
{
"role": "user",
"content": "Current time: 2009-01-26 10:26:48\nCurrent OS: Linux\nShell: Bash\n\nI just tried to install a package but got a \"permission denied\" error because I forgot to run it with sudo. Is there a quick way to re-run that exact same command as root without having to type the whole thing out again?"
},
{
"role": "assistant",
"content": "<think>The user ran a command without sudo and got a permission denied error. They want to quickly re-run the same command but with sudo. In bash, there's a quick way to do this using `sudo !!`...</think>\n```json\n{\n \"command\": \"sudo !!\",\n \"description\": \"This uses bash's history expansion feature where '!!' represents the last command you executed. Prepending 'sudo' allows you to immediately re-run that exact command with elevated privileges without retyping it.\"\n}\n```"
}
]
}
```
## Statistics
| Metric | Min | Avg | Max |
|--------|-----|-----|-----|
| User prompt length | 243c | 411c | 622c |
| Reasoning (`<think>`) length | 272c | 1,920c | 16,135c |
| Answer length | 100c | 411c | 1,958c |
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("LocoreMind/qwen3.5-27b-cli-reasoning-3632x")
```
## Command Prefix Coverage (Top 10)
`find`(1079) · `for`(600) · `echo`(579) · `sudo`(451) · `curl`(364) · `ls`(325) · `cat`(304) · `grep`(254) · `git`(251) · `sed`(241)
*Counts refer to the full source dataset; the sampled subset covers all prefixes with at least one representative.*
## Intended Use
- Fine-tuning smaller models for CLI assistance and terminal command generation
- Training reasoning capabilities for shell/DevOps tasks
- Building agentic tool-calling systems that operate in terminal environments
## Limitations
- Commands are biased toward Linux/Bash; macOS and other shells are underrepresented
- The model occasionally suggests modernized alternatives instead of the exact original command (e.g., `python3 -m http.server` instead of `python -m SimpleHTTPServer`)
- Complex commands with heavy quoting/escaping were filtered out, slightly underrepresenting `awk`/`sed`/`perl` one-liners
# Qwen3.5-27B 命令行界面(CLI)推理数据集 3632x
本数据集为面向CLI/终端命令辅助的合成推理数据集,由启用思维模式的**Qwen3.5-27B**模型蒸馏生成。
每个样本均包含描述终端任务的逼真用户场景,并配套模型的推理链(`<think>`标签包裹)与结构化JSON答案(含`command`与`description`字段)。
## 数据集概览
| 指标项 | 详情 |
|---|---|
| **源模型** | Qwen3.5-27B(DashScope API) |
| **样本量** | 3632条 |
| **思维模式** | 已启用(令牌(Token)预算:4096 Token) |
| **源数据集** | [b-mc2/cli-commands-explained](https://huggingface.co/datasets/b-mc2/cli-commands-explained)(含16000条CLI命令) |
| **命令覆盖范围** | 3034个唯一命令前缀(覆盖源数据集全部前缀) |
| **质量过滤规则** | 可被JSON解析 + 通过`bash -n`语法检查 |
| **授权协议** | CC-BY-4.0 |
## 构建流程
本数据集采用三阶段蒸馏流程构建:
1. **第一阶段 — 提示词生成**(非思维模式):将每条原始CLI命令条目(`code`、`title`、`description`、`date`)输入Qwen3.5-27B模型,生成包含环境上下文(当前时间、当前操作系统、Shell)的逼真场景化用户提示词。模型将从命令本身推断出操作系统与Shell类型。
2. **第二阶段 — 思维蒸馏**:将生成的提示词连同启用的思维模式一同送回Qwen3.5-27B模型。模型将针对用户问题进行推理,并输出结构化JSON答案。
3. **第三阶段 — 质量过滤**:对模型输出的响应进行JSON可解析性与命令语法(`bash -n`)校验。共过滤掉89条存在JSON转义错误或多行命令被截断的样本(过滤率2.4%)。
### 分层抽样
从16000条源数据中通过分层抽样选取3721条样本:
- 高赞命令(点赞数≥11):共797条经典命令
- 每个唯一命令前缀选取1条代表性样本:额外补充2924条命令
- 覆盖源数据集中全部3034个命令前缀
## 数据格式
json
{
"messages": [
{
"role": "system",
"content": ""
},
{
"role": "user",
"content": "当前时间:2009-01-26 10:26:48
当前操作系统:Linux
Shell:Bash
我刚才尝试安装一个软件包,但收到了"权限不足"的错误,因为我忘记使用sudo运行命令。有没有办法不用重新输入完整命令,就能快速以root身份重新执行刚才的命令?"
},
{
"role": "assistant",
"content": "<think>用户执行命令时未添加sudo,因此收到了权限不足的错误。他们希望快速以sudo权限重新执行该命令。在Bash中,可以通过`sudo !!`快速实现这一点...</think>
json
{
"command": "sudo !!",
"description": "该命令利用了Bash的历史扩展特性,其中`!!`代表用户刚刚执行的最后一条命令。在其前添加`sudo`即可立即以管理员权限重新执行该命令,无需重新输入完整内容。"
}
"
}
]
}
## 统计指标
| 指标 | 最小值 | 平均值 | 最大值 |
|--------|-----|-----|-----|
| 用户提示词长度 | 243字符 | 411字符 | 622字符 |
| 推理内容(`<think>`)长度 | 272字符 | 1920字符 | 16135字符 |
| 答案长度 | 100字符 | 411字符 | 1958字符 |
## 使用方式
python
from datasets import load_dataset
dataset = load_dataset("LocoreMind/qwen3.5-27b-cli-reasoning-3632x")
## 命令前缀覆盖量(前十)
`find`(1079) · `for`(600) · `echo`(579) · `sudo`(451) · `curl`(364) · `ls`(325) · `cat`(304) · `grep`(254) · `git`(251) · `sed`(241)
*注:统计值对应完整源数据集;抽样子集覆盖了所有至少存在一条代表性样本的命令前缀。*
## 预期用途
- 针对CLI辅助与终端命令生成任务微调小型模型
- 训练面向Shell/DevOps任务的推理能力
- 构建可在终端环境中运行的AI智能体(AI Agent)工具调用系统
## 局限性
- 命令偏向Linux/Bash生态,macOS与其他Shell类型覆盖不足
- 模型偶尔会建议现代化替代方案而非原始命令(例如使用`python3 -m http.server`替代`python -m SimpleHTTPServer`)
- 包含复杂引号与转义的复杂命令被过滤,导致`awk`/`sed`/`perl`单行命令的代表性略有不足
提供机构:
maas
创建时间:
2026-02-27



