rl-rag/dr-tulu-sft-qwen35-tools
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/rl-rag/dr-tulu-sft-qwen35-tools
下载链接
链接失效反馈官方服务:
资源简介:
# DR Tulu SFT Data — Qwen3.5 Tool Format
Converted from [rl-rag/dr-tulu-sft-unified](https://huggingface.co/datasets/rl-rag/dr-tulu-sft-unified)
to Qwen3.5 multi-turn tool-calling format for LLaMA-Factory SFT training.
## Data
- **`exact_answer_multiturn.json`**: 2,868 exact-answer examples in LLaMA-Factory sharegpt format
- Filtered to `type=exact_answer` only
- Tool calls in JSON format inside `<tool_call>` tags (LLaMA-Factory converts to Qwen3.5 XML during tokenization)
- Tool outputs preserved from original DR Tulu data (Serper search results, page content, Semantic Scholar snippets)
- System prompt and tool definitions imported from [elastic-serving](https://github.com/RulinShao/elastic-inference) Qwen3Adapter
## Tool Name Mapping
| DR Tulu Original | Qwen3.5 / elastic-serving |
|---|---|
| `serper_google_webpage_search` | `web_search` |
| `serper_fetch_webpage_content` | `open_url` |
| `semantic_scholar_snippet_search` | `paper_search` |
## Roles and Masking
| Role | Trained on? |
|------|-------------|
| `system` | No (masked) |
| `human` | No (masked) |
| `observation` | No (masked) |
| `function_call` | **Yes** |
| `gpt` | **Yes** |
## Citation Format
Final answers use `<cite id="ID1,ID2">claim text</cite>` for citations
and `\boxed{answer}` for short factual answers.
## Usage with LLaMA-Factory
Register in `dataset_info.json`:
```json
{
"drtulu_qwen35": {
"file_name": "exact_answer_multiturn.json",
"formatting": "sharegpt",
"columns": {"messages": "conversations", "tools": "tools"},
"tags": {
"role_tag": "from", "content_tag": "value",
"user_tag": "human", "assistant_tag": "gpt",
"observation_tag": "observation", "function_tag": "function_call",
"system_tag": "system"
}
}
}
```
Then in your YAML config:
```yaml
dataset: drtulu_qwen35
template: qwen3 # or qwen3_5_text for Qwen3.5 models
```
提供机构:
rl-rag



