智能体Tool Use工具使用多轮对话数据集
收藏魔搭社区2026-05-21 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/rockingdingo/Agent-Tool-Use-Dialogue-Open-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Open Agent RL Dataset: High Quality AI Agent | Tool Use & Function Calls | Reinforcement Learning Datasets
[Github](https://github.com/AI-Agent-Hub/AI-Agent-Marketplace)|[Huggingface](https://huggingface.co/datasets/DeepNLP/AI-Agent-Marketplace-Index)|[Pypi](https://pypi.org/project/ai-agent-marketplace/) | [Open Source AI Agent Marketplace DeepNLP](https://www.deepnlp.org/store/ai-agent)|[Agent RL Dataset](https://www.deepnlp.org/store/dataset)
DeepNLP website provides **high quality, genuine, online users' request** of Agent & RL datasets to help LLM foundation/SFT/Post Train to get more capable models at function call, tool use and planning. The datasets are collected and sampled
from users' requests on our various clients (Web/App/Mini App) and [Open OneKey Agent Router](https://www.deepnlp.org/agent/onekey-mcp-router) and [Open OneKey MCP Router](https://www.deepnlp.org/agent/onekey-mcp-router). Some datasets requires [credit](https://www.deepnlp.org/workspace/billing) to deduct and you can easily gain more credit by activities such as commenting and discussion and uploading your own datasets to the communities.
We have released sampled examples on huggingface. If you find it useful, please visit our AI store dataset Tab to Select [Agent RL Dataset](https://www.deepnlp.org/store/dataset).
| Dataset Name | Description | User Feedback | Example Dataset Download | Full DataSet Download |
|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---- |-------|-----------------------------------------------------------------------------------------------------------------|
| Tool Use Multi-Turn Dialogue | The tool use multi-turn dialogue dataset is in the list of messages formats, Useful for AI Search/Deep Research/Map/Financial Data/etc | YES | 50 instances, [Download](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset-example) | 1k, [Download](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset) |
**Disclaimer**: Safe privacy preserving or personalized information are marked and filtered out.
<img src="https://raw.githubusercontent.com/aiagenta2z/ai-agent-marketplace/refs/heads/main/docs/ai_agent_marketplace_distribution.png" style="height:600px;" alt="AI Agent Marketplace Category">
## 1. Dataset Features
**Genuine Users' Queries**: Most of the high quality datasets are collected from query logs of our live AI Agents, such as [MCP Tool Use Agent](https://agent.deepnlp.org/agent/mcp_tool_use), [Open OneKey Agent Router](https://www.deepnlp.org/agent/onekey-mcp-router) and [Open OneKey MCP Router](https://www.deepnlp.org/agent/onekey-mcp-router).
**Function Call and MCP Servers Support**: The datasets covers wide range of MCP servers from the Open MCP Marketplace() and Playgrounds.
**Users Action and Humans' Feedback**: Users' actual feedbacks are crucial in improving the AI Agents training process. We collects users' genuine actions, such as **ACCEPT/REJECT** in confirming the function call results, **Upvote/Downvote** action of the final responses, and many other users' feedback on clickable elements.
**Various Domains and Tasks**: We covers 40+ categories of AI agents' tool use scenarios, ranging from information seeking (AI search, map search, etc) to autonomous AI agents browser use, computer use, Data Analysis, Excel Spreadsheet and Powerpoint creation and generation, etc.
**Example AI Agent Dataset Dialogues**
| Domain | Related MCP Server| Demo |
| ---- | ---- | ---- |
| Office File Agent | Excel Spreadsheet, Powerpoint, PDF, etc | [Example](https://agent.deepnlp.org/agent/mcp_tool_use/share/ee640008-6bc1-4c3a-832b-2557f985b540) [MCP]() |
| AI Search/Deep Research | Bing/Google Custom/Perplexity/Tavily/Firecrawl | [Demo](https://agent.deepnlp.org/agent/mcp_tool_use?server=tavily-ai/tavily-mcp) [MCP]() |
| Map Trip Planning | Amap(Gaode), BaiduMap, etc. | [Example](https://agent.deepnlp.org/agent/mcp_tool_use/share/8ab0b25c-b72d-4cae-9c86-a852df8c6541) [MCP](https://agent.deepnlp.org/agent/mcp_tool_use?server=amap-mcp/amap-mcp-%E9%AB%98%E5%BE%B7%E5%9C%B0%E5%9B%BE-mcp) [Use MCP]() |
| Browser Usage | Playwright, Puppeteer, etc. | [Demo](https://agent.deepnlp.org/agent/mcp_tool_use?server=puppeteer/puppeteer) [MCP]() |
| Chart,Graph,Image | everart,mcp-server-charts(AntV),canva-mcp | [Demo](https://agent.deepnlp.org/agent/mcp_tool_use/share/93d94694-e673-49d3-b805-820c4ef842bd) [MCP]() |
## 2. Dataset Introduction
We provide main below types of AI agents datasets in List of Messages Json Formats and scalar data such as rewards, etc.
| Dataset Name | Description | User Feedback | Example Dataset Download | Full DataSet Download |
|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---- |-------|-----------------------------------------------------------------------------------------------------------------|
| Tool Use Multi-Turn Dialogue | The tool use multi-turn dialogue dataset is in the list of messages formats, Useful for AI Search/Deep Research/Map/Financial Data/etc | YES | 50 instances, [Download](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset-example) | 1k, [Download](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset) |
| Function Calling Tool Use | The dataset contains **messages** and **available tools** as input and output the choosen **tool_call** result indicating which tool to use and the arguments. The datasets are collected from calling SOTA LLM such as Qwen, Kimi, etc. | No | 50 instances, [Download](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-function-calling-open-dataset-example) | 1k, [Download](https://deepnlp.org/store/ai-agent/ai-agent/pub-deepnlp/agent-function-calling-open-dataset) |
| Reinforcement Learning | Sessions of user and assistant' multi-dialogues, rewards from users' feedback in this session, such click of confirmation (Accept/Reject), Upvote, Downvote on the responses, etc. | YES | 50 instances, [Download](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-reinforcement-learning-open-dataset-example) | 1k, [Download](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-reinforcement-learning-open-dataset) |
### Dataset 1 Tool Use Multi-Turn Dialogue Dataset
**Dataset Description**
| KEY | Type | Description |
| ---- |---------------------|-----------------------------------------------------------------------------------------------------------------------------|
| trace_id | String | Identify each unique new user request or API calling |
| session_id | String | The identifier of each dialogue, which consists of multiple turns of dialogues and every user input produces a new trace_id |
| messages | List of Json Object | Dialogue Messages |
This data instances indicates a multi-turn dialogues of users' calling Google Maps **get_weather** tool to know
the recent weather in San Francisco. The dialogues contains three types of messages:
```
User: query, original question that user asks,
User: available_tools, List of Json that user provides to LLM,
Assistant: message, content.type='tool_use', LLM output which tool to use and its parameters,
User: message, content.type='tool_result', Users' actual function call running results.
```
```
[
{
"role": "user",
"content": "What is the weather like in San Francisco?"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "I need to use get_weather, and the user wants SF, which is likely San Francisco, CA."
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {
"location": "San Francisco, CA",
"unit": "celsius"
}
}
]
},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",
"content": "15 degrees"
}
]
}
]
```
Note that the function call comes in different formats when calling various models. We are mainly collecting in the OpenAI and anthroupic function calling formats.
We supported both and you can see the differences from the offical documentations.
**Multi-modal and Files** formats are also attached:
The images and raw descriptions of the files such as path are also attached for context variables.
<img src="https://raw.githubusercontent.com/AI-Agent-Hub/mcp-marketplace/refs/heads/main/app/mcp_tool_use/docs/office_excel_use_agent.jpg" style="height:600px;" alt="Excel Spreadsheet Usage">
**OpenAI/Qwen/etc Function Call Formats**
```
{
"tool_call": {
"id": "call_d6f4ed29ce614390b99a05",
"function": {
"arguments": "{\"url\": \"https://www.stackoverflow.com\", \"browserType\": \"chromium\"}",
"name": "playwright_navigate"
},
"type": "function",
"index": 0
}
}
```
**Anthroupic Tool Use Formats**
```
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {
"location": "San Francisco, CA",
"unit": "celsius"
}
}
```
# Open Agent RL 数据集:高质量AI智能体(AI Agent) | 工具调用(Tool Use)与函数调用(Function Call) | 强化学习(Reinforcement Learning)数据集
[Github](https://github.com/AI-Agent-Hub/AI-Agent-Marketplace)|[Huggingface](https://huggingface.co/datasets/DeepNLP/AI-Agent-Marketplace-Index)|[Pypi](https://pypi.org/project/ai-agent-marketplace/) | [开源AI智能体市场DeepNLP](https://www.deepnlp.org/store/ai-agent)|[Agent RL 数据集](https://www.deepnlp.org/store/dataset)
DeepNLP官网提供**高质量、真实的在线用户请求**类型的智能体与强化学习数据集,旨在助力大语言模型(LLM)的基座训练、SFT微调与后训练,使其在函数调用、工具使用与规划能力上获得进一步提升。本数据集采集自我们各客户端(网页端、应用端、小程序端)以及[Open OneKey智能体路由(Open OneKey Agent Router)](https://www.deepnlp.org/agent/onekey-mcp-router)、[Open OneKey MCP路由(Open OneKey MCP Router)](https://www.deepnlp.org/agent/onekey-mcp-router)的用户请求,并经过采样处理。部分数据集需消耗[积分(credit)](https://www.deepnlp.org/workspace/billing)方可使用,用户可通过评论互动、上传自有数据集至社区等活动轻松获取更多积分。
我们已在Hugging Face平台发布了采样示例。若您觉得本数据集有所助益,可前往我们的AI商店数据集板块,获取[Agent RL 数据集](https://www.deepnlp.org/store/dataset)。
| 数据集名称 | 数据集描述 | 用户反馈 | 示例数据集下载 | 完整数据集下载 |
|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---- |-------|-----------------------------------------------------------------------------------------------------------------|
| 工具使用多轮对话 | 本工具使用多轮对话数据集采用消息列表格式,适用于AI搜索、深度研究、地图查询、金融数据处理等场景 | 是 | 50条实例,[下载](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset-example) | 1000条实例,[下载](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset) |
**免责声明**:本数据集已对需保护的隐私信息与个性化数据进行标记与过滤处理。
<img src="https://raw.githubusercontent.com/aiagenta2z/ai-agent-marketplace/refs/heads/main/docs/ai_agent_marketplace_distribution.png" style="height:600px;" alt="AI智能体市场分类">
## 1. 数据集特性
**真实用户查询**:本项目大部分高质量数据集采集自我们上线AI智能体的查询日志,包括[MCP工具使用智能体(MCP Tool Use Agent)](https://agent.deepnlp.org/agent/mcp_tool_use)、[Open OneKey智能体路由(Open OneKey Agent Router)](https://www.deepnlp.org/agent/onekey-mcp-router)以及[Open OneKey MCP路由(Open OneKey MCP Router)](https://www.deepnlp.org/agent/onekey-mcp-router)。
**函数调用与MCP服务器支持**:本数据集覆盖开放MCP市场与演练场中的多种MCP服务器。
**用户行为与人工反馈**:用户的真实反馈对优化AI智能体训练流程至关重要。我们采集了用户的真实交互行为,包括确认函数调用结果时的**接受/拒绝**操作、对最终回复的**点赞/点踩**,以及用户对可交互元素的各类反馈。
**多领域与多任务覆盖**:本数据集涵盖40余种AI智能体工具使用场景,从信息检索类任务(如AI搜索、地图查询等)到自主AI智能体的浏览器操作、计算机操控、数据分析、Excel表格与PPT生成等各类场景。
**AI智能体数据集对话示例**
| 领域 | 关联MCP服务器| 演示示例 |
| ---- | ---- | ---- |
| 办公文件智能体 | Excel表格、PPT、PDF等 | [示例](https://agent.deepnlp.org/agent/mcp_tool_use/share/ee640008-6bc1-4c3a-832b-2557f985b540) [MCP]() |
| AI搜索/深度研究 | 必应/谷歌自定义搜索、Perplexity、Tavily、Firecrawl | [演示](https://agent.deepnlp.org/agent/mcp_tool_use?server=tavily-ai/tavily-mcp) [MCP]() |
| 地图行程规划 | 高德地图、百度地图等 | [示例](https://agent.deepnlp.org/agent/mcp_tool_use/share/8ab0b25c-b72d-4cae-9c86-a852df8c6541) [MCP](https://agent.deepnlp.org/agent/mcp_tool_use?server=amap-mcp/amap-mcp-%E9%AB%98%E5%BE%B7%E5%9C%B0%E5%9B%BE-mcp) [使用MCP]() |
| 浏览器操作 | Playwright、Puppeteer等 | [演示](https://agent.deepnlp.org/agent/mcp_tool_use?server=puppeteer/puppeteer) [MCP]() |
| 图表、图像生成 | Everart、mcp-server-charts(AntV)、canva-mcp | [演示](https://agent.deepnlp.org/agent/mcp_tool_use/share/93d94694-e673-49d3-b805-820c4ef842bd) [MCP]() |
## 2. 数据集介绍
我们主要提供以下几类AI智能体数据集,数据格式为消息列表JSON格式,同时包含奖励值等标量数据。
| 数据集名称 | 数据集描述 | 用户反馈 | 示例数据集下载 | 完整数据集下载 |
|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ---- |-------|-----------------------------------------------------------------------------------------------------------------|
| 工具使用多轮对话 | 本工具使用多轮对话数据集采用消息列表格式,适用于AI搜索、深度研究、地图查询、金融数据处理等场景 | 是 | 50条实例,[下载](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset-example) | 1000条实例,[下载](https://www.deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-tool-use-dialogue-open-dataset) |
| 函数调用工具使用 | 本数据集以**消息**与**可用工具**作为输入,输出选定的**工具调用(tool_call)**结果,用于指示需调用的工具及参数。数据集采集自通义千问(Qwen)、Kimi等前沿大语言模型的调用日志。 | 否 | 50条实例,[下载](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-function-calling-open-dataset-example) | 1000条实例,[下载](https://deepnlp.org/store/ai-agent/ai-agent/pub-deepnlp/agent-function-calling-open-dataset) |
| 强化学习数据集 | 本数据集包含用户与助手的多轮对话会话,以及该会话中来自用户反馈的奖励值,例如确认操作(接受/拒绝)的点击、对回复的点赞/点踩等。 | 是 | 50条实例,[下载](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-reinforcement-learning-open-dataset-example) | 1000条实例,[下载](https://deepnlp.org/store/dataset/dataset/pub-deepnlp/agent-reinforcement-learning-open-dataset) |
### 数据集1:工具使用多轮对话数据集
**数据集描述**
| 关键字段 | 数据类型 | 字段说明 |
| ---- |---------------------|-----------------------------------------------------------------------------------------------------------------------------|
| trace_id | 字符串 | 用于标识每个唯一的用户请求或API调用 |
| session_id | 字符串 | 对话会话标识符,一个会话包含多轮对话,每一次用户输入都会生成一个新的trace_id |
| messages | JSON对象列表 | 对话消息列表 |
本数据集的示例展示了用户调用谷歌地图**get_weather**工具查询旧金山近期天气的多轮对话。该对话包含三类消息:
User: query, original question that user asks,
User: available_tools, List of Json that user provides to LLM,
Assistant: message, content.type='tool_use', LLM output which tool to use and its parameters,
User: message, content.type='tool_result', Users' actual function call running results.
[
{
"role": "user",
"content": "What is the weather like in San Francisco?"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "I need to use get_weather, and the user wants SF, which is likely San Francisco, CA."
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {
"location": "San Francisco, CA",
"unit": "celsius"
}
}
]
},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",
"content": "15 degrees"
}
]
}
]
请注意,不同模型的函数调用格式存在差异。本数据集主要采集自OpenAI与Anthropic的函数调用格式,我们同时支持这两种格式,具体差异可参考官方文档。
**多模态与文件格式支持**:数据集同时附带图片与文件路径等原始描述信息,用于补充上下文变量。
<img src="https://raw.githubusercontent.com/AI-Agent-Hub/mcp-marketplace/refs/heads/main/app/mcp_tool_use/docs/office_excel_use_agent.jpg" style="height:600px;" alt="Excel表格使用场景">
**OpenAI、通义千问等函数调用格式**
{
"tool_call": {
"id": "call_d6f4ed29ce614390b99a05",
"function": {
"arguments": "{"url": "https://www.stackoverflow.com", "browserType": "chromium"}",
"name": "playwright_navigate"
},
"type": "function",
"index": 0
}
}
**Anthropic工具调用格式**
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {
"location": "San Francisco, CA",
"unit": "celsius"
}
}
提供机构:
maas
创建时间:
2025-10-13
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个专注于智能体工具使用的多轮对话数据集,采用消息列表的JSON格式,适用于AI搜索、深度研究、地图规划和金融数据等场景。其核心特点包括:基于真实用户查询和AI代理日志收集,确保数据质量;涵盖多种工具调用格式(如OpenAI和Anthropic),并包含用户反馈(如接受/拒绝、点赞/点踩)以支持强化学习。数据集提供50个示例实例和1k完整实例,可用于训练和优化大语言模型在函数调用和工具使用方面的能力。
以上内容由遇见数据集搜集并总结生成



