AISA-Framework/AISA-AR-FunctionCall-Reasoning
收藏Hugging Face2026-03-04 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/AISA-Framework/AISA-AR-FunctionCall-Reasoning
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ar
license: apache-2.0
tags:
- function-calling
- tool-use
- agentic
- arabic
- reasoning
- think
- llm-training
- agentic-ai
- agents
- structured-output
- chain-of-thought
pretty_name: AISA-AR-FunctionCall-Think
size_categories:
- 10K<n<100K
task_categories:
- text-generation
task_ids:
- language-modeling
---
# AISA-AR-FunctionCall-Think
ce.co/production/uploads/628f7a71dd993507cfcbe587/21Mxl67VW-RQFiXTnvheT.png" width="700"/>
</p>
**Arabic Reasoning-Augmented Function Calling Dataset**
`AISA-AR-FunctionCall-Think` is a reasoning-augmented subset of the [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) dataset designed for training models that perform **explicit reasoning before tool invocation**.
The dataset introduces structured reasoning traces enclosed in `<think>` blocks prior to emitting a tool call. This enables training models that produce interpretable decision steps before executing structured API actions.
This dataset supports research on **reasoning-aware tool calling in Arabic agentic AI systems**.
---
## Dataset Overview
Each example in the dataset contains:
- Arabic user request
- Tool schema definitions
- Reasoning trace (`<think>` block)
- Structured tool call
- Argument annotations
- Metadata labels
**Model output format:**
```
<think>
reasoning about tool selection
</think>
<start_function_call>
call:tool_name{arguments}
<end_function_call>
```
---
## Dataset Statistics
| Property | Value |
|---|---|
| Dataset size | ~12,000 samples |
| Dialect coverage | 5 Arabic dialects |
| Domains | 8 real-world domains |
| Tools | 27 structured tools |
This dataset is derived from the larger **AISA-AR-FunctionCall** corpus.
---
## Dialect Coverage
| Dialect |
|---|
| Modern Standard Arabic (MSA) |
| Gulf Arabic |
| Egyptian Arabic |
| Levantine Arabic |
| Maghrebi Arabic |
---
## Domains
| Domain |
|---|
| Travel |
| Utilities |
| Islamic services |
| Weather |
| Healthcare |
| Banking & finance |
| E-commerce |
| Government services |
---
## Example Sample
**User query:**
```
ما حالة الطقس في الرياض اليوم؟
```
**Expected model output:**
```
<think>
المستخدم يريد معرفة حالة الطقس في مدينة الرياض.
الأداة المناسبة هي get_weather.
</think>
<start_function_call>
call:get_weather{city:<escape>الرياض<escape>,days:1}
<end_function_call>
```
---
## Data Format
Each example in the dataset contains the following fields:
| Field | Description |
|---|---|
| `messages` | Conversation messages (developer system prompt + user query) |
| `tools` | Tool schema definitions available for the query |
| `think` | Reasoning trace explaining tool selection |
| `tool_called` | Ground truth tool name |
| `arguments` | Structured argument dictionary |
| `domain` | Task domain (e.g., weather, banking) |
| `dialect` | Arabic dialect group |
---
## Dataset Construction
The dataset was generated through a **reasoning augmentation pipeline** applied to the base AISA-AR-FunctionCall dataset.
**Pipeline steps:**
1. Select structured tool-calling examples from the base corpus
2. Generate reasoning traces explaining tool selection decisions
3. Insert reasoning inside `<think>` blocks
4. Preserve structured tool-call supervision from original annotations
5. Validate reasoning-tool alignment
---
## Intended Use
This dataset is designed for:
- Reasoning-aware tool-calling model training
- Interpretable Arabic AI agents
- Arabic reasoning supervision research
- Structured decision modeling
- Agent alignment experiments
### Out-of-Scope Uses
- General Arabic NLP tasks (classification, summarization, translation)
- Production deployment without validation of reasoning quality
- Safety-critical systems
---
## Known Limitations
- Reasoning traces are short and may not cover multi-step reasoning chains
- Some queries require deeper semantic interpretation than current traces provide
- `<think>` blocks increase output length, which may affect latency in production
- Standard function-call validators may flag outputs as parse failures due to `<think>` tokens preceding the function call marker — this is a format difference, not a structural error
---
## Related Resources
| Resource | Link |
|---|---|
| Base dataset | [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) |
| Reasoning model | [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) |
| Production model | [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) |
| Full collection | [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) |
---
## License
[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
---
语言:
- 阿拉伯语(Arabic)
许可证:Apache-2.0
标签:
- 函数调用(function-calling)
- 工具使用(tool-use)
- 智能体(agentic)
- 阿拉伯语
- 推理(reasoning)
- 思维(think)
- 大语言模型训练(LLM Training)
- 智能体AI(agentic-ai)
- 智能体(agents)
- 结构化输出(structured-output)
- 思维链(chain-of-thought)
美观名称:AISA-AR-FunctionCall-Think
样本规模区间:10000 < 样本数 < 100000
任务类别:
- 文本生成
任务子类别:
- 语言建模
---
# AISA-AR-FunctionCall-Think
ce.co/production/uploads/628f7a71dd993507cfcbe587/21Mxl67VW-RQFiXTnvheT.png" width="700"/>
</p>
**阿拉伯语推理增强型函数调用数据集**
`AISA-AR-FunctionCall-Think` 是[AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall)数据集的推理增强子集,专为训练**在调用工具前执行显式推理**的模型而设计。
该数据集在生成工具调用指令前,引入了包裹在`<think>`标签块内的结构化推理轨迹,这使得模型能够在执行结构化API操作前,输出可解释的决策步骤。
本数据集可支持**阿拉伯语智能体AI系统中的感知推理工具调用**相关研究。
---
## 数据集概览
每条样本包含以下内容:
- 阿拉伯语用户请求
- 工具schema定义
- 推理轨迹(`<think>`标签块)
- 结构化工具调用
- 参数标注
- 元数据标签
**模型输出格式:**
<think>
关于工具选择的推理过程
</think>
<start_function_call>
调用:工具名{参数}
<end_function_call>
---
## 数据集统计信息
| 属性 | 取值 |
|---|---|
| 数据集规模 | 约12000条样本 |
| 方言覆盖范围 | 5种阿拉伯语方言 |
| 覆盖领域 | 8个真实世界领域 |
| 工具数量 | 27个结构化工具 |
本数据集源自更大规模的**AISA-AR-FunctionCall**语料库。
---
## 方言覆盖范围
| 方言类别 |
|---|
| 现代标准阿拉伯语(Modern Standard Arabic, MSA) |
| 海湾阿拉伯语 |
| 埃及阿拉伯语 |
| 黎凡特阿拉伯语 |
| 马格里布阿拉伯语 |
---
## 覆盖领域
| 领域类别 |
|---|
| 旅游 |
| 公共服务 |
| 伊斯兰宗教服务 |
| 气象 |
| 医疗健康 |
| 银行与金融 |
| 电子商务 |
| 政务服务 |
---
## 示例样本
**用户查询:**
ما حالة الطقس في الرياض اليوم؟
**预期模型输出:**
<think>
المستخدم يريد معرفة حالة الطقس في مدينة الرياض.
الأداة المناسبة هي get_weather.
</think>
<start_function_call>
call:get_weather{city:<escape>الرياض<escape>,days:1}
<end_function_call>
---
## 数据格式
每条样本包含以下字段:
| 字段名 | 字段说明 |
|---|---|
| `messages` | 对话消息(开发者系统提示词+用户查询) |
| `tools` | 可用于该查询的工具schema定义 |
| `think` | 解释工具选择逻辑的推理轨迹 |
| `tool_called` | 真实标注的工具名称 |
| `arguments` | 结构化参数字典 |
| `domain` | 任务所属领域(例如气象、银行等) |
| `dialect` | 使用的阿拉伯语方言类别 |
---
## 数据集构建
本数据集通过对基础AISA-AR-FunctionCall数据集应用**推理增强流水线**生成。
**流水线步骤:**
1. 从基础语料库中选取结构化工具调用样本
2. 生成解释工具选择逻辑的推理轨迹
3. 将推理内容插入`<think>`标签块中
4. 保留原始标注中的结构化工具调用监督信号
5. 验证推理轨迹与工具调用的一致性
---
## 预期用途
本数据集旨在应用于:
- 感知推理的工具调用模型训练
- 可解释的阿拉伯语AI智能体
- 阿拉伯语推理监督研究
- 结构化决策建模
- 智能体对齐实验
### 不适用场景
- 通用阿拉伯语自然语言处理任务(如分类、摘要、翻译等)
- 未对推理质量进行验证的生产部署
- 安全关键型系统
---
## 已知局限性
- 推理轨迹篇幅较短,可能无法覆盖多步推理链条
- 部分查询需要比现有轨迹更深入的语义解读
- `<think>`标签块会增加输出长度,可能影响生产环境中的延迟
- 标准的函数调用验证工具可能会将带有`<think>`标签的输出识别为解析失败,因为`<think>`标签位于函数调用标记之前——这属于格式差异,而非结构错误
---
## 相关资源
| 资源名称 | 链接 |
|---|---|
| 基础数据集 | [AISA-AR-FunctionCall](https://huggingface.co/datasets/AISA-Framework/AISA-AR-FunctionCall) |
| 推理模型 | [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) |
| 生产部署模型 | [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) |
| 完整数据集集合 | [AISA Arabic FunctionCall](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models) |
---
## 许可证
本数据集采用[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)许可证开源。
提供机构:
AISA-Framework



