AISA-Framework/AISA-AR-FunctionCall
收藏Hugging Face2026-03-15 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/AISA-Framework/AISA-AR-FunctionCall
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ar
license: apache-2.0
tags:
- function-calling
- tool-use
- agentic
- arabic
- llm-training
- agentic-ai
- agents
- structured-output
pretty_name: AISA-AR-FunctionCall
size_categories:
- 10K<n<100K
task_categories:
- text-generation
task_ids:
- language-modeling
---
# AISA-AR-FunctionCall
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/PzKodJNvt9RkR-Q3agKHT.png" width="700"/>
</p>
**Arabic Structured Function Calling Dataset**
`AISA-AR-FunctionCall` is a large-scale Arabic dataset designed for training language models to convert natural language into structured executable tool calls.
The dataset enables research and development of **Arabic agentic AI systems** capable of invoking APIs, tools, and external services.
It is part of the **AISA (Agentic AI Systems Architecture)** initiative.
---
## Dataset Overview
The dataset contains **structured tool-calling examples in Arabic** across multiple dialects and real-world domains.
Each sample includes:
- Arabic user query
- Tool schema definitions
- Expected tool call
- Structured arguments
- Metadata annotations
The dataset supports training models to generate outputs in the **FunctionGemma structured tool-calling format**.
---
## Dataset Statistics
| Property | Value |
|---|---|
| Total samples | 50,810 |
| Training samples | 41,104 |
| Validation samples | 4,568 |
| Test samples | 5,079 |
| Tools | 27 |
| Domains | 8 |
| Dialect groups | 5 |
---
## Arabic Dialects
The dataset includes five Arabic dialect groups, enabling training of models robust to linguistic variation across the Arabic world:
| Dialect |
|---|
| Modern Standard Arabic (MSA) |
| Gulf Arabic |
| Egyptian Arabic |
| Levantine Arabic |
| Maghrebi Arabic |
---
## Domains
The dataset covers eight real-world task domains, selected to represent typical tool-based AI assistant tasks:
| Domain |
|---|
| Travel |
| Utilities |
| Islamic services |
| Weather |
| Healthcare |
| Banking & finance |
| E-commerce |
| Government services |
---
## Tool Schema
Each tool is defined using a structured schema including function name, description, parameter types, and required arguments.
**Example tool schema:**
```json
{
"name": "get_weather",
"description": "الحصول على حالة الطقس",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"days": {"type": "integer"}
},
"required": ["city"]
}
}
```
---
## Example Sample
**User request:**
```
ما حالة الطقس في الرياض اليوم؟
```
**Expected model output:**
```
<start_function_call>
call:get_weather{city:<escape>الرياض<escape>,days:1}
<end_function_call>
```
---
## Data Format
Each example in the dataset contains the following fields:
| Field | Description |
|---|---|
| `messages` | Conversation messages (developer system prompt + user query) |
| `tools` | Tool schema definitions available for the query |
| `requires_function` | Boolean — whether a tool should be invoked |
| `tool_called` | Ground truth tool name |
| `arguments` | Structured argument dictionary |
| `domain` | Task domain (e.g., weather, banking) |
| `dialect` | Arabic dialect group |
---
## Data Cleaning and Repair
The dataset was constructed through a **data-centric restructuring pipeline**. Major repair steps included:
- Structural auditing of all samples
- Enum constraint correction
- Normalization of argument values
- Tool schema consolidation
- Tool pruning (36 → 27 tools)
- Removal of duplicated tool definitions
- Prompt-length reduction via tool sampling
These steps significantly improved training stability for structured function calling.
### Key Issues Resolved
Initial experiments with the raw dataset revealed several structural problems:
| Issue | Status |
|---|---|
| Silent outputs for negative samples | Fixed |
| Enum validation errors | Fixed |
| Duplicated tool definitions | Removed |
| Prompt truncation from large tool sets | Resolved via tool sampling |
| Schema inconsistencies | Normalized |
After repair, the dataset became **schema-consistent and training-ready**.
---
## Intended Use
This dataset is designed for:
- Arabic tool-calling model training
- Agentic AI research
- Structured LLM evaluation
- Multilingual tool invocation research
- Arabic AI assistant development
### Out-of-Scope Uses
- General Arabic NLP tasks (sentiment, classification, summarization)
- Safety-critical decision systems without additional validation
---
## Limitations
Remaining challenges include:
- Semantic ambiguity in some cross-domain queries
- Overlapping tool descriptions (e.g., weather vs. air quality)
- Domain-specific terminology variation across dialects
Future versions will include additional tools and reasoning annotations.
---
## Related Models
Models trained on this dataset:
| Model | Description |
|---|---|
| [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) | Production fine-tuned model |
| [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) | Reasoning-augmented variant |
---
## AISA Framework
This dataset is part of the **AISA** initiative for building reliable multilingual agentic AI systems.
Model & dataset collection: [AISA-Framework/aisa-arabic-functioncall-datasets-and-models](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models)
---
## Acknowledgment
We would like to thank **Hesham Haroon** for providing the original dataset:
🔗 https://huggingface.co/datasets/HeshamHaroon/Arabic_Function_Calling
This dataset served as the foundation for our work. We adapted and transformed the data into a **mobile-action style format**, which was then used to train **FunctionGemma-based Arabic function-calling models**.
## License
[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
language:
- 阿拉伯语(ar)
license: Apache 2.0许可证
tags:
- 函数调用
- 工具使用
- 智能体AI(agentic AI)
- 阿拉伯语
- 大语言模型(LLM)训练
- 智能体AI(agentic AI)
- 智能体
- 结构化输出
pretty_name: AISA-AR-FunctionCall
size_categories:
- 10K<n<100K
task_categories:
- 文本生成
task_ids:
- 语言建模
---
# AISA-AR-FunctionCall
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/628f7a71dd993507cfcbe587/PzKodJNvt9RkR-Q3agKHT.png" width="700"/>
</p>
**阿拉伯语结构化函数调用数据集**
`AISA-AR-FunctionCall` 是一款大规模阿拉伯语数据集,专为训练大语言模型(Large Language Model, LLM)将自然语言转换为结构化可执行工具调用而设计。本数据集可用于研究与开发能够调用应用程序接口(API)、工具及外部服务的**智能体AI(agentic AI)系统**,它是**AISA(智能体AI系统架构,Agentic AI Systems Architecture)** 计划的一部分。
---
## 数据集概览
本数据集包含覆盖多种方言与真实世界领域的**阿拉伯语结构化工具调用示例**。每个样本包含:
- 阿拉伯语用户查询
- 工具架构定义
- 预期工具调用
- 结构化参数
- 元数据标注
本数据集支持训练模型以生成符合**FunctionGemma结构化工具调用格式**的输出。
---
## 数据集统计
| 属性 | 数值 |
|---|---|
| 总样本数 | 50,810 |
| 训练样本数 | 41,104 |
| 验证样本数 | 4,568 |
| 测试样本数 | 5,079 |
| 工具数量 | 27 |
| 领域数量 | 8 |
| 方言组数量 | 5 |
---
## 阿拉伯语方言
本数据集包含五组阿拉伯语方言,可用于训练适配阿拉伯世界语言多样性的鲁棒模型:
| 方言组 |
|---|
| 现代标准阿拉伯语(MSA) |
| 海湾阿拉伯语 |
| 埃及阿拉伯语 |
| 黎凡特阿拉伯语 |
| 马格里布阿拉伯语 |
---
## 领域
本数据集覆盖八大真实世界任务领域,旨在覆盖典型的基于工具的AI助手任务:
| 领域 |
|---|
| 旅行 |
| 公共服务 |
| 伊斯兰服务 |
| 天气 |
| 医疗保健 |
| 银行与金融 |
| 电子商务 |
| 政府服务 |
---
## 工具架构
每个工具均采用结构化架构定义,包含函数名称、描述、参数类型与必填参数。
**示例工具架构:**
json
{
"name": "get_weather",
"description": "الحصول على حالة الطقس",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"days": {"type": "integer"}
},
"required": ["city"]
}
}
---
## 样本示例
**用户请求:**
ما حالة الطقس في الرياض اليوم؟
**预期模型输出:**
<start_function_call>
call:get_weather{city:<escape>الرياض<escape>,days:1}
<end_function_call>
---
## 数据格式
本数据集中的每个示例包含以下字段:
| 字段 | 说明 |
|---|---|
| `messages` | 对话消息(开发者系统提示词+用户查询) |
| `tools` | 当前查询可用的工具架构定义 |
| `requires_function` | 布尔值——是否需要调用工具 |
| `tool_called` | 真实工具名称 |
| `arguments` | 结构化参数字典 |
| `domain` | 任务领域(例如天气、银行) |
| `dialect` | 阿拉伯语方言组 |
---
## 数据清理与修复
本数据集通过**以数据为中心的重构流水线**构建,主要修复步骤包括:
- 所有样本的结构审计
- 枚举约束修正
- 参数值归一化
- 工具架构整合
- 工具精简(从36个精简至27个)
- 重复工具定义移除
- 通过工具采样缩短提示词长度
上述步骤显著提升了结构化函数调用任务的训练稳定性。
### 已解决的关键问题
原始数据集在初始实验中暴露出多项结构问题:
| 问题 | 处理状态 |
|---|---|
| 负样本静默输出 | 已修复 |
| 枚举验证错误 | 已修复 |
| 重复工具定义 | 已移除 |
| 大型工具集导致的提示词截断 | 通过工具采样解决 |
| 架构不一致 | 已归一化 |
修复完成后,本数据集已实现**架构一致且可直接用于训练**。
---
## 预期用途
本数据集旨在用于:
- 阿拉伯语工具调用模型训练
- 智能体AI研究
- 结构化大语言模型评估
- 多语言工具调用研究
- 阿拉伯语AI助手开发
### 不适用场景
- 通用阿拉伯语自然语言处理任务(如情感分析、分类、摘要)
- 未经额外验证的安全关键决策系统
---
## 局限性
尚存的挑战包括:
- 部分跨领域查询存在语义歧义
- 工具描述重叠(例如天气与空气质量工具)
- 不同方言间的领域特定术语存在差异
未来版本将新增更多工具与推理标注。
---
## 相关模型
基于本数据集训练的模型:
| 模型 | 说明 |
|---|---|
| [AISA-AR-FunctionCall-FT](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-FT) | 生产级微调模型 |
| [AISA-AR-FunctionCall-Think](https://huggingface.co/AISA-Framework/AISA-AR-FunctionCall-Think) | 带推理增强的变体模型 |
---
## AISA框架
本数据集是**AISA**计划的一部分,该计划旨在构建可靠的多语言智能体AI系统。
模型与数据集集合:[AISA-Framework/aisa-arabic-functioncall-datasets-and-models](https://huggingface.co/collections/AISA-Framework/aisa-arabic-functioncall-datasets-and-models)
---
## 致谢
我们感谢Hesham Haroon提供的原始数据集:🔗 https://huggingface.co/datasets/HeshamHaroon/Arabic_Function_Calling
本数据集为本项目奠定了核心基础,我们将其适配并转换为**移动端操作风格格式**,并用于训练基于FunctionGemma的阿拉伯语函数调用模型。
## 许可证
[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
提供机构:
AISA-Framework



