BFCL-Hi
收藏魔搭社区2025-12-04 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/nv-community/BFCL-Hi
下载链接
链接失效反馈官方服务:
资源简介:
## Dataset Description:
The BFCL-Hi (Hindi BFCL) dataset evaluates the function-calling capability of large language models (LLMs) when questions are asked in Hindi. This is the GCP-translated version of the English BFCL dataset, in which question-function-answer pairs across various domains and multiple languages are originally curated in English.
This dataset is ready for commercial/non-commercial use.
The evaluation steps are described [here](https://huggingface.co/datasets/nvidia/BFCL-Hi/blob/main/EVAL.md).
## Dataset Owner:
NVIDIA Corporation
## Dataset Creation Date:
April 2025
## License/Terms of Use:
This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0). Additional Information: Apache
2.0 License.
## Intended Usage:
Evaluate LLM's ability to call functions and tools when queries are asked in the Hindi language.
## Dataset Characterization
Data Collection Method<br>
* Synthetic<br>
Labeling Method<br>
* Not Applicable <br>
## Dataset Format
Text
## Dataset Quantification
6.9MB of prompt-label pairs, comprising 2251 individual samples.
## Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
## Citing
If you find our work helpful, please consider citing our paper:
```
@article{kamath2025benchmarking,
title={Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis},
author={Kamath, Anusha and Singla, Kanishk and Paul, Rakesh and Joshi, Raviraj and Vaidya, Utkarsh and Chauhan, Sanjay Singh and Wartikar, Niranjan},
journal={arXiv preprint arXiv:2508.19831},
year={2025}
}
```
# 数据集描述
BFCL-Hi(印地语版BFCL)数据集用于评估大语言模型(LLM)在接收印地语提问时的函数调用能力。该数据集是英语版BFCL数据集通过谷歌云平台(GCP)翻译得到的衍生版本,原版英语BFCL数据集最初在多领域、多语言场景下精心整理了问题-函数-答案三元组。
本数据集可用于商业及非商业用途。评估步骤详见[此处](https://huggingface.co/datasets/nvidia/BFCL-Hi/blob/main/EVAL.md)。
## 数据集所有者
英伟达公司(NVIDIA Corporation)
## 数据集创建日期
2025年4月
## 使用许可条款
本数据集采用知识共享署名4.0国际许可协议(Creative Commons Attribution 4.0 International License,CC-BY-4.0)进行授权。补充说明:同时兼容Apache 2.0许可协议。
## 预期用途
评估大语言模型在接收印地语查询时调用函数与工具的能力。
## 数据集特征
### 数据收集方法
* 合成生成
### 标注方法
* 不适用
## 数据集格式
文本
## 数据集规模
包含6.9MB的提示词-标签对,总计2251条独立样本。
## 伦理考量
英伟达(NVIDIA)认为可信人工智能是一项共同责任,我们已建立相关政策与实践规范,以支持各类人工智能应用的开发。开发者在按照本数据集服务条款下载或使用本数据集时,应与其内部模型团队协作,确保该模型符合相关行业及应用场景的要求,并应对可能出现的产品误用问题。
请通过[此处](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)提交安全漏洞或英伟达人工智能相关问题反馈。
## 引用说明
若您认为本工作对您的研究有所帮助,请引用以下论文:
@article{kamath2025benchmarking,
title={印地语大语言模型基准测试:全新数据集套件与对比分析},
author={Kamath, Anusha and Singla, Kanishk and Paul, Rakesh and Joshi, Raviraj and Vaidya, Utkarsh and Chauhan, Sanjay Singh and Wartikar, Niranjan},
journal={arXiv preprint arXiv:2508.19831},
year={2025}
}
提供机构:
maas
创建时间:
2025-10-09



