Compumacy/toolcall_bench

Name: Compumacy/toolcall_bench
Creator: Compumacy
Published: 2025-05-11 21:11:40
License: 暂无描述

Hugging Face2025-05-11 更新2025-10-18 收录

下载链接：

https://hf-mirror.com/datasets/Compumacy/toolcall_bench

下载链接

链接失效反馈

官方服务：

资源简介：

When2Call是一个评估大型语言模型工具调用决策的基准数据集，包括何时生成工具调用、何时提出后续问题、何时承认提供的问题无法用工具回答，以及如果问题似乎需要使用工具但无法进行工具调用时应该怎么做。该数据集提供了一个用于When2Call的训练集，并利用基准的多选特性开发了一种偏好优化训练策略，该策略比传统的微调对工具调用有显著改进。此数据集已准备好商业使用。

When2Call is a benchmark designed to evaluate tool-calling decision-making for large language models (LLMs), including when to generate a tool call, when to ask follow-up questions, when to admit the question cant be answered with the tools provided, and what to do if the question seems to require tool use but a tool call cant be made. The dataset offers a training set for When2Call and leverages the multiple-choice nature of the benchmark to develop a preference optimization training regime, which shows considerable improvement over traditional fine-tuning for tool calling. This dataset is ready for commercial use.

提供机构：

Compumacy

5,000+

优质数据集

54 个

任务类型

进入经典数据集