ibm-research/fc-reward-bench
收藏Hugging Face2025-09-22 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/ibm-research/fc-reward-bench
下载链接
链接失效反馈官方服务:
资源简介:
fc-reward-bench是一个用于评估函数调用任务中奖励模型性能的基准数据集。它包含了从BFCL-v3数据集的单次对话分割中派生出的1500个独特的用户输入。每个输入都配有一个正确的函数调用和一个错误的函数调用。正确调用直接来源于BFCL,而错误调用是由25个宽松许可的模型生成的。
fc-reward-bench is a benchmark designed to evaluate reward model performance in function-calling tasks. It features 1,500 unique user inputs derived from the single-turn splits of the BFCL-v3 dataset. Each input is paired with both correct and incorrect function calls. Correct calls are sourced directly from BFCL, while incorrect calls are generated by 25 permissively licensed models.
提供机构:
ibm-research
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



