T-Eval
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/T-Eval
下载链接
链接失效反馈官方服务:
资源简介:
T-Eval disentangles the tool utilization evaluation into several sub-domains along model capabilities, facilitating the inner understanding of both holistic and isolated competency of LLMs. We conduct extensive experiments on T-Eval and in-depth analysis of various LLMs. T-Eval not only exhibits consistency with the outcome-oriented evaluation but also provides a more fine-grained analysis of the capabilities of LLMs, providing a new perspective in LLM evaluation on tool-utilization ability.
提供机构:
OpenDataLab
创建时间:
2024-05-14



