Self-Knowledge Tasks
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/knowledge-verse-ai/LLM-Self_Knowledge_Eval
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了由LLM生成的450个可行任务和450个不可行任务,旨在评估可行性边界的一致性以及自我知识类型。此外,数据集还包含了生成任务的提示、可行与不可行任务的示例,以及各种高性能模型的结果。总体规模达到了900个任务,均衡覆盖了不同的自我知识类型。该任务的目的是评估LLM关于可行性边界的自我知识。
This dataset consists of 450 feasible tasks and 450 infeasible tasks generated by LLMs, designed to evaluate the consistency of feasibility boundaries and diverse types of self-knowledge. Additionally, it contains the prompts used for task generation, examples of both feasible and infeasible tasks, as well as the performance results of various high-performance models. With a total of 900 tasks overall, the dataset covers a balanced range of distinct self-knowledge types. The core objective of this evaluation task is to assess the self-knowledge of LLMs regarding their task feasibility boundaries.
提供机构:
Knowledge Verse AI



