TIGER-Lab/StructEval
收藏Hugging Face2025-09-23 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/TIGER-Lab/StructEval
下载链接
链接失效反馈官方服务:
资源简介:
StructEval是一个评估大型语言模型在生成和转换结构化输出方面能力的基准数据集。它包括18种格式和44种任务类型,支持从自然语言提示生成格式以及格式之间的转换。每个示例包含任务标识符、自然语言查询、英文描述的特性需求、输入输出格式类型、完整的输入查询示例、评估指标等。
StructEval is a benchmark dataset designed to evaluate the ability of large language models (LLMs) to generate and convert structured outputs across 18 different formats and 44 types of tasks. It supports format generation from natural language prompts and format-to-format conversion. Each example includes a task identifier, natural language query, English-written feature requirements, input and output format types, a complete input query example, evaluation metrics, and more.
提供机构:
TIGER-Lab



