NLG Failure Annotations
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/tunde-ajayi/llm-as-a-qualitative-judge
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了297个实例,涵盖了各种自然语言生成任务中生成的输出结果中的标注失败情况。注释内容详细,包括用户输入、真实回应、生成回应以及任务指标。该数据集旨在识别并聚类自然语言生成输出中的错误类型。规模上,数据集共有297个实例,其任务重点在于自然语言生成中的错误分析和问题聚类。
This dataset contains 297 instances, covering annotation failures in outputs generated across various natural language generation (NLG) tasks. The annotations are comprehensive, including user inputs, ground-truth responses, generated responses, and task-specific metrics. This dataset is designed to identify and cluster error types in natural language generation outputs. With 297 instances in total, its core focus lies in error analysis and problem clustering for natural language generation tasks.
提供机构:
Authors of the paper



