EVOUNA

Name: EVOUNA
Creator: Authors of the paper
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/wangcunxiang/QA-Eval

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在评估人工智能生成的答案在开放问答领域中与标准答案的准确性。它包含了经过筛选的实例以提升准确度，并且作者亲自进行了人工标注过程，确保了数据质量。在规模上，该数据集包括NQ类别下的3610个案例和TQ类别下的500个案例。其任务是进行问答评估（Qa-Eval）的评价。

This dataset aims to evaluate the accuracy of AI-generated answers against standard reference answers in the open-domain question answering field. It includes curated instances to enhance evaluation accuracy, and the authors conducted manual annotation in person to ensure high data quality. In terms of scale, the dataset comprises 3610 instances under the NQ category and 500 instances under the TQ category. Its core task is question answering evaluation (Qa-Eval).

提供机构：

Authors of the paper

5,000+

优质数据集

54 个

任务类型

进入经典数据集