UHGEval
收藏arXiv2024-05-24 更新2024-06-21 收录
下载链接:
https://iaar-shanghai.github.io/UHGEval/
下载链接
链接失效反馈官方服务:
资源简介:
UHGEval数据集是由中国人民大学信息学院、高级算法研究所和媒体融合生产技术与系统国家重点实验室联合开发的,专注于评估中文大型语言模型在无约束生成环境下的幻觉现象。该数据集包含超过5000个实例,每个实例都标有关键词级别的幻觉信息。数据集的内容主要来源于2015年至2017年的中文新闻文章,涵盖政治、经济、科技、社会等多个领域。创建过程中,采用了无约束生成方法,即直接将文本输入模型,不加任何限制地获取输出结果。UHGEval数据集的应用领域主要集中在语言模型的可靠性和准确性评估,旨在解决模型在实际应用中可能产生的幻觉问题,提高语言模型在专业场景中的实用性。
The UHGEval dataset was co-developed by the School of Information, Renmin University of China, the Advanced Algorithms Institute, and the State Key Laboratory of Media Convergence Production Technology and Systems. It focuses on evaluating hallucination phenomena of Chinese large language models (LLMs) in unconstrained generation scenarios. This dataset contains over 5,000 instances, each annotated with keyword-level hallucination information. The content of the dataset is mainly sourced from Chinese news articles published between 2015 and 2017, covering multiple domains such as politics, economy, technology, and society. During its construction, an unconstrained generation method was adopted, which involves directly inputting text into the model without any restrictions to obtain model outputs. The UHGEval dataset is primarily applied to the reliability and accuracy evaluation of language models, aiming to solve the hallucination issues that may arise in the practical application of models and improve the practicality of language models in professional scenarios.
提供机构:
中国人民大学信息学院,北京,中国 †高级算法研究所,上海,中国 ‡媒体融合生产技术与系统国家重点实验室,北京,中国
创建时间:
2023-11-26



