RepLiQA 用于基准测试的可能问答数据集

超神经2024-06-27 更新2024-06-29 收录

下载链接：

https://hyper.ai/cn/datasets/32617

下载链接

链接失效反馈

官方服务：

资源简介：

RepLiQA 是一个包含「上下文-问题-答案」三元组的评估数据集，其中上下文是非事实但看似自然的文档，内容是关于现实中不存在的虚构实体（例如人或地点）。 RepLiQA 是人工创建的，旨在测试大型语言模型 (LLM) 在提供的文档中查找和使用上下文信息的能力。与现有的问答数据集不同，RepLiQA 的非事实性使得模型的性能不会因 LLM 从训练数据中记忆事实的能力而受到干扰，人们可以更有信心地测试模型利用所提供上下文的能力。

RepLiQA is an evaluation dataset comprising "context-question-answer" triplets, where the contexts are non-factual yet seemingly natural documents focused on fictional entities (e.g., people or locations) that do not exist in reality. RepLiQA was manually created to test the ability of large language models (LLMs) to locate and utilize contextual information from provided documents. Unlike existing question answering datasets, the non-factual nature of RepLiQA eliminates interference from the LLMs' factual memorization capabilities derived from their training data, enabling more confident evaluation of models' ability to leverage the given context.

创建时间：

2024-06-26

搜集汇总

数据集介绍

背景与挑战

背景概述

RepLiQA是一个用于评估大型语言模型的人工创建数据集，包含非事实性文档及其对应的问题-答案对。该数据集通过虚构实体内容排除模型记忆干扰，专注于测试模型利用上下文信息的能力，覆盖了17个不同主题。

以上内容由遇见数据集搜集并总结生成