CRAG Benchmark

Name: CRAG Benchmark
Creator: KDD Cup 2024 Organizers
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/USTCAGI/CRAG-in-KDD-Cup2024

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为CRAG基准，旨在解决现有问答基准在评估检索增强生成（RAG）系统时所面临的多样化和动态挑战的限制。它提供了一个全面评估RAG系统性能的方法。此外，该基准还允许基于模型的自动评估，并包含一个基于规则匹配和GPT-4评估的评分系统。该数据集的规模包括1371个公开测试案例，其任务是对检索增强生成系统进行评估。

This dataset, termed the CRAG benchmark, is designed to address the limitations of prevailing question answering benchmarks in evaluating the diverse and dynamic challenges encountered by retrieval-augmented generation (RAG) systems. It offers a comprehensive framework for evaluating the performance of RAG systems. Furthermore, the benchmark supports model-based automatic evaluation and incorporates a scoring system that leverages both rule matching and GPT-4-based assessment. Comprising 1,371 publicly accessible test cases, the dataset is dedicated to the evaluation of retrieval-augmented generation (RAG) systems.

提供机构：

KDD Cup 2024 Organizers

5,000+

优质数据集

54 个

任务类型

进入经典数据集