aizip/Rag-Eval-Dataset-6k

Name: aizip/Rag-Eval-Dataset-6k
Creator: aizip
Published: 2025-04-18 22:50:23
License: 暂无描述

Hugging Face2025-04-18 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/aizip/Rag-Eval-Dataset-6k

下载链接

链接失效反馈

官方服务：

资源简介：

RED6k是一个包含约6000个样本的全面数据集，跨越10个领域，由Aizip创建，用于评估语言模型在检索增强生成（RAG）系统中的总结能力。该数据集特别关注在使用小型语言模型（SLM）进行本地RAG部署时面临的独特挑战。每个样本都是一个包含以下字段的JSON对象：问题、上下文、答案、难度、上下文数量和可回答标志。可回答标志指导模型的行为预期，当标志为真时，模型应基于提供的上下文生成响应；当标志为假时，模型应拒绝回答，并可能提出澄清性问题以帮助精炼查询。该数据集适用于基准测试RAG系统性能、微调SLM以改进RAG能力和评估模型识别其知识边界的能力。

RED6k is a comprehensive dataset containing ~6,000 samples across 10 domains created by Aizip for evaluating language models as summarizers in retrieval-augmented generation (RAG) systems. The dataset focuses particularly on the unique challenges faced when using Small Language Models (SLMs) in local RAG deployments. Each sample is structured as a JSON object containing the following fields: question, contexts, answer, difficulty, num_contexts, and Answerable. The Answerable flag guides model behavior expectations: when true, models should generate responses based on the provided context; when false, models should refuse to answer and may offer clarifying questions to help refine the query. This dataset is ideal for benchmarking RAG system performance, fine-tuning SLMs for improved RAG capabilities, and evaluating models ability to recognize their knowledge boundaries.

提供机构：

aizip

5,000+

优质数据集

54 个

任务类型

进入经典数据集