aizip/Rag-Eval-Dataset-6k
收藏Hugging Face2025-04-18 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/aizip/Rag-Eval-Dataset-6k
下载链接
链接失效反馈官方服务:
资源简介:
RED6k是一个包含约6000个样本的全面数据集,跨越10个领域,由Aizip创建,用于评估语言模型在检索增强生成(RAG)系统中的总结能力。该数据集特别关注在使用小型语言模型(SLM)进行本地RAG部署时面临的独特挑战。每个样本都是一个包含以下字段的JSON对象:问题、上下文、答案、难度、上下文数量和可回答标志。可回答标志指导模型的行为预期,当标志为真时,模型应基于提供的上下文生成响应;当标志为假时,模型应拒绝回答,并可能提出澄清性问题以帮助精炼查询。该数据集适用于基准测试RAG系统性能、微调SLM以改进RAG能力和评估模型识别其知识边界的能力。
RED6k is a comprehensive dataset containing ~6,000 samples across 10 domains created by Aizip for evaluating language models as summarizers in retrieval-augmented generation (RAG) systems. The dataset focuses particularly on the unique challenges faced when using Small Language Models (SLMs) in local RAG deployments. Each sample is structured as a JSON object containing the following fields: question, contexts, answer, difficulty, num_contexts, and Answerable. The Answerable flag guides model behavior expectations: when true, models should generate responses based on the provided context; when false, models should refuse to answer and may offer clarifying questions to help refine the query. This dataset is ideal for benchmarking RAG system performance, fine-tuning SLMs for improved RAG capabilities, and evaluating models ability to recognize their knowledge boundaries.
提供机构:
aizip



