voilaj/swiss-legal-rag-bench
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/voilaj/swiss-legal-rag-bench
下载链接
链接失效反馈官方服务:
资源简介:
Swiss Legal RAG Bench是一个端到端的基准测试数据集,用于评估检索增强生成系统在瑞士联邦和州法律上的表现。该数据集通过三个正交维度(正确性、基础性和检索准确性)评估RAG流程,并将错误分解为特定的失败模式。数据集包含10个手工制作的联邦法律问题,涵盖三种官方语言(德语、法语、意大利语)和瑞士法律的主要实质性领域(如债务法、刑法、宪法等)。该数据集旨在填补法律AI评估文献中对于瑞士法律(多语言、联邦-州分层)的不足,特别是评估生成模型是否产生基于事实且正确的答案。数据集还包括版本信息、基线结果、评估工具、局限性以及引用信息等。
Swiss Legal RAG Bench is an end-to-end benchmark for retrieval-augmented generation systems operating on Swiss federal and cantonal law. It evaluates RAG pipelines along three orthogonal dimensions (correctness, groundedness, and retrieval accuracy) and decomposes errors into specific failure modes. The dataset includes 10 hand-crafted federal-law questions across three official languages (DE, FR, IT) and major substantive areas of Swiss law (e.g., Schuldrecht, Strafrecht, Verfassungsrecht). The benchmark addresses the under-representation of Swiss law in international legal-AI evaluation literature, particularly in assessing whether generative models produce grounded and correct answers. The README also covers methodology, dataset structure, versioning, baseline results, evaluation harness, limitations, citation, and licensing.
提供机构:
voilaj



