SIG
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/maybenotime/RAG-SpuriousFeatures
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个轻量级的基准测试集,由合成数据集创建而成,旨在更高效地评估不同模型在鲁棒性方面的表现。此外,SIG基准测试从每个扰动中的(K,G)和(U,G)子集中选择了100个样本,这些样本显示出Mistral和Llama模型在敏感性和非鲁棒性方面的特点。该任务的规模属于中等,专注于问答领域。
This is a lightweight benchmark dataset constructed from synthetic data, designed to facilitate more efficient assessment of the robustness of various AI models. Additionally, the SIG benchmark selects 100 samples from the (K,G) and (U,G) subsets of each perturbation; these samples demonstrate the sensitivity and non-robustness traits exhibited by Mistral and Llama models. This task is medium-scale and focuses on the question answering domain.



