AIM-Intelligence/COMPASS-Policy-Alignment-Testbed-Dataset
收藏Hugging Face2026-01-06 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/AIM-Intelligence/COMPASS-Policy-Alignment-Testbed-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
COMPASS策略对齐测试平台数据集旨在评估大型语言模型(LLMs)在企业特定政策下的合规性。该数据集包含八个虚拟企业场景(行业垂直领域),每个场景中的查询分为四种类型:明确合规请求(allowed_base)、边界合规请求(allowed_edge)、明确不合规请求(denied_base)和对抗性不合规请求(denied_edge)。数据集按行业组织,每个子集包含一个存储为Parquet文件的测试分割。数据字段包括查询类型、政策类别、具体政策主题等。该数据集适用于政策合规性基准测试、安全性评估以及类型分析。
The COMPASS Policy Alignment Testbed Dataset evaluates how well Large Language Models (LLMs) adhere to organization-specific policies in realistic enterprise-style settings. It includes eight virtual enterprise scenarios (industry verticals), with queries categorized into four types: clearly compliant requests (allowed_base), borderline compliant requests (allowed_edge), clearly non-compliant requests (denied_base), and adversarial non-compliant requests (denied_edge). The dataset is organized by industry, with each subset containing a test split stored as Parquet files. Data fields include query type, policy category, and specific policy topics, among others. The dataset is intended for benchmarking policy compliance, safety evaluation, and per-type analysis.
提供机构:
AIM-Intelligence



