val-bench/VAL-Bench
收藏Hugging Face2025-09-25 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/val-bench/VAL-Bench
下载链接
链接失效反馈官方服务:
资源简介:
VAL-Bench是一个用于系统分析语言模型如何可靠地体现人类价值观的多样化基准。该数据集包含来自维基百科争议部分的115K对成对提示,用于评估模型在面对公共辩论的相反立场时是否能够保持稳定的价值立场。
VAL-Bench is a diverse benchmark for systematic analysis of how reliably language models embody human values. The dataset consists of 115K pairs of prompts from Wikipedias controversial sections, used to evaluate whether models maintain a stable value stance across opposing sides of public debates.
提供机构:
val-bench



