Scalable Oversight Benchmark

Name: Scalable Oversight Benchmark
Creator: Authors of the paper
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/ArjunPanickssery/math_problems_debate

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集采用了一种基于代理分数差（ASD）度量的原则性框架，以评估人类反馈机制的有效性，该框架衡量了一种机制在多大程度上能够促进真实陈述而非欺骗。为实现这一目标，该数据集利用了Python包（SOlib），以便快速评估可扩展的监督协议。该数据集的任务是评估人工智能监督协议中的人类反馈机制。

This dataset employs a principled framework based on the Agent Score Difference (ASD) metric to assess the effectiveness of human feedback mechanisms. This framework quantifies the extent to which a given mechanism promotes truthful statements as opposed to deceptive ones. To this end, the dataset utilizes the Python package SOlib to enable rapid evaluation of scalable supervision protocols. The core task of this dataset is to evaluate human feedback mechanisms in artificial intelligence-powered supervision protocols.

提供机构：

Authors of the paper

5,000+

优质数据集

54 个

任务类型

进入经典数据集