tytodd/qwen3.5-4b-judge_bench
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/tytodd/qwen3.5-4b-judge_bench
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
config_name: judge_bench
features:
- name: input
struct:
- name: question
dtype: string
- name: response_A
dtype: string
- name: response_B
dtype: string
- name: prediction
struct:
- name: label
dtype: string
- name: reasoning
dtype: string
- name: messages
struct:
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
- name: outputs
struct:
- name: reasoning_content
dtype: string
- name: text
dtype: string
- name: correct
dtype: bool
splits:
- name: ood
num_bytes: 14895584
num_examples: 280
download_size: 11871760
dataset_size: 14895584
configs:
- config_name: judge_bench
data_files:
- split: ood
path: judge_bench/ood-*
---
# qwen3.5-4b-judge_bench
- Repo: `tytodd/qwen3.5-4b-judge_bench`
- Config: `/Users/tytodd/Desktop/Modaic/code/core/probe-lab/configs/datasets/judge_bench/judge_bench.yaml`
- Model: `Qwen/Qwen3.5-4B`
- Runtime: `Modal` local vLLM on `localhost`
| benchmark | train | val | ood | all |
| --- | --- | --- | --- | --- |
| judge_bench | | | 82.50% | 82.50% |
| all | | | 82.50% | 82.50% |
提供机构:
tytodd



