five

tytodd/qwen3.5-4b-judge_bench

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/tytodd/qwen3.5-4b-judge_bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: config_name: judge_bench features: - name: input struct: - name: question dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: prediction struct: - name: label dtype: string - name: reasoning dtype: string - name: messages struct: - name: messages list: - name: content dtype: string - name: role dtype: string - name: outputs struct: - name: reasoning_content dtype: string - name: text dtype: string - name: correct dtype: bool splits: - name: ood num_bytes: 14895584 num_examples: 280 download_size: 11871760 dataset_size: 14895584 configs: - config_name: judge_bench data_files: - split: ood path: judge_bench/ood-* --- # qwen3.5-4b-judge_bench - Repo: `tytodd/qwen3.5-4b-judge_bench` - Config: `/Users/tytodd/Desktop/Modaic/code/core/probe-lab/configs/datasets/judge_bench/judge_bench.yaml` - Model: `Qwen/Qwen3.5-4B` - Runtime: `Modal` local vLLM on `localhost` | benchmark | train | val | ood | all | | --- | --- | --- | --- | --- | | judge_bench | | | 82.50% | 82.50% | | all | | | 82.50% | 82.50% |
提供机构:
tytodd
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作