tytodd/qwen3.5-2b-lmsys-arena

Name: tytodd/qwen3.5-2b-lmsys-arena
Creator: tytodd
Published: 2026-04-10 09:24:00
License: 暂无描述

Hugging Face2026-04-10 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/tytodd/qwen3.5-2b-lmsys-arena

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: chatbot_arena_conversations features: - name: input struct: - name: question dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: prediction struct: - name: label dtype: string - name: reasoning dtype: string - name: messages struct: - name: messages list: - name: content dtype: string - name: role dtype: string - name: outputs struct: - name: reasoning_content dtype: string - name: text dtype: string - name: correct dtype: bool splits: - name: train num_bytes: 48572696 num_examples: 1000 - name: val num_bytes: 13135090 num_examples: 250 download_size: 45223912 dataset_size: 61707786 - config_name: mt_bench_human_judgments features: - name: input struct: - name: question dtype: string - name: response_A dtype: string - name: response_B dtype: string - name: prediction struct: - name: label dtype: string - name: reasoning dtype: string - name: messages struct: - name: messages list: - name: content dtype: string - name: role dtype: string - name: outputs struct: - name: reasoning_content dtype: string - name: text dtype: string - name: correct dtype: bool splits: - name: ood num_bytes: 66930133 num_examples: 1000 download_size: 47622543 dataset_size: 66930133 configs: - config_name: chatbot_arena_conversations data_files: - split: train path: chatbot_arena_conversations/train-* - split: val path: chatbot_arena_conversations/val-* - config_name: mt_bench_human_judgments data_files: - split: ood path: mt_bench_human_judgments/ood-* --- # qwen3.5-2b-lmsys-arena - Repo: `tytodd/qwen3.5-2b-lmsys-arena` - Config: `/Users/tytodd/Desktop/Modaic/code/core/probe-lab/configs/datasets/lmsys-arena/lmsys-arena.yaml` - Model: `Qwen/Qwen3.5-2B` - Runtime: `Modal` local vLLM on `localhost` | benchmark | train | val | ood | all | | --- | --- | --- | --- | --- | | chatbot_arena_conversations | 74.00% | 72.00% | | 73.60% | | mt_bench_human_judgments | | | 66.60% | 66.60% | | all | 74.00% | 72.00% | 66.60% | 70.49% |

提供机构：

tytodd

5,000+

优质数据集

54 个

任务类型

进入经典数据集