tytodd/qwen3.5-4b-or_bench_toxic
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/tytodd/qwen3.5-4b-or_bench_toxic
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
config_name: or_bench_toxic
features:
- name: input
struct:
- name: prompt
dtype: string
- name: prediction
struct:
- name: or_bench_category
dtype: string
- name: reasoning
dtype: string
- name: messages
struct:
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
- name: outputs
struct:
- name: reasoning_content
dtype: string
- name: text
dtype: string
- name: correct
dtype: bool
splits:
- name: ood
num_bytes: 13347071
num_examples: 524
download_size: 9675702
dataset_size: 13347071
configs:
- config_name: or_bench_toxic
data_files:
- split: ood
path: or_bench_toxic/ood-*
---
# qwen3.5-4b-or_bench_toxic
- Repo: `tytodd/qwen3.5-4b-or_bench_toxic`
- Config: `/Users/tytodd/Desktop/Modaic/code/core/probe-lab/configs/datasets/or_bench_toxic/or_bench_toxic.yaml`
- Model: `Qwen/Qwen3.5-4B`
- Runtime: `Modal` local vLLM on `localhost`
| benchmark | train | val | ood | all |
| --- | --- | --- | --- | --- |
| or_bench_toxic | | | 64.89% | 64.89% |
| all | | | 64.89% | 64.89% |
数据集信息:
配置名称:or_bench_toxic
特征字段:
- 字段名:`input`,结构体包含:
- 字段名:`prompt`,数据类型:字符串(string)
- 字段名:`prediction`,结构体包含:
- 字段名:`or_bench_category`,数据类型:字符串(string)
- 字段名:`reasoning`,数据类型:字符串(string)
- 字段名:`messages`,结构体包含:
- 字段名:`messages`,列表类型,列表元素结构体包含:
- 字段名:`content`,数据类型:字符串(string)
- 字段名:`role`,数据类型:字符串(string)
- 字段名:`outputs`,结构体包含:
- 字段名:`reasoning_content`,数据类型:字符串(string)
- 字段名:`text`,数据类型:字符串(string)
- 字段名:`correct`,数据类型:布尔值(bool)
数据划分:
- 划分名称:分布外(Out-of-Distribution,简称ood),字节数:13347071,样本数量:524
下载大小:9675702
数据集总大小:13347071
配置集:
- 配置名称:or_bench_toxic
数据文件:
- 划分:ood,路径:`or_bench_toxic/ood-*`
---
# qwen3.5-4b-or_bench_toxic
- 代码仓库:`tytodd/qwen3.5-4b-or_bench_toxic`
- 配置文件路径:`/Users/tytodd/Desktop/Modaic/code/core/probe-lab/configs/datasets/or_bench_toxic/or_bench_toxic.yaml`
- 模型:`Qwen/Qwen3.5-4B`
- 运行环境:基于`Modal`的本地vLLM,运行于`localhost`
| 基准测试集 | 训练集 | 验证集 | 分布外集(ood) | 总计 |
| --- | --- | --- | --- | --- |
| or_bench_toxic | | | 64.89% | 64.89% |
| 全部基准 | | | 64.89% | 64.89% |
提供机构:
tytodd



