pipizhao/SkillRouter-Eval-Core

Name: pipizhao/SkillRouter-Eval-Core
Creator: pipizhao
Published: 2026-04-24 06:13:05
License: 暂无描述

Hugging Face2026-04-24 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/pipizhao/SkillRouter-Eval-Core

下载链接

链接失效反馈

官方服务：

资源简介：

SkillRouter Eval Core数据集是一个公开的SkillRouter评估基准数据集，用于技能路由任务的评估。数据集包含87个基准任务描述（tasks.jsonl）、真实技能ID和相关性标签（relevance.json）、以及两个不同难度的候选技能池（easy/和hard/，分别包含78,361和79,141个候选技能）。数据集还包含文件布局和元数据（manifest.json）。该数据集旨在提供一个本地data/eval_core目录结构，供SkillRouter评估脚本使用。评估基准包括75个评分任务（24个单技能任务和51个多技能任务），使用相关性进行分级nDCG评分。数据来源包括benchflow-ai/skillsbench和majiayu000/claude-skill-registry。

The SkillRouter Eval Core dataset is a public evaluation benchmark for skill routing tasks. It includes 87 benchmark task descriptions (tasks.jsonl), ground-truth skill IDs and relevance labels (relevance.json), and two tiers of candidate skill pools (easy/ and hard/, containing 78,361 and 79,141 candidate skills respectively). The dataset also includes file layout and metadata (manifest.json). It is designed to provide a local data/eval_core directory structure for SkillRouter evaluation scripts. The benchmark consists of 75 scored tasks (24 single-skill tasks and 51 multi-skill tasks) using relevance for graded nDCG scoring. Data sources include benchflow-ai/skillsbench and majiayu000/claude-skill-registry.

提供机构：

pipizhao

5,000+

优质数据集

54 个

任务类型

进入经典数据集