anonymous-ed-benchmark/skillret-benchmark

Name: anonymous-ed-benchmark/skillret-benchmark
Creator: anonymous-ed-benchmark
Published: 2026-04-29 06:39:42
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/anonymous-ed-benchmark/skillret-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

SkillRet是一个用于将自然语言用户请求与代理技能匹配的检索基准。每个检索文档都是一个完整的代理技能，由其名称、简短描述和完整的Markdown技能主体表示。每个查询描述了一个需要一项或多项相关技能的现实用户请求。该基准是从GitHub索引的公共代理技能构建的，包含通过自指导式管道生成的合成训练和评估查询。发布内容包括完整的技能库、训练/评估技能分割、查询文件、二元相关性标签和两级分类法。数据集分为三个子集，每个子集都有训练和测试分割：技能、查询和相关性标签。技能子集包含用于训练和评估的技能，查询子集包含合成训练和评估查询，相关性标签子集提供二元相关性标签。数据集还包括一个完整的技能库和分类法定义文件。

SkillRet is a retrieval benchmark for matching natural-language user requests to agent skills. Each retrieval document is a full agent skill, represented by its name, short description, and full Markdown skill body. Each query describes a realistic user request that requires one or more relevant skills. The benchmark is built from public agent skills indexed from GitHub and contains synthetic train and evaluation queries generated through a self-instruct-style pipeline. The release includes a full skill library, train/evaluation skill splits, query files, binary relevance labels, and a two-level taxonomy. The dataset is organized into three subsets, each with train and test splits: skills, queries, and qrels. The skills subset contains skills used for training and evaluation, the queries subset contains synthetic training and evaluation queries, and the qrels subset provides binary relevance labels. The dataset also includes a full skill library and a taxonomy definition file.

提供机构：

anonymous-ed-benchmark

5,000+

优质数据集

54 个

任务类型

进入经典数据集