withmartian/routerbench

Name: withmartian/routerbench
Creator: withmartian
Published: 2024-03-27 07:27:17
License: 暂无描述

Hugging Face2024-03-27 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/withmartian/routerbench

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - text-generation - question-answering language: - en tags: - code pretty_name: RouterBench size_categories: - 10K<n<100K --- RouterBench is a dataset comprising of over 30000 prompts and the responses from 11 different LLMs, with the prompts taken from standard benchmarks such as MBPP, GSM-8k, Winogrande, Hellaswag, MMLU, MT-Bench, and more. The data includes the prompt, the model response, the estimated cost associated with that response, and a performance score to answer if the model got the answer correct. All prompts have a correct answer that the LLM generation is compared against. These datasets are designed to be used with Martian's [routerbench](https://github.com/withmartian/alt-routing-methods/tree/public-productionize) package for training and evaluating various model routing methods. There are two versions of the dataset, one where there is 5-shot generation, and one with 0-shot results. Both datasets can be used with the `routerbench` package individually or in combination.

提供机构：

withmartian

原始信息汇总

RouterBench 数据集概述

任务类别

文本生成
问答

语言

英语

数据集名称

RouterBench

数据集大小

10K<n<100K

数据集描述

RouterBench 数据集包含超过 30000 个提示及其来自 11 种不同大型语言模型（LLMs）的响应。这些提示来自多个标准基准，如 MBPP、GSM-8k、Winogrande、Hellaswag、MMLU、MT-Bench 等。数据集包括提示、模型响应、与响应相关的估计成本以及一个性能评分，用于判断模型是否正确回答了问题。所有提示都有一个正确答案，用于与 LLM 生成的答案进行比较。这些数据集旨在与 Martian 的 routerbench 包一起使用，用于训练和评估各种模型路由方法。

数据集版本

5-shot 生成版本
0-shot 结果版本

这两个版本的数据集可以单独或组合使用 routerbench 包。

5,000+

优质数据集

54 个

任务类型

进入经典数据集