NightShade9x9/lmsys-chat-1m-benchmark

Name: NightShade9x9/lmsys-chat-1m-benchmark
Creator: NightShade9x9
Published: 2024-12-17 06:46:38
License: 暂无描述

Hugging Face2024-12-17 更新2024-12-21 收录

下载链接：

https://hf-mirror.com/datasets/NightShade9x9/lmsys-chat-1m-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集将`lmsys/lmsys-chat-1m`分割为`trainval`和`benchmark`两个部分。`benchmark`部分包含790个样本（有些重复），用于在硬件上对LLM推理性能进行基准测试，测量如TTFT、吞吐量等统计数据。每个提示都使用`TinyLlama/TinyLlama-1.1B-Chat-v1.0`分词器进行分词，并被分到100个等大小的区间中。在每个区间内，随机选择10个样本，如果该区间内样本少于10个，则允许替换。

This dataset splits `lmsys/lmsys-chat-1m` into `trainval` and `benchmark` splits. The `benchmark` split contains 790 samples (some repeated) of various prompt lengths. The purpose of that split is to benchmark LLM inference performance on hardware, by measuring statistics such as TTFT, throughput, etc. Each prompt in the dataset was tokenized using `TinyLlama/TinyLlama-1.1B-Chat-v1.0` tokenizer and binned into 100 equal sized bins. Within each size bin, 10 samples are randomly selected, with replacement if there are fewer than 10 samples in that bin.

提供机构：

NightShade9x9

5,000+

优质数据集

54 个

任务类型

进入经典数据集