llm-system-prompts-benchmark

Opencsg2024-07-19 更新2025-05-03 收录

下载链接：

https://www.opencsg.com/datasets/AIWizards/llm-system-prompts-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

该仓库提供了一套用于评估大型语言模型遵循系统提示能力的基准数据集，包含100个系统提示，旨在测试模型在语法模式、多项选择、角色扮演、信息记忆和法语等方面的能力。每个数据点包含提示、探测和评估函数三部分，通过评估模型对提示的遵循程度来衡量其性能。该数据集主要由英文构成，少量法语提示，并采用Apache 2.0协议授权。该仓库还提供了标准化数据操作和统一模型接口，方便研究人员使用该数据集进行模型对比和干预研究。

This repository provides a benchmark dataset for evaluating the ability of large language models (LLMs) to follow system prompts, containing 100 system prompts designed to test model capabilities across grammatical patterns, multiple-choice questions, role-playing, information memorization, and French language tasks, among others. Each data point consists of three components: a prompt, a probe, and an evaluation function, with performance measured by assessing the model's adherence to the given prompts. This dataset is primarily composed of English content with a small number of French prompts, and is licensed under the Apache 2.0 license. The repository also provides standardized data operations and a unified model interface, enabling researchers to easily use this dataset for model comparison and intervention studies.

创建时间：

2024-07-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集