bigcode/bigcodereward-experiment-results
收藏Hugging Face2025-10-13 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/bigcode/bigcodereward-experiment-results
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了多个子数据集,每个子数据集都有执行指令和不执行指令的两种版本。数据集中的特征包括聊天会话ID、判断模型、是否执行指令、判断结果、两个模型的输出类型、指令内容、详细评分和评判消息等。数据集可用于评估模型在执行指令方面的性能。
The dataset consists of multiple subsets, each with versions that execute and do not execute commands. Features include chat session ID, judgment model, command execution status, judgment result, output types of two models, command content, detailed scores, and evaluation messages. The dataset can be used to evaluate the performance of models in executing commands.
提供机构:
bigcode



