gallifantjack/pminervini_NQ_Swap_sub_answer
收藏Hugging Face2024-11-25 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/gallifantjack/pminervini_NQ_Swap_sub_answer
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含pminervini_NQ_Swap的评估结果,标签列为sub_answer,包含各种模型性能指标和样本。数据集包含评估过程中的原始样本,以及模型名称、输入列和分数等元数据。这有助于理解模型在不同任务和数据集上的表现。数据集的特征包括:唯一标识符、用户查询、助手响应、预期输出、评分、评分解释、输入列、标签列、模型名称和原始数据集名称。该数据集可用于评估模型鲁棒性、评估模型响应中的潜在偏见以及模型性能监控和分析。
This dataset contains evaluation results for pminervini_NQ_Swap with label column sub_answer, with various model performance metrics and samples. The dataset includes features such as a unique identifier for the sample, user query/content, assistant response, expected output, score of the assistants response, explanation of the score, input column, label column, model name used in evaluation, and name of the original dataset. This helps with understanding model performance across different tasks and datasets. The dataset can be used for evaluating model robustness across various tasks, assessing potential biases in model responses, and model performance monitoring and analysis.
提供机构:
gallifantjack



