gallifantjack/boolq_N_A
收藏Hugging Face2024-12-11 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/gallifantjack/boolq_N_A
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含boolq的评估结果,带有标签列N_A,包含各种模型性能指标和样本。数据集包含评估过程中的原始样本,以及模型名称、输入列和分数等元数据。这些信息有助于理解模型在不同任务和数据集上的表现。数据集的特征包括样本的唯一标识符、用户查询/内容、助手响应、预期输出、响应分数、分数解释、输入列、标签列、模型名称和原始数据集名称。数据集可用于评估模型在不同任务中的鲁棒性、评估模型响应中的潜在偏见以及模型性能的监控和分析。
This dataset contains evaluation results for boolq with label column N_A, with various model performance metrics and samples. The dataset contains original samples from the evaluation process, along with metadata like model names, input columns, and scores. This helps with understanding model performance across different tasks and datasets. The features include a unique identifier for the sample, user query/content, assistant response, the expected output, score of the assistants response, explanation of the score, input column, label column, model name, and original dataset name. The dataset can be used for evaluating model robustness across various tasks, assessing potential biases in model responses, and model performance monitoring and analysis.
提供机构:
gallifantjack



