strategy-scope/res_output_gpt41mini_single_strategy_prompt-20260407_130036

Name: strategy-scope/res_output_gpt41mini_single_strategy_prompt-20260407_130036
Creator: strategy-scope
Published: 2026-04-07 17:03:03
License: 暂无描述

Hugging Face2026-04-07 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/strategy-scope/res_output_gpt41mini_single_strategy_prompt-20260407_130036

下载链接

链接失效反馈

官方服务：

资源简介：

--- tags: - strategy-scope - CREATE - evaluation --- # res_output_gpt41mini_single_strategy_prompt-20260407_130036 Evaluation results for `strategy-scope/output_gpt41mini_single_strategy_prompt`. ## Aggregate Statistics | Metric | Value | |--------|-------| | Instances | 165 | | Avg paths/instance | 12.2 | | Avg path length | 2.9 | | Avg valid/instance | 11.2 | | Avg valid & factual/instance | 4.7 | | Avg factuality | 0.7232 | | Avg strength | 2.8135 | | Avg pairwise distance (ft=0.0) | 0.7808 | | Avg pairwise distance (ft=1.0) | 0.6767 | | Avg utility (ft=0.0) | 16.8886 | | Avg utility (ft=1.0) | 8.6947 | ## Parameters - **Eval model:** gpt-4o-mini - **Patience:** 0.9 - **Total eval calls:** 4036 - **Timestamp:** 20260407_130036

tags: - 策略范围（strategy-scope） - CREATE - 评估 # res_output_gpt41mini_single_strategy_prompt-20260407_130036 针对`strategy-scope/output_gpt41mini_single_strategy_prompt`的评估结果。 ## 汇总统计指标 | 指标 | 数值 | |--------|-------| | 样本量 | 165 | | 单样本平均路径数 | 12.2 | | 平均路径长度 | 2.9 | | 单样本平均有效结果数 | 11.2 | | 单样本平均有效且符合事实的结果数 | 4.7 | | 平均事实性得分 | 0.7232 | | 平均强度得分 | 2.8135 | | 平均成对距离（ft=0.0） | 0.7808 | | 平均成对距离（ft=1.0） | 0.6767 | | 平均效用（ft=0.0） | 16.8886 | | 平均效用（ft=1.0） | 8.6947 | ## 实验参数 - **评估模型**：gpt-4o-mini - **耐心阈值**：0.9 - **总评估调用次数**：4036 - **时间戳**：20260407_130036

提供机构：

strategy-scope

5,000+

优质数据集

54 个

任务类型

进入经典数据集