alon-albalak/claude-sonnet-4-5-noveltybench-comprehensive-summary
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/alon-albalak/claude-sonnet-4-5-noveltybench-comprehensive-summary
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: mean_distinct
dtype: float64
- name: mean_utility
dtype: float64
- name: total_instances
dtype: int64
- name: total_completions
dtype: int64
- name: mean_partition_score
dtype: float64
- name: std_partition_score
dtype: float64
- name: mean_intra_diversity
dtype: float64
- name: std_intra_diversity
dtype: float64
- name: median_intra_diversity
dtype: float64
- name: min_intra_diversity
dtype: float64
- name: max_intra_diversity
dtype: float64
- name: mean_group_size
dtype: float64
- name: total_groups
dtype: int64
- name: total_pairs_computed
dtype: int64
- name: mean_reward
dtype: float64
- name: std_reward
dtype: float64
- name: median_reward
dtype: float64
- name: min_mean_reward
dtype: float64
- name: max_mean_reward
dtype: float64
- name: global_mean_reward
dtype: float64
- name: global_std_reward
dtype: float64
- name: global_min_reward
dtype: float64
- name: global_max_reward
dtype: float64
- name: mean_judge_score
dtype: float64
- name: std_judge_score
dtype: float64
- name: median_judge_score
dtype: float64
- name: min_mean_judge_score
dtype: float64
- name: max_mean_judge_score
dtype: float64
- name: global_mean_judge_score
dtype: float64
- name: global_std_judge_score
dtype: float64
- name: global_min_judge_score
dtype: float64
- name: global_max_judge_score
dtype: float64
- name: score_parsing_success_rate
dtype: float64
- name: total_valid_scores
dtype: int64
splits:
- name: train
num_bytes: 272
num_examples: 1
download_size: 16637
dataset_size: 272
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
数据集信息:
特征:
- 名称:平均独特性(mean_distinct),数据类型:64位浮点数(float64)
- 名称:平均效用(mean_utility),数据类型:64位浮点数(float64)
- 名称:总样本数(total_instances),数据类型:64位整数(int64)
- 名称:总完成数(total_completions),数据类型:64位整数(int64)
- 名称:平均划分得分(mean_partition_score),数据类型:64位浮点数(float64)
- 名称:划分得分的标准差(std_partition_score),数据类型:64位浮点数(float64)
- 名称:平均组内多样性(mean_intra_diversity),数据类型:64位浮点数(float64)
- 名称:组内多样性的标准差(std_intra_diversity),数据类型:64位浮点数(float64)
- 名称:组内多样性的中位数(median_intra_diversity),数据类型:64位浮点数(float64)
- 名称:组内多样性的最小值(min_intra_diversity),数据类型:64位浮点数(float64)
- 名称:组内多样性的最大值(max_intra_diversity),数据类型:64位浮点数(float64)
- 名称:平均组大小(mean_group_size),数据类型:64位浮点数(float64)
- 名称:总组数(total_groups),数据类型:64位整数(int64)
- 名称:已计算总样本对数量(total_pairs_computed),数据类型:64位整数(int64)
- 名称:平均奖励值(mean_reward),数据类型:64位浮点数(float64)
- 名称:奖励值的标准差(std_reward),数据类型:64位浮点数(float64)
- 名称:奖励值的中位数(median_reward),数据类型:64位浮点数(float64)
- 名称:最小平均奖励值(min_mean_reward),数据类型:64位浮点数(float64)
- 名称:最大平均奖励值(max_mean_reward),数据类型:64位浮点数(float64)
- 名称:全局平均奖励值(global_mean_reward),数据类型:64位浮点数(float64)
- 名称:全局奖励值的标准差(global_std_reward),数据类型:64位浮点数(float64)
- 名称:全局最小奖励值(global_min_reward),数据类型:64位浮点数(float64)
- 名称:全局最大奖励值(global_max_reward),数据类型:64位浮点数(float64)
- 名称:平均评审得分(mean_judge_score),数据类型:64位浮点数(float64)
- 名称:评审得分的标准差(std_judge_score),数据类型:64位浮点数(float64)
- 名称:评审得分的中位数(median_judge_score),数据类型:64位浮点数(float64)
- 名称:最小平均评审得分(min_mean_judge_score),数据类型:64位浮点数(float64)
- 名称:最大平均评审得分(max_mean_judge_score),数据类型:64位浮点数(float64)
- 名称:全局平均评审得分(global_mean_judge_score),数据类型:64位浮点数(float64)
- 名称:全局评审得分的标准差(global_std_judge_score),数据类型:64位浮点数(float64)
- 名称:全局最小评审得分(global_min_judge_score),数据类型:64位浮点数(float64)
- 名称:全局最大评审得分(global_max_judge_score),数据类型:64位浮点数(float64)
- 名称:得分解析成功率(score_parsing_success_rate),数据类型:64位浮点数(float64)
- 名称:有效得分总数(total_valid_scores),数据类型:64位整数(int64)
划分集:
- 名称:训练集(train),占用字节数:272,样本数:1
下载大小:16637
数据集大小:272
配置项:
- 配置名称:默认配置(default),数据文件:
- 划分集:训练集(train),路径:data/train-*
提供机构:
alon-albalak



