five

Sqwish/R2-Bench-GLM-Backfilled-v1

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Sqwish/R2-Bench-GLM-Backfilled-v1
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个派生数据集,源自JiaqiXue/R2-Bench。主要包含从judge_raw恢复的correctness_score值以及针对未解决行的重新判断数据,用于评估GLM模型的性能。数据集文件位于data/zai-org/目录下,包含GLM-4.5-Air和GLM-4.6的评估文件。报告文件显示总行数为929,040行,其中929,027行从judge_raw恢复,13行重新判断,未解决行数为0。

This dataset is a derived dataset originating from JiaqiXue/R2-Bench. It primarily includes correctness_score values recovered from judge_raw and targeted re-judging for unresolved rows, used to evaluate the performance of GLM models. The dataset files are located in the data/zai-org/ directory, containing evaluation files for GLM-4.5-Air and GLM-4.6. The report file indicates a total of 929,040 rows, with 929,027 rows recovered from judge_raw, 13 rows re-judged, and 0 unresolved rows.
提供机构:
Sqwish
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作