five

1337xyz1337xyz/sciknoweval-v2-hard-autogradable-512-2026-04-28

收藏
Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/1337xyz1337xyz/sciknoweval-v2-hard-autogradable-512-2026-04-28
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other task_categories: - question-answering language: - en tags: - sciknoweval - scientific-reasoning - plan-crl - benchmark-subset pretty_name: SciKnowEval v2 Hard Autogradable 512 - 2026-04-28 --- # SciKnowEval v2 Hard Autogradable 512 - 2026-04-28 A 512-example sanity subset sampled from `hicai-zju/SciKnowEval` (`v2`, `test`) for Plan-CRL scientific reasoning evals. Selection seed: `20260428`. Filtering and balancing: - excludes `L1` - keeps `L2`, `L3`, `L4` - keeps autogradable types: `mcq-4-choices`, `mcq-2-choices`, `true_or_false`, `filling` - requires `answerKey` or `answer` - balances domains at 128 examples each: Biology, Chemistry, Material, Physics - per domain: 32 L2, 48 L3, 48 L4 Useful fields for manual inspection: - `question` - `choices` - `target` - `selection_domain` - `selection_level` - `selection_type` - `details.task` - `details.subtask` Extra files: - `metadata/selection_summary.json` - `metadata/selection_manifest.csv`
提供机构:
1337xyz1337xyz
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作