1337xyz1337xyz/sciknoweval-v2-hard-autogradable-512-2026-04-28
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/1337xyz1337xyz/sciknoweval-v2-hard-autogradable-512-2026-04-28
下载链接
链接失效反馈官方服务:
资源简介:
---
license: other
task_categories:
- question-answering
language:
- en
tags:
- sciknoweval
- scientific-reasoning
- plan-crl
- benchmark-subset
pretty_name: SciKnowEval v2 Hard Autogradable 512 - 2026-04-28
---
# SciKnowEval v2 Hard Autogradable 512 - 2026-04-28
A 512-example sanity subset sampled from `hicai-zju/SciKnowEval` (`v2`, `test`) for Plan-CRL scientific reasoning evals.
Selection seed: `20260428`.
Filtering and balancing:
- excludes `L1`
- keeps `L2`, `L3`, `L4`
- keeps autogradable types: `mcq-4-choices`, `mcq-2-choices`, `true_or_false`, `filling`
- requires `answerKey` or `answer`
- balances domains at 128 examples each: Biology, Chemistry, Material, Physics
- per domain: 32 L2, 48 L3, 48 L4
Useful fields for manual inspection:
- `question`
- `choices`
- `target`
- `selection_domain`
- `selection_level`
- `selection_type`
- `details.task`
- `details.subtask`
Extra files:
- `metadata/selection_summary.json`
- `metadata/selection_manifest.csv`
提供机构:
1337xyz1337xyz



