dac-research/extra_evals_v1
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/dac-research/extra_evals_v1
下载链接
链接失效反馈官方服务:
资源简介:
extra_evals_v1 是一个综合评估数据集,结合了 LongBench v2、LooGLE、RULER 和 ZeroScrolls 等多个基准测试,用于长文档评估。该数据集包含多个任务子集,每个子集都有特定的配置和测试分割。数据集结构详细描述了各子集的列信息、令牌长度统计信息以及是否适合使用 LLM 评判的可行性。该数据集旨在评估各种领域和任务类型的长文档任务。
extra_evals_v1 is a curated evaluation dataset combining multiple benchmarks (LongBench v2, LooGLE, RULER, and ZeroScrolls) for long-document evaluation. It includes various task subsets, each with specific configurations and test splits. The dataset structure details the columns, token length statistics, and LLM-judge feasibility for each subset. It is designed to evaluate long-document tasks across diverse domains and task types.
提供机构:
dac-research



