EvoEval
收藏arXiv2025-09-30 收录
下载链接:
https://evo-eval.github.io
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个程序合成基准测试套件,它通过将现有基准测试演化到不同的目标领域而创建。EvoEval包含了五个转换类别:难度提升、创造性、细微差别、组合以及工具使用。该数据集涵盖了这五个类别共500个任务,其任务是进行程序合成评估。
This dataset is a program synthesis benchmark suite created by evolving existing benchmarks across diverse target domains. EvoEval encompasses five transformation categories: difficulty escalation, creativity, nuance, composition, and tool use. It contains a total of 500 tasks spanning these five categories, all designed for program synthesis evaluation.



