smartdub/COFFE
收藏Hugging Face2025-04-01 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/smartdub/COFFE
下载链接
链接失效反馈官方服务:
资源简介:
COFEE是一个用于评估LLM生成代码时间效率的Python基准测试。它从HumanEval、MBPP、APPS和Code Contests数据集中选取了实例,并保留了原始测试用例作为正确性测试用例,同时添加了新的测试用例来评估时间效率,称为压力测试用例。
COFEE is a Python benchmark for evaluating the time efficiency of LLM-generated code. It includes instances selected from HumanEval, MBPP, APPS, and Code Contests datasets, retains the original test cases as correctness test cases, and adds new test cases designed for time efficiency evaluation, known as stressful test cases.
提供机构:
smartdub



