eth-sri/AutoBaxBench
收藏Hugging Face2025-12-25 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/eth-sri/AutoBaxBench
下载链接
链接失效反馈官方服务:
资源简介:
AutoBaxBench是一个由AutoBaxBuilder自动生成的代码安全基准测试数据集,旨在评估代码生成模型和代理生成正确和安全代码的能力。数据集包含560个后端开发任务,覆盖40个AutoBaxBuilder生成的场景和14个后端框架(包括Python、JavaScript/TypeScript、Go、PHP、Ruby和Rust的多种框架),分为简单、中等和困难三个难度级别。数据集提供了每个任务的完整场景规范,包括API规范、自然语言描述、框架特定的实现提示、是否需要数据库或秘密管理等详细信息。此外,数据集还提供了功能测试和端到端安全测试,可用于评估生成的解决方案。
AutoBaxBench is an agentically generated coding benchmark, generated by AutoBaxBuilder. It is designed to measure the ability of code generation models and agents to generate correct and secure code. The benchmark contains 560 backend development tasks from 40 AutoBaxBuilder-generated scenarios across 14 backend frameworks (including various frameworks for Python, JavaScript/TypeScript, Go, PHP, Ruby, and Rust) and 3 difficulty levels (Easy, Medium, Hard). The dataset provides complete scenario specifications for each task, including API specifications, natural language descriptions, framework-specific implementation hints, and details on whether databases or secret management are required. Additionally, the dataset includes functional tests and end-to-end security tests for evaluating generated solutions.
提供机构:
eth-sri



