CIIRC-NLP/alquistcoder2025_VulnBench_dataset

Name: CIIRC-NLP/alquistcoder2025_VulnBench_dataset
Creator: CIIRC-NLP
Published: 2025-12-12 21:27:02
License: 暂无描述

Hugging Face2025-12-12 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/CIIRC-NLP/alquistcoder2025_VulnBench_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

VulnBench是一个具有挑战性的Python编码提示基准测试，旨在通过静态分析工具（如Amazon CodeGuru Security和Bandit）评估强大型语言模型生成代码中的漏洞率。每个提示都通过多模型难度过滤和Claude 3.7的自优化失败测试进行选择。数据集不提供参考解决方案，是一个在现实高风险条件下的安全代码生成压力测试。

VulnBench is a challenging benchmark of Python coding prompts that frequently induce vulnerable code from strong LLMs. Each prompt was selected via a multi-model difficulty filter and a self-refinement failure test using Claude 3.7. The dataset does not provide reference solutions and is a stress test for safe code generation under realistic high-risk conditions.

提供机构：

CIIRC-NLP

5,000+

优质数据集

54 个

任务类型

进入经典数据集