OSS-forge/PoisonPy
收藏Hugging Face2025-12-17 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/OSS-forge/PoisonPy
下载链接
链接失效反馈官方服务:
资源简介:
PoisonPy数据集是一个用于研究AI代码生成器中漏洞的数据集,特别关注目标数据中毒攻击。数据集包含三个主要部分:1) Baseline Training Set,包含干净的训练数据,包括文本描述、Python代码片段、代码是否安全(0表示安全,1表示不安全)以及漏洞类别(ICI、DPI、TPI或NULL表示安全);2) Testset,包含测试集,分为输入文件(PoisonPy_test.in)和输出文件(PoisonPy_test.out);3) Unsafe samples with Safe implementation,包含120个用于数据中毒的代码样本,每个样本都有安全和漏洞版本,涵盖ICI、DPI和TPI三个类别。该数据集是研究论文《Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks》的一部分。
The PoisonPy dataset is a dataset designed for studying vulnerabilities in AI code generators, with a particular focus on targeted data poisoning attacks. The dataset consists of three main parts: 1) Baseline Training Set, which contains clean training data including text descriptions, Python code snippets, whether the code is safe (0 for safe, 1 for unsafe), and the vulnerability category (ICI, DPI, TPI, or NULL for safe); 2) Testset, which includes the test set divided into input files (PoisonPy_test.in) and output files (PoisonPy_test.out); 3) Unsafe samples with Safe implementation, containing 120 code samples used for data poisoning, each with both safe and vulnerable versions, covering the ICI, DPI, and TPI categories. This dataset is part of the research paper titled Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks.
提供机构:
OSS-forge



