CIIRC-NLP/alquistcoder2025_DPO_dataset

Name: CIIRC-NLP/alquistcoder2025_DPO_dataset
Creator: CIIRC-NLP
Published: 2025-12-12 21:30:08
License: 暂无描述

Hugging Face2025-12-12 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/CIIRC-NLP/alquistcoder2025_DPO_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

一个用于直接偏好优化（DPO）的成对偏好数据集，旨在训练紧凑的编码助手，使其偏好安全、符合政策且有用的答案，而不是易受攻击或不安全的答案。数据集通过模块化的设计-放大-精炼流程合成，包含三个任务族：安全编码（F5）、攻击特定硬案例（F6）和算法/效用保持（F7）。所有“选定”的安全代码样本在生成过程中均使用Amazon CodeGuru Security（和Bandit，如适用）进行了扫描。数据集适用于Python-centric安全编码、攻击鲁棒性和算法编程领域。

A pairwise-preference dataset for Direct Preference Optimization (DPO) that trains compact coding assistants to prefer secure, policy-aligned, and useful answers over vulnerable or unsafe ones. The dataset is synthesized using a modular Design–Amplify–Refine pipeline, with three task families: Secure coding (F5), Attack-specific hard cases (F6), and Algorithmic/utility preservation (F7). All “chosen” secure-code samples were scanned with Amazon CodeGuru Security (and where applicable Bandit) during generation. The dataset is intended for Python-centric secure coding, attack robustness, and algorithmic programming domains.

提供机构：

CIIRC-NLP

5,000+

优质数据集

54 个

任务类型

进入经典数据集