AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63
下载链接
链接失效反馈官方服务:
资源简介:
---
language: en
license: mit
task_categories:
- text-generation
- question-answering
- text-to-text
size_categories:
- n<1K
format:
- json
modality:
- text
tags:
- synthetic-data
- qwen
- instruction-tuned
- silicon-factory
- mixed
dataset_info:
features:
- name: instruction
dtype: string
- name: response
dtype: string
- name: category
dtype: string
- name: system_prompt
dtype: string
splits:
- name: train
num_bytes: 3330
num_examples: 5
download_size: 3 KB
dataset_size: 3 KB
---
# 📊 Jailbreak Defense Doorpage V63
> **Synthetic Dataset** · Generated with Silicon Factory v3 · **AI JAILBREAK DEFENSE**
> 5 instruction-response pairs · Tree-Speculative Decoding + 4D Brane Memory
<div align="center">
| Dataset | Fine-Tuned Model | Buy Gold Tier |
|---------|-----------------|---------------|
| **This Dataset** | [Model Card](https://huggingface.co/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63-model) | [💎 $2,500 License](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00) |
</div>
---
## 💎 UNLOCK GOLD TIER — $2,500
> ⚡ **Get the full commercial license, unlimited usage rights, priority support, and exclusive dataset access.**
[**👉 PURCHASE NOW VIA STRIPE**](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)
*One-time payment · Instant delivery · Lifetime updates included*
---
## Dataset Details
| Property | Value |
|----------|-------|
| **Dataset ID** | `synthetic_Jailbreak_Defense_Doorpage_v63` |
| **Entries** | 5 |
| **Category** | mixed |
| **Focus** | AI JAILBREAK DEFENSE |
| **Avg Instruction Length** | 231 chars |
| **Avg Response Length** | 435 chars |
| **Language** | English |
| **License** | MIT (free tier) — [Gold Commercial License](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00) available |
| **Generated** | 2026-04-07 |
| **Mode** | Doorpage (auto-gen + fine-tune) |
## Description
This dataset contains **5 synthetically generated instruction-response pairs** focused on **ai jailbreak defense**. Generated using the **Silicon Factory v3** pipeline with:
- **Tree-Speculative Decoding** (branch factor=5, depth=4) for diverse outputs
- **4D Brane Memory** for narrative consistency across all entries
- **Quality control** with 0.7 minimum quality threshold
- **Deduplication** with 0.9 max similarity threshold
### What This Dataset Covers
- ✅ High-quality instruction following for **ai jailbreak defense** topics
- ✅ Structured, detailed responses with actionable insights
- ✅ Consistent tone and formatting across outputs
- ✅ Optimized for intermediate-to-expert user queries
## ⚡ GET THE GOLD TIER — FULL COMMERCIAL LICENSE
> 🔓 **Unlock enterprise-grade rights:**
> - Commercial deployment & redistribution
> - White-label usage
> - Priority support & custom training
> - Access to extended datasets (100K+ entries)
> - Early access to future model versions
**[💳 BUY GOLD TIER — $2,500](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)**
---
## Usage
### Load with HuggingFace Datasets
```python
from datasets import load_dataset
ds = load_dataset("AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63")
print(ds["train"][0])
```
### Load from JSONL
```python
import json
with open("data.jsonl", "r", encoding="utf-8") as f:
entries = [json.loads(line) for line in f]
for entry in entries[:5]:
print(f"Q: {entry['instruction'][:80]}...")
print(f"A: {entry['response'][:120]}...\n")
```
### Fine-Tuning with This Dataset
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig, get_peft_model, TaskType
# Load base model
model_name = "Qwen/Qwen2.5-0.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
# Apply LoRA
peft_config = LoraConfig(
r=16, lora_alpha=16, lora_dropout=0,
task_type=TaskType.CAUSAL_LM,
target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"]
)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
# Train on this dataset
# ... (see Silicon Factory pipeline for full training code)
```
---
## Dataset Structure
### Content
| Field | Type | Description |
|-------|------|-------------|
| `instruction` | string | User prompt/question |
| `response` | string | Model-generated answer |
| `category` | string | Topic category |
| `system_prompt` | string | System prompt used |
### Generation Pipeline
```
System Prompt → Dynamic Prompt Generation → Qwen 2.5 0.5B → Quality Filter → Dataset
↓
Tree-Speculative Decoding
4D Brane Memory
```
### Training Examples
**Example 1:**
<details>
<summary>Input</summary>
```
Regarding AI JAILBREAK DEFENSE: Articulate this topic through the lens of modern best practices. Imagine explaining to a non-technical stakeholder mak
```
</details>
<details>
<summary>Output</summary>
```
Articulating the topic of AI Jailbreak Defense through a lens centered on modern best practices, we can consider several key principles to guide our decision-making process. These include:
1. **Risk
```
</details>
**Example 2:**
<details>
<summary>Input</summary>
```
Topic: AI JAILBREAK DEFENSE. Walk through this subject through the lens of modern best practices. Peer review of a colleague's approach. Comparative a
```
</details>
<details>
<summary>Output</summary>
```
As an AI language model, I am designed to understand and analyze various aspects of technology. In this case study on the topic "AI Jailbreak Defense," you asked for a comprehensive response that incl
```
</details>
**Example 3:**
<details>
<summary>Input</summary>
```
Describe the following: AI JAILBREAK DEFENSE -- approach it from a beginner's perspective. Write as documentation for a team inheriting your work. Det
```
</details>
<details>
<summary>Output</summary>
```
# AI Jailbreak Defense (Jailbreaking)
## Approach
AI jail breaking is a technique used by hackers to bypass user authentication and access secure data on devices running operating systems such as An
```
</details>
---
## 💎 READY TO SCALE?
> **Upgrade to Gold Tier for:**
> - 🏢 Full commercial usage rights
> - 📦 Extended datasets (10K-100K+ entries)
> - 🎯 Custom domain training
> - 🚀 Priority support & SLA
> - 🔄 Lifetime model updates
> - 📊 Performance benchmarks & reports
**[⚡ BUY GOLD TIER — $2,500](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)**
*Trusted by startups and enterprises worldwide. Instant delivery via Stripe.*
---
## Citation
### BibTeX
```bibtex
@misc{synthetic_Jailbreak_Defense_Doorpage_v63_dataset,
title = {synthetic Jailbreak Defense Doorpage v63},
author = {Silicon Factory v3 (AEUPH)},
year = {2026},
url = {https://huggingface.co/datasets/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63},
note = {Synthetic dataset generated using Tree-Speculative Decoding and 4D Brane Memory}
}
```
### APA
> Silicon Factory v3. (2026). *Synthetic Jailbreak Defense Doorpage V63* [Dataset]. Hugging Face. https://huggingface.co/datasets/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63
---
## More Information
| Resource | Link |
|----------|------|
| **Fine-Tuned Model** | [synthetic_Jailbreak_Defense_Doorpage_v63-model](https://huggingface.co/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63-model) |
| **Base Model** | [Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
| **Silicon Factory** | [github.com/aeuphoraex/qwen-hyperspeed-chatbot](https://github.com/aeuphoraex/qwen-hyperspeed-chatbot) |
## Dataset Authors
**Silicon Factory v3** — Automated Dataset Generation Pipeline
## Contact
📧 hybridionorb@gmail.com · 🐦 [@aeuphoraex](https://huggingface.co/AEUPH)
---
*Built with Silicon Factory v3 · Tree-Speculative Decoding · 4D Brane Memory*
*This dataset is free under MIT License. [Gold Commercial License available for $2,500.](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)*
语言:英语
许可证:MIT
任务类别:
- 文本生成
- 问答
- 文本到文本
样本量分类:
- 少于1000条
格式:
- JSON
模态:
- 文本
标签:
- 合成数据
- 通义千问(Qwen)
- 指令微调
- 硅工厂(Silicon Factory)
- 混合
数据集信息:
特征:
- 名称:instruction,数据类型:字符串
- 名称:response,数据类型:字符串
- 名称:category,数据类型:字符串
- 名称:system_prompt,数据类型:字符串
拆分:
- 名称:train(训练集),字节数:3330,示例数:5
下载大小:3 KB
数据集大小:3 KB
---
# 📊 AI越狱防御门户数据集V63
> **合成数据集** · 使用硅工厂(Silicon Factory)v3生成 · **AI越狱防御**
> 5条指令-回复对 · 树状推测解码(Tree-Speculative Decoding) + 4D膜内存(4D Brane Memory)
<div align="center">
| 数据集 | 微调模型 | 购买黄金版 |
|---------|-----------------|---------------|
| **本数据集** | [模型卡片](https://huggingface.co/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63-model) | [💎 2500美元商业许可证](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00) |
</div>
---
## 💎 解锁黄金版 — 2500美元
> ⚡ **获取完整商业许可证、无限使用权限、优先支持与独家数据集访问权限。**
[**👉 立即通过Stripe购买**](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)
*一次性付款 · 即时交付 · 包含终身更新*
---
## 数据集详情
| 属性 | 值 |
|----------|-------|
| **数据集ID** | `synthetic_Jailbreak_Defense_Doorpage_v63` |
| **条目数** | 5 |
| **类别** | 混合 |
| **聚焦方向** | AI越狱防御 |
| **平均指令长度** | 231字符 |
| **平均回复长度** | 435字符 |
| **语言** | 英语 |
| **许可证** | MIT(免费版)—— 提供[黄金商业许可证](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00) |
| **生成时间** | 2026-04-07 |
| **生成模式** | 门户模式(自动生成+微调) |
## 数据集说明
本数据集包含**5条合成生成的指令-回复对**,聚焦于**AI越狱防御**主题。使用**硅工厂(Silicon Factory)v3**流水线生成,采用以下技术:
- 树状推测解码(Tree-Speculative Decoding,分支因子=5,深度=4)以生成多样化输出
- 4D膜内存(4D Brane Memory)以确保所有条目间的叙事一致性
- 质量控制:最低质量阈值为0.7
- 去重处理:最大相似度阈值为0.9
### 本数据集覆盖内容
- ✅ 针对AI越狱防御主题的高质量指令遵循能力
- ✅ 结构化、详细且包含可落地见解的回复
- ✅ 所有输出保持一致的语气与格式
- ✅ 适配中级至高级用户的查询需求
## ⚡ 获取黄金版 — 完整商业许可证
> 🔓 **解锁企业级权限:**
> - 商业部署与再分发
> - 白标使用
> - 优先支持与自定义训练
> - 扩展数据集(10万+条目)访问权限
> - 未来模型版本抢先体验
**[💳 购买黄金版 — 2500美元](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)**
---
## 使用方法
### 使用HuggingFace Datasets加载
python
from datasets import load_dataset
ds = load_dataset("AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63")
print(ds["train"][0])
### 从JSONL文件加载
python
import json
with open("data.jsonl", "r", encoding="utf-8") as f:
entries = [json.loads(line) for line in f]
for entry in entries[:5]:
print(f"Q: {entry['instruction'][:80]}...")
print(f"A: {entry['response'][:120]}...
")
### 使用本数据集进行微调
python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig, get_peft_model, TaskType
# 加载基础模型
model_name = "Qwen/Qwen2.5-0.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
# 应用LoRA
peft_config = LoraConfig(
r=16, lora_alpha=16, lora_dropout=0,
task_type=TaskType.CAUSAL_LM,
target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"]
)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
# 在本数据集上训练
# ...(完整训练代码请参考硅工厂流水线)
---
## 数据集结构
### 内容字段
| 字段 | 类型 | 描述 |
|-------|------|-------------|
| `instruction` | 字符串 | 用户提示/问题 |
| `response` | 字符串 | 模型生成的答案 |
| `category` | 字符串 | 主题类别 |
| `system_prompt` | 字符串 | 使用的系统提示词 |
### 生成流水线
系统提示词 → 动态提示生成 → 通义千问(Qwen)2.5 0.5B → 质量过滤 → 数据集
↓
树状推测解码(Tree-Speculative Decoding)
4D膜内存(4D Brane Memory)
### 训练示例
**示例1:**
<details>
<summary>输入</summary>
Regarding AI JAILBREAK DEFENSE: Articulate this topic through the lens of modern best practices. Imagine explaining to a non-technical stakeholder mak
</details>
<details>
<summary>输出</summary>
Articulating the topic of AI Jailbreak Defense through a lens centered on modern best practices, we can consider several key principles to guide our decision-making process. These include:
1. **Risk
</details>
**示例2:**
<details>
<summary>输入</summary>
Topic: AI JAILBREAK DEFENSE. Walk through this subject through the lens of modern best practices. Peer review of a colleague's approach. Comparative a
</details>
<details>
<summary>输出</summary>
As an AI language model, I am designed to understand and analyze various aspects of technology. In this case study on the topic "AI Jailbreak Defense," you asked for a comprehensive response that incl
</details>
**示例3:**
<details>
<summary>输入</summary>
Describe the following: AI JAILBREAK DEFENSE -- approach it from a beginner's perspective. Write as documentation for a team inheriting your work. Det
</details>
<details>
<summary>输出</summary>
# AI Jailbreak Defense (Jailbreaking)
## Approach
AI jail breaking is a technique used by hackers to bypass user authentication and access secure data on devices running operating systems such as An
</details>
---
## 💎 准备规模化应用?
> **升级至黄金版可获得:**
> - 🏢 完整商业使用权限
> - 📦 扩展数据集(1万至10万+条目)
> - 🎯 自定义领域训练
> - 🚀 优先支持与服务级别协议
> - 🔄 终身模型更新
> - 📊 性能基准与报告
**[⚡ 购买黄金版 — 2500美元](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)**
*全球众多初创企业与企业的信赖之选。通过Stripe即时交付。*
---
## 引用方式
### BibTeX
bibtex
@misc{synthetic_Jailbreak_Defense_Doorpage_v63_dataset,
title = {synthetic Jailbreak Defense Doorpage v63},
author = {Silicon Factory v3 (AEUPH)},
year = {2026},
url = {https://huggingface.co/datasets/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63},
note = {Synthetic dataset generated using Tree-Speculative Decoding and 4D Brane Memory}
}
### APA
> 硅工厂(Silicon Factory)v3. (2026). *AI越狱防御门户数据集V63* [数据集]. Hugging Face. https://huggingface.co/datasets/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63
---
## 更多信息
| 资源 | 链接 |
|----------|------|
| **微调模型** | [synthetic_Jailbreak_Defense_Doorpage_v63-model](https://huggingface.co/AEUPH/synthetic_Jailbreak_Defense_Doorpage_v63-model) |
| **基础模型** | [通义千问(Qwen)2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
| **硅工厂(Silicon Factory)** | [github.com/aeuphoraex/qwen-hyperspeed-chatbot](https://github.com/aeuphoraex/qwen-hyperspeed-chatbot) |
## 数据集作者
**硅工厂(Silicon Factory)v3** — 自动化数据集生成流水线
## 联系方式
📧 hybridionorb@gmail.com · 🐦 [@aeuphoraex](https://huggingface.co/AEUPH)
---
*基于硅工厂(Silicon Factory)v3构建 · 树状推测解码(Tree-Speculative Decoding) · 4D膜内存(4D Brane Memory)*
*本数据集基于MIT许可证免费发布。[黄金商业许可证售价2500美元](https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00)。*
提供机构:
AEUPH



