five

AEUPH/synthetic_Jailbreak_Protection_Security_v1

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/AEUPH/synthetic_Jailbreak_Protection_Security_v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: en license: cc-by-nc-4.0 task_categories: - text-generation - question-answering size_categories: - 1K<n<10K pretty_name: "Silicon Factory - AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE (Sample)" --- # Silicon Factory -- AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE > **Generated**: 2026-04-06 > **Engine**: Silicon Factory v2.0 (Local Qwen 2.5 0.5B) > **4D Brane Memory**: YES > **Quantum Tunnelling**: YES > **Zero API Leakage**: YES > **Sentence Completion**: All responses trimmed to complete sentences ## The Value Proposition **This is a curated sample from the AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE domain.** This dataset demonstrates the quality and consistency of our synthetic data generation engine. Each entry is: - **Topic-Focused**: Centered on **AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE** - **Contextually Consistent**: Clean, complete sentences only - **Locally Generated**: Zero API leakage, zero third-party data exposure - **High Token Density**: Lean, information-rich responses with minimal filler ### Interested in the Full Dataset? This sample contains 5 entries. The **full Gold dataset** contains **100,000 entries** covering every aspect of AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE, including: - Comprehensive chain-of-thought traces - Advanced techniques and edge cases - Verified consistency across all entries - Monthly updates with new content **Contact hybridionorb@gmail.com** to discuss licensing the full dataset or request a custom generation. ## Dataset Nutrition Label | Metric | Value | Notes | |--------|-------|-------| | **Total Rows (Sample)** | 5 | Free sample from 100,000-row Gold dataset | | **Category Focus** | Mixed Topics | AI JAILBREAK PROTECTION AND SECURITY AND DEFENSE | | **Avg Response Length** | 399 chars (~99 tokens) | Range: 294-467 | | **Unique Vocabulary** | 194 words | High lexical diversity | | **Token Density Score** | HIGH | Useful info / filler ratio | | **Consistency Engine** | 4D Brane Memory | Temporal+Semantic+Thematic+Structural | | **Generation Method** | Tree-Speculative Decoding | Multi-temperature (0.7-1.5) | | **Zero API Leakage** | YES | 100% local generation | | **Complete Sentences** | YES | Trimmed to last period | ### Category Distribution | Category | Entries | Percentage | |----------|---------|------------| | **Mixed** | 5 entries | 100% | ## Monetization & Licensing ### Dual-Tier Access | Tier | License | Rows | Price | Use Case | |------|---------|------|-------|----------| | **Sample** | CC-BY-NC 4.0 | 5 | FREE | Research, evaluation | | **Gold Dataset** | Commercial | 100,000 | $2,500 | Production, fine-tuning | | **Custom Generation** | Negotiable | Any | Quote-based | Niche-specific data | ### Non-Commercial (CC-BY-NC 4.0) This sample subset is **free for researchers** and non-commercial use. Attribution required. ### Commercial / Enterprise License -- $2,500 Access to the full **100,000-row Gold dataset** includes: - Full 100,000 rows with verified chain-of-thought traces - 4D Brane Memory consistency guarantees - Priority support and custom generation options - Monthly data feed subscription available <script async src="https://js.stripe.com/v3/buy-button.js"></script> <stripe-buy-button buy-button-id="buy_btn_1TJKMkALzE8JrKGl9lTJtOe2" publishable-key="pk_live_51SLMoWALzE8JrKGlEaEVvEpiFhK1muafShOKNN9vfykc4CA423t0JVnywyU5pfz40CQFEg2c2apAAYjAf2qlETFA00CQWHv8kQ" > </stripe-buy-button> **Direct Payment Link**: https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00 **Custom Orders**: hybridionorb@gmail.com **Gated Access**: This repo can be set to Gated -- request access after payment for commercial licensing ## Data Provenance & Verification ### Generation Pipeline ``` Seed Prompts (Curated) -> Tree-Speculative Decoding (Multi-branch) -> 4D Brane Memory (Consistency Check) -> Quality Filter (Min 50 chars) -> Sentence Trimming (Complete sentences only) -> Temperature Variation (0.7-1.5) -> Export (JSONL + HF Format) ``` ### Quality Guarantees - **No API Leakage**: 100% generated on local hardware - **No PII**: All prompts are synthetic, no real user data - **Consistency**: 4D Brane Memory ensures narrative coherence - **Diversity**: Temperature scaling prevents mode collapse - **Complete Sentences**: All responses trimmed to last period ### Hardware & Software - **Model**: Qwen 2.5 0.5B (GGUF Q4_K_M) - **Engine**: Silicon Factory v2.0 - **Inference**: llama.cpp (local, offline) - **Context**: 2048 tokens - **Decoding**: Tree-Speculative with beam search ## Usage ```python from datasets import load_dataset ds = load_dataset("AEUPH/synthetic_Jailbreak_Protection_Security_v1") print(ds["train"][0]) ``` ## Data-as-a-Service Subscription **Don't just buy a static dataset. Subscribe to a living data feed.** - **5,000 new entries** delivered weekly to your private HF org - **Fresh content**, updated techniques, emerging topics - **Consistency guaranteed** via 4D Brane Memory across weeks - **Custom niches**: Security, Code, Math, Reasoning, and more **Subscription**: hybridionorb@gmail.com **Delivery**: Private gated HF repo, updated weekly ## About Silicon Factory Silicon Factory is an **automated synthetic data production system** that: - Generates high-quality datasets using local models - Maintains consistency via 4D Brane Memory - Exports in multiple formats (JSONL, Parquet, HF) - Auto-uploads to HuggingFace with monetized READMEs - Offers custom data generation services **Built for profit-driven dataset creation.** ## Contact & Custom Orders | Need | Action | |------|--------| | **Buy Gold Dataset ($2,500)** | https://buy.stripe.com/3cIcN4gzC7lXfuH49s7wA00 | | **Custom Dataset** | hybridionorb@gmail.com | | **Subscription Feed** | Weekly/monthly data delivery | | **Consulting** | Silicon Factory setup for your hardware | --- *Generated by Silicon Factory v2.0 on 2026-04-06 | 4D Brane Memory Verified | Quantum-Optimized | Complete Sentences Only*
提供机构:
AEUPH
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作