haiderkamal23/allaM-offsec-arabic-chat-v2
收藏Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/haiderkamal23/allaM-offsec-arabic-chat-v2
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- question-answering
- text-generation
language:
- ar
- en
tags:
- security
- offensive-security
- penetration-testing
- cybersecurity
- arabic
size_categories:
- 10K<n<100K
---
# Arabic Offensive Security Chat Dataset v2
**High-quality category-aware bilingual Arabic/English dataset for offensive security assistants.**
## What's New in v2
- ✅ **Category-aware responses:** Different response structures for web vulns, DeFi, reconnaissance tools, social engineering, etc.
- ✅ **No generic templates:** Each category has specialized analysis framework
- ✅ **No verbatim copying:** Responses analyze and transform the input, not repeat it
- ✅ **Semantic accuracy:** Tools (nmap, recon-ng) described as tools, not vulnerabilities
- ✅ **Real attack chain reasoning:** Detailed multi-stage attack scenarios
## Dataset Details
- **Training examples:** 18,412
- **Validation examples:** 2,000
- **Total:** 20,412
- **Languages:** Arabic (analysis) + English (technical terms)
- **Format:** ChatML (`messages` field)
- **Source:** Filtered WNT3D/Ultimate-Offensive-Red-Team
## Categories Covered
- **general:** 8086 examples
- **web_vuln:** 3753 examples
- **defi_crypto:** 2042 examples
- **recon_tools:** 1890 examples
- **malware_ransomware:** 1814 examples
- **api_protocol:** 994 examples
- **infra_network:** 885 examples
- **cloud_container:** 858 examples
- **social_engineering:** 90 examples
## Response Structures
### Web Vulnerabilities
- Vulnerability type identification
- Technical analysis
- Multi-stage attack chain
- OWASP/CWE/MITRE mapping
- Specific mitigations (not generic)
- Detection techniques
### DeFi/Smart Contracts
- Blockchain-specific attack vectors
- Reentrancy, flash loans, oracle manipulation
- On-chain impact analysis
- Solidity-specific mitigations
- Audit tools and formal verification
### Reconnaissance Tools
- Tool descriptions (not vulnerabilities!)
- Role in attack chain
- Detection from defender perspective
- Network monitoring strategies
### Social Engineering
- Attack psychology and techniques
- Delivery mechanisms
- User awareness as primary defense
- Email/communication security
### Malware/Ransomware
- Full kill chain (10+ stages)
- Lateral movement tactics
- Double extortion analysis
- Backup and recovery strategies
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("haiderkamal23/allaM-offsec-arabic-chat-v2")
```
## License
Apache 2.0 - For authorized security research and testing only.
## Training
Designed for QLoRA fine-tuning of:
- `humain-ai/ALLaM-7B-Instruct-preview`
- Or similar Arabic LLMs
Target model: `haiderkamal23/ALLaM-7B-OffSec-Arabic-v2`
提供机构:
haiderkamal23



