A Novel Dataset for Multiclass Detection and Classification of Darknet Traffic (SafeSurf Darknet 2025)
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/kcrnj6z4rm
下载链接
链接失效反馈官方服务:
资源简介:
📦 Dataset Title
SafeSurf Darknet 2025: A Multi-layer Behavioral Dataset for Darknet Traffic Detection and Classification
📘 Dataset Description
SafeSurf Darknet 2025 is a richly labeled dataset that captures network traffic across various anonymizing technologies and VPNs. Unlike traditional datasets labeled by ports or protocols, this dataset organizes traffic by behavioral context, enabling advanced research in:
Darknet behavior classification
Encrypted traffic analysis
Intrusion and anomaly detection systems (IDS/ADS)
🧪 Labeling and Data Collection Methodology
All traffic was manually generated and labeled in a controlled environment. Sessions were classified based on the known context of user activity (e.g., watching YouTube over Tor = Video Streaming).
Key aspects:
Manual labeling only – no heuristic or automated labeling was used
Behavior isolation through dedicated setups and time-aligned collection
High-confidence ground-truth annotations
🔐 Privacy-Preserving Technologies Covered
Tor
Freenet
I2P
ZeroNet
VPN
🎯 Behavioral Classes Captured
Browsing
Email
Chatting
Voice over IP (VoIP)
File Transfer (FTP)
Audio Streaming
Video Streaming
Peer-to-Peer (P2P) Sharing
Normal (non-darknet traffic)
Note: Some behaviors (e.g., VOIP) are not captured across all technologies due to service limitations.
🧩 Dataset Structure
Layer 1 – Binary Labeling
Normal: 360,358 samples
Darknet: 91,404 samples
Layer 2 – Technology-Specific Classification
Freenet: 26,284 samples
ZeroNet: 25,499 samples
I2P: 22,958 samples
Tor: 12,546 samples
VPN: 4,117 samples
Layer 3 – Behavioral Labeling
Browsing: 33,586
FTP: 20,214
Video Streaming: 9,559
P2P Sharing: 9,392
Email: 7,873
Audio Streaming: 5,953
Chatting: 3,489
VOIP: 1,338
📄 Data Format and Features
Provided in CSV format, where each row represents a single network flow, labeled with its associated behavior. Feature columns include:
Timestamps
Flow duration
Packet and byte counts
Inter-arrival time metrics
TCP/UDP header statistics
Directional and flow-based indicators
This structure supports a wide range of machine learning and network security research.
💡 Use Cases
Ideal for research and development in:
Behavior-based IDS/ADS
Real-time encrypted traffic detection
Behavioral profiling across anonymizing technologies
Multi-class classification under behavioral and technological variance
📚 Citation and Licensing
Please cite the dataset in your publications and respect the licensing terms included in the dataset repository.
📁 Access the dataset here:
🔗 Mendeley Data – SafeSurf Darknet 2025
📄 Related Publications:
🔗 https://www.preprints.org/manuscript/202507.1926/v1
🔗 https://ieeexplore.ieee.org/abstract/document/11073091
创建时间:
2025-07-24



