Synthetic Network Traffic Dataset for Anomaly Detection Using Machine Learning in SDN Environments
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/4pnwdgt7b7
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains 10,005 synthetic network flow records designed to support research in network anomaly detection, particularly in Software-Defined Networking (SDN) environments. The core hypothesis behind the dataset's creation is that statistical analysis of flow-level features—such as connection duration, packet count, byte transmission, protocol usage, and port behavior—can effectively distinguish between normal and malicious traffic patterns.
The dataset simulates realistic traffic scenarios, representing both benign network flows and various types of malicious activity, including DDoS attacks, port scanning, data exfiltration, brute-force authentication attempts, and protocol misuse. Each record includes temporal metadata, IP addresses, port numbers, protocol types, connection duration, packet counts, and byte statistics.
The integration of machine learning methods, particularly the Random Forest algorithm, into Software-Defined Networks (SDN) enables the creation of an efficient and adaptive system for detecting DDoS attacks. Such a system ensures high accuracy in network traffic classification, timely anomaly detection, and minimizes the impact of false positives on network performance/
Characteristics of Normal Traffic:
Primarily TCP traffic (70%) with standard HTTP/HTTPS, SSH, and DNS communications
Connection durations typically range from 0.1 to 5.0 seconds
Packet counts range between 5–100 per connection
Balanced byte transmission patterns reflecting typical client-server interactions
Use of standard service ports (80, 443, 22, 53, etc.)
Characteristics of Anomalous Traffic:
Very short connection durations (0.01–0.1s), minimal packet counts (1–5), and low byte transmissions typical of port scanning
Extremely high packet counts (1,000–10,000) with disproportionate ratios of sent to received bytes
Extended connection durations (10–60s) and high outbound byte transmission (100KB–1MB) during DDoS or data exfiltration attacks
Repeated connections to authentication ports (SSH-22, RDP-3389), moderate duration and packet count during brute-force attacks
Use of uncommon protocols (GRE, ESP, AH) with non-standard port combinations indicating protocol anomalies
Dataset Applications:
Training supervised machine learning models for anomaly detection in network traffic
Evaluating classification algorithm performance on imbalanced network security datasets
Testing feature engineering techniques for traffic analysis
Educational use in cybersecurity and network monitoring courses
Label Interpretation:
Label 0 represents normal, benign network traffic
Label 1 represents anomalous, potentially malicious network traffic.
创建时间:
2025-06-13



