5G Core GTP-U Attack Dataset

Name: 5G Core GTP-U Attack Dataset
Creator: Suranga Prasad Wengappuli Arachchige
Published: 2025-08-25 09:33:53
License: 暂无描述

DataCite Commons2025-08-25 更新2026-05-04 收录

下载链接：

https://etsin.fairdata.fi/dataset/5dad463d-0306-455f-b47a-bef771b819fc

下载链接

链接失效反馈

官方服务：

资源简介：

Introduction The 5G Core GTP-U Attack Dataset (5G-CGAD) is a dataset developed to address the critical scarcity of publicly available data for security research in 5G networks. While 5G enables unprecedented capabilities such as ultra-reliable low latency communications (URLLC), enhanced mobile broadband (eMBB), and massive machine-type communications (mMTC), it also introduces new vulnerabilities due to its cloud-native, virtualized, and software-driven architecture. This dataset focuses on the GPRS Tunneling Protocol for the User Plane (GTP-U), which operates on the N3 interface between the gNodeB and the User Plane Function (UPF). As a protocol designed under the assumption of a trusted backhaul, GTP-U lacks authentication and integrity protections, making it a high-value target for adversaries. Attacks exploiting GTP-U can result in denial-of-service, session hijacking, traffic redirection, and privacy violations. The dataset provides both benign traffic (representing real-world applications such as streaming, browsing, and file downloads) and malicious traffic (five classes of simulated attacks). Data is available in multiple formats (PCAP and CSV with over 80 extracted features), ensuring adaptability for diverse research purposes. Testbed Design The dataset was generated using a realistic cloud-native 5G testbed: Infrastructure: Kubernetes cluster with two Lenovo ThinkStation P3 servers (Ubuntu 22.04). Core Network: Open-source Open5GS for AMF, SMF, UPF, and other core functions. Radio Access Network (RAN): UERANSIM to emulate 30 UEs and 3 gNodeBs. Traffic Capture: Sidecar container inside the UPF pod for real-time packet sniffing. Monitoring: Prometheus and Grafana for system observability. Analysis: Independent ML-based anomaly detection to validate realism. This setup ensured scalability, reproducibility, and real-time traffic monitoring. Benign Traffic Generation To approximate real-world 5G usage, three benign traffic categories were simulated: Video Streaming – 20 UEs using iperf3, with randomized throughput (5–50 Mbps) and session durations. Web Browsing – 10 UEs using curl to fetch content from 12 randomized websites. File Downloading – Additional browsing users download files of varying sizes. This diversity captures the variability of session lengths, packet sizes, protocols, and throughput, which is essential for distinguishing attacks from legitimate usage. Attack Scenarios The dataset includes five attack categories, each carefully simulated and labeled: GTP Encapsulation Attack – Nested GTP-U packets injected to exploit UPF vulnerabilities. Malformed GTP Attack – Variants with invalid headers, corrupted checksums, oversized fields, and unsupported message types. DDoS Attack (ICMP/UDP Floods) – Attacks from compromised UEs targeting the UPF. Intra-UPF UE DoS Attack – Malicious UE floods another UE within the same UPF, using SYN floods, UDP floods, ICMP floods, HTTP floods, and fragmentation-based amplification. GTP-U TEID Brute-Force Attack – Adversary guesses Tunnel Endpoint Identifiers (TEIDs) to discover active sessions or disrupt connectivity. These attacks were repeated multiple times over several days to ensure diversity and statistical richness. Data Processing Pipeline Captured data underwent a structured pipeline: Packet Capture (PCAP) – Full traffic including GTP-U headers. GTP-U Header Removal (via STRIPE tool) – Preserving only relevant fields when appropriate. Flow Generation (via CICFlowMeter) – Conversion into flow-based CSV with 84 statistical features. Labeling – Mapping to one of six classes: BENIGN, GTP-ENCAPSULATION, GTP-MALFORMED, DDOS, INTRA-UPF-DOS, GTP-BRUTEFORCE. Feature Selection – Redundancy reduction using Pearson correlation and ANOVA. Normalization – Features standardized (zero mean, unit variance). Class Balancing (SMOTE) – To mitigate skew (e.g., large number of Intra-UPF DoS flows vs. fewer TEID brute-force flows).

提供机构：

Suranga Prasad Wengappuli Arachchige

创建时间：

2025-08-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集