Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set"

Name: Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set"
Creator: TU Wien
Published: 2025-03-31 19:53:23
License: 暂无描述

DataCite Commons2025-03-31 更新2025-04-16 收录

下载链接：

https://researchdata.tuwien.ac.at/doi/10.48436/ghdc6-45k78

下载链接

链接失效反馈

官方服务：

资源简介：

Note: If you use this dataset, please cite the following paper: Brenner, B., Fabini, J., Offermanns, M., Semper, S., & Zseby, T. (2024). Malware communication in smart factories: A network traffic data set. Computer Networks, 255, 110804. or in BibTeX: @article{brenner2024malware, title={Malware communication in smart factories: A network traffic data set}, author={Brenner, Bernhard and Fabini, Joachim and Offermanns, Magnus and Semper, Sabrina and Zseby, Tanja}, journal={Computer Networks}, volume={255}, pages={110804}, year={2024}, publisher={Elsevier}} Context and methodology Machine learning-based intrusion detection requires suitable and realistic data sets for training and testing. However, data sets that originate from real networks are rare. Network data is considered privacy-sensitive, and the purposeful introduction of malicious traffic is usually not possible. In this paper, we introduce a labeled data set captured at a smart factory located in Vienna, Austria, during normal operation and during penetration tests with different attack types. The data set contains 173 GB of PCAP files, representing 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic. The captured malicious traffic originated from a professional penetration tester who performed two types of attacks:(a) Aggressive attacks that are easier to detect.(b) Stealthy attacks that are harder to detect. Our data set includes the raw PCAP files and extracted flow data. Labels for packets and flows indicate whether they originated from a specific attack or from benign communication. We describe the methodology for creating the dataset, conduct an analysis of the data, and provide detailed information about the recorded traffic itself. The dataset is freely available to support reproducible research and the comparability of results in the area of intrusion detection in industrial networks. Technical details readme.txt Information about the data collection, format, necessary software and versions to access it. license.txt: Licensing information. a_day1, a_day2, s_day1, s_day2, tf_a, and tf_s: Main dataset, where files starting with "tf" are training files containing only benign, operational data. All other files are attack files containing both operational data and attack data. images.zip: Contains descriptive images about the data. extractions.zip: Contains extracted packets and flows in both labeled and unlabeled form. a_day_tuesday_dos.zip: An additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled. list_of_extracted_features: A complete list of features we extracted from the PCAP Files. All flow files contain these features. list_of_identified_protocols.csv: A complete list of all protocols that we could identify within the PCAP files provided.

提供机构：

TU Wien

创建时间：

2025-03-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集