Machine Learning Topic on Intrusion Detection

Name: Machine Learning Topic on Intrusion Detection
Creator: Mendeley Data
License: 暂无描述

doi.org2025-03-23 收录

下载链接：

http://doi.org/10.17632/nrxz4sfj73.1

下载链接

链接失效反馈

官方服务：

资源简介：

Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) are crucial tools for defending against sophisticated and constantly evolving network threats. Anomaly-based intrusion detection techniques are seeing consistent and accurate performance evolutions due to the absence of dependable test and validation datasets. Based on our assessments of the eleven datasets that have been available since 1998, it is evident that the majority of them are outdated and not trustworthy. many datasets in question exhibit limited traffic diversity and volumes, while others fail to encompass the full range of known assaults. Additionally, many datasets anonymize packet payload data, which hinders their ability to accurately reflect current trends. Additionally, some may have a deficiency in terms of their feature set and metadata. The CICIDS2017 dataset comprises both benign and the latest prevalent threats, accurately representing real-world data (PCAPs). The report also incorporates the outcomes of the network traffic analysis conducted by CICFlowMeter. The flows are categorized and labeled based on several attributes such as the time stamp, source and destination IPs, source and destination ports, protocols, and attack types. These results are stored in CSV files. Additionally, the definition of the extracted characteristics is also provided. Our main focus in constructing this dataset was to provide background traffic that closely resembles real-life scenarios. We utilized our suggested B-Profile system (Sharafaldin, et al. 2016) to analyze and characterize the abstract behavior of human interactions. This system also generates benign background traffic that mimics naturalistic patterns. We constructed the abstract behavior of 25 users for this dataset by analyzing their usage of the HTTP, HTTPS, FTP, SSH, and email protocols. The data collection session commenced at 9 a.m. on Monday, July 3, 2017, and concluded at 5 p.m. on Friday, July 7, 2017, spanning a duration of 5 days. Monday is the typical day when just mild traffic is present. The implemented attacks comprise Brute Force FTP, Brute Force SSH, Denial of Service (DoS), Heartbleed, Web Attack, Infiltration, Botnet, and Distributed Denial of Service (DDoS). They have been executed in both the morning and afternoon on Tuesday, Wednesday, Thursday, and Friday. In our latest approach for evaluating datasets (Gharib et al., 2016), we have established eleven essential criteria for constructing a dependable benchmark dataset. None of the prior Intrusion Detection System (IDS) datasets were able to encompass all 11 requirements.

入侵检测系统（IDS）和入侵预防系统（IPS）是抵御复杂且不断演化的网络威胁的关键工具。基于异常的入侵检测技术由于缺乏可靠的测试和验证数据集而持续展现出稳定且精确的性能演进。根据我们对自1998年以来可用的十一个数据集的评估，显而易见，其中大部分已过时且不可靠。许多数据集在流量多样性和规模方面存在局限性，而其他数据集则未能涵盖已知攻击的全域。此外，许多数据集对数据包有效载荷进行了匿名化处理，这阻碍了其准确反映当前趋势的能力。此外，一些数据集在特征集和元数据方面可能存在缺陷。CICIDS2017数据集包含了良性和最新的流行威胁，准确地反映了现实世界的PCAP数据。该报告还融入了由CICFlowMeter进行的网络流量分析结果。这些流量根据时间戳、源和目的IP地址、源和目的端口、协议和攻击类型等属性进行分类和标记，并存储在CSV文件中。此外，还提供了提取特征的定义。我们在构建此数据集时的主要关注点是提供与真实场景高度相似的背景流量。我们利用了Sharafaldin等人（2016年）提出的B-Profile系统来分析和表征人类交互的抽象行为。该系统还能够生成模拟自然模式的良性背景流量。我们通过分析HTTP、HTTPS、FTP、SSH和电子邮件协议的使用情况，构建了25个用户的抽象行为。数据收集活动始于2017年7月3日星期一上午9点，至7月7日星期五下午5点结束，共计5天。星期一是典型的轻量级流量日。实施的攻击包括暴力破解FTP、暴力破解SSH、拒绝服务（DoS）、Heartbleed、网络攻击、渗透、僵尸网络和分布式拒绝服务（DDoS）。这些攻击在星期二、星期三、星期四和星期五的上午和下午进行。在我们最新的数据集评估方法（Gharib等人，2016年）中，我们为构建可靠基准数据集确立了十一条基本标准。之前所有的入侵检测系统（IDS）数据集都无法满足这11项要求。

提供机构：

Mendeley Data

5,000+

优质数据集

54 个

任务类型

进入经典数据集