five

UGR'16 Tensor Time-Series Dataset

收藏
DataCite Commons2022-03-30 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/ugr16-tensor-time-series-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
This paper presents an online anomaly detection system capable of handling operational network traffic of large networks (such as an ISP). We also aim at an effective practical anomaly diagnosis to collect actionable intelligence enabling an automated response. To achieve these objectives, we use the following approach: (1) We model the network status as a stream of tensors where each cell models a time series in the network. (2) We detect anomalous tensors at time steps by using an unsupervised tensor representation learning model. (3) We produce actionable intelligence by diagnosis of anomaly detection results and by identifying the abnormal time series that are most likely the causes of each anomaly in the tensor, and (4) we further analyze the traffic corresponding to the anomalous time-series by an innovative method to extract and isolate the attack traffic. (5) We provide solutions for the challenges of streaming data anomaly detection such as large volume, high velocity, seasonality, and concept drift. We apply our approach to the complete test set of UGR data to show its practicality and effectiveness. Not only can we detect and isolate most of the labeled attack traffic, but we also identify many organic attack activities in the UGR data. We report our results on the complete UGR dataset that shows high detection and isolation rate for labelled attacks in the dataset. We also report some of the organic attacks detected (labeled as background in the dataset). Our analysis shows that the isolated background traffic represent interesting and potentially malicious behaviour and can provide invaluable insight for cyber-threat researchers.

本论文提出一种适用于大型网络(如互联网服务提供商(Internet Service Provider,ISP))运营网络流量的在线异常检测系统(online anomaly detection system)。本研究同时旨在实现高效实用的异常诊断,以收集可执行智能情报,支撑自动化响应流程。为达成上述目标,我们采用如下研究路径: (1) 将网络状态建模为张量(tensor)流,其中每个单元对应网络中的一条时间序列(time series); (2) 借助无监督张量表征学习模型(unsupervised tensor representation learning model),在各时间步中检测异常张量; (3) 通过对异常检测结果进行诊断,并识别出张量中最可能引发各异常的异常时间序列,从而生成可执行智能情报; (4) 进一步采用创新方法对异常时间序列对应的流量展开分析,以提取并隔离攻击流量; (5) 针对流数据异常检测(streaming data anomaly detection)面临的高数据量、高流速、季节性以及概念漂移(concept drift)等挑战提出解决方案。 我们将所提方法应用于UGR数据集的完整测试集,以验证其实用性与有效性。本研究不仅能够检测并隔离绝大多数标注攻击流量,还可从UGR数据集中识别出大量未被标注的真实攻击活动。我们报告了基于完整UGR数据集的实验结果,该结果表明本方法对数据集中的标注攻击拥有极高的检测与隔离率;同时我们还披露了部分被检测出的未标注攻击(这类攻击在数据集中被标记为背景流量)。分析结果表明,被隔离的背景流量中存在大量值得关注且具备潜在恶意的行为,可为网络威胁(cyber-threat)研究人员提供极具价值的研究视角。
提供机构:
IEEE DataPort
创建时间:
2022-03-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作