Median-difference window subseries score for contextual anomaly on time series
收藏Mendeley Data2024-01-31 更新2024-06-27 收录
下载链接:
http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/CU.the.2016.195
下载链接
链接失效反馈官方服务:
资源简介:
Anomaly detection on time series is one of the exciting topics in data mining. The aim is to find a data point which is different from the majority, called an anomaly. In this thesis, a novel anomaly score called Median-Difference Window subseries Score (MDWS) is proposed with its algorithm together with the parameter of the recommended window length for detecting the contextual anomalies on time series data. It is computed as the subtraction of the middle-window point with the median of all data points within the current window. The proposed MDWS algorithm is implemented as the median-update of the current window subseries to maintain the linear time complexity. Two anomaly thresholds are applied from interquartile range rule. The experimental results show that the MDWS has the highest performance on both synthetic and real world benchmark datasets from Yahoo! and Numenta comparing with others existing anomaly detection methods. Moreover, MDWS algorithm is also faster than other algorithm on the large dataset.
时间序列异常检测是数据挖掘领域极具研究价值的热门课题之一。其目标是识别出与绝大多数数据点存在显著差异的异常数据点,这类数据点被称为异常值(anomaly)。本论文提出了一种名为中位数差窗口子序列得分(Median-Difference Window subseries Score, MDWS)的新型异常得分指标,配套提出了其算法与推荐窗口长度参数,用于检测时间序列数据中的上下文异常。该指标的计算逻辑为:取当前窗口的中点数据值,与该窗口内全部数据点的中位数作差得到得分。所提出的MDWS算法通过对当前窗口子序列进行中位数更新的方式实现,确保了算法的线性时间复杂度。本文采用四分位距(interquartile range, IQR)准则设定两类异常阈值。实验结果表明,相较于现有主流异常检测方法,MDWS在合成数据集与来自Yahoo!及Numenta的真实世界基准数据集上均取得了最优的检测性能。此外,在大规模数据集上,MDWS算法的运行速度也优于其他对比算法。
创建时间:
2024-01-31



