Improving Spectral Clustering using the Asymptotic Value of the Normalised Cut
收藏Taylor & Francis Group2019-12-23 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/Improving_Spectral_Clustering_using_the_Asymptotic_Value_of_the_Normalised_Cut/7862810/1
下载链接
链接失效反馈官方服务:
资源简介:
Spectral clustering is a popular and versatile clustering method based on a relaxation of the normalised graph cut objective. Despite its popularity, selecting the number of clusters and tuning the important scaling parameter remain challenging problems in practical applications of spectral clustering. Popular heuristics have been proposed, but corresponding theoretical results are scarce. In this paper we investigate the asymptotic value of the normalised cut for an increasing sample assumed to arise from an underlying probability distribution. Based on this, we find strong connections between spectral and density clustering. This enables us to provide recommendations for selecting the number of clusters and setting the scaling parameter in a data driven manner. An algorithm inspired by these recommendations is proposed, which we have found to exhibit strong performance in a range of applied domains. An R implementation of the algorithm is available from https://github.com/DavidHofmeyr/spuds.
谱聚类(Spectral Clustering)是一类基于归一化图割(Normalised Graph Cut)目标松弛化的通用且广受认可的聚类方法。尽管该方法应用广泛,但在谱聚类的实际应用场景中,聚类数目选择与关键缩放参数调优仍是颇具挑战性的问题。目前已有诸多常用启发式策略被提出,但相关的理论研究成果仍较为匮乏。本文针对由某一潜在概率分布生成的递增样本,研究了归一化图割的渐近取值。基于该研究结论,我们发现谱聚类与密度聚类(Density Clustering)之间存在紧密关联。这一发现使得我们能够提出基于数据驱动的方案,用于确定聚类数目以及设置缩放参数。基于上述方案,我们提出了一种全新算法,经测试该算法在多个应用领域中均展现出优异性能。该算法的R语言实现可从https://github.com/DavidHofmeyr/spuds获取。
创建时间:
2019-03-19



