five

Improving Spectral Clustering using the Asymptotic Value of the Normalised Cut

收藏
Taylor & Francis Group2019-12-23 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/Improving_Spectral_Clustering_using_the_Asymptotic_Value_of_the_Normalised_Cut/7862810/1
下载链接
链接失效反馈
官方服务:
资源简介:
Spectral clustering is a popular and versatile clustering method based on a relaxation of the normalised graph cut objective. Despite its popularity, selecting the number of clusters and tuning the important scaling parameter remain challenging problems in practical applications of spectral clustering. Popular heuristics have been proposed, but corresponding theoretical results are scarce. In this paper we investigate the asymptotic value of the normalised cut for an increasing sample assumed to arise from an underlying probability distribution. Based on this, we find strong connections between spectral and density clustering. This enables us to provide recommendations for selecting the number of clusters and setting the scaling parameter in a data driven manner. An algorithm inspired by these recommendations is proposed, which we have found to exhibit strong performance in a range of applied domains. An R implementation of the algorithm is available from https://github.com/DavidHofmeyr/spuds.

谱聚类(Spectral Clustering)是一类基于归一化图割(Normalised Graph Cut)目标松弛化的通用且广受认可的聚类方法。尽管该方法应用广泛,但在谱聚类的实际应用场景中,聚类数目选择与关键缩放参数调优仍是颇具挑战性的问题。目前已有诸多常用启发式策略被提出,但相关的理论研究成果仍较为匮乏。本文针对由某一潜在概率分布生成的递增样本,研究了归一化图割的渐近取值。基于该研究结论,我们发现谱聚类与密度聚类(Density Clustering)之间存在紧密关联。这一发现使得我们能够提出基于数据驱动的方案,用于确定聚类数目以及设置缩放参数。基于上述方案,我们提出了一种全新算法,经测试该算法在多个应用领域中均展现出优异性能。该算法的R语言实现可从https://github.com/DavidHofmeyr/spuds获取。
创建时间:
2019-03-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作