Dataset for: Some Remarks on the R2 for Clustering
收藏Figshare2018-05-14 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Dataset_for_Some_Remarks_on_the_R_sup_2_sup_for_Clustering/6124508
下载链接
链接失效反馈官方服务:
资源简介:
A common descriptive statistic in cluster analysis is the $R^2$ that measures the overall proportion of variance explained by the cluster means. This note highlights properties of the $R^2$ for clustering. In particular, we show that generally the $R^2$ can be artificially inflated by linearly transforming the data by ``stretching'' and by projecting. Also, the $R^2$ for clustering will often be a poor measure of clustering quality in high-dimensional settings. We also investigate the $R^2$ for clustering for misspecified models. Several simulation illustrations are provided highlighting weaknesses in the clustering $R^2$, especially in high-dimensional settings. A functional data example is given showing how that $R^2$ for clustering can vary dramatically depending on how the curves are estimated.
创建时间:
2018-05-14



