five

Deep Clustering Evaluation: How to Validate Internal Clustering Validation Measures

收藏
DataCite Commons2026-03-16 更新2026-04-25 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Deep_Clustering_Evaluation_How_to_Validate_Internal_Clustering_Validation_Measures/30809735
下载链接
链接失效反馈
官方服务:
资源简介:
Deep clustering partitions complex high-dimensional data using deep neural networks for clustering. It involves projecting data into lower-dimensional embeddings before partitioning, which embarks unique evaluation challenges. Traditional clustering validation measures, designed for low-dimensional spaces, are problematic for deep clustering for two reasons: (a) the curse of dimensionality when applied to the high-dimensional input data, and (b) unreliable comparison of clustering results when applied to embedded data from different embedding spaces, owing to variations in training procedures and model parameter settings. This article addresses these unresolved and often overlooked challenges in evaluating clustering within deep learning. We propose a systematic evaluation framework for internal clustering validation measures that: (a) theoretically establishes why traditional measures are ineffective when applied to input data or across disparate embedding spaces paired with partitioning outcomes; (b) identifies embedding spaces that endorse reliable evaluations by detecting groups with high agreement in ranking partitioning outcomes; and (c) develops a stable and robust scoring scheme by weighting index values computed across these identified embedding spaces. Experiments show that this new framework aligns better with external measures, effectively reducing the misguidance from the improper use of internal validation measures in deep clustering evaluation. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
提供机构:
Taylor & Francis
创建时间:
2025-12-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作