Hierarchical Fusion Multi-Instance Learning for Weakly Supervised Pathological Image Classification

中国科学数据2026-04-16 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.11999/JEIT250726

下载链接

链接失效反馈

官方服务：

资源简介：

ObjectiveCancer mortality in China continues to rise, and pathological image classification has become central to diagnosis. Pathological images have a multilevel structure, yet many existing methods focus only on the highest resolution or use simple feature concatenation for multi-scale fusion. These strategies do not make effective use of hierarchical information. In addition, most approaches rely on random pseudo-bag division to handle high-resolution images. Because cancerous regions in positive slides are sparse, random sampling often produces incorrect pseudo-labels and low signal-to-noise ratios, which reduce classification accuracy. This study proposes a Hierarchical Fusion Multi-Instance Learning (HFMIL) method that integrates multilevel feature fusion with a pseudo-bag division strategy based on an attention evaluation function to improve accuracy and interpretability in pathological image classification.MethodsA weakly supervised multilevel classification method is proposed to use the hierarchical characteristics of pathological images and improve cancer image classification performance. The method has three main steps. First, multilevel features are extracted. Blank regions are removed, low-resolution images are divided into patches, and these patches are indexed to their corresponding high-resolution regions. Semantic features capture low-resolution tissue structure and high-resolution cellular detail. Second, pseudo-bags are constructed using an attention-based evaluation function. Class activation mapping is used to compute patch-level scores. Patches are ranked, and high-scoring ones are selected as potential positive samples. Low-scoring patches are discarded to maintain pseudo-label relevance. High-resolution pseudo-bags are then generated using index mapping, which reduces incorrect pseudo-labels and improves the signal-to-noise ratio. Third, a two-stage classification model is developed. Low-resolution pseudo-bags are aggregated with a gated attention mechanism for preliminary classification. A cross-attention mechanism then fuses the most informative low-resolution features with their corresponding high-resolution features. The fused representation is concatenated with aggregated high-resolution pseudo-bags to form an image-level feature vector for final prediction. Training uses a two-stage loss that combines low-resolution and overall cross-entropy losses. Experiments on three pathological image datasets confirm the effectiveness of the method in weakly supervised settings.Results and DiscussionsThe proposed method is compared with several recent weakly supervised classification approaches, including ABMIL, CLAM, TransMIL, and DTFD, using three pathological image datasets: the publicly available Camelyon16 and TCGA-LUNG datasets and a private skin cancer dataset, NBU-Skin. The results show clear performance gains. On Camelyon16, the method achieves 88.3% accuracy and an AUC of 0.979 (Table 2). On TCGA-LUNG, accuracy reaches 86.0% and AUC 0.931 (Table 2), exceeding the comparative methods. On the NBU-Skin dataset, accuracy reaches 90.5% and AUC 0.976 for multiclass tasks (Table 2). Ablation studies further examine the necessity of the multilevel feature fusion and pseudo-bag division modules. The combination of these modules improves classification performance. On the skin cancer dataset, removing the pseudo-bag division module reduces accuracy from 93.8% to 90.7%, and removing the multilevel feature fusion module reduces accuracy further to 80.0% (Table 3). These results confirm that each component contributes to the effectiveness of the method.ConclusionsA weakly supervised pathological image classification method that integrates multilevel feature fusion and an attention-based pseudo-bag division strategy is proposed. The method uses hierarchical information effectively and reduces errors caused by incorrect pseudo-labels and low signal-to-noise ratios. Experiments show consistent improvements in accuracy and AUC across three datasets. The main contributions are: (1) a multilevel feature extraction and fusion strategy that uses a cross-attention mechanism to combine features across scales; (2) an attention-based pseudo-bag division method that identifies potential positive regions and improves pseudo-label correctness through a top-k strategy while reducing background noise; and (3) superior performance compared with recent weakly supervised classifiers. Future work may include optimizing cross-level attention mechanisms, extending the framework to prognosis prediction or lesion segmentation, and developing more efficient feature extraction and fusion modules for broader clinical use.

创建时间：

2026-04-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集