Privacy-Preserving Federated Weakly-Supervised Learning for Cancer Subtyping on Histopathology Images

中国科学数据2026-04-16 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.11999/JEIT250842

下载链接

链接失效反馈

官方服务：

资源简介：

ObjectiveData-driven deep learning methods are widely applied to cancer subtyping, yet their performance depends on large training datasets with fine-grained annotations. For gigapixel Whole Slide Images (WSI), such annotations are labor-intensive and costly. Clinical data are typically stored in isolated data silos, and sharing procedures raise privacy concerns. Federated Learning (FL) enables a global model to be trained from data distributed across multiple medical centers without transmitting local data. However, in conventional FL, substantial heterogeneity across centers reduces the performance and stability of the global model.MethodsA privacy-preserving FL method is proposed for gigapixel WSI in computational pathology. Weakly supervised attention-based Multiple Instance Learning (MIL) is integrated with differential privacy to support training when only slide-level labels are available. Within each client, a multi-scale attention-based MIL method is used to conduct local training on histopathology WSIs, reducing the need for costly pixel-level annotation through a weakly supervised setting. During the federated update, local differential privacy is applied to limit the risk of sensitive information leakage. Random noise drawn from a Gaussian or Laplace distribution is added to model parameters after each client’s local training. Furthermore, a federated adaptive reweighting strategy is introduced to address the heterogeneity of pathological images across clients by dynamically balancing the influence of local data quantity and quality on each client’s aggregation weight.Results and DiscussionsThe proposed FL framework is evaluated on two clinical diagnostic tasks: Non-small Cell Lung Cancer (NSCLC) histologic subtyping and Breast Invasive Carcinoma (BRCA) histologic subtyping. As shown in (Table 1, Table 2, and Fig. 4), the proposed FL method (Ours with DP and Ours w/o DP) achieves higher accuracy and stronger generalization than localized models and other FL approaches. Its classification performance remains competitive even when compared with the centralized model (Fig. 3). These results indicate that privacy-preserving FL is a feasible and effective strategy for multicenter histopathology images and may reduce the performance degradation typically caused by data heterogeneity across centers. When the magnitude of added noise is controlled within a limited range, stable classification can also be achieved (Table 3). The two main components, the multiscale representation attention network and the federated adaptive reweighting strategy, each contribute to consistent performance improvement (Table 4). In addition, the proposed FL method maintains stable classification performance across different hyperparameter settings (Table 5, Table 6), confirming its robustness.ConclusionsThe proposed FL method addresses two central challenges in multicenter computational pathology: the presence of data silos and concerns over privacy. It also alleviates the performance degradation caused by inter-center data heterogeneity. As balancing model accuracy with privacy protection remains a key challenge, future work focuses on developing methods that preserve privacy while sustaining stable classification performance.

创建时间：

2026-04-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集