Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.081065
Special Issues
Table of Content

Open Access

ARTICLE

FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Sheyun Zhang, Ruichun Gu*, Chaofeng Li, Zhijian Dong, Hefei Wang
School of Digital and Intelligence Industry, Inner Mongolia University of Science and Technology, Baotou, China
* Corresponding Author: Ruichun Gu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.081065

Received 22 February 2026; Accepted 24 April 2026; Published online 03 June 2026

Abstract

Federated learning (FL) enables collaborative model training without sharing raw data. However, in real-world applications, clients often exhibit statistical heterogeneity, missing classes, and long-tailed distributions, which can substantially degrade the generalization performance of conventional parameter aggregation and some personalization approaches. Moreover, distillation or alignment-based methods may suffer from unstable supervision and difficult optimization under highly heterogeneous settings. To this end, this paper proposes a novel method called FKD-RTM (Heterogeneous Federated Knowledge Distillation Based on Residual-Enhanced Tree-to-MLP Knowledge Transfer). The key idea is to decouple local teaching from globally aggregatable student learning: we introduce a Gradient Boosting Decision Tree (GBDT) as a local teacher at each client, providing more reliable soft supervision based on shared feature representations for a Multi-Layer Perceptron (MLP) student that supports efficient global aggregation and adaptation. To further correct the prediction bias left after first-stage distillation, we introduce a residual enhancement mechanism. It learns complementary knowledge in the pre-normalization score domain and enables second-stage corrective learning. In addition, FKD-RTM performs partial parameter fine-tuning of the feature extractor and student model for personalized local adaptation. Personalized updates are excluded from global aggregation to avoid contaminating the global model. Experiments on multiple datasets, including CIFAR-100, demonstrate that the proposed FKD-RTM method consistently improves accuracy and generalization under diverse complex data settings and achieves a better trade-off between global and personalized performance.

Keywords

Federated learning; complex data; heterogeneous distillation; GBDT; residual enhancement
  • 68

    View

  • 16

    Download

  • 0

    Like

Share Link