FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Sheyun Zhang; Ruichun Gu; Chaofeng Li; Zhijian Dong; Hefei Wang

doi:10.32604/cmc.2026.081065

Open Access icon Open Access

ARTICLE

FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Sheyun Zhang, Ruichun Gu^*, Chaofeng Li, Zhijian Dong, Hefei Wang

School of Digital and Intelligence Industry, Inner Mongolia University of Science and Technology, Baotou, China

* Corresponding Author: Ruichun Gu. Email: email

Computers, Materials & Continua 2026, 88(2), 97 https://doi.org/10.32604/cmc.2026.081065

Received 22 February 2026; Accepted 24 April 2026; Issue published 15 June 2026

Abstract

Federated learning (FL) enables collaborative model training without sharing raw data. However, in real-world applications, clients often exhibit statistical heterogeneity, missing classes, and long-tailed distributions, which can substantially degrade the generalization performance of conventional parameter aggregation and some personalization approaches. Moreover, distillation or alignment-based methods may suffer from unstable supervision and difficult optimization under highly heterogeneous settings. To this end, this paper proposes a novel method called FKD-RTM (Heterogeneous Federated Knowledge Distillation Based on Residual-Enhanced Tree-to-MLP Knowledge Transfer). The key idea is to decouple local teaching from globally aggregatable student learning: we introduce a Gradient Boosting Decision Tree (GBDT) as a local teacher at each client, providing more reliable soft supervision based on shared feature representations for a Multi-Layer Perceptron (MLP) student that supports efficient global aggregation and adaptation. To further correct the prediction bias left after first-stage distillation, we introduce a residual enhancement mechanism. It learns complementary knowledge in the pre-normalization score domain and enables second-stage corrective learning. In addition, FKD-RTM performs partial parameter fine-tuning of the feature extractor and student model for personalized local adaptation. Personalized updates are excluded from global aggregation to avoid contaminating the global model. Experiments on multiple datasets, including CIFAR-100, demonstrate that the proposed FKD-RTM method consistently improves accuracy and generalization under diverse complex data settings and achieves a better trade-off between global and personalized performance.

Keywords

Federated learning; complex data; heterogeneous distillation; GBDT; residual enhancement

Cite This Article

APA Style

Zhang, S., Gu, R., Li, C., Dong, Z., Wang, H. (2026). FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer. Computers, Materials & Continua, 88(2), 97. https://doi.org/10.32604/cmc.2026.081065

Vancouver Style

Zhang S, Gu R, Li C, Dong Z, Wang H. FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer. Comput Mater Contin. 2026;88(2):97. https://doi.org/10.32604/cmc.2026.081065

IEEE Style

S. Zhang, R. Gu, C. Li, Z. Dong, and H. Wang, “FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer,” Comput. Mater. Contin., vol. 88, no. 2, pp. 97, 2026. https://doi.org/10.32604/cmc.2026.081065

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Abstract

Keywords

Cite This Article

370

167

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link