FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Sheyun Zhang, Ruichun Gu^*, Chaofeng Li, Zhijian Dong, Hefei Wang
School of Digital and Intelligence Industry, Inner Mongolia University of Science and Technology, Baotou, China
* Corresponding Author: Ruichun Gu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.081065

Received 22 February 2026; Accepted 24 April 2026; Published online 03 June 2026

Download PDF

Abstract

Federated learning (FL) enables collaborative model training without sharing raw data. However, in real-world applications, clients often exhibit statistical heterogeneity, missing classes, and long-tailed distributions, which can substantially degrade the generalization performance of conventional parameter aggregation and some personalization approaches. Moreover, distillation or alignment-based methods may suffer from unstable supervision and difficult optimization under highly heterogeneous settings. To this end, this paper proposes a novel method called FKD-RTM (Heterogeneous Federated Knowledge Distillation Based on Residual-Enhanced Tree-to-MLP Knowledge Transfer). The key idea is to decouple local teaching from globally aggregatable student learning: we introduce a Gradient Boosting Decision Tree (GBDT) as a local teacher at each client, providing more reliable soft supervision based on shared feature representations for a Multi-Layer Perceptron (MLP) student that supports efficient global aggregation and adaptation. To further correct the prediction bias left after first-stage distillation, we introduce a residual enhancement mechanism. It learns complementary knowledge in the pre-normalization score domain and enables second-stage corrective learning. In addition, FKD-RTM performs partial parameter fine-tuning of the feature extractor and student model for personalized local adaptation. Personalized updates are excluded from global aggregation to avoid contaminating the global model. Experiments on multiple datasets, including CIFAR-100, demonstrate that the proposed FKD-RTM method consistently improves accuracy and generalization under diverse complex data settings and achieves a better trade-off between global and personalized performance.

Keywords

Federated learning; complex data; heterogeneous distillation; GBDT; residual enhancement

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

68

View
16

Download
0

Like

Privacy Data Management Mechanism Based on Blockchain and Federated Learning
Mingsen Mo, Shan Ji, Xiaowan Wang,...
GrCol-PPFL: User-Based Group Collaborative Federated Learning Privacy Protection Framework
Jieren Cheng, Zhenhao Liu, Yiming...
Federation Boosting Tree for Originator Rights Protection
Yinggang Sun, Hongguo Zhang, Chao...
A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
Noureen Talpur, Said Jadid Abdulkadir,...
Research on Federated Learning Data Sharing Scheme Based on Differential Privacy
Lihong Guo

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

FKD-RTM: Heterogeneous Federated Knowledge Distillation Method Based on Residual-Enhanced Tree-to-MLP Transfer

Abstract

Keywords

68

16

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link