Shan Jiang1, Wenxin You2, Haoran Zhang3, Shichang Xuan3,*, Jiaxing Shen4
CMC-Computers, Materials & Continua, Vol.88, No.2, 2026, DOI:10.32604/cmc.2026.079321
- 15 June 2026
Abstract Large Language Models (LLMs) have been playing a transformative role in natural language understanding and generation, yet adapting LLMs to domain-specific and privacy-sensitive data remains challenging under centralized training. Federated Learning (FL) provides a promising alternative by enabling training LLMs collaboratively without sharing raw data. However, integrating FL and LLMs introduces new challenges, including model size, device heterogeneity, non-IID data, and alignment requirements. This survey offers a structured overview of the federated LLM ecosystem. We present a comprehensive taxonomy encompassing system architectures, advanced data strategies for addressing heterogeneity, and retrieval-augmented generation in federated contexts. Additionally, More >