Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.075966
Special Issues
Table of Content

Open Access

ARTICLE

EdgeST-Fusion: A Cross-Modal Federated Learning and Graph Transformer Framework for Multimodal Spatiotemporal Data Analytics in Smart City Consumer Electronics

Mohammed M. Alenazi*
Faculty of Computers and Information Technology, Department of Computer Engineering, University of Tabuk, Tabuk, Saudi Arabia
* Corresponding Author: Mohammed M. Alenazi. Email: email
(This article belongs to the Special Issue: Integrating Computing Technology of Cloud-Fog-Edge Environments and its Application)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.075966

Received 11 November 2025; Accepted 05 January 2026; Published online 26 January 2026

Abstract

Multimodal spatiotemporal data from smart city consumer electronics present critical challenges including cross-modal temporal misalignment, unreliable data quality, limited joint modeling of spatial and temporal dependencies, and weak resilience to adversarial updates. To address these limitations, EdgeST-Fusion is introduced as a cross-modal federated graph transformer framework for context-aware smart city analytics. The architecture integrates cross-modal embedding networks for modality alignment, graph transformer encoders for spatial dependency modeling, temporal self-attention for dynamic pattern learning, and adaptive anomaly detection to ensure data quality and security during aggregation. A privacy-preserving federated learning protocol with differential privacy guarantees enables collaborative model training without centralizing sensitive data. The framework employs data-quality-aware weighted aggregation to enhance robustness against noisy and malicious client updates. Experimental evaluation on the GeoLife, PeMS-Bay, and SmartHome+ datasets demonstrates that EdgeST-Fusion achieves 21.8% improvement in prediction accuracy, 35.7% reduction in communication overhead, and 29.4% enhancement in security resilience compared to recent baselines. Real-world deployment across three smart city testbeds validates practical viability with 90.0% average accuracy and sub-250 ms inference latency. The proposed framework remains feasible for deployment on heterogeneous and resource-constrained consumer electronics devices while maintaining strong privacy guarantees and scalability for large-scale urban environments.

Keywords

Federated learning; graph transformer; spatiotemporal analytics; consumer electronics; smart cities; cross-modal fusion; edge computing; privacy preservation
  • 80

    View

  • 10

    Download

  • 0

    Like

Share Link