TY  - EJOU
AU  - Zhang, Yuechuan 
AU  - Zhang, Mingshu 
AU  - Wei, Bin 
AU  - Jin, Hongyu 
AU  - Wang, Yaxuan 

TI  - SparseMoE-MFN: A Sparse Attention and Mixture-of-Experts Framework for Multimodal Fake News Detection on Social Media
T2  - Computers, Materials \& Continua

PY  - 2026
VL  - 87
IS  - 2
SN  - 1546-2226

AB  - Detecting fake news in multimodal and multilingual social media environments is challenging due to inherent noise, inter-modal imbalance, computational bottlenecks, and semantic ambiguity. To address these issues, we propose SparseMoE-MFN, a novel unified framework that integrates sparse attention with a sparse-activated Mixture-of-Experts (MoE) architecture. This framework aims to enhance the efficiency, inferential depth, and interpretability of multimodal fake news detection. SparseMoE-MFN leverages LLaVA-v1.6-Mistral-7B-HF for efficient visual encoding and Qwen/Qwen2-7B for text processing. The sparse attention module adaptively filters irrelevant tokens and focuses on key regions, reducing computational costs and noise. The sparse MoE module dynamically routes inputs to specialized experts (visual, language, cross-modal alignment) based on content heterogeneity. This expert specialization design boosts computational efficiency and semantic adaptability, enabling precise processing of complex content and improving performance on ambiguous categories. Evaluated on the large-scale, multilingual <mml:math id="mml-ieqn-1"><mml:msup><mml:mi>MR</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math> dataset, SparseMoE-MFN achieves state-of-the-art performance. It obtains an accuracy of 86.7% and a macro-averaged F1 score of 0.859, outperforming strong baselines like MiniGPT-4 by 3.4% and 3.2%, respectively. Notably, it shows significant advantages in the “unverified” category. Furthermore, SparseMoE-MFN demonstrates superior computational efficiency, with an average inference latency of 89.1 ms and 95.4 GFLOPs, substantially lower than existing models. Ablation studies and visualization analyses confirm the effectiveness of both sparse attention and sparse MoE components in improving accuracy, generalization, and efficiency.
KW  - Fake news detection; multimodal; sparse attention; mixture-of-experts; interpretability; computational efficiency

DO  - 10.32604/cmc.2026.073996