HMA-DER: A Hierarchical Attention and Expert Routing Framework for Accurate Gastrointestinal Disease Diagnosis
Sara Tehsin1, Inzamam Mashood Nasir1,*, Wiem Abdelbaki2, Fadwa Alrowais3, Khalid A. Alattas4, Sultan Almutairi5, Radwa Marzouk6
1 Faculty of Informatics, Kaunas University of Technology, Kaunas, 51368, Lithuania
2 College of Engineering and Technology, American University of the Middle East, Egaila, 54200, Kuwait
3 Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
4 Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, 23890, Saudi Arabia
5 Department of Computer Science, Applied College, Shaqra University, Shaqra, 15526, Saudi Arabia
6 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
* Corresponding Author: Inzamam Mashood Nasir. Email:
Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.074416
Received 10 October 2025; Accepted 17 November 2025; Published online 09 December 2025
Abstract
Objective: Deep learning is employed increasingly in Gastroenterology (GI) endoscopy computer-aided diagnostics for polyp segmentation and multi-class disease detection. In the real world, implementation requires high accuracy, therapeutically relevant explanations, strong calibration, domain generalization, and efficiency. Current Convolutional Neural Network (CNN) and transformer models compromise border precision and global context, generate attention maps that fail to align with expert reasoning, deteriorate during cross-center changes, and exhibit inadequate calibration, hence diminishing clinical trust.
Methods: HMA-DER is a hierarchical multi-attention architecture that uses dilation-enhanced residual blocks and an explainability-aware Cognitive Alignment Score (CAS) regularizer to directly align attribution maps with reasoning signals from experts. The framework has additions that make it more resilient and a way to test for accuracy, macro-averaged F1 score, Area Under the Receiver Operating Characteristic Curve (AUROC), calibration (Expected Calibration Error (ECE), Brier Score), explainability (CAS, insertion/deletion AUC), cross-dataset transfer, and throughput.
Results: HMA-DER gets Dice Similarity Coefficient scores of 89.5% and 86.0% on Kvasir-SEG and CVC-ClinicDB, beating the strongest baseline by +1.9 and +1.7 points. It gets 86.4% and 85.3% macro-F1 and 94.0% and 93.4% AUROC on HyperKvasir and GastroVision, which is better than the baseline by +1.4/+1.6 macro-F1 and +1.2/+1.1 AUROC. Ablation study shows that hierarchical attention gives the highest (+3.0), followed by CAS regularization (+2–3), dilatation (+1.5–2.0), and residual connections (+2–3). Cross-dataset validation demonstrates competitive zero-shot transfer (e.g., KSCVC Dice 82.7%), whereas multi-dataset training diminishes the domain gap, yielding an 88.1% primary-metric average. HMA-DER’s mixed-precision inference can handle 155 pictures per second, which helps with calibration.
Conclusion: HMA-DER strikes a compromise between accuracy, explainability, robustness, and efficiency for the use of reliable GI computer-aided diagnosis in real-world clinical settings.
Keywords
Gastrointestinal image analysis; polyp segmentation; multi-attention deep learning; explainable AI; cognitive alignment score; cross-dataset generalization