A Comparative Benchmark of Deep Learning Architectures for AI-Assisted Breast Cancer Detection in Mammography Using the MammosighTR Dataset: A Nationwide Turkish Screening Study (2016–2022)
Nuh Azginoglu*
Department of Computer Engineering, Faculty of Engineering, Architecture and Design, Kayseri University, Kayseri, 38280, Türkiye
* Corresponding Author: Nuh Azginoglu. Email:
(This article belongs to the Special Issue: Advanced Image Segmentation and Object Detection: Innovations, Challenges, and Applications)
Computer Modeling in Engineering & Sciences https://doi.org/10.32604/cmes.2026.075834
Received 09 November 2025; Accepted 05 January 2026; Published online 19 January 2026
Abstract
Breast cancer screening programs rely heavily on mammography for early detection; however, diagnostic performance is strongly affected by inter-reader variability, breast density, and the limitations of conventional computer-aided detection systems. Recent advances in deep learning have enabled more robust and scalable solutions for large-scale screening, yet a systematic comparison of modern object detection architectures on nationally representative datasets remains limited. This study presents a comprehensive quantitative comparison of prominent deep learning–based object detection architectures for Artificial Intelligence-assisted mammography analysis using the MammosighTR dataset, developed within the Turkish National Breast Cancer Screening Program. The dataset comprises 12,740 patient cases collected between 2016 and 2022, annotated with BI-RADS categories, breast density levels, and lesion localization labels. A total of 31 models were evaluated, including One-Stage, Two-Stage, and Transformer-based architectures, under a unified experimental framework at both patient and breast levels. The results demonstrate that Two-Stage architectures consistently outperform One-Stage models, achieving approximately 2%–4% higher Macro F1-Scores and more balanced precision–recall trade-offs, with Double-Head R-CNN and Dynamic R-CNN yielding the highest overall performance (Macro F1
≈ 0.84–0.86). This advantage is primarily attributed to the region proposal mechanism and improved class balance inherent to Two-Stage designs. One-Stage detectors exhibited higher sensitivity and faster inference, reaching Recall values above 0.88, but experienced minor reductions in Precision and overall accuracy (
≈1%–2%) compared with Two-Stage models. Among Transformer-based architectures, Deformable DEtection TRansformer demonstrated strong robustness and consistency across datasets, achieving Macro F1-Scores comparable to CNN-based detectors (
≈0.83–0.85) while exhibiting minimal performance degradation under distributional shifts. Breast density–based analysis revealed increased misclassification rates in medium-density categories (types B and C), whereas Transformer-based architectures maintained more stable performance in high-density type D tissue. These findings quantitatively confirm that both architectural design and tissue characteristics play a decisive role in diagnostic accuracy. Overall, the study provides a reproducible benchmark and highlights the potential of hybrid approaches that combine the accuracy of Two-Stage detectors with the contextual modeling capability of Transformer architectures for clinically reliable breast cancer screening systems.
Keywords
Deep learning; mammography; breast cancer detection; object detection; BI-RADS classification