Open Access
ARTICLE
A Lightweight YOLOv11 Framework for Multi-Class Retinal Disease Classification
1 Faculty of Computing, Riphah International University, Islamabad, Pakistan
2 Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Republic of Korea
3 Department of Computer Science and Engineering, Inha University, Incheon, Republic of Korea
* Corresponding Authors: Junaid Rashid. Email: ; Jungeun Kim. Email:
Computer Modeling in Engineering & Sciences 2026, 147(3), 44 https://doi.org/10.32604/cmes.2026.081617
Received 05 March 2026; Accepted 21 May 2026; Issue published 30 June 2026
Abstract
Early detection of diabetic retinopathy (DR), media haze (MH), optic disc cupping (ODC), and glaucoma is crucial for preventing vision loss. However, timely diagnosis is often constrained by limited specialist availability and high diagnostic costs. This study proposes a You Only Look Once (YOLO)-based deep learning (DL) framework for the automated classification of fundus images into disease-specific categories. We unified diverse annotations from the Retinal Fundus Multi-Disease image Dataset (RFMiD), RFMiD2.0, and the DR Fundus Image Dataset (DR-FID) by standardizing annotation files and class labels. A custom filtering module was used to isolate single-pathology cases, and dataset issues such as missing or corrupted files were identified and resolved. To handle class imbalance, we applied oversampling and undersampling methods. The dataset was re-engineered for lightweight, accurate classification with YOLOv11, utilizing offline preprocessing tailored for retinal images. The dataset design leverages YOLOv11’s multi-class classification framework to achieve high performance on resource-constrained devices. This tailored approach outperforms preparing datasets solely through cloud-based platforms like Roboflow. The proposed model uses a lightweight YOLOv11 architecture, resulting in faster inference and lower memory requirements than conventional Convolutional Neural Networks (CNNs), such as Residual Networks (ResNets) or Visual Geometry Group (VGG) networks. Delivering high accuracy with minimal resource use, the model shows no signs of divergence or overfitting. Confusion matrices and class-wise metrics confirm consistent performance. The proposed framework achieves improved performance, with 94.78% accuracy, 96.12% specificity, 79.61% precision, 83.61% recall, and an 81.14% F1-score, demonstrating strong generalization to the internal held-out test set.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools