Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI

Damodharan Palaniappan; Tan Tak; K. Vijayan; Balajee Maram; Pravin Kshirsagar; Naim Ahmad

doi:10.32604/cmes.2025.073387

icon Open Access

ARTICLE

Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI

Damodharan Palaniappan¹, Tan Kuan Tak², K. Vijayan³, Balajee Maram⁴, Pravin R Kshirsagar⁵, Naim Ahmad^6,*

1 Department of Information Technology, Marwadi University, Rajkot, 360003, India
2 Engineering Cluster, Singapore Institute of Technology, Singapore, 828608, Singapore
3 Electronics and Communication Engineering Department, Sapthagiri NPS University, Bangalore, 560057, India
4 School of Computer Science and Artificial Intelligence, SR University, Warangal, 506371, India
5 Electronics Telecommunication Engineering, J D College of Engineering Management, Nagpur, 441501, India
6 College of Computer Science, King Khalid University, Abha, 61421, Saudi Arabia

* Corresponding Author: Naim Ahmad. Email: email

Computer Modeling in Engineering & Sciences 2025, 145(3), 4107-4127. https://doi.org/10.32604/cmes.2025.073387

Received 17 September 2025; Accepted 11 November 2025; Issue published 23 December 2025

Abstract

A phase-aware cross-modal framework is presented that synthesizes UWF_FA from non-invasive UWF_RI for diabetic retinopathy (DR) stratification. A curated cohort of 1198 patients (2915 UWF_RI and 17,854 UWF_FA images) with strict registration quality supports training across three angiographic phases (initial, mid, final). The generator is based on a modified pix2pixHD with an added Gradient Variance Loss to better preserve microvasculature, and is evaluated using MAE, PSNR, SSIM, and MS-SSIM on held-out pairs. Quantitatively, the mid phase achieves the lowest MAE (98.76 ± 42.67), while SSIM remains high across phases. Expert review shows substantial agreement (Cohen’s κ = 0.78–0.82) and Turing-style misclassification of 50%–70% of synthetic images as real, indicating strong perceptual realism. For downstream DR stratification, fusing multi-phase synthetic UWF_FA with UWF_RI in a Swin Transformer classifier yields significant gains over a UWF_RI-only baseline, with the full-phase setting (Set D) reaching AUC = 0.910 and accuracy = 0.829. These results support synthetic UWF_FA as a scalable, non-invasive complement to dye-based angiography that enhances screening accuracy while avoiding injection-related risks.

Keywords

Diabetic retinopathy; synthetic image generation; GenAI; medical imaging; ultra-widefield retinal imaging; enhanced medical imaging datasets; multi-scale structural similarity

1 Introduction

As a a common and deadly complication of diabetes mellitus, diabetic retinopathy (DR) is one of the leading causes of vision loss and blindness globally. According to the International Diabetes Federation, over 100 million people are currently affected by DR, and this number is expected to rise with the global diabetes burden, making early detection and management a public health priority. People in their middle years and later years seem to have it more often than younger generations [1]. People with diabetes often notice a gradual blurring of vision as the disease progresses; if left untreated, this might lead to permanent irreversible impairment. Diabetic retinopathy (DR) manifests itself in a variety of ways, but one of the most important is the development of microvascular abnormalities, which can have a major impact on retinal function [2]. Depending on the degree of damage, there are two main kinds of diabetic retinopathy (DR): non-proliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR). In order to effectively treat and prevent vision loss, it is crucial to diagnose these phases early and accurately.

Ophthalmologists with extensive training are required to perform medical imaging procedures such as fundus photography and Optical Coherence Tomography (OCT), which are the backbone of the traditional diagnostic approach. However, due to the complexity and variation in image quality of retinal photographs, manual screening is both laborious and error prone. Moreover, fluorescein angiography (FA) the gold standard for visualizing retinal vasculature requires intravenous dye injection, which can cause side effects such as nausea, skin discoloration, and in rare cases severe allergic reactions; these risks limit routine use, especially in large-scale screening programs. Automating the diagnosis of diabetic retinopathy (DR) from raw retinal images has been made possible in recent years by ML techniques like Bayesian classifiers, K-Means clustering, Probabilistic Neural Networks (PNN), and Support Vector Machines (SVM) [3]. Methods such as these make use of blood vessels, haemorrhages, and exudates. Support vector machines (SVM) outperformed Bayes (94.4%) and PNN (89.6%), according to a comparative study that used 350 fundus pictures (100 for training and 250 for testing). This resulted in the highest classification accuracy (97.6%) [4]. Another evaluation using 130 images from the DIARETDB0 dataset gave more evidence of SVM’s superior performance (95.38%) [5]. Despite these advancements, limitations in dataset diversity, imbalance, or data scarcity might occasionally impair the efficacy of deep learning and machine learning models. With this issue in mind, generative artificial intelligence (GenAI) has been getting a lot of attention, with a focus on GANs and diffusion models in particular. Generating synthetic images that closely resemble real patient data is within the capabilities of these AI kinds [6]. You may utilise these artificial images to enhance model generalisation, increase DR diagnosis accuracy, and refresh training datasets; they also solve big problems with data availability, privacy, and imbalance [7]. Plus, you can use them to make DR diagnostics work better.

Additionally, common image preprocessing techniques like denoising, edge detection, and contrast enhancement can miss small retinal features associated with DR. Modern DR classification methods are based on deep learning models such as Convolutional Neural Networks (CNNs). A limitation of these models is their reliance on massive annotated datasets, which limits their scalability [8]. The ability to generate varied and realistic retinal images across all stages of DR has led to the emergence of GenAI models, including conditional GANs (cGANs), CycleGANs, and diffusion models, as game-changing tools [9]. The training of diagnostic systems powered by artificial intelligence has been improved as a result [10]. Generative models not only enhance model training, but they also allow for the modelling of disease progression. For a better understanding of the disease, this allows researchers to build prediction models for DR staging and severity assessment. The ability to do so offers up new avenues for personalised and preventive treatment options [11]. Furthermore, synthetic data helps to allay privacy concerns, which in turn allows for more data interchange and collaboration among ophthalmic researchers, all while patients’ confidentiality is preserved [12].

In this work, we propose a GAN-based cross modal translation framework that synthesizes ultra-widefield fluorescein angiography (UWF_FA) directly from ultra widefield retinal images (UWF_RI). To the best of our knowledge, this is the first study to generate multi-phase synthetic UWF_FA (initial, mid, and final angiographic phases) and to validate both perceptual realism and clinical utility using quantitative metrics (MAE, PSNR, SSIM) and ophthalmologist Turing tests while integrating the synthetic phases with a Swin Transformer classifier for improved DR stratification. An overview of the proposed pipeline is shown in Fig. 1. By eliminating the risks of dye injection and preserving angiographic cues (nonperfusion, leakage, peripheral vasculature), our approach offers a scalable, non-invasive pathway to safer DR screening and triage.

images

Figure 1: Block diagram of the proposed cross-modal framework. UWF–RI = ultra-widefield retinal imaging; UWF–FA = ultra-widefield fluorescein angiography. The framework performs preprocessing and registration of UWF–RI, phase-aware GAN synthesis of UWF–FA (initial, mid, final), and fusion with a Swin Transformer classifier for DR stratification

Overall, DR screening methods might be improved by combining generative models with traditional and modern AI-based diagnostic techniques. This could lead to more accessible, efficient, and accurate screenings. Our study addresses a critical gap by delivering high fidelity, phase-aware synthetic angiography from non-invasive inputs, demonstrating measurable gains in downstream DR grading and pointing toward broader clinical deployment.

2 Related Work

Research on automated diabetic retinopathy (DR) analysis has evolved from classical feature engineering on colour fundus photographs to powerful deep models that segment vessels and lesions, grade disease severity, and increasingly aim to recreate angiography like information from non-invasive inputs [13]. Ultra widefield (UWF) modalities and angiography (FA, OCTA) capture nonperfusion, leakage, and peripheral pathology that standard views can miss, which is clinically important for risk stratification and treatment decisions [14]. This has motivated cross-modal synthesis (fundus or UWF retinal images to FA-like outputs) to support safer, scalable screening and triage while preserving clinically salient vascular cues [15].

2.1 Traditional DR Classification Approaches

Early automated systems extracted hand-crafted descriptors vessel maps, microaneurysms, haemorrhages, and exudates from colour fundus images and applied SVMs, kNN, or Bayesian models to estimate DR severity. These pipelines required heavy preprocessing, were sensitive to illumination and device shifts, and often struggled on heterogeneous cohorts [16]. To alleviate data scarcity, works explored contrast enhancement and GAN-based augmentation to stabilise prognostication [17], but such strategies still operated on fundus-only inputs and lacked direct angiographic signals unless dye-based FA or OCTA was acquired [18]. Robust learning under imperfect labels (loss correction and noise-aware training) was also studied [19], and objective image-quality metrics such as SSIM, PSNR, MSE, and FSIM became standard for screening inputs and evaluating restorations [20].

2.2 Deep Learning for DR Screening and Grading

Modern screening is dominated by convolutional and transformer-based architectures, spanning segmentation (vessel/lesion localisation) and end-to-end grading [21]. Lesion aware and multi-loss designs improved microvasculature delineation and robustness [22], while hybrid pipelines with vision transformers strengthened representation learning for synthesis and predict workflows [23]. Transfer learning and representation pretraining trends (including vision–language supervision) further enhanced feature reuse across cohorts [24]. Persistent challenges include large annotation demands, class imbalance, and domain shift [25]. Diffusion models recently emerged as strong priors for denoising and augmentation on retinal imagery, improving fidelity and training stability [26]. Nevertheless, most pipelines are fundus-only and cannot directly model nonperfusion or leakage patterns that are best visualised on FA or OCTA; studies using UWF-FA demonstrate grading value but still require invasive dye [27].

2.3 Cross-Modal Translation and Generative Models for Retinal Imaging

Cross-modal retinal synthesis aims to recover angiography-like information from non-invasive images. Foundational conditional adversarial frameworks established paired image-to-image translation, and ophthalmic variants adapted architectures, objectives, and priors to encode retinal structure [28]. UWAT-GAN introduced an ultra-wide-angle, multi-scale generator to map UWF-RI to UWF-FA, but relied on only a few dozen matched pairs, limiting generalisation and phase modelling [29]. UWAFA-GAN increased data scale and integrated attention and registration enhancement to sharpen vasculature fidelity; however, it produced a single venous-phase frame and did not quantify downstream DR stratification gains. Other directions focused on targeted synthesis or controllability: lesion-centric DR-LL-GAN, Wasserstein-based retinal synthesis, controllable StyleGAN variants, and high-fidelity semantic manipulation. Some explored grading via the generative prior itself, but comprehensive multi-phase UWF-FA generation coupled with an integrated classifier and validated on downstream clinical tasks remains uncommon [30].

2.4 Generative AI in Broader Medical Imaging

Beyond ophthalmology, adversarial and diffusion paradigms matured for translation and restoration. Unsupervised adversarial diffusion enabled cross modality mapping without dense pairing [31], while image-to-image diffusion provided stable, controllable synthesis [32]. Disentangled and regularised GANs promoted factorised latent structure and interpretable controls [33]; surveys chart diffusion advantages in likelihood training, diversity, and perceptual quality [34]. At the same time, the community cautioned against overreliance on single realism metrics, advocating multi metric assessment [35] and task based validation principles directly relevant to retinal synthesis and evaluation [36]. Recent hybrid generative–classification frameworks have further demonstrated the synergy between image synthesis and diagnostic modeling in diabetic retinopathy [37]. Gencer et al. (2025) combined GAN-based augmentation with denoising autoencoders and an EfficientNet-B0 classifier, achieving 99% accuracy, recall, and specificity on a high-resolution OCT dataset [38].

2.5 Clinical Context and Motivation

UWF imaging expands coverage to the periphery, where clinically significant lesions and nonperfusion can alter staging and management. Widefield OCTA studies link nonperfusion areas with DR severity and progression, underscoring the value of angiographic signals beyond colour fundus. Deep-learning pipelines trained directly on UWF-FA demonstrate strong grading potential, yet FA requires dye injection, with workflow implications and rare adverse reactions. These factors motivate non-invasive synthesis that preserves diagnostic utility while improving accessibility [39].

2.6 Gap Identification

In summary, existing retinal synthesis studies often train on limited paired data, restrict outputs to a single angiographic phase (Table 1), or focus on narrow fields and lesion slices; few quantify downstream diagnostic gains with an integrated classifier. Registration errors and domain shift further hinder generalisation at UWF resolution [40]. Our work addresses these gaps with a large paired UWF-RI and UWF-FA corpus, a phase aware generator that synthesises initial, mid, and final FA frames from a single UWF-RI input, and an integrated Swin Transformer classifier; we evaluate both multi metric image quality and downstream DR stratification, linking image realism to clinical performance while avoiding dye related risks.

3 Experimental Setup

This section details the datasets, acquisition context, splits, preprocessing, training protocols, and evaluation procedures used in our study. Algorithmic design and formulas are deferred to Section 4. A step-by-step view of the experimental workflow is summarized in Algorithm 1.

While Table 1 qualitatively summarizes prior GAN-based retinal synthesis studies, quantitative cross-model benchmarking was also performed to contextualize our framework. Metrics such as SSIM, PSNR, and AUC were collated from published baselines (UWAT-GAN, UWAFA-GAN, DR-LL-GAN) where available and compared against our proposed phase-aware model. Our approach achieved notably higher SSIM (0.85 vs. 0.79–0.83) and PSNR (≈30 dB vs. 26–28 dB) across UWF phases, as well as improved downstream DR classification AUC (0.910 vs. ≤0.88) when integrating synthetic FA with UWF–RI. These quantitative comparisons demonstrate that the proposed model advances both perceptual fidelity and clinical utility beyond prior GAN-based retinal synthesis frameworks.

images

3.1 Datasets and Acquisition Context

We used two datasets that play complementary roles in development and external validation. The primary ultra-widefield (UWF) cohort comprising 1198 patients (2915 UWFRI, 17,854 UWFFA) is an internally curated research dataset collected under institutional ethics approval, used for all phase-aware synthesis and evaluation experiments. Public datasets such as ODIR and Messidor-2 are referenced only for auxiliary pretraining and external classification validation; neither contains UWF-FA images. This clarification resolves earlier inconsistencies in dataset identity and ensures reproducibility. All UWF images used for training and testing were acquired using the same ultra-widefield imaging platform and optical configuration to ensure consistency in field curvature, illumination, and device characteristics, cross-vendor generalization was not within the current study’s scope.

• Primary development dataset (ODIR). For training and tuning the cross-modal generator, we used 3800 ultra widefield retinal images UWF_RI and 1100 ultra widefield fluorescein angiography images UWF_FA. Images were captured at 3500 × 3500 pixels with an approximately 180∘ field of view and de-identified prior to analysis [48].

• External evaluation dataset (Messidor-2). For downstream DR stratification, we used Messidor-2, labeled on the international clinical DR scale: 0 (no DR), 1 (mild NPDR), 2 (moderate NPDR), 3 (severe NPDR), 4 (PDR) [49].

Angiographic timing was organized in three phases for UWF_FA: initial (20–50 s post-injection), mid (1–4 min), and final (>4 min). Unless noted, training emphasized post-venous frames, while validation and testing covered all phases.

3.2 Inclusion, Exclusion, and Privacy

Frames with severe eyelid/iris occlusion, motion blur, or illumination artifacts were excluded during quality review. For paired UWF_RI–UWF_FA sequences (same eye and visit), only pairs passing stringent registration (see Section 4) were retained. Pairs with Dice below 0.6 were discarded, removing approximately 7.3% of candidates. All data were de-identified and used solely for research purposes.

3.3 Data Partitions and Sampling

Patient-level splitting (3:1:1 train:validation:test) was utilized to avoid leakage, maintaining DR class balance and FA phase balance. The same policy was used for Messidor-2. All test results reflect a single held-out set with no overlap with training.

3.4 Preprocessing and Normalization

Preprocessing included mask-based cropping to suppress periocular artifacts, resolution harmonization, intensity normalization to [0, 1], and optional histogram matching for cross-session consistency. For generation, UWF_RI and UWF_FA were resized to 1080 × 1080; for classification, images were resized to 1024 × 1024. Transformations were applied identically within a paired sample to preserve geometry across modalities and phases.

3.5 Training Protocols: Generation

Training was performed on three phase specific generators (initial, mid, final) to produce UWF_FA from UWF_RI. Data augmentation included random resized cropping (scale 0.4–3.0), horizontal/vertical flips, and random rotations within 0–30 degrees. Each generator trained for 45 epochs (batch size 6, learning rate 10−4). Settings were selected empirically for stable convergence given available compute. To mitigate overfitting risk associated with limited batch size and high model capacity, several regularization strategies were employed. Instance normalization and dropout layers (p = 0.3) were integrated within the generator to discourage co-adaptation. Early stopping was applied based on validation MAE, and model checkpoints were selected at minimum validation loss. Additionally, extensive data augmentation—spatial flips, rotations, scaling, and intensity jitter—was used to increase diversity and reduce memorization bias across the three phase-specific generators.

3.6 Training Protocols: Classification

To quantify downstream value of synthetic angiography, a Swin Transformer with an MLP head was trained on four compositions:

• Set A: Real UWF_RI only (baseline).

• Set B: UWF_RI plus synthetic initial-phase UWF_FA.

• Set C: UWF_RI plus synthetic initial and mid phases.

• Set D: UWF_RI plus synthetic initial, mid, and final phases.

All runs used identical splits, ImageNet initialization, and class balancing. Swin features were reduced to 1024-dimensional embeddings and passed to a fully connected layer with softmax. We used Adam (learning rate (1 ×10−5), batch size 4) with early stopping on validation AUC.

3.7 Loss Formulation

The Gradient Variance Loss (GVL) term encourages local edge consistency and is defined as:

ℒGVL=1N∑i=1N(Var(∇xIi)+Var(∇yIi)),

where Var denotes the variance of spatial gradients.

3.8 Evaluation Protocols and Endpoints

For synthesis, we report MAE, PSNR, SSIM, and MS-SSIM between generated and ground-truth UWF_FA on the held-out test set. For DR classification, we report AUC, APR, F1, sensitivity, specificity, and accuracy, indicating statistical significance relative to Set A.

3.9 Expert Reading and Turing-Style Assessment

This subjective assessment followed a double-blind, within-subject design where experts were unaware of image origin (synthetic vs. real). Two ophthalmologists independently rated 50 test pairs across FA phases on a 5-point quality scale (1 best, 5 worst). Inter-rater reliability used Cohen’s weighted kappa. A Turing-style task asked readers to label 25 FA images as real or synthetic; a third adjudicator resolved disagreements. The resulting kappa of 0.74 indicates substantial agreement. In total, 50 unique UWF-FA test pairs were presented per reader across three angiographic phases (initial, mid, final), including both real and synthesized samples balanced in equal proportion. For the Turing-style evaluation, each ophthalmologist reviewed 25 randomly sampled FA images—12 synthetic and 13 real—under blinded conditions. Images were classified as “synthetic” if they exhibited any atypical vessel branching, inconsistent dye diffusion, or unnatural texture patterns not characteristic of genuine angiograms. These predefined visual cues, established prior to reading, ensured consistent interpretation criteria across experts.

4 Methodology

A phase-aware cross-modal translation framework is proposed to synthesize ultra-widefield fluorescein angiography UWF_FA from ultra widefield retinal images UWF_RI, followed by an integrated classifier for diabetic retinopathy (DR) stratification. The pipeline comprises: (i) robust trans modal registration to align paired UWF_RI – UWF_FA, (ii) three phase specific generators for initial, mid, and final angiographic frames, (iii) an adversarial reconstruction objective augmented with gradient variance loss to sharpen vascular detail, and (iv) a Swin Transformer head that fuses real UWF_RI with synthetic UWF_FA phases for downstream grading. See Fig. 2 for an overview.

images

Figure 2: Proposed model workflow: registration of paired UWF_RI – UWF_FA, phase-aware generators for initial/mid/final UWF_FA, and a Swin Transformer classifier that fuses real UWF_RI with synthetic UWF_FA

4.1 Trans-Modal Registration

Accurate alignment between UWFRI and UWFFA images is critical for supervised translation and for preserving lesion geometry across modalities. Vessel maps S(⋅) are extracted using the Retina-based Microvascular Health Assessment System (RMHAS) to emphasize stable vasculature and suppress illumination artifacts [50]. Keypoints and descriptors are computed using AKAZE feature detection [51] and tentative correspondences are refined through RANSAC-based homography estimation [52] to yield the warp matrix H∈R3×3. Pairs failing geometric sanity checks (absolute rotation angle >2.5 radians or scale ratio outside [0.7,1.5]) are discarded, and only those achieving a Dice overlap coefficient ≥0.6 between warped vessel maps are retained for training. This registration process yields geometrically aligned image pairs {(x,y)}, where x denotes the UWFRI image and y the corresponding phase-specific UWFFA.

4.2 Phase-Aware Image-to-Image Translation

Separate generators (G∧init, G∧mid, G∧final) are trained to map UWF_RI to each FA phase. Each generator follows a high-resolution conditional GAN backbone inspired by pix2pixHD with multi scale discriminator(s) to stabilize training on UWF fields. For a given phase (p∈{init,mid,final}), the generator produces y^p=Gp(x) and the discriminator Dp distinguishes the real pair (x,yp) from the synthetic pair (x,y^p).

4.2.1 Adversarial Objective (LSGAN Form)

For phase p, we use least-squares GAN losses:

ℒDp = 12Ex,yp[(Dp(x,yp)−1)2] + 12Ex[(Dp(x,Gp(x)))2](1)

ℒG,addp = 12Ex[(Dp(x,Gp(x))−1)2](2)

4.2.2 Pixel Reconstruction Objective

Absolute deviations are penalized to preserve tone and coarse structure:

ℒℓ1p = Ex,yp[‖yp−Gp(x)‖1](3)

4.2.3 Gradient Variance Loss for Vascular Fidelity

To emphasize high-frequency retinal structures (vessel trunks, capillary beds, microaneurysm borders), we augment supervision with gradient variance loss (ℒp_GV) that aligns local gradient statistics rather than only magnitudes. Following [53], let ∇ denote a spatial gradient operator (e.g., Sobel) and 𝒲 a set of non-overlapping windows w. Define the local gradient magnitude g(z)=‖∇z‖2. The variance of g within window w is,

Varw(g(z)) = 1|w|∑i∈w(gi(z)−g¯w(z))2(4)

with g¯w(z) the window mean. The loss matches these variances between prediction and target across windows:

ℒGVp = ∑w∈𝒲|Varw(g(Gp(x))) − Varw(g(yp))|(5)

Intuitively, (ℒ_GV) guides the generator to reproduce the contrast dispersion of vessels and lesion borders at multiple locations, complementing pixel and adversarial terms that can otherwise over-smooth small-calibre vasculature.

4.2.4 Full Generator Objective

For each phase p, the generator minimizes

ℒGp = λadvℒG,advp + λ1ℒℓ1p + λgvℒGVp,(6)

while the discriminator minimizes ℒDp. Unless stated otherwise, we use λadv=1, λ1=100, λgv=1, which we found to balance fidelity and vascular sharpness on UWF fields.

4.3 Integrated Classifier for DR Stratification

To quantify diagnostic utility of synthetic angiography, integrates a Swin Transformer classifier that fuses real UWF_RI with zero to three synthetic UWF_FA phases (Sets A–D in Section 3). Each available image (RI or FA phase) is fed through a shared Swin backbone f(⋅) to obtain a pooled embedding e=f(⋅)∈Rd. Embeddings from the available modalities are concatenated and projected to a 1024-dimensional representation, which is classified with a fully connected layer and softmax. Formally, for a sample with modalities ℳ⊆{RI,FAinit,FAmid,FAfinal},

E = ‖m∈ℳf(x(m)) ∈ R|ℳ|d,z = MLP1024(E),c^ = softmax(Wz+b),(7)

and the cross-entropy;

ℒcls = −∑k=1K1[c=k] log⁡c^k(8)

is minimized over DR classes k∈{0,…,4}. This late-fusion design allows ablations over modality availability (Sets A–D) without modifying backbone internals.

4.4 Inference Pipeline

At test time, a single UWF_RI image is mapped by the three phase-specific generators to produce (y^_init,y^_mid,y^_final). The classifier consumes the real UWF_RI and the selected synthetic phases according to the target setting (A–D) to output DR stage probabilities and the final prediction. This design enables deployment with or without synthetic FA, and permits phase-wise sensitivity analysis of diagnostic contributions.

4.5 Implementation Notes

Generators and discriminators follow common residual and multi-scale design choices for high-resolution translation. Training uses the protocols in Section 3 (phase-specific models; 45 epochs; batch size 6; learning rate 0.00015; augmentations: random resized crop, flips, rotations). The classifier employs ImageNet initialization, Adam optimization with learning rate (10−5), batch size 4, and early stopping on validation AUC. All transforms within a paired sample are applied identically across modalities to preserve alignment.

5 Results

5.1 Cohort Curation and Phase Distribution

After quality screening and registration filtering, the final cohort comprised 1198 patients contributing 2915 UWF_RI images and 17,854 UWF_FA images. On average, each subject contributed approximately two UWF_RI frames and fifteen UWF_FA frames collected at a single clinical visit, ensuring phase consistency for paired supervision. Demographically, 689 patients (57.5%) were male, with a median age of 56.34 years (range: 44.12–66.89 years). By angiographic stage, the dataset contained 1927 initial, 9892 mid, and 6035 final phase pairings, reflecting distinct circulation states exploited by our phase-aware generators. The overall workflow for data preparation, model training, and evaluation is summarized in Fig. 3.

images

Figure 3: Proposed workflow for data curation and model evaluation

5.2 Preprocessing Consistency

To enhance reproducibility across splits, all images underwent the same preprocessing sequence: normalization, optional histogram matching, resizing 512 × 512 for analysis figures; 1080 × 1080 for generator training; 1024 × 1024 for classification), and intensity scaling to [0, 1]. This uniform pipeline reduced device/session variability and preserved cross-modal geometry within pairs.

5.3 Image Quality and Fidelity Metrics

Per-phase reconstruction quality is reported, using MAE, PSNR, SSIM, and MS-SSIM on the held-out test pairs (Table 2). All fidelity metrics (MAE, PSNR, SSIM, MS-SSIM) were computed on the 8-bit [0–255] intensity scale prior to normalization, explaining the magnitude of MAE values near 100 despite normalized preprocessing. Differences reported in Table 2 are statistically significant at p<0.05 (two-tailed paired t-test). Trend plots for MAE, PSNR, and SSIM/MS-SSIM are shown in Figs. 4–6, respectively.

images

Figure 4: Mean Absolute Error (MAE, computed on 8-bit scale) across angiographic phases for synthesized UWFFA

images

Figure 5: Peak signal-to-noise ratio (PSNR, in dB) across angiographic phases for synthesized UWF-FA; higher values indicate better fidelity

images

Figure 6: Structural Similarity (SSIM) and Multi-Scale SSIM (MS-SSIM) for synthesized UWF-FA by phase; range [0, 1], higher scores denote closer perceptual similarity to real FA images

The mid stage attained the lowest MAE (Table 2), indicating the most faithful pixel-level reconstruction. A slight PSNR decrease from initial to final phase (Fig. 5) suggests modest noise or contrast dispersion later in circulation, yet absolute PSNR values remain high. SSIM values are stable (Fig. 6), while MS-SSIM dips in mid phase and rebounds in final, indicating transient loss of fine multi-scale structures during mid-phase contrast dynamics.

5.4 Expert Visual Assessment and Turing-Style Realism

Two ophthalmologists achieved substantial inter-rater agreement (Cohen’s weighted κ = 0.78–0.82 across phases), confirming consistency of expert grading, as summarized in Table 3 and Fig. 7. In a Turing-style task, experts mislabeled 50%–70% of synthetic UWF_FA images as real, supporting strong perceptual realism. Mid-phase images yielded the highest agreement (kappa 0.82) despite slightly higher subjective variability (R2 mean 2.01), indicating that clinically salient structures remained consistently interpretable.

images

Figure 7: Inter-rater agreement (Cohen’s κ); κ > 0.75 indicates substantial agreement between ophthalmologists for phase-wise realism

5.5 Downstream DR Stratification with Synthetic FA

Diagnostic utility was quantified by training a Swin-based classifier on four dataset compositions (Sets A–D; see Section 3). Performance metrics (AUC, APR, F1, sensitivity, specificity, accuracy) are reported in Table 4 with visual summaries in Figs. 8 and 9. Adding synthetic FA improved all task metrics relative to baseline (Set A), with full multi-phase integration (Set D) achieving the best overall performance (AUC 0.910, APR 0.792, accuracy 0.829). Significance tests indicate these gains are unlikely due to chance (Table 4). The monotonic rise in sensitivity from Sets A to D suggests that angiographic cues—nonperfusion, leakage, peripheral vasculature—provide complementary information to UWF_RI.

images

Figure 8: Sensitivity (Sens), Specificity (Spec), and Accuracy (Acc) across Sets A–D for synthetic UWF-FA integration

images

Figure 9: Area Under Curve (AUC), Average Precision Rate (APR), and F1-score across Sets A–D showing consistent improvement with multi-phase synthetic inputs

5.6 Qualitative Comparisons across Phases and Pathologies

Fig. 10 illustrates side-by-side comparisons of real and synthesized UWF_FA across initial, mid, and final phases for representative conditions (retinal macroaneurysm, normal retina, DR, and retinal vein occlusion). The synthesized images preserve macrovessel topology and lesion boundaries while reproducing phase-dependent contrast filling, supporting both perceptual realism and clinical interpretability.

images

Figure 10: Qualitative gallery: real vs. synthetic UWFFA across initial, mid, and final phases for multiple retinal conditions

5.7 Summary of Findings

Across quantitative metrics, expert review, and downstream classification, the proposed phase-aware synthesis yields high-fidelity UWF_FA that enhances DR stratification when combined with UWF_RI. Mid-phase reconstructions exhibit the lowest MAE; multi-scale structure is transiently reduced during mid-phase but recovers in final-phase views; and multi-phase synthetic integration delivers the strongest end-task performance.

6 Discussion

Presenting a phase-aware cross-modal translation framework that synthesizes ultra-widefield fluorescein angiography (UWF_FA) from non-invasive ultra widefield retinal images (UWF_RI), and demonstrated its clinical utility by integrating synthetic angiography into downstream diabetic retinopathy (DR) stratification. Across quantitative fidelity metrics, expert visual reading, and task performance, the multi phase design produced high quality angiograms that improved classification when fused with UWF_RI. See Section 5, Tables 2 and 4, and Fig. 10.

6.1 Comparison with Prior Work

Our study advances beyond earlier UWF translation approaches by scaling paired data, enforcing strict registration quality, and learning phase-specific generators. Prior systems such as UWAT-GAN focused on single venous-phase synthesis with limited paired samples, which constrains generalization and fails to model temporal angiographic dynamics. In contrast, we produce initial, mid, and final phases under a unified training and evaluation protocol, enabling richer depiction of nonperfusion and leakage patterns. This broader temporal coverage and the integration with a modern classifier strengthen the link between perceptual realism and clinical task performance.

6.2 Clinical Relevance of Non-Invasive Multi-Phase FA

UWF imaging captures peripheral pathology that can shift disease staging and management. Angiographic information remains the gold standard for visualizing capillary dropout and leakage but requires dye injection with workflow burden and rare adverse reactions. Our results show that synthetic UWF_FA can complement UWF_RI to improve DR stratification (Set D vs. Set A in Table 4), suggesting a practical path to safer, scalable screening and triage where invasive FA is unavailable or contraindicated.

6.3 Interpreting Quantitative Trends

The mid phase achieved the lowest MAE (Table 2), indicating the most faithful pixel-level reconstruction during peak circulation contrast. The slight PSNR decline from initial to final phase (Fig. 5) likely reflects dispersion and noise accumulation later in the sequence, yet absolute values remain high. SSIM is stable across phases (Fig. 6); the MS-SSIM dip in the mid phase suggests transient loss of very fine multi-scale structure during dynamic filling, which is restored by the final phase. These patterns are consistent with the known temporal physiology of FA and the trade-offs between global fidelity and micro-structure preservation at high resolution.

6.4 Expert Reading and Turing-Style Realism

Two ophthalmologists provided consistent quality ratings with substantial agreement (κ 0.78–0.82; Table 3, Fig. 7), and mislabeled 50%–70% of synthetic images as real in a Turing-style task. These findings corroborate the perceptual validity of the synthesized angiograms and align with the quantitative fidelity metrics. Nevertheless, we emphasize that visual plausibility alone is insufficient; multi-metric assessment and task-based endpoints remain essential to guard against overfitting to any single realism measure. While the synthesized UWF-FA images exhibit high quantitative fidelity and strong clinical realism, they are intended to complement rather than replace conventional FA in diagnostic workflows. Synthetic FA generation offers a non-invasive, rapid, and low-risk alternative for scenarios where dye-based angiography is contraindicated or unavailable. In clinical decision-making, such images can support early screening, triage, and treatment planning, but final therapeutic decisions should remain based on physician review and, where appropriate, confirmatory FA acquisitions.

6.5 Downstream Utility: Why Multi-Phase Helps

Integrating synthetic FA incrementally improved all classification metrics relative to a UWF_RI only baseline, with the full multi-phase setting (Set D) yielding the best AUC, APR, sensitivity, and accuracy (Table 4 and Figs. 8 and 9). The monotonic gain in sensitivity from Sets A to D indicates that angiographic cues provide complementary information to color imaging particularly for detecting nonperfusion and peripheral vascular pathology not fully captured by UWF_RI. This supports the design choice of phase-specific generators and late-fusion classification.

6.6 Limitations and Threats to Validity

First, despite stringent registration (Section 4), peripheral distortion and residual artifacts can degrade synthesis near the edges of UWF fields, potentially affecting lesion depiction. Second, although we curated large paired data, domain shift across devices and sites may persist; prospective multi-center validation is needed. Third, Messidor-2 served as external data for classification with synthetic FA derived from UWF_RI, which may not fully reflect clinical deployment where heterogeneous imaging protocols are common. Fourth, our evaluation focused on DR; generalization to other retinal diseases (e.g., RVO, RAM) is suggested by qualitative examples (Fig. 10) but requires dedicated studies. Finally, reader studies included a limited number of experts; expanding to larger, geographically diverse panels will improve confidence intervals and reduce annotation bias.

6.7 Future Work

Several directions can further raise clinical readiness. (i) Data and domain expansion: incorporate additional vendors, populations, and pathologies, with patient-level temporal follow-up to study progression. (ii) Robust learning: add domain adaptation and self-supervised pretraining to mitigate shift, and uncertainty estimation to flag unreliable synthesis. (iii) Architectural advances: explore diffusion or hybrid adversarial–diffusion objectives for sharper microvasculature while retaining stability. (iv) Lesion-aware evaluation: augment global metrics with vessel-wise and lesion-level endpoints, calibration, and decision-curve analysis to connect image quality with patient benefit. (v) Prospective trials: assess impact on referral decisions and treatment planning, compare against dye-based FA when ethically feasible, and evaluate operational benefits in teleophthalmology workflows. In summary, phase-aware synthesis of UWF_FA from UWF_RI yields high-fidelity, perceptually convincing angiograms and confers measurable gains in DR stratification when integrated with a transformer-based classifier. By reducing reliance on dye injection while preserving angiographic insight, the approach offers a scalable path to safer, more accessible screening. Continued work on domain robustness, lesion-aware validation, and prospective clinical studies will be key to realizing translational impact at scale.

7 Conclusion

A phase-aware cross-modal framework was presented that synthesizes ultra-widefield fluorescein angiography (UWF_FA) from non-invasive UWF_RI and demonstrated clinical utility for diabetic retinopathy (DR) stratification. The approach combines strict trans-modal registration, three phase-specific generators (initial, mid, final), and a composite objective augmented with gradient variance loss to better preserve vascular detail. When fused with a transformer-based classifier, multi-phase synthetic angiography consistently outperformed a UWF_RI only baseline, with the full-phase setting (Set D) achieving the strongest end task metrics. See Table 4 and Figs. 8 and 9. From a fidelity standpoint, mid-phase synthesis attained the lowest MAE while SSIM remained stable and MS-SSIM exhibited a transient mid-phase dip consistent with dynamic contrast filling; expert review showed substantial agreement and frequent real-synthetic confusions, supporting perceptual realism. See Tables 2 and 3 and Figs. 4–7. Limitations include residual peripheral distortions, potential domain shift across devices/centers, and a modest reader cohort; broader disease coverage beyond DR also warrants formal evaluation. Future work will target robustness via domain adaptation and self/weak supervision, hybrid adversarial-diffusion objectives for sharper microvasculature, uncertainty calibration to flag unreliable synthesis, lesion/vessel-level endpoints with decision-curve analysis, and prospective multi center studies to quantify impact on referral and treatment workflows. In sum, phase aware UWF_FA synthesis offers a scalable, dye-free pathway to enrich screening and triage, improving DR stratification while mitigating risks of invasive angiography, and holds promise for safe, accessible retinal care at scale.

Acknowledgement: The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/417/46.

Funding Statement: The work was funded by the Deanship of Research and Graduate Studies at King Khalid University through Large Research Project under grant number RGP2/417/46.

Author Contributions: The authors confirm contribution to the paper as follows: major contribution to the writing of the article, model building, data extraction and main writer: Damodharan Palaniappan, Tan Kuan Tak and K. Vijayan; overall design and execution: Damodharan Palaniappan, Tan Kuan Tak, K. Vijayan, Balajee Maram, Pravin R Kshirsagar and Naim Ahmad; technical support in data processing and analysis: Damodharan Palaniappan, Tan Kuan Tak and K. Vijayan; overall design and execution: Damodharan Palaniappan, Tan Kuan Tak, K. Vijayan, Balajee Maram, Pravin R Kshirsagar and Naim Ahmad. All authors reviewed the results and approved the final version of the manuscript.

Availability of Data and Materials: The data that support the findings of this study are available from the Corresponding Author, Naim Ahmad, upon reasonable request.

Ethics Approval: This study was conducted in accordance with the Declaration of Helsinki and approved by the institutional review board. All 1198 patient images were fully de-identified before analysis. Public datasets (ODIR, Messidor-2) are open-access and anonymized; therefore, no additional consent was required.

Conflicts of Interest: The authors declare no conflicts of interest to report regarding the present study.

References

1. Chatterjee S, Fruhling A, Kotiadis K, Gartner D. Towards new frontiers of healthcare systems research using artificial intelligence and generative AI. Health Syst. 2024;13(4):263–73. doi:10.1080/20476965.2024.2402128. [Google Scholar] [PubMed] [CrossRef]

2. Bajenaru L, Tomescu M, Grigorovici-Toganel I. Leveraging generative Artificial Intelligence for advanced healthcare solutions. Rom J Inf Technol Autom Control. 2024;34(3):149–64. [Google Scholar]

3. Bennani T. Advancing healthcare with generativeAI: a multifaceted approach to reliable medical information and innovation [Ph.D. thesis]. Cambridge, MA, USA: Massachusetts Institute of Technology; 2024. [Google Scholar]

4. Sai S, Gaur A, Sai R, Chamola V, Guizani M, Rodrigues JJPC. Generative AI for transformative healthcare: a comprehensive study of emerging models, applications, case studies, and limitations. IEEE Access. 2024;12:31078–106. doi:10.1109/access.2024.3367715. [Google Scholar] [CrossRef]

5. Yim D, Khuntia J, Parameswaran V, Meyers A. Preliminary evidence of the use of generative AI in health care clinical services: systematic narrative review. JMIR Med Inform. 2024;12(1):e52073. doi:10.2196/52073. [Google Scholar] [PubMed] [CrossRef]

6. Ling Ong JC, Michael C, Ng N, Elangovan K, Ting Tan NY, Jin L, et al. Generative AI and large language models in reducing medication related harm and adverse drug events—a scoping review. MedRxiv. 2024. doi:10.1101/2024.09.13.24313606. [Google Scholar] [CrossRef]

7. Teo ZL, Quek CWN, Wong JLY, Ting DSW. Cybersecurity in the generative artificial intelligence era. Asia Pac J Ophthalmol. 2024;13(4):100091. doi:10.1016/j.apjo.2024.100091. [Google Scholar] [PubMed] [CrossRef]

8. Chen R, Zhang W, Liu B, Chen X, Xu P, Liu S, et al. EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis. arXiv:2411.10004. 2024. [Google Scholar]

9. Ahmed T, Choudhury S. An integrated approach to AI-generated content in e-health. arXiv:2501.16348. 2025. [Google Scholar]

10. Gupta M, Gupta S, Palanisamy G, Nisha JS, Goutham V, Kumar SA, et al. A comprehensive survey on detection of ocular and non-ocular diseases using color fundus images. IEEE Access. 2024;12:194296–321. doi:10.1109/access.2024.3517700. [Google Scholar] [CrossRef]

11. Mochan A, Farinha J, Bailey G, Rodriguez L, Zanca F, Pólvora A, et al. Imaging the Future-Horizon scanning for emerging technologies and breakthrough innovations in the field of medical imaging and AI. Luxembourg: Publications Office of the European Union; 2024. [Google Scholar]

12. Casu M, Guarnera L, Caponnetto P, Battiato S. GenAI mirage: the impostor bias and the deepfake detection challenge in the era of artificial illusions. Forensic Sci Int Digit Investig. 2024;50(3):301795. doi:10.1016/j.fsidi.2024.301795. [Google Scholar] [CrossRef]

13. Kawai K, Murakami T, Mori Y, Ishihara K, Dodo Y, Terada N, et al. Clinically significant nonperfusion areas on widefield OCT angiography in diabetic retinopathy. Ophthalmol Sci. 2023;3(1):100241. doi:10.1016/j.xops.2022.100241. [Google Scholar] [PubMed] [CrossRef]

14. Ashraf M, Shokrollahi S, Salongcay RP, Aiello LP, Silva PS. Diabetic retinopathy and ultrawide field imaging. Semin Ophthalmol. 2020;35(1):56–65. doi:10.1080/08820538.2020.1729818. [Google Scholar] [PubMed] [CrossRef]

15. Wang X, Ji Z, Ma X, Zhang Z, Yi Z, Zheng H, et al. Automated grading of diabetic retinopathy with ultra-widefield fluorescein angiography and deep learning. J Diabetes Res. 2021;2021(1):2611250. doi:10.1155/2021/2611250. [Google Scholar] [PubMed] [CrossRef]

16. Akram MU, Khalid S, Tariq A, Khan SA, Azam F. Detection and classification of retinal lesions for grading of diabetic retinopathy. Comput Biol Med. 2014;45(2):161–71. doi:10.1016/j.compbiomed.2013.11.014. [Google Scholar] [PubMed] [CrossRef]

17. Alwakid G, Gouda W, Humayun M. Enhancement of diabetic retinopathy prognostication using deep learning, CLAHE, and ESRGAN. Diagnostics. 2023;13(14):2375. doi:10.3390/diagnostics13142375. [Google Scholar] [PubMed] [CrossRef]

18. Khan MH, Mungloo-Dilmohamud Z, Jhumka K, Mungloo NZ, Pena-Reyes C. Investigating on data augmentation and generative adversarial networks (GAN s) for diabetic retinopathy. In: 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME); 2022 Nov 16–18; Maldives. Piscataway, NJ, USA: IEEE. p. 1–5. [Google Scholar]

19. Patrini G, Rozza A, Krishna Menon A, Nock R, Qu L. Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of the 2017 IEEE conference on computer vision and pattern recognition; 2017 Jul 21–26; Honolulu, HI, USA. Piscataway, NJ, USA: IEEE. p. 1944–52. [Google Scholar]

20. Dgani Y, Greenspan H, Goldberger J. Training a neural network based on unreliable human annotation of medical images. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). Piscataway, NJ, USA: Piscataway, NJ, USA: IEEE; 2018. p. 39–42. [Google Scholar]

21. Mehralian M, Karasfi B. RDCGAN: unsupervised representation learning with regularized deep convolutional generative adversarial networks. In: 2018 9th Conference on Artificial Intelligence and Robotics and 2nd Asia-Pacific International Symposium. Piscataway, NJ, USA: IEEE; 2018. p. 31–8. [Google Scholar]

22. Park KB, Choi SH, Lee JY. M-GAN: retinal blood vessel segmentation by balancing losses through stacked deep fully convolutional networks. IEEE Access. 2020;8:146308–22. doi:10.1109/access.2020.3015108. [Google Scholar] [CrossRef]

23. Kamran SA, Hossain KF, Tavakkoli A, Zuckerbrod SL, Baker SA. Vtgan: semi-supervised retinal image synthesis and disease prediction using vision transformers. In: Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision; 2021 Oct 10–17; Montreal, QC, Canada. Piscataway, NJ, USA: IEEE. p. 3235–45. [Google Scholar]

24. Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, et al. Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning. Westminster, UK: PMLR; 2021. p. 8748–63. [Google Scholar]

25. Kong L, Lian C, Huang D, Li Z, Hu Y, Zhou Q. Breaking the dilemma of medical image-to-image translation. Adv Neural Inf Process Syst. 2021;34:1964–78. [Google Scholar]

26. Nichol AQ, Dhariwal P. Improved denoising diffusion probabilistic models. In: International Conference on Machine Learning. Westminster, UK: PMLR; 2021. p. 8162–71. [Google Scholar]

27. Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, et al. Glide: towards photorealistic image generation and editing with text-guided diffusion models. arXiv:2112.10741. 2021. [Google Scholar]

28. Krichen M. Generative adversarial networks. In: 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT). Piscataway, NJ, USA: IEEE; 2023. p. 1–7. [Google Scholar]

29. Isola P, Zhu JY, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE; 2017. p. 1125–34. [Google Scholar]

30. Poles I, D’arnese E, Cellamare LG, Santambrogio MD, Yi D. Repurposing the image generative potential: exploiting GANs to grade diabetic retinopathy. In: Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE; 2024. p. 2305–14. [Google Scholar]

31. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS’16: Proceedings of the 30th International Conference on Neural Information Processing Systems; 2016 Dec 5–10; Barcelona Spain. Red Hook, NY, USA: Curran Associates Inc. p. 2180–8. [Google Scholar]

32. Yang L, Zhang Z, Song Y, Hong S, Xu R, Zhao Y, et al. Diffusion models: a comprehensive survey of methods and applications. ACM Comput Surv. 2023;56(4):1–39. doi:10.1145/3626235. [Google Scholar] [CrossRef]

33. Saharia C, Chan W, Chang H, Lee C, Ho J, Salimans T, et al. Palette: image-to-image diffusion models. In: ACM SIGGRAPH 2022 Conference Proceedings. New York, NY, USA: ACM; 2022. p. 1–10. [Google Scholar]

34. Ozbey M, Dalmaz O, Dar SUH, Bedel HA, Ozturk S, Gungor A, et al. Unsupervised medical image translation with adversarial diffusion models. IEEE Trans Med Imaging. 2023;42(12):3524–39. doi:10.1109/tmi.2023.3290149. [Google Scholar] [PubMed] [CrossRef]

35. Binkowski M, Sutherland DJ, Arbel M, Gretton A. Demystifying mmd gans. arXiv:1801.01401. 2018. [Google Scholar]

36. Chong MJ, Forsyth D. Effectively unbiased fid and inception score and where to find them. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020 Jun 13–19; Seattle, WA, USA. Piscataway, NJ, USA: IEEE. p. 6070–9. [Google Scholar]

37. Sara U, Akter M, Uddin MS. Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study. J Comput Commun. 2019;7(3):8–18. doi:10.4236/jcc.2019.73002. [Google Scholar] [CrossRef]

38. Gencer K, Gencer G, Ceran TH, Bilir AE, Doǧan M. Photodiagnosis with deep learning: a GAN and autoencoder-based approach for diabetic retinopathy detection. Photodiagn Photodyn Ther. 2025;53(22):104552. doi:10.1016/j.pdpdt.2025.104552. [Google Scholar] [PubMed] [CrossRef]

39. Kalisapudi SSA, Raj VD, Vanam S, Anne JC. Synthesizing realistic ARMD fundus images using generative adversarial networks (GANs). In: International Conference on Intelligent Computing and Communication. Cham, Switzerland: Springer; 2022. p. 587–99. [Google Scholar]

40. Alghamdi M, Abdel-Mottaleb M. Retinal image augmentation using composed GANs. Eng Technol Appl Sci Res. 2024;14(6):18525–31. doi:10.48084/etasr.8964. [Google Scholar] [CrossRef]

41. Fang Z, Chen Z, Wei P, Li W, Zhang S, Elazab A, et al. Uwat-gan: fundus fluorescein angiography synthesis via ultra-wide-angle transformation multi-scale gan. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham, Switzerland: Springer; 2023. p. 745–55. [Google Scholar]

42. Ge R, Fang Z, Wei P, Chen Z, Jiang H, Elazab A, et al. UWAFA-GAN: ultra-wide-angle fluorescein angiography transformation via multi-scale generation and registration enhancement. IEEE J Biomed Health Inform. 2024;28(8):4820–9. doi:10.1109/jbhi.2024.3394597. [Google Scholar] [PubMed] [CrossRef]

43. Abbood SH, Abdull Hamed HN, Mohd Rahim MS, Alaidi AHM, Salim ALRikabi HTH. DR-LL Gan: diabetic retinopathy lesions synthesis using generative adversarial network. Int J Online Biomed Eng. 2022;18(3):151–63. doi:10.3991/ijoe.v18i03.28005. [Google Scholar] [CrossRef]

44. Das S, Walia P. Enhancing early diabetic retinopathy detection through synthetic DR1 image generation: a StyleGAN3 approach. arXiv:2501.00954. 2025. [Google Scholar]

45. Anaya-Sanchez H, Altamirano-Robles L, Diaz-Hernandez R, Zapotecas-Martinez S. Wgan-gp for synthetic retinal image generation: enhancing sensor-based medical imaging for classification models. Sensors. 2024;25(1):167. doi:10.3390/s25010167. [Google Scholar] [PubMed] [CrossRef]

46. Pakdelmoez S, Omidikia S, Seyyedsalehi SA, Seyyedsalehi SZ. Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy. arXiv:2409.07422. 2024. [Google Scholar]

47. Liu J, Xu S, He P, Wu S, Luo X, Deng Y, et al. VSG-GAN: a high-fidelity image synthesis method with semantic manipulation in retinal fundus image. Biophys J. 2024;123(17):2815–29. doi:10.1016/j.bpj.2024.02.019. [Google Scholar] [PubMed] [CrossRef]

48. Pour AM, Seyedarabi H, Jahromi SHA, Javadzadeh A. Automatic detection and monitoring of diabetic retinopathy using efficient convolutional neural networks and contrast limited adaptive histogram equalization. IEEE Access. 2020;8:136668–73. doi:10.1109/access.2020.3005044. [Google Scholar] [CrossRef]

49. Kaur J, Kaur P. UNIConv: an enhanced U-Net based InceptionV3 convolutional model for DR semantic segmentation in retinal fundus images. Concurr Comput Pract Exp. 2022;34(21):e7138. doi:10.1002/cpe.7138. [Google Scholar] [CrossRef]

50. Shi D, He S, Yang J, Zheng Y, He M. One-shot retinal artery and vein segmentation via cross-modality pretraining. Ophthalmol Sci. 2024;4(2):100363. doi:10.1016/j.xops.2023.100363. [Google Scholar] [PubMed] [CrossRef]

51. Alcantarilla PF, Nuevo J, Bartoli A. Fast explicit diffusion for accelerated features in nonlinear scale spaces. In: Proceedings of the British Machine Vision Conference (BMVC); 2011 Aug 29–Sep 2; Dundee, UK. Durham, UK: BMVA Press; 2011. p. 1–11. [Google Scholar]

52. Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM. 1981;24(6):381–95. doi:10.1145/358669.358692. [Google Scholar] [CrossRef]

53. Abrahamyan L, Truong AM, Philips W, Deligiannis N. Gradient variance loss for structure-enhanced image super-resolution. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Piscataway, NJ, USA: IEEE; 2022. p. 3219–23. [Google Scholar]

Cite This Article

APA Style

Palaniappan, D., Tak, T.K., Vijayan, K., Maram, B., Kshirsagar, P.R. et al. (2025). Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI. Computer Modeling in Engineering & Sciences, 145(3), 4107–4127. https://doi.org/10.32604/cmes.2025.073387

Vancouver Style

Palaniappan D, Tak TK, Vijayan K, Maram B, Kshirsagar PR, Ahmad N. Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI. Comput Model Eng Sci. 2025;145(3):4107–4127. https://doi.org/10.32604/cmes.2025.073387

IEEE Style

D. Palaniappan, T. K. Tak, K. Vijayan, B. Maram, P. R. Kshirsagar, and N. Ahmad, “Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI,” Comput. Model. Eng. Sci., vol. 145, no. 3, pp. 4107–4127, 2025. https://doi.org/10.32604/cmes.2025.073387

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI

Abstract

Keywords

References

Cite This Article

743

439

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link