Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

CAFE-GAN: CLIP-Projected GAN with Attention-Aware Generation and Multi-Scale Discrimination

Xuanhong Wang¹, Hongyu Guo¹, Jiazhen Li¹, Mingchen Wang¹, Xian Wang¹, Yijun Zhang^2,*

CMC-Computers, Materials & Continua, Vol.86, No.1, pp. 1-19, 2026, DOI:10.32604/cmc.2025.069482 - 10 November 2025

Abstract Over the past decade, large-scale pre-trained autoregressive and diffusion models rejuvenated the field of text-guided image generation. However, these models require enormous datasets and parameters, and their multi-step generation processes are often inefficient and difficult to control. To address these challenges, we propose CAFE-GAN, a CLIP-Projected GAN with Attention-Aware Generation and Multi-Scale Discrimination, which incorporates a pre-trained CLIP model along with several key architectural innovations. First, we embed a coordinate attention mechanism into the generator to capture long-range dependencies and enhance feature representation. Second, we introduce a trainable linear projection layer after the CLIP text… More >

Open Access

ARTICLE

Enhancement of Medical Imaging Technique for Diabetic Retinopathy: Realistic Synthetic Image Generation Using GenAI

Damodharan Palaniappan¹, Tan Kuan Tak², K. Vijayan³, Balajee Maram⁴, Pravin R Kshirsagar⁵, Naim Ahmad^6,*

CMES-Computer Modeling in Engineering & Sciences, Vol.145, No.3, pp. 4107-4127, 2025, DOI:10.32604/cmes.2025.073387 - 23 December 2025

Abstract A phase-aware cross-modal framework is presented that synthesizes UWF_FA from non-invasive UWF_RI for diabetic retinopathy (DR) stratification. A curated cohort of 1198 patients (2915 UWF_RI and 17,854 UWF_FA images) with strict registration quality supports training across three angiographic phases (initial, mid, final). The generator is based on a modified pix2pixHD with an added Gradient Variance Loss to better preserve microvasculature, and is evaluated using MAE, PSNR, SSIM, and MS-SSIM on held-out pairs. Quantitatively, the mid phase achieves the lowest MAE (98.76 ± 42.67), while SSIM remains high across phases. Expert review shows substantial agreement (Cohen’s More >

Open Access

ARTICLE

Integrating Speech-to-Text for Image Generation Using Generative Adversarial Networks

Smita Mahajan¹, Shilpa Gite^1,2, Biswajeet Pradhan^3,*, Abdullah Alamri⁴, Shaunak Inamdar⁵, Deva Shriyansh⁵, Akshat Ashish Shah⁵, Shruti Agarwal⁵

CMES-Computer Modeling in Engineering & Sciences, Vol.143, No.2, pp. 2001-2026, 2025, DOI:10.32604/cmes.2025.058456 - 30 May 2025

Abstract The development of generative architectures has resulted in numerous novel deep-learning models that generate images using text inputs. However, humans naturally use speech for visualization prompts. Therefore, this paper proposes an architecture that integrates speech prompts as input to image-generation Generative Adversarial Networks (GANs) model, leveraging Speech-to-Text translation along with the CLIP + VQGAN model. The proposed method involves translating speech prompts into text, which is then used by the Contrastive Language-Image Pretraining (CLIP) + Vector Quantized Generative Adversarial Network (VQGAN) model to generate images. This paper outlines the steps required to implement such a… More >

Open Access

ARTICLE

Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation

Jianxin Feng^1,2,*, Xiaoyao Liu^1,2

CMC-Computers, Materials & Continua, Vol.83, No.2, pp. 2087-2107, 2025, DOI:10.32604/cmc.2025.060252 - 16 April 2025

Abstract As a form of discrete representation learning, Vector Quantized Variational Autoencoders (VQ-VAE) have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity. However, existing VQ-VAEs often perform quantization in the spatial domain, ignoring global structural information and potentially suffering from codebook collapse and information coupling issues. This paper proposes a frequency quantized variational autoencoder (FQ-VAE) to address these issues. The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform (2D-FFT) and performs adaptive quantization on these frequency components… More >

Open Access

ARTICLE

A Perspective-Aware Cyclist Image Generation Method for Perception Development of Autonomous Vehicles

Beike Yu¹, Dafang Wang^1,*, Xing Cui², Bowen Yang¹

CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 2687-2702, 2025, DOI:10.32604/cmc.2024.059594 - 17 February 2025

Abstract Realistic urban scene generation has been extensively studied for the sake of the development of autonomous vehicles. However, the research has primarily focused on the synthesis of vehicles and pedestrians, while the generation of cyclists is rarely presented due to its complexity. This paper proposes a perspective-aware and realistic cyclist generation method via object retrieval. Images, semantic maps, and depth labels of objects are first collected from existing datasets, categorized by class and perspective, and calculated by an algorithm newly designed according to imaging principles. During scene generation, objects with the desired class and perspective… More >

Open Access

ARTICLE

HRAM-VITON: High-Resolution Virtual Try-On with Attention Mechanism

Yue Chen¹, Xiaoman Liang^1,2,*, Mugang Lin^1,2, Fachao Zhang¹, Huihuang Zhao^1,2

CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 2753-2768, 2025, DOI:10.32604/cmc.2024.059530 - 17 February 2025

Abstract The objective of image-based virtual try-on is to seamlessly integrate clothing onto a target image, generating a realistic representation of the character in the specified attire. However, existing virtual try-on methods frequently encounter challenges, including misalignment between the body and clothing, noticeable artifacts, and the loss of intricate garment details. To overcome these challenges, we introduce a two-stage high-resolution virtual try-on framework that integrates an attention mechanism, comprising a garment warping stage and an image generation stage. During the garment warping stage, we incorporate a channel attention mechanism to effectively retain the critical features of… More >

Open Access

ARTICLE

Evaluation of Modern Generative Networks for EchoCG Image Generation

Sabina Rakhmetulayeva^1,*, Zhandos Zhanabekov², Aigerim Bolshibayeva³

CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4503-4523, 2024, DOI:10.32604/cmc.2024.057974 - 19 December 2024

Abstract The applications of machine learning (ML) in the medical domain are often hindered by the limited availability of high-quality data. To address this challenge, we explore the synthetic generation of echocardiography images (echoCG) using state-of-the-art generative models. We conduct a comprehensive evaluation of three prominent methods: Cycle-consistent generative adversarial network (CycleGAN), Contrastive Unpaired Translation (CUT), and Stable Diffusion 1.5 with Low-Rank Adaptation (LoRA). Our research presents the data generation methodology, image samples, and evaluation strategy, followed by an extensive user study involving licensed cardiologists and surgeons who assess the perceived quality and medical soundness of More > Graphic Abstract

Evaluation of Modern Generative Networks for EchoCG Image Generation

Open Access

ARTICLE

An Enhanced GAN for Image Generation

Chunwei Tian^1,2,3,4, Haoyang Gao^2,3, Pengwei Wang², Bob Zhang^1,*

CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 105-118, 2024, DOI:10.32604/cmc.2024.052097 - 18 July 2024

Abstract Generative adversarial networks (GANs) with gaming abilities have been widely applied in image generation. However, gamistic generators and discriminators may reduce the robustness of the obtained GANs in image generation under varying scenes. Enhancing the relation of hierarchical information in a generation network and enlarging differences of different network architectures can facilitate more structural information to improve the generation effect for image generation. In this paper, we propose an enhanced GAN via improving a generator for image generation (EIGGAN). EIGGAN applies a spatial attention to a generator to extract salient information to enhance the truthfulness… More >

Open Access

ARTICLE

An Interactive Collaborative Creation System for Shadow Puppets Based on Smooth Generative Adversarial Networks

Cheng Yang^1,2, Miaojia Lou^2,*, Xiaoyu Chen^1,2, Zixuan Ren¹

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 4107-4126, 2024, DOI:10.32604/cmc.2024.049183 - 20 June 2024

Abstract Chinese shadow puppetry has been recognized as a world intangible cultural heritage. However, it faces substantial challenges in its preservation and advancement due to the intricate and labor-intensive nature of crafting shadow puppets. To ensure the inheritance and development of this cultural heritage, it is imperative to enable traditional art to flourish in the digital era. This paper presents an Interactive Collaborative Creation System for shadow puppets, designed to facilitate the creation of high-quality shadow puppet images with greater ease. The system comprises four key functions: Image contour extraction, intelligent reference recommendation, generation network, and… More >

Open Access

ARTICLE

Restoration of the JPEG Maximum Lossy Compressed Face Images with Hourglass Block-GAN

Jongwook Si¹, Sungyoung Kim^2,*

CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 2893-2908, 2024, DOI:10.32604/cmc.2023.046081 - 26 March 2024

Abstract In the context of high compression rates applied to Joint Photographic Experts Group (JPEG) images through lossy compression techniques, image-blocking artifacts may manifest. This necessitates the restoration of the image to its original quality. The challenge lies in regenerating significantly compressed images into a state in which these become identifiable. Therefore, this study focuses on the restoration of JPEG images subjected to substantial degradation caused by maximum lossy compression using Generative Adversarial Networks (GAN). The generator in this network is based on the U-Net architecture. It features a new hourglass structure that preserves the characteristics… More >

Displaying 1-10 on page 1 of 14. Per Page

View

1034

Download

376

View

664

Download

385

View

1989

Download

724

View

1472

Download

745

View

958

Download

393

View

1253

Download

459

View

1119

Download

620

View

2063

Download

963

View

1728

Download

783

View

2127

Download

961

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: