Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (13)
  • Open Access

    ARTICLE

    CAFE-GAN: CLIP-Projected GAN with Attention-Aware Generation and Multi-Scale Discrimination

    Xuanhong Wang1, Hongyu Guo1, Jiazhen Li1, Mingchen Wang1, Xian Wang1, Yijun Zhang2,*

    CMC-Computers, Materials & Continua, Vol.86, No.1, pp. 1-19, 2026, DOI:10.32604/cmc.2025.069482 - 10 November 2025

    Abstract Over the past decade, large-scale pre-trained autoregressive and diffusion models rejuvenated the field of text-guided image generation. However, these models require enormous datasets and parameters, and their multi-step generation processes are often inefficient and difficult to control. To address these challenges, we propose CAFE-GAN, a CLIP-Projected GAN with Attention-Aware Generation and Multi-Scale Discrimination, which incorporates a pre-trained CLIP model along with several key architectural innovations. First, we embed a coordinate attention mechanism into the generator to capture long-range dependencies and enhance feature representation. Second, we introduce a trainable linear projection layer after the CLIP text… More >

  • Open Access

    ARTICLE

    Integrating Speech-to-Text for Image Generation Using Generative Adversarial Networks

    Smita Mahajan1, Shilpa Gite1,2, Biswajeet Pradhan3,*, Abdullah Alamri4, Shaunak Inamdar5, Deva Shriyansh5, Akshat Ashish Shah5, Shruti Agarwal5

    CMES-Computer Modeling in Engineering & Sciences, Vol.143, No.2, pp. 2001-2026, 2025, DOI:10.32604/cmes.2025.058456 - 30 May 2025

    Abstract The development of generative architectures has resulted in numerous novel deep-learning models that generate images using text inputs. However, humans naturally use speech for visualization prompts. Therefore, this paper proposes an architecture that integrates speech prompts as input to image-generation Generative Adversarial Networks (GANs) model, leveraging Speech-to-Text translation along with the CLIP + VQGAN model. The proposed method involves translating speech prompts into text, which is then used by the Contrastive Language-Image Pretraining (CLIP) + Vector Quantized Generative Adversarial Network (VQGAN) model to generate images. This paper outlines the steps required to implement such a… More >

  • Open Access

    ARTICLE

    Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation

    Jianxin Feng1,2,*, Xiaoyao Liu1,2

    CMC-Computers, Materials & Continua, Vol.83, No.2, pp. 2087-2107, 2025, DOI:10.32604/cmc.2025.060252 - 16 April 2025

    Abstract As a form of discrete representation learning, Vector Quantized Variational Autoencoders (VQ-VAE) have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity. However, existing VQ-VAEs often perform quantization in the spatial domain, ignoring global structural information and potentially suffering from codebook collapse and information coupling issues. This paper proposes a frequency quantized variational autoencoder (FQ-VAE) to address these issues. The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform (2D-FFT) and performs adaptive quantization on these frequency components… More >

  • Open Access

    ARTICLE

    A Perspective-Aware Cyclist Image Generation Method for Perception Development of Autonomous Vehicles

    Beike Yu1, Dafang Wang1,*, Xing Cui2, Bowen Yang1

    CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 2687-2702, 2025, DOI:10.32604/cmc.2024.059594 - 17 February 2025

    Abstract Realistic urban scene generation has been extensively studied for the sake of the development of autonomous vehicles. However, the research has primarily focused on the synthesis of vehicles and pedestrians, while the generation of cyclists is rarely presented due to its complexity. This paper proposes a perspective-aware and realistic cyclist generation method via object retrieval. Images, semantic maps, and depth labels of objects are first collected from existing datasets, categorized by class and perspective, and calculated by an algorithm newly designed according to imaging principles. During scene generation, objects with the desired class and perspective… More >

  • Open Access

    ARTICLE

    HRAM-VITON: High-Resolution Virtual Try-On with Attention Mechanism

    Yue Chen1, Xiaoman Liang1,2,*, Mugang Lin1,2, Fachao Zhang1, Huihuang Zhao1,2

    CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 2753-2768, 2025, DOI:10.32604/cmc.2024.059530 - 17 February 2025

    Abstract The objective of image-based virtual try-on is to seamlessly integrate clothing onto a target image, generating a realistic representation of the character in the specified attire. However, existing virtual try-on methods frequently encounter challenges, including misalignment between the body and clothing, noticeable artifacts, and the loss of intricate garment details. To overcome these challenges, we introduce a two-stage high-resolution virtual try-on framework that integrates an attention mechanism, comprising a garment warping stage and an image generation stage. During the garment warping stage, we incorporate a channel attention mechanism to effectively retain the critical features of… More >

  • Open Access

    ARTICLE

    Evaluation of Modern Generative Networks for EchoCG Image Generation

    Sabina Rakhmetulayeva1,*, Zhandos Zhanabekov2, Aigerim Bolshibayeva3

    CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4503-4523, 2024, DOI:10.32604/cmc.2024.057974 - 19 December 2024

    Abstract The applications of machine learning (ML) in the medical domain are often hindered by the limited availability of high-quality data. To address this challenge, we explore the synthetic generation of echocardiography images (echoCG) using state-of-the-art generative models. We conduct a comprehensive evaluation of three prominent methods: Cycle-consistent generative adversarial network (CycleGAN), Contrastive Unpaired Translation (CUT), and Stable Diffusion 1.5 with Low-Rank Adaptation (LoRA). Our research presents the data generation methodology, image samples, and evaluation strategy, followed by an extensive user study involving licensed cardiologists and surgeons who assess the perceived quality and medical soundness of More > Graphic Abstract

    Evaluation of Modern Generative Networks for EchoCG Image Generation

  • Open Access

    ARTICLE

    An Enhanced GAN for Image Generation

    Chunwei Tian1,2,3,4, Haoyang Gao2,3, Pengwei Wang2, Bob Zhang1,*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 105-118, 2024, DOI:10.32604/cmc.2024.052097 - 18 July 2024

    Abstract Generative adversarial networks (GANs) with gaming abilities have been widely applied in image generation. However, gamistic generators and discriminators may reduce the robustness of the obtained GANs in image generation under varying scenes. Enhancing the relation of hierarchical information in a generation network and enlarging differences of different network architectures can facilitate more structural information to improve the generation effect for image generation. In this paper, we propose an enhanced GAN via improving a generator for image generation (EIGGAN). EIGGAN applies a spatial attention to a generator to extract salient information to enhance the truthfulness… More >

  • Open Access

    ARTICLE

    An Interactive Collaborative Creation System for Shadow Puppets Based on Smooth Generative Adversarial Networks

    Cheng Yang1,2, Miaojia Lou2,*, Xiaoyu Chen1,2, Zixuan Ren1

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 4107-4126, 2024, DOI:10.32604/cmc.2024.049183 - 20 June 2024

    Abstract Chinese shadow puppetry has been recognized as a world intangible cultural heritage. However, it faces substantial challenges in its preservation and advancement due to the intricate and labor-intensive nature of crafting shadow puppets. To ensure the inheritance and development of this cultural heritage, it is imperative to enable traditional art to flourish in the digital era. This paper presents an Interactive Collaborative Creation System for shadow puppets, designed to facilitate the creation of high-quality shadow puppet images with greater ease. The system comprises four key functions: Image contour extraction, intelligent reference recommendation, generation network, and… More >

  • Open Access

    ARTICLE

    Restoration of the JPEG Maximum Lossy Compressed Face Images with Hourglass Block-GAN

    Jongwook Si1, Sungyoung Kim2,*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 2893-2908, 2024, DOI:10.32604/cmc.2023.046081 - 26 March 2024

    Abstract In the context of high compression rates applied to Joint Photographic Experts Group (JPEG) images through lossy compression techniques, image-blocking artifacts may manifest. This necessitates the restoration of the image to its original quality. The challenge lies in regenerating significantly compressed images into a state in which these become identifiable. Therefore, this study focuses on the restoration of JPEG images subjected to substantial degradation caused by maximum lossy compression using Generative Adversarial Networks (GAN). The generator in this network is based on the U-Net architecture. It features a new hourglass structure that preserves the characteristics… More >

  • Open Access

    ARTICLE

    A Novel Unsupervised MRI Synthetic CT Image Generation Framework with Registration Network

    Liwei Deng1, Henan Sun1, Jing Wang2, Sijuan Huang3, Xin Yang3,*

    CMC-Computers, Materials & Continua, Vol.77, No.2, pp. 2271-2287, 2023, DOI:10.32604/cmc.2023.039062 - 29 November 2023

    Abstract In recent years, radiotherapy based only on Magnetic Resonance (MR) images has become a hot spot for radiotherapy planning research in the current medical field. However, functional computed tomography (CT) is still needed for dose calculation in the clinic. Recent deep-learning approaches to synthesized CT images from MR images have raised much research interest, making radiotherapy based only on MR images possible. In this paper, we proposed a novel unsupervised image synthesis framework with registration networks. This paper aims to enforce the constraints between the reconstructed image and the input image by registering the reconstructed… More >

Displaying 1-10 on page 1 of 13. Per Page