Open Access iconOpen Access

ARTICLE

PolyDiffusion: A Multi-Objective Optimized Contour-to-Image Diffusion Framework

Yuzhen Liu1,2, Jiasheng Yin1,2, Yixuan Chen1,2, Jin Wang1,2, Xiaolan Zhou1,2, Xiaoliang Wang1,2,*

1 School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, China
2 Sanya Research Institute, Hunan University of Science and Technology, Sanya, 572024, China

* Corresponding Author: Xiaoliang Wang. Email: email

Computers, Materials & Continua 2025, 85(2), 3965-3980. https://doi.org/10.32604/cmc.2025.068500

Abstract

Multi-instance image generation remains a challenging task in the field of computer vision. While existing diffusion models demonstrate impressive fidelity in image generation, they often struggle with precisely controlling each object’s shape, pose, and size. Methods like layout-to-image and mask-to-image provide spatial guidance but frequently suffer from object shape distortion, overlaps, and poor consistency, particularly in complex scenes with multiple objects. To address these issues, we introduce PolyDiffusion, a contour-based diffusion framework that encodes each object’s contour as a boundary-coordinate sequence, decoupling object shapes and positions. This approach allows for better control over object geometry and spatial positioning, which is critical for achieving high-quality multi-instance generation. We formulate the training process as a multi-objective optimization problem, balancing three key objectives: a denoising diffusion loss to maintain overall image fidelity, a cross-attention contour alignment loss to ensure precise shape adherence, and a reward-guided denoising objective that minimizes the Fréchet distance to real images. In addition, the Object Space-Aware Attention module fuses contour tokens with visual features, while a prior-guided fusion mechanism utilizes inter-object spatial relationships and class semantics to enhance consistency across multiple objects. Experimental results on benchmark datasets such as COCO-Stuff and VOC-2012 demonstrate that PolyDiffusion significantly outperforms existing layout-to-image and mask-to-image methods, achieving notable improvements in both image quality and instance-level segmentation accuracy. The implementation of PolyDiffusion is available at (accessed on 06 August 2025).

Keywords

Diffusion models; multi-object generation; multi-objective optimization; contour-to-image

Cite This Article

APA Style
Liu, Y., Yin, J., Chen, Y., Wang, J., Zhou, X. et al. (2025). PolyDiffusion: A Multi-Objective Optimized Contour-to-Image Diffusion Framework. Computers, Materials & Continua, 85(2), 3965–3980. https://doi.org/10.32604/cmc.2025.068500
Vancouver Style
Liu Y, Yin J, Chen Y, Wang J, Zhou X, Wang X. PolyDiffusion: A Multi-Objective Optimized Contour-to-Image Diffusion Framework. Comput Mater Contin. 2025;85(2):3965–3980. https://doi.org/10.32604/cmc.2025.068500
IEEE Style
Y. Liu, J. Yin, Y. Chen, J. Wang, X. Zhou, and X. Wang, “PolyDiffusion: A Multi-Objective Optimized Contour-to-Image Diffusion Framework,” Comput. Mater. Contin., vol. 85, no. 2, pp. 3965–3980, 2025. https://doi.org/10.32604/cmc.2025.068500



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 763

    View

  • 451

    Download

  • 0

    Like

Share Link