Yuzhen Liu1,2, Jiasheng Yin1,2, Yixuan Chen1,2, Jin Wang1,2, Xiaolan Zhou1,2, Xiaoliang Wang1,2,*
CMC-Computers, Materials & Continua, Vol.85, No.2, pp. 3965-3980, 2025, DOI:10.32604/cmc.2025.068500
- 23 September 2025
Abstract Multi-instance image generation remains a challenging task in the field of computer vision. While existing diffusion models demonstrate impressive fidelity in image generation, they often struggle with precisely controlling each object’s shape, pose, and size. Methods like layout-to-image and mask-to-image provide spatial guidance but frequently suffer from object shape distortion, overlaps, and poor consistency, particularly in complex scenes with multiple objects. To address these issues, we introduce PolyDiffusion, a contour-based diffusion framework that encodes each object’s contour as a boundary-coordinate sequence, decoupling object shapes and positions. This approach allows for better control over object geometry… More >