TY  - EJOU
AU  - Xie, Qing 
AU  - Yu, Ruiyun 

TI  - Image Style Transfer for Exhibition Hall Design Based on Multimodal Semantic-Enhanced Algorithm
T2  - Computers, Materials \& Continua

PY  - 2025
VL  - 84
IS  - 1
SN  - 1546-2226

AB  - Although existing style transfer techniques have made significant progress in the field of image generation, there are still some challenges in the field of exhibition hall design. The existing style transfer methods mainly focus on the transformation of single dimensional features, but ignore the deep integration of content and style features in exhibition hall design. In addition, existing methods are deficient in detail retention, especially in accurately capturing and reproducing local textures and details while preserving the content image structure. In addition, point-based attention mechanisms tend to ignore the complexity and diversity of image features in multi-dimensional space, resulting in alignment problems between features in different semantic areas, resulting in inconsistent stylistic features in content areas. In this context, this paper proposes a semantic-enhanced multimodal style transfer algorithm tailored for exhibition hall design. The proposed approach leverages a multimodal encoder architecture to integrate information from text, source images, and style images, using separate encoder modules for each modality to capture shallow, deep, and semantic features. A novel Style Transfer Convolution (STConv) convolutional kernel, based on the Visual Geometry Group (VGG) 19 network, is introduced to improve feature extraction in style transfer. Additionally, an enhanced Transformer encoder is incorporated to capture contextual semantic information within images, while the CLIP model is employed for text data processing. A hybrid attention module is designed to precisely capture style features, achieving multimodal feature fusion via a diffusion model that generates exhibition hall design images aligned with stylistic requirements. Quantitative experiments show that compared with the most advanced algorithms, the proposed method has achieved significant performance improvement on both Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) indexes. For example, on the ExpoArchive dataset, the proposed method has a FID value of 87.9 and a KID value of 1.98, which is significantly superior to other methods.
KW  - Exhibition hall design; style transfer; multimodal fusion; semantic enhancement; diffusion model

DO  - 10.32604/cmc.2025.062712