Make Fashion Real: Texture-preserving Rendered-to-Real Translation with Diffusion Models

Abstract

Bridging the longstanding gap between rendering (computer graphics) and the real world remains a significant challenge. Particularly in the realm of fashion, existing methods still struggle to produce realistic and consistent results. In this paper, we introduce a novel diffusion-based framework that effectively translates rendered images into realistic counterparts, ensuring the preservation of texture details. Our approach comprises two primary stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG). In the DKI stage, the rendered-to-real domain specific knowledge is injected into the pretrained text-to-image (T2I) diffusion model through positive domain finetuning and negative domain embedding, enhancing its capability to generate realistic images. Subsequently, the RIG stage employs a Texture-preserving Attention Control (TAC) specifically designed to maintain fine-grained clothing textures during image generation. Additionally, we introduce a new dataset named SynFashion, featuring detailed digital high-resolution clothing with diverse textures, which will be made public available for rendered fashion applications. Extensive experimental results show the superiority and effectiveness of our method in rendered-to-real image translation.

Publication
Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems (NeurIPS)
Huamin Wang
Huamin Wang
Chief Scientist

My research interests include physics-based simulation and modeling, generative AI, numerical analysis and nonlinear optimization.