HyperAIHyperAI
2 months ago

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Boyuan Jiang, Xiaobin Hu, Donghao Luo, Qingdong He, Chengming Xu, Jinlong Peng, Jiangning Zhang, Chengjie Wang, Yunsheng Wu, Yanwei Fu
FitDiT: Advancing the Authentic Garment Details for High-fidelity
  Virtual Try-on
Abstract

Although image-based virtual try-on has made considerable progress, emergingapproaches still encounter challenges in producing high-fidelity and robustfitting images across diverse scenarios. These methods often struggle withissues such as texture-aware maintenance and size-aware fitting, which hindertheir overall effectiveness. To address these limitations, we propose a novelgarment perception enhancement technique, termed FitDiT, designed forhigh-fidelity virtual try-on using Diffusion Transformers (DiT) allocating moreparameters and attention to high-resolution features. First, to further improvetexture-aware maintenance, we introduce a garment texture extractor thatincorporates garment priors evolution to fine-tune garment feature,facilitating to better capture rich details such as stripes, patterns, andtext. Additionally, we introduce frequency-domain learning by customizing afrequency distance loss to enhance high-frequency garment details. To tacklethe size-aware fitting issue, we employ a dilated-relaxed mask strategy thatadapts to the correct length of garments, preventing the generation of garmentsthat fill the entire mask area during cross-category try-on. Equipped with theabove design, FitDiT surpasses all baselines in both qualitative andquantitative evaluations. It excels in producing well-fitting garments withphotorealistic and intricate details, while also achieving competitiveinference times of 4.57 seconds for a single 1024x768 image after DiT structureslimming, outperforming existing methods.

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on | Latest Papers | HyperAI