HyperAIHyperAI

Command Palette

Search for a command to run...

Multimodal Unsupervised Image-to-Image Translation

Xun Huang Ming-Yu Liu Serge Belongie Jan Kautz

Abstract

Unsupervised image-to-image translation is an important and challengingproblem in computer vision. Given an image in the source domain, the goal is tolearn the conditional distribution of corresponding images in the targetdomain, without seeing any pairs of corresponding images. While thisconditional distribution is inherently multimodal, existing approaches make anoverly simplified assumption, modeling it as a deterministic one-to-onemapping. As a result, they fail to generate diverse outputs from a given sourcedomain image. To address this limitation, we propose a Multimodal UnsupervisedImage-to-image Translation (MUNIT) framework. We assume that the imagerepresentation can be decomposed into a content code that is domain-invariant,and a style code that captures domain-specific properties. To translate animage to another domain, we recombine its content code with a random style codesampled from the style space of the target domain. We analyze the proposedframework and establish several theoretical results. Extensive experiments withcomparisons to the state-of-the-art approaches further demonstrates theadvantage of the proposed framework. Moreover, our framework allows users tocontrol the style of translation outputs by providing an example style image.Code and pretrained models are available at https://github.com/nvlabs/MUNIT


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp