8 months ago

Diffusion Model

Video Processing

Method/Architecture

Computer Vision

Hur Junhwa ; Herrmann Charles ; Saxena Saurabh ; Kontkanen Janne ; Lai Wei-Sheng ; Shih Yichang ; Rubinstein Michael ; Fleet David J. ; Sun Deqing

Abstract

Despite the recent progress, existing frame interpolation methods stillstruggle with processing extremely high resolution input and handlingchallenging cases such as repetitive textures, thin objects, and large motion.To address these issues, we introduce a patch-based cascaded pixel diffusionmodel for high resolution frame interpolation, HiFI, that excels in thesescenarios while achieving competitive performance on standard benchmarks.Cascades, which generate a series of images from low to high resolution, canhelp significantly with large or complex motion that require both globalcontext for a coarse solution and detailed context for high resolution output.However, contrary to prior work on cascaded diffusion models which performdiffusion on increasingly large resolutions, we use a single model that alwaysperforms diffusion at the same resolution and upsamples by processing patchesof the inputs and the prior solution. At inference time, this drasticallyreduces memory usage and allows a single model, solving both frameinterpolation (base model's task) and spatial up-sampling, saving training costas well. HiFI excels at high-resolution images and complex repeated texturesthat require global context, achieving comparable or state-of-the-artperformance on various benchmarks (Vimeo, Xiph, X-Test, and SEPE-8K). Wefurther introduce a new dataset, LaMoR, that focuses on particularlychallenging cases, and HiFI significantly outperforms other baselines. Pleasevisit our project page for video results: https://hifi-diffusion.github.io

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Diffusion Model

Video Processing

Method/Architecture

Computer Vision

Hur Junhwa ; Herrmann Charles ; Saxena Saurabh ; Kontkanen Janne ; Lai Wei-Sheng ; Shih Yichang ; Rubinstein Michael ; Fleet David J. ; Sun Deqing

Abstract

Despite the recent progress, existing frame interpolation methods stillstruggle with processing extremely high resolution input and handlingchallenging cases such as repetitive textures, thin objects, and large motion.To address these issues, we introduce a patch-based cascaded pixel diffusionmodel for high resolution frame interpolation, HiFI, that excels in thesescenarios while achieving competitive performance on standard benchmarks.Cascades, which generate a series of images from low to high resolution, canhelp significantly with large or complex motion that require both globalcontext for a coarse solution and detailed context for high resolution output.However, contrary to prior work on cascaded diffusion models which performdiffusion on increasingly large resolutions, we use a single model that alwaysperforms diffusion at the same resolution and upsamples by processing patchesof the inputs and the prior solution. At inference time, this drasticallyreduces memory usage and allows a single model, solving both frameinterpolation (base model's task) and spatial up-sampling, saving training costas well. HiFI excels at high-resolution images and complex repeated texturesthat require global context, achieving comparable or state-of-the-artperformance on various benchmarks (Vimeo, Xiph, X-Test, and SEPE-8K). Wefurther introduce a new dataset, LaMoR, that focuses on particularlychallenging cases, and HiFI significantly outperforms other baselines. Pleasevisit our project page for video results: https://hifi-diffusion.github.io

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion | Papers | HyperAI