Memory-Augmented Non-Local Attention for Video Super-Resolution

In this paper, we propose a novel video super-resolution method that aims atgenerating high-fidelity high-resolution (HR) videos from low-resolution (LR)ones. Previous methods predominantly leverage temporal neighbor frames toassist the super-resolution of the current frame. Those methods achieve limitedperformance as they suffer from the challenge in spatial frame alignment andthe lack of useful information from similar LR neighbor frames. In contrast, wedevise a cross-frame non-local attention mechanism that allows videosuper-resolution without frame alignment, leading to be more robust to largemotions in the video. In addition, to acquire the information beyond neighborframes, we design a novel memory-augmented attention module to memorize generalvideo details during the super-resolution training. Experimental resultsindicate that our method can achieve superior performance on large motionvideos comparing to the state-of-the-art methods without aligning frames. Oursource code will be released.