Audio Super Resolution using Neural Networks

We introduce a new audio processing technique that increases the samplingrate of signals such as speech or music using deep convolutional neuralnetworks. Our model is trained on pairs of low and high-quality audio examples;at test-time, it predicts missing samples within a low-resolution signal in aninterpolation process similar to image super-resolution. Our method is simpleand does not involve specialized audio processing techniques; in ourexperiments, it outperforms baselines on standard speech and music benchmarksat upscaling ratios of 2x, 4x, and 6x. The method has practical applications intelephony, compression, and text-to-speech generation; it demonstrates theeffectiveness of feed-forward convolutional architectures on an audiogeneration task.