2 months ago
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
Schlüter, Jan ; Gutenbrunner, Gerald

Abstract
In audio classification, differentiable auditory filterbanks with fewparameters cover the middle ground between hard-coded spectrograms and rawaudio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined withPer-Channel Energy Normalization (PCEN), has shown promising results, but iscomputationally expensive. With inhomogeneous convolution kernel sizes andstrides, and by replacing PCEN with better parallelizable operations, we canreach similar results more efficiently. In experiments on six audioclassification tasks, our frontend matches the accuracy of LEAF at 3% of thecost, but both fail to consistently outperform a fixed mel filterbank. Thequest for learnable audio frontends is not solved.