HyperAIHyperAI

Command Palette

Search for a command to run...

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

Keyu An Xian Shi Shiliang Zhang

Abstract

Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance. Nevertheless, RNN-T training requires large time and computation resources as RNN-T loss calculation is slow and consumes a lot of memory. Another limitation of RNN-T is that it tends to access more contexts for better performance, thus leading to higher emission latency in streaming ASR. In this paper we propose boundary-aware transducer (BAT) for memory-efficient and low-latency ASR. In BAT, the lattice for RNN-T loss computation is reduced to a restricted region selected by the alignment from continuous integrate-and-fire (CIF), which is jointly optimized with the RNN-T model. Extensive experiments demonstrate that compared to RNN-T, BAT reduces time and memory consumption significantly in training, and achieves good CER-latency trade-offs in inference for streaming ASR.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp