2 months ago

NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

Liu, Chenyu ; Pan, Jia ; Hu, Jinshui ; Yin, Baocai ; Yin, Bing ; Chen, Mingjun ; Liu, Cong ; Du, Jun ; Liu, Qingfeng

Abstract

Recently, Handwritten Mathematical Expression Recognition (HMER) has gainedconsiderable attention in pattern recognition for its diverse applications indocument understanding. Current methods typically approach HMER as animage-to-sequence generation task within an autoregressive (AR) encoder-decoderframework. However, these approaches suffer from several drawbacks: 1) a lackof overall language context, limiting information utilization beyond thecurrent decoding step; 2) error accumulation during AR decoding; and 3) slowdecoding speed. To tackle these problems, this paper makes the first attempt tobuild a novel bottom-up Non-AutoRegressive Modeling approach for HMER, calledNAMER. NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel GraphDecoder (PGD). Initially, the VAT tokenizes visible symbols and local relationsat a coarse level. Subsequently, the PGD refines all tokens and establishesconnectivities in parallel, leveraging comprehensive visual and linguisticcontexts. Experiments on CROHME 2014/2016/2019 and HME100K datasets demonstratethat NAMER not only outperforms the current state-of-the-art (SOTA) methods onExpRate by 1.93%/2.35%/1.49%/0.62%, but also achieves significant speedups of13.7x and 6.7x faster in decoding time and overall FPS, proving theeffectiveness and efficiency of NAMER.