Bilateral Reference for High-Resolution Dichotomous Image Segmentation

We introduce a novel bilateral reference framework (BiRefNet) forhigh-resolution dichotomous image segmentation (DIS). It comprises twoessential components: the localization module (LM) and the reconstructionmodule (RM) with our proposed bilateral reference (BiRef). The LM aids inobject localization using global semantic information. Within the RM, weutilize BiRef for the reconstruction process, where hierarchical patches ofimages provide the source reference and gradient maps serve as the targetreference. These components collaborate to generate the final predicted maps.We also introduce auxiliary gradient supervision to enhance focus on regionswith finer details. Furthermore, we outline practical training strategiestailored for DIS to improve map quality and training process. To validate thegeneral applicability of our approach, we conduct extensive experiments on fourtasks to evince that BiRefNet exhibits remarkable performance, outperformingtask-specific cutting-edge methods across all benchmarks. Our codes areavailable at https://github.com/ZhengPeng7/BiRefNet.