Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots

Precise and rapid delineation of sharp boundaries and robust semantics isessential for numerous downstream robotic tasks, such as robot grasping andmanipulation, real-time semantic mapping, and online sensor calibrationperformed on edge computing units. Although boundary detection and semanticsegmentation are complementary tasks, most studies focus on lightweight modelsfor semantic segmentation but overlook the critical role of boundary detection.In this work, we introduce Mobile-Seed, a lightweight, dual-task frameworktailored for simultaneous semantic segmentation and boundary detection. Ourframework features a two-stream encoder, an active fusion decoder (AFD) and adual-task regularization approach. The encoder is divided into two pathways:one captures category-aware semantic information, while the other discernsboundaries from multi-scale features. The AFD module dynamically adapts thefusion of semantic and boundary information by learning channel-wiserelationships, allowing for precise weight assignment of each channel.Furthermore, we introduce a regularization loss to mitigate the conflicts indual-task learning and deep diversity supervision. Compared to existingmethods, the proposed Mobile-Seed offers a lightweight framework tosimultaneously improve semantic segmentation performance and accurately locateobject boundaries. Experiments on the Cityscapes dataset have shown thatMobile-Seed achieves notable improvement over the state-of-the-art (SOTA)baseline by 2.2 percentage points (pp) in mIoU and 4.2 pp in mF-score, whilemaintaining an online inference speed of 23.9 frames-per-second (FPS) with1024x2048 resolution input on an RTX 2080 Ti GPU. Additional experiments onCamVid and PASCAL Context datasets confirm our method's generalizability. Codeand additional results are publicly available athttps://whu-usi3dv.github.io/Mobile-Seed/.