SATS: Self-Attention Transfer for Continual Semantic Segmentation

Continually learning to segment more and more types of image regions is adesired capability for many intelligent systems. However, such continualsemantic segmentation suffers from the same catastrophic forgetting issue as incontinual classification learning. While multiple knowledge distillationstrategies originally for continual classification have been well adapted tocontinual semantic segmentation, they only consider transferring old knowledgebased on the outputs from one or more layers of deep fully convolutionalnetworks. Different from existing solutions, this study proposes to transfer anew type of information relevant to knowledge, i.e. the relationships betweenelements (Eg. pixels or small local regions) within each image which cancapture both within-class and between-class knowledge. The relationshipinformation can be effectively obtained from the self-attention maps in aTransformer-style segmentation model. Considering that pixels belonging to thesame class in each image often share similar visual properties, aclass-specific region pooling is applied to provide more efficient relationshipinformation for knowledge transfer. Extensive evaluations on multiple publicbenchmarks support that the proposed self-attention transfer method can furthereffectively alleviate the catastrophic forgetting issue, and its flexiblecombination with one or more widely adopted strategies significantlyoutperforms state-of-the-art solutions.