Disentangling Hate in Online Memes

Hateful and offensive content detection has been extensively explored in asingle modality such as text. However, such toxic information could also becommunicated via multimodal content such as online memes. Therefore, detectingmultimodal hateful content has recently garnered much attention in academic andindustry research communities. This paper aims to contribute to this emergingresearch topic by proposing DisMultiHate, which is a novel framework thatperformed the classification of multimodal hateful content. Specifically,DisMultiHate is designed to disentangle target entities in multimodal memes toimprove hateful content classification and explainability. We conduct extensiveexperiments on two publicly available hateful and offensive memes datasets. Ourexperiment results show that DisMultiHate is able to outperformstate-of-the-art unimodal and multimodal baselines in the hateful memeclassification task. Empirical case studies were also conducted to demonstrateDisMultiHate's ability to disentangle target entities in memes and ultimatelyshowcase DisMultiHate's explainability of the multimodal hateful contentclassification task.