HyperAI

Multimodal Forgery Detection Method R-MFDN

R-MFDN stands for Reference-assisted Multimodal Forgery Detection Network, a multimodal forgery detection method proposed by Fudan University, China Electronics Jinxin and Shanghai Intelligent Visual Computing Collaborative Innovation Center in 2024. This method uses rich identity information to mine cross-modal inconsistencies for forgery detection. R-MFDN consists of three main parts: multimodal feature extraction module, feature information fusion module and forgery identification module. It extracts and fuses features by combining video coding, audio coding and temporal Transformer model to perform forgery identification.

The innovation of this method is that it not only focuses on single-modality forgery detection, but also enhances the model's sensitivity to forged content through cross-modal contrastive learning loss function and identity-driven contrastive learning loss function. This method has shown strong identification capabilities in multimodal deep fake scenarios, especially in identity forgery scenarios such as AI face-changing and voice-stuffing.

Related papersIdentity-Driven Multimedia Forgery Detection via Reference Assistance" has been accepted by ACM MultiMedia 2024, a top international conference in the field of multimedia, and an oral report was given at the conference. The study also built a high-quality AI face-changing and voice-changing dataset IDForge, which can be used toApplyGet the data.