Multimodal Text And Image Classification On 1

Accuracy (%)

Results

Performance results of various models on this benchmark

Model Name	Accuracy (%)	Paper Title	Repository
Early Fusion (Bert + InceptionV3)	92.5	Image and Text fusion for UPMC Food-101 \using BERT and CNNs	-
Late Fusion (Bert + InceptionV3)	84.59	Image and Text fusion for UPMC Food-101 \using BERT and CNNs	-

0 of 2 row(s) selected.