Multimodal (MobileNetV2) | 92.2% | 12M | Multimodal Side-Tuning for Document Classification | |
Transfer Learning from AlexNet, VGG-16, GoogLeNet and ResNet50 | 90.97% | - | Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification | |
AlexNet + spatial pyramidal pooling + image resizing | 90.94% | - | Analysis of Convolutional Neural Networks for Document Image Classification | - |