SCST training, w/ rep. penalty | 10.58 | 30.63 | 17.86 | Training for Diversity in Image Paragraph Captioning | |
Diverse and Coherent Paragraph Generation from Images | 9.43 | 20.93 | 18.62 | Diverse and Coherent Paragraph Generation from Images | - |
RTT-GAN (Semi + Fully) | 9.21 | 20.36 | 18.39 | Recurrent Topic-Transition GAN for Visual Paragraph Generation | - |
Depth-aware Attention Model (DAM) | 6.7 | 17.3 | 13.9 | Look Deeper See Richer: Depth-aware Image Paragraph Captioning | - |
Regions-Hierarchical (ours) | 8.69 | 13.52 | 15.95 | A Hierarchical Approach for Generating Descriptive Image Paragraphs | |