Who's Waldo Image Captioning Dataset
Date
3 years ago
Publish URL
License
其他
Categories

Who's Waldo contains 270k image-text pairs and automatically annotates the alignment between the mentioned people and their corresponding visual regions.
The Who's Waldo dataset is constructed from freely licensed images and descriptions from Wikimedia Commons. Who's Waldo is a benchmark dataset for human-centric visual grounding.