CxC Image Captioning Dataset

CxC stands for Crisscrossed Captions, which is an image caption dataset containing 247,315 manually labeled annotations.
This dataset extends the development and testing scope of the MS-COCO dataset using semantic similarity ratings for image-text pairs, text-text pairs, and image-image pairs.