HyperAI

Cops-Ref Object Reference Understanding Dataset

Date

2 years ago

Organization

The University of Hong Kong

Publish URL

github.com

License

其他

Download Help
特色图像

Cops-Ref stands for Compositional Referring Expression Comprehension, which is a visual reasoning image dataset about the object reference understanding. The dataset contains 75,299 real images, 148,712 text descriptions, and 1,307,885 candidate regions.

This dataset has two main features. One is a new text generation engine that can combine reasoning logic and visual features to generate text descriptions of varying degrees of complexity. The other is a new test setting that interferes with semantically similar visual images during the test.