Date

5 years ago

Organization

Publish URL

ai.google.com

Paper URL

www.aclweb.org

Tags

Object Detection

The dataset was released by Google in 2018 and includes 3.3 million image-caption pairs. The team created an automatic pipeline to extract, filter, and process candidate image and text pairs from billions of web pages. The dataset is divided into training, validation and test sets. The training set consists of 3,318,333 image URL/title pairs, and the total number of token types (i.e. vocabulary) in the title is 51,201. Each title contains an average of 10.3 tokens, and the validation set consists of 15,840 image URL/title pairs. In addition, the team provided machine-generated image labels for 2,007,528 image URL/title pairs in the training set. Related papers: Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset Discuss on Discord

Date

5 years ago

Organization

Publish URL

ai.google.com

Paper URL

www.aclweb.org

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Conceptual Captions Dataset (CC12M)

Build AI with AI

HyperAI Newsletters

Command Palette

Conceptual Captions Dataset (CC12M)

Related Datasets

Groundsource Global Flood Events Dataset

Vehicles OpenImages Vehicle Image Dataset

Hand Gestures Labbled Gesture Car Game Dataset

X-ray Contraband Detection Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Conceptual Captions Dataset (CC12M)

Related Datasets

Groundsource Global Flood Events Dataset

Vehicles OpenImages Vehicle Image Dataset

Hand Gestures Labbled Gesture Car Game Dataset

X-ray Contraband Detection Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

Groundsource Global Flood Events Dataset

Vehicles OpenImages Vehicle Image Dataset

Hand Gestures Labbled Gesture Car Game Dataset

X-ray Contraband Detection Dataset

Related Datasets

Groundsource Global Flood Events Dataset

Vehicles OpenImages Vehicle Image Dataset

Hand Gestures Labbled Gesture Car Game Dataset

X-ray Contraband Detection Dataset