HyperAI

[Summary of Image Segmentation Datasets] Byte Released COCONut, Selected for CVPR 2024, Experience Segment Anything Now!

特色图像

With the continuous development of computer vision technology, image segmentation has shown important application value in many fields. In recent years, various image segmentation datasets have sprung up. Last month,ByteDance released the first large-scale panoramic image segmentation dataset "COCONut".It has injected fresh blood into the research in this field.

This week, HyperAI has compiled and analyzed 10 high-quality image segmentation datasets to better promote the progress of related research.

In addition, the popular project "Segment Anything" on GitHub is also available in the "Tutorials" section of hyper.ai's official website! Come and experience the world of image segmentation!

Run online: https://go.hyper.ai/4GUjy

1.COCONut Large-Scale Image Segmentation Dataset

Publishing Agency:ByteDance

Release time:2024

Estimated size:2.27 GB

Download address:https://go.hyper.ai/D1XHs

COCONut is the first large-scale manually annotated panoramic image segmentation dataset released by ByteDance, containing about 383K images and 5.18 million manually annotated panoramic segmentation masks. This achievement has been selected for CVPR 2024.

2.Pascal Panoptic Parts Panoptic Segmentation Dataset

Publishing Agency:Eindhoven University of Technology

Release time:2021

Estimated size:157.78 MB

Download address:https://go.hyper.ai/KD9NU

This dataset consists of annotations for the part-aware Panoptic Segmentation task on the PASCAL VOC 2010 dataset. The related results have been selected for CVPR 2021.

3.PASCAL-5i small sample image segmentation dataset

Publishing Agency:Georgia Institute of Technology

Release time:2020

Estimated size:112.42 MB

Download address:https://go.hyper.ai/oNGRX

PASCAL-5i is a dataset for evaluating small sample image segmentation. The dataset is divided into 4 parts, each part contains 5 categories, totaling 20 categories.

4.SUN09 image segmentation dataset

Publishing Agency:Massachusetts Institute of Technology

Release time:2010

Estimated size:8.15 GB

Download address:https://go.hyper.ai/PWjWo

The SUN09 dataset consists of 12,000 annotated images covering more than 200 object categories. The dataset contains natural, indoor, and outdoor images. Each image contains an average of 7 different annotated objects, and the average area occupied by each object is 5 % of the image size. The dataset has been published in IEEE CVPR 2010.

5.PASCAL VOC 2011 image segmentation dataset

Publishing Agency:University of Leeds

Release time:2011

Estimated size:1.7 GB

Download address:https://go.hyper.ai/bXb4O

PASCAL VOC 2011 is an image segmentation dataset. The training set contains 2,223 images, consisting of 5,034 target objects; the test set contains 1,111 images and 2,028 target objects. In total, there are more than 5,000 accurately segmented objects for training.

6.PhraseCut language-based image segmentation dataset

Publishing Agency:University of Massachusetts Amherst 

Release time:2020

Download address:https://go.hyper.ai/bvzRm

The PhraseCut dataset contains 77,262 images and 345,486 phrase-region pairs. The dataset is collected from the Visual Genome dataset and uses existing annotations to generate a set of challenging reference phrases, and the corresponding regions of these phrases are manually annotated.

7.MPI3D 3D image separation dataset

Publishing Agency:Max Planck Institute for Intelligent Systems

Release time:2019

Download address:https://go.hyper.ai/JfmOO

MPI stands for Moldflow Plastic Insight, which consists of more than 1 million images of physical 3D objects. The images have seven variable factors, such as the color, shape, size, and position of the objects. This dataset can be used to test representation learning algorithms in simulated and real environments. Related results have been selected for NeurIPS 2019.

8.CryoNuSeg instance segmentation dataset

Publishing Agency:Medical University of Vienna

Release time:2023

Estimated size:160 MB

Download address:https://go.hyper.ai/Ybpbg

CryoNuSeg is a dataset for nuclear instance segmentation in frozen section H&E stained tissue images. The dataset contains images from 10 human organs, has a fixed size of 512×512 pixels, and provides 3 manual annotations to allow measurement of intra-observer and inter-observer variability.

9.TrashCan instance segmentation dataset

Publishing Agency:University Digital Conservancy

Release time:2020

Estimated size:18.3 GB

Download address:https://go.hyper.ai/dxw78

TrashCan is an instance segmentation dataset of underwater garbage, consisting of 7,212 annotated images, recording various underwater garbage, unmanned submersibles, and the activities of seafloor flora and fauna. The annotations of this dataset are in the format of instance segmentation annotations, and its images are from the J-EDI dataset.

10.FSS-1000 small sample image segmentation dataset

Publishing Agency:The Hong Kong University of Science and Technology

Release time:2019

Estimated size:7.56 GB

Download address:https://go.hyper.ai/eTDiv

FSS-1000 is a small sample segmentation dataset containing 1,000 classes. This dataset explores the task of training a model to complete image recognition tasks using only 5 manually annotated images. The dataset contains a large number of objects that have never appeared or been annotated in previous datasets, such as small daily objects, commodities, cartoon characters, logos, etc.

SegmentAnything Tutorial

Segment Anything Model (SAM) is a machine vision model that can generate high-quality image segmentation based on input prompts such as points or boxes, and can be used to generate corresponding masks for all objects in an image. The model is trained on a dataset of 11 million images and 1.1 billion masks, and has strong zero-shot performance on various segmentation tasks, achieving true segmentation of everything.

Run online: https://go.hyper.ai/D1XHs

The above are 10 image segmentation and classification datasets compiled by HyperAI. If you have resources that you want to include on the hyper.ai official website, you are welcome to leave a message or submit your contribution to tell us!

About HyperAI

HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:

* Provide domestic accelerated download nodes for 1200+ public data sets

* Includes 300+ classic and popular online tutorials

* Interpretation of 100+ AI4Science paper cases

* Support 500+ related terms search

* Hosting the first complete Apache TVM Chinese documentation in China

Visit the official website to start your learning journey:

https://hyper.ai