[Summary of Image Segmentation Datasets] Byte Released COCONut, Selected for CVPR 2024, Experience Segment Anything Now!

With the continuous development of computer vision technology, image segmentation has shown important application value in many fields. In recent years, various image segmentation datasets have sprung up. Last month,ByteDance released the first large-scale panoramic image segmentation dataset "COCONut".It has injected fresh blood into the research in this field.
This week, HyperAI has compiled and analyzed 10 high-quality image segmentation datasets to better promote the progress of related research.
In addition, the popular project "Segment Anything" on GitHub is also available in the "Tutorials" section of hyper.ai's official website! Come and experience the world of image segmentation!
Run online: https://go.hyper.ai/4GUjy
1.COCONut Large-Scale Image Segmentation Dataset
Publishing Agency:ByteDance
Release time:2024
Estimated size:2.27 GB
Download address:https://go.hyper.ai/D1XHs
COCONut is the first large-scale manually annotated panoramic image segmentation dataset released by ByteDance, containing about 383K images and 5.18 million manually annotated panoramic segmentation masks. This achievement has been selected for CVPR 2024.
2.Pascal Panoptic Parts Panoptic Segmentation Dataset
Publishing Agency:Eindhoven University of Technology
Release time:2021
Estimated size:157.78 MB
Download address:https://go.hyper.ai/KD9NU
This dataset consists of annotations for the part-aware Panoptic Segmentation task on the PASCAL VOC 2010 dataset. The related results have been selected for CVPR 2021.
3.PASCAL-5i small sample image segmentation dataset
Publishing Agency:Georgia Institute of Technology
Release time:2020
Estimated size:112.42 MB
Download address:https://go.hyper.ai/oNGRX
PASCAL-5i is a dataset for evaluating small sample image segmentation. The dataset is divided into 4 parts, each part contains 5 categories, totaling 20 categories.
4.SUN09 image segmentation dataset
Publishing Agency:Massachusetts Institute of Technology
Release time:2010
Estimated size:8.15 GB
Download address:https://go.hyper.ai/PWjWo
The SUN09 dataset consists of 12,000 annotated images covering more than 200 object categories. The dataset contains natural, indoor, and outdoor images. Each image contains an average of 7 different annotated objects, and the average area occupied by each object is 5 % of the image size. The dataset has been published in IEEE CVPR 2010.
5.PASCAL VOC 2011 image segmentation dataset
Publishing Agency:University of Leeds
Release time:2011
Estimated size:1.7 GB
Download address:https://go.hyper.ai/bXb4O
PASCAL VOC 2011 is an image segmentation dataset. The training set contains 2,223 images, consisting of 5,034 target objects; the test set contains 1,111 images and 2,028 target objects. In total, there are more than 5,000 accurately segmented objects for training.
6.PhraseCut language-based image segmentation dataset
Publishing Agency:University of Massachusetts Amherst
Release time:2020
Download address:https://go.hyper.ai/bvzRm
The PhraseCut dataset contains 77,262 images and 345,486 phrase-region pairs. The dataset is collected from the Visual Genome dataset and uses existing annotations to generate a set of challenging reference phrases, and the corresponding regions of these phrases are manually annotated.
7.MPI3D 3D image separation dataset
Publishing Agency:Max Planck Institute for Intelligent Systems
Release time:2019
Download address:https://go.hyper.ai/JfmOO
MPI stands for Moldflow Plastic Insight, which consists of more than 1 million images of physical 3D objects. The images have seven variable factors, such as the color, shape, size, and position of the objects. This dataset can be used to test representation learning algorithms in simulated and real environments. Related results have been selected for NeurIPS 2019.
8.CryoNuSeg instance segmentation dataset
Publishing Agency:Medical University of Vienna
Release time:2023
Estimated size:160 MB
Download address:https://go.hyper.ai/Ybpbg
CryoNuSeg is a dataset for nuclear instance segmentation in frozen section H&E stained tissue images. The dataset contains images from 10 human organs, has a fixed size of 512×512 pixels, and provides 3 manual annotations to allow measurement of intra-observer and inter-observer variability.
9.TrashCan instance segmentation dataset
Publishing Agency:University Digital Conservancy
Release time:2020
Estimated size:18.3 GB
Download address:https://go.hyper.ai/dxw78
TrashCan is an instance segmentation dataset of underwater garbage, consisting of 7,212 annotated images, recording various underwater garbage, unmanned submersibles, and the activities of seafloor flora and fauna. The annotations of this dataset are in the format of instance segmentation annotations, and its images are from the J-EDI dataset.
10.FSS-1000 small sample image segmentation dataset
Publishing Agency:The Hong Kong University of Science and Technology
Release time:2019
Estimated size:7.56 GB
Download address:https://go.hyper.ai/eTDiv
FSS-1000 is a small sample segmentation dataset containing 1,000 classes. This dataset explores the task of training a model to complete image recognition tasks using only 5 manually annotated images. The dataset contains a large number of objects that have never appeared or been annotated in previous datasets, such as small daily objects, commodities, cartoon characters, logos, etc.
SegmentAnything Tutorial

Segment Anything Model (SAM) is a machine vision model that can generate high-quality image segmentation based on input prompts such as points or boxes, and can be used to generate corresponding masks for all objects in an image. The model is trained on a dataset of 11 million images and 1.1 billion masks, and has strong zero-shot performance on various segmentation tasks, achieving true segmentation of everything.
Run online: https://go.hyper.ai/D1XHs
The above are 10 image segmentation and classification datasets compiled by HyperAI. If you have resources that you want to include on the hyper.ai official website, you are welcome to leave a message or submit your contribution to tell us!
About HyperAI
HyperAI (hyper.ai) is the leading artificial intelligence and high-performance computing community in China.We are committed to becoming the infrastructure in the field of data science in China and providing rich and high-quality public resources for domestic developers. So far, we have:
* Provide domestic accelerated download nodes for 1200+ public data sets
* Includes 300+ classic and popular online tutorials
* Interpretation of 100+ AI4Science paper cases
* Support 500+ related terms search
* Hosting the first complete Apache TVM Chinese documentation in China
Visit the official website to start your learning journey: