HyperAI

9 Machine Learning Datasets You Can’t Miss

2 years ago
Information
Jiaxin Sun
特色图像

Content Overview: This issue organizes 9 data sets recently updated by HyperAI's official website, covering three areas: face recognition, posture estimation, and autonomous driving.
Keywords: face recognition, pose estimation, autonomous driving
This article was first published on WeChat official account:HyperAI

Recently, HyperAI's official website has updated 300+ high-quality public data sets, covering data modalities such as images, videos, audio, RGB-D, etc.

This article summarizes 9 representative data sets for you to download and use as needed.

Direct access to HyperAI Hyper Neural Dataset Portal:

Face Recognition

Face recognition is one of the applications of computer vision projects. In face recognition training,The training data is large in volume, of stable quality, and free of "impurities", making it a very good high-quality database for research.

VGG-Face2 face recognition dataset

The VGG-Face2 dataset is a face image dataset.

The images in the dataset are all from Google Image Search. The people in the dataset vary greatly in posture, age, race, and occupation.

VGG-Face2 Dataset

Publishing Agency: University of Oxford

Quantity included: 3.31 million images

Data format: images

Data size: 37.49 GB

Release time: 2017

Download address:hyper.ai/datasets/5711

Helen face dataset

The HELEN dataset consists of 2,330 400*400 pixel face images. The dataset includes 2,000 training images and 330 test images.With highly accurate, detailed and consistent annotations of the main components of faces.

Helen Dataset

Publishing Agency: University of Illinois

Quantity included: 2,330 400*400 pixel face images

Data format: images

Data size: 1.02 GB

Release time: 2012

Download address:hyper.ai/datasets/16552

FairFace face dataset

FairFace is a more racially balanced dataset of face images. The dataset contains 108,501 images from 7 different ethnic groups (White, Black, Indian, East Asian, Southeast Asian, Middle Eastern, and Latino).

FairFace Dataset

Publishing Agency: University of California, Los Angeles

Quantity included: 108,501 images

Data format: images

Data size: 2.49 GB

Release time: 2020

Download address: hyper.ai/datasets/17876

Human Pose Estimation

Pose estimation uses some geometric model or structure to represent the structure and shape of an object. The current difficulties include complex background and few complex posture samples.

MPI-INF-3DHP 3D human pose estimation dataset

MPI-INF-3DHP is a 3D human pose estimation dataset with images in both indoor and outdoor environments. The dataset contains more than 1.3 million images, recording 8 types of activities of 8 participants from 14 camera angles.

MPI-INF-3DHP Dataset

Publishing Agency: Saarland University

Quantity included: More than 1.3 million images

Data format: video

Data size: 21.77 GB

Release time: 2016

Download address:hyper.ai/datasets/17262

HandNet hand pose dataset

The HandNet hand pose dataset contains depth maps of 10 participants' hands non-rigidly deformed in front of a RealSense RGB-D camera. The dataset contains a total of 214,971 depth maps, including 202,198 training sets, 10,000 test sets, and 2,773 validation sets.

HandNet Dataset

Publishing Agency: Technion - Israel Institute of Technology

Quantity included: 214,971 images

Data format: images

Data size: 12.85 GB

Release time: 2015

Download address:hyper.ai/datasets/19801

3DPW Pose Dataset

3DPW stands for 3D Poses in the Wild.It is the first outdoor dataset with accurate 3D poses and can be used to solve the pose estimation problem. The dataset includes 60 video sequences, 3D human body scans and 3D human models.

3DPW Dataset

Publishing Agency: Leibniz University Hannover

Quantity included: 60 video sequences

Data format: video

Data size: 4.55 GB

Release time: 2018

Download address: hyper.ai/datasets/16463

Self-Driving

Artificial intelligence can play the role of driver in autonomous driving.By collecting, analyzing and processing information on various conditions occurring on the road, operations can be performed to replace human control.

Comma.ai Autonomous Driving Video Dataset

The Comma.ai dataset is a video dataset for autonomous driving. Containing a total of 7.25 hours of video, the dataset contains 10 videos recorded at 20Hz. The dataset also includes measurements such as car speed, acceleration, steering angle, GPS coordinates, gyroscope angle, etc.

Comma.ai Dataset

Publishing Agency: Comma.ai

Quantity included: 7.25 hours of video

Data format: video

Data size: 44.96 GB

Release time: 2016

Download address: hyper.ai/datasets/5200

Argoverse Autonomous Driving Dataset

The Argoverse dataset consists of two parts: 3D Tracking and Motion Forecasting.

The Argoverse 3D tracking dataset contains 3D tracking annotations for 113 scenes. Each clip is 15-30 seconds long and contains a total of 11,319 tracked objects. Each clip scene in the training set and the test set contains annotations for all objects within five meters, which can be understood as detecting all objects in the drivable area (5 meters) of the car, presented in the form of a 3D frame. This dataset can be used in fields such as autonomous driving.

The Argoverse Motion Forecasting dataset is a dataset for motion prediction models. Contains 327793 scenes, each lasting 5 seconds, and contains a 2D bird's-eye view of each tracked object sampled at 10 Hz. This dataset is obtained from more than 1000 hours of street driving and can be used for research in areas such as autonomous driving.

Argoverse Dataset

Publishing Agency: ARGO AI

Quantity included: More than 30,000 scenes

Data size: 260.38 GB

Release time: 2019

Download address:hyper.ai/datasets/8858

Talk2Car Autonomous Driving Dataset

The Talk2Car dataset is an object reference dataset.It contains commands written in natural language for self-driving cars, which means that passengers can give commands to self-driving cars by speaking.

The Talk2Car dataset builds on the nuScenes dataset and includes a broad set of sensor modalities, namely semantic maps, GPS, LiDAR, RADAR, and 360° RGB images with 3D bounding box annotations.

Talk2Car Dataset

Publishing Agency: KU Leuven, Belgium

Data format: images

Data size: 1.65 GB

Release time: 2019

Download address:hyper.ai/datasets/18926

To search or download the dataset, visit the following links:

-- over--