Command Palette
Search for a command to run...
Towards Good Practices for Deep 3D Hand Pose Estimation
Towards Good Practices for Deep 3D Hand Pose Estimation
Guo Hengkai Wang Guijin Chen Xinghao Zhang Cairong
Abstract
3D hand pose estimation from single depth image is an important andchallenging problem for human-computer interaction. Recently deep convolutionalnetworks (ConvNet) with sophisticated design have been employed to address it,but the improvement over traditional random forest based methods is not soapparent. To exploit the good practice and promote the performance for handpose estimation, we propose a tree-structured Region Ensemble Network (REN) fordirectly 3D coordinate regression. It first partitions the last convolutionoutputs of ConvNet into several grid regions. The results from separatefully-connected (FC) regressors on each regions are then integrated by anotherFC layer to perform the estimation. By exploitation of several trainingstrategies including data augmentation and smooth L1 loss, proposed REN cansignificantly improve the performance of ConvNet to localize hand joints. Theexperimental results demonstrate that our approach achieves the bestperformance among state-of-the-art algorithms on three public hand posedatasets. We also experiment our methods on fingertip detection and human posedatasets and obtain state-of-the-art accuracy.