Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Document Type : Original Article

Authors

Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran

Abstract

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. To simplify the proposed method and to be more functional, the depth factor is ignored. So only the simple color images of hands are used as inputs of the system. The proposed method is evaluated by using two datasets with high-diversity named Mixamo and RWTH, which include 43,986 and 1160 color images, respectively, where 74% of these images are selected as a training set and, 26% of the rest images are selected as the evaluation set. The experiments show that the proposed method provides better results in both hand posture recognition and detection of sign languages compared to state-of-the-art methods.

Keywords


 
1. Hosseini, S.M., Nasrabadi, A., Nouri, P. and Farsi, H., “A novel
human gait recognition system”, International Journal of
Computer and Electrical Engineering, Vol. 2, No. 6, (2010), 1043–
1049.  
2. Ohn-Bar, E. and Trivedi, M. M., “Hand gesture recognition in real
time for automotive interfaces: A multimodal vision-based approach
and evaluations”, IEEE transactions on intelligent transportation
Systems, Vol. 15, No. 6, (2014), 2368–2377.  
3. Wu, D., Pigou, L., Kindermans, P.J., Le, N.D.H., Shao, L., Dambre,
J. and Odobez, J. M., “Deep dynamic neural networks for
multimodal gesture segmentation and recognition”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol.
38, No. 8, (2016), 1583–1597.  
4. Pradipa, R. and Kavitha, S., “Hand gesture recognition–analysis of
various techniques, methods and their algorithms”, International
Journal of Innovative Research in Science, Engineering and
Technology, Vol. 3, No. 3, (2014), 2003–2010.  
5. Dinh, D.L., Kim, J.T. and Kim, T. S., “Hand gesture recognition and
interface via a depth imaging sensor for smart home appliances”,
Energy Procedia, Vol. 62, No. 62, (2014), 576–582.  
6. Zimmermann, C. and Brox, T., “Learning to estimate 3d hand pose
from single rgb images”, In Proceedings of the IEEE International
Conference on Computer Vision, IEEE, (2017), 4903–4911.  
 
7. Cardoso, T., Delgado, J. and Barata, J., “Hand gesture recognition
towards enhancing accessibility”, Procedia Computer Science, Vol.
67, (2015), 419–429.  
8. Sharma, R.P. and Verma, G. K., “Human computer interaction using
hand gesture”, Procedia Computer Science, Vol. 54, (2015), 721–
727.  
9. Tang, A., Lu, K., Wang, Y., Huang, J. and Li, H., “A real-time hand
posture recognition system using deep neural networks”, ACM
Transactions on Intelligent Systems and Technology, Vol. 6, No.
2, (2015), 1–23.  
10. Farsi, H. and Mohamadzadeh, S., “Combining Hadamard matrix,
discrete wavelet transform and DCT features based on PCA and
KNN for image retrieval”, Majlesi Journal of Electrical
Engineering, Vol. 7, No. 1, (2013), 9–15.  
11. Dondi, P., Lombardi, L. and Porta, M., “Development of gesturebased
human–computer
interaction
applications
by
fusion
of
depth
and
colour
video
streams”, IET Computer Vision, Vol. 8, No. 6,
(2014), 568–578.  
12. Husain, F., Gandhi, S., Nijhawan, T., Agarwal, V., Khatun, S. and
Parveen, S., “Gesture Recognition System Using Matlab: Aliterature
Review”, International Journal of Scientific Research and
Management Studies, Vol. 2, No. 11, (2016), 425–432.  
13. Khan, R.Z. and Ibraheem, N. A., “Hand gesture recognition: a
literature review”, International Journal of Artificial Intelligence
& Applications, Vol. 3, No. 4, (2012), 161–174.  
14. Hosseini, S.M., Farsi, H. and Yazdi, H. S., “Best clustering around
the color images”, International Journal of Computer and
Electrical Engineering, Vol. 1, No. 1, (2009), 20–25.  
15. Afifi, M., “11K Hands: gender recognition and biometric
identification using a large dataset of hand images”, Multimedia
Tools and Applications, Vol. 78, No. 15, (2019), 20835–20854.  
16. Bambach, S., Lee, S., Crandall, D.J. and Yu, C., “Lending a hand:
Detecting hands and recognizing activities in complex egocentric
interactions”, In Proceedings of the IEEE International Conference
on Computer Vision, IEEE, (2015), 1949–1957.  
17. Sezavar, A., Farsi, H. and Mohamadzadeh, S., “Content-based image
retrieval by combining convolutional neural networks and sparse
representation”, Multimedia Tools and Applications, Vol. 78, No.
15, (2019), 20895–20912.  
 
18. John, V., Umetsu, M., Boyali, A., Mita, S., Imanishi, M., Sanma, N.
and Shibata, S., “Real-time hand posture and gesture-based touchless
automotive user interface using deep learning”, In 2017 IEEE
Intelligent Vehicles Symposium (IV), IEEE, (2017), 869–874.  
19. Farsi, H., “Improvement of minimum tracking in minimum statistics
noise estimation method”, Signal Processing: An International
Journal, Vol. 4, No. 1, (2010), 17–23.  
20. Wei, S.E., Ramakrishna, V., Kanade, T. and Sheikh, Y.,
“Convolutional pose machines”, In Proceedings of the IEEE
 
conference on Computer Vision and Pattern Recognition, IEEE,
(2016), 4724–4732.  
21. Zhao, R., Wang, Y. and Martinez, A. M., “A simple, fast and highlyaccurate
algorithm to recover 3d shape from 2d landmarks on a
single image”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 40, No. 12, (2017), 3059–3066.  
22. Lee, S., Bambach, S., Crandall, D.J., Franchak, J.M. and Yu, C.,
“This hand is my hand: A probabilistic approach to hand
disambiguation in egocentric video”, In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition
Workshops, (2014), 543–550.  
23. Mohammadzadeh, S. and Farsi, H., “Image retrieval using colortexture
features extracted from Gabor-Walsh wavelet pyramid”,
Journal of Information Systems and Telecommunication, Vol. 2,
No. 15, (2014), 31–40.  
24. Farsi, H., Mozaffarian, M.A. and Rahmani, H., “Improving voice
activity detection used in ITU-T G. 729. B”, In Proceedings of the
3rd WSEAS International Conference on Circuits, Systems, Signal
and Telecommunications, Ningbo, China, (2009), 11–15.  
25. Molchanov, P., Gupta, S., Kim, K. and Kautz, J., “Hand gesture
recognition with 3D convolutional neural networks”, In Proceedings
of the IEEE conference on computer vision and pattern recognition
workshops, (2015), 1–7.  
26. Mixamo database, Available at: http://www.mixamo.com. 
27. RWTH German finger spelling database, Available at: 
http://wwwi6.informatik.rwthaachen.de/~dreuw/fingerspelling.php.