Deep Multi-task Convolutional Neural Networks for Efficient Classification of Face Attributes

Document Type : Original Article


Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran


Facial feature recognition is an important subject in computer vision with numerous applications. The human face plays a significant role in social interaction and personology. Valuable information such as identity, age, gender, and emotions can be revealed via facial features. The purpose of this paper is to present a technique for detecting age, smile, and gender from facial images. A multi-task deep learning (MT-DL) framework was proposed that can simultaneously estimate three important features of the human face with remarkable accuracy. Additionally, the proposed approach aims to reduce the number of trainable network parameters while leveraging the combination of features from different layers to increase the overall accuracy. The conducted tests demonstrate that the proposed method outperforms recent advanced techniques in all three accuracy criteria. Moreover, it was demonstrated that multi-task learning (MTL) is capable of improving the accuracy by 1.55% in the smile task, 2.04% in the gender task, and 3.52% in the age task even with less available data, by utilizing tasks with more available data. Furthermore, the trainable parameters of the network in the MTL mode for estimating three tasks simultaneously increase only by about 40% compared to the single-task mode. The proposed method was evaluated on the IMDB-WIKI and GENKI-4K datasets and produced comparable accuracy to the state-of-the-art methods in terms of smile, age detection, and gender classification.


Main Subjects

  1. Shahbakhsh, M.B. and Hassanpour, H., "Empowering face recognition methods using a gan-based single image super-resolution network", International Journal of Engineering, Transactions A: Basics, Vol. 35, No. 10, (2022), 1858-1866. doi: 10.5829/ije.2022.35.10a.05.
  2. Firouzian, I., Firouzian, N., Hashemi, S. and Kozegar, E., "Pain facial expression recognition from video sequences using spatio-temporal local binary patterns and tracking fiducial points", International Journal of Engineering, Transactions B: Applications, Vol. 33, No. 5, (2020), 1038-1047. doi: 10.5829/ije.2020.33.05b.38.
  3. Charoqdouz, E. and Hassanpour, H., "Feature extraction from several angular faces using a deep learning based fusion technique for face recognition", International Journal of Engineering, Transactions B: Applications, Vol. 36, No. 8, (2023), 1548-1555. doi: 10.5829/ije.2023.36.08b.14.
  4. Thepade, S., Dindorkar, M., Chaudhari, P. and Bang, S., "Enhanced face presentation attack prevention employing feature fusion of pre-trained deep convolutional neural network model and thepade's sorted block truncation coding", International Journal of Engineering, Transactions A: Basics, Vol. 36, No. 4, (2023), 807-816. doi: 10.5829/ije.2023.36.04a.17.
  5. Krizhevsky, A., Sutskever, I. and Hinton, G.E., "Imagenet classification with deep convolutional neural networks", Advances in Neural Information Processing Systems, Vol. 25, (2012). doi: 10.1145/3065386.
  6. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A., "Going deeper with convolutions", in Proceedings of the IEEE conference on computer vision and pattern recognition. (2015), 1-9.
  7. Simonyan, K. and Zisserman, A., "Very deep convolutional networks for large-scale image recognition", arXiv preprint arXiv:1409.1556, (2014). doi: 10.48550/arXiv.1409.1556.
  8. He, K., Zhang, X., Ren, S. and Sun, J., "Deep residual learning for image recognition", in Proceedings of the IEEE conference on computer vision and pattern recognition. (2016), 770-778.
  9. Xie, S., Girshick, R., Dollár, P., Tu, Z. and He, K., "Aggregated residual transformations for deep neural networks", in Proceedings of the IEEE conference on computer vision and pattern recognition. (2017), 1492-1500.
  10. Sang, D.V., "Facial smile detection using convolutional neural networks", in 2017 9th International Conference on Knowledge and Systems Engineering (KSE), IEEE. (2017), 136-141.
  11. Cui, D., Huang, G.-B. and Liu, T., "Elm based smile detection using distance vector", Pattern Recognition, Vol. 79, (2018), 356-369. doi: 10.1016/j.patcog.2018.02.019.
  12. Vo, T., Nguyen, T. and Le, C., "A hybrid framework for smile detection in class imbalance scenarios", Neural Computing and Applications, Vol. 31, No. 12, (2019), 8583-8592. doi: 10.1007/s00521-019-04089-w.
  13. Nguyen, C.C., Tran, G.S., Nghiem, T.P., Burie, J.-C. and Luong, C.M., "Real-time smile detection using deep learning", Journal of Computer Science and Cybernetics, Vol. 35, No. 2, (2019), 135-145. doi: 10.15625/1813-9663/35/2/13315.
  14. Wu, H., Liu, Y., Liu, Y. and Liu, S., "Fast facial smile detection using convolutional neural network in an intelligent working environment", Infrared Physics & Technology, Vol. 104, (2020), 103061. doi: 10.1016/j.infrared.2019.103061.
  15. Hassen, O.A., Abu, N.A., Abidin, Z.Z. and Darwish, S.M.,, "A new descriptor for smile classification based on cascade classifier in unconstrained scenarios", Symmetry, Vol. 13, (2021), 805-816. doi: 10.3390/sym13050805.
  16. Hassen, O.A., Abu, N.A., Abidin, Z.Z. and Darwish, S.M., "Realistic smile expression recognition approach using ensemble classifier with enhanced bagging", Computers, Materials & Continua, Vol. 70, No. 2, (2022). doi: 10.32604/cmc.2022.019125.
  17. Liu, Y., Liu, Z., Zhao, Y. and Xu, J., "A robust approach for smile recognition via deep convolutional neural networks", in 2022 7th International Conference on Image, Vision and Computing (ICIVC), IEEE. (2022), 60-64.
  18. Mavaddati, S., "Voice-based age and gender recognition using training generative sparse model", International Journal of Engineering, Transactions C: Aspects, Vol. 31, No. 9, (2018), 1529-1535. doi: 10.5829/ije.2018.31.09c.08.
  19. Zhang, K., Sun, M., Han, T.X., Yuan, X., Guo, L. and Liu, T., "Residual networks of residual networks: Multilevel residual networks", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, No. 6, (2017), 1303-1314. doi: 10.1109/TCSVT.2017.2654543.
  20. Rothe, R., Timofte, R. and Van Gool, L., "Deep expectation of real and apparent age from a single image without facial landmarks", International Journal of Computer Vision, Vol. 126, No. 2-4, (2018), 144-157. doi: 10.1007/s11263-016-0940-3.
  21. Zhang, K., Liu, N., Yuan, X., Guo, X., Gao, C., Zhao, Z. and Ma, Z., "Fine-grained age estimation in the wild with attention lstm networks", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, No. 9, (2019), 3140-3152. doi: 10.1109/TCSVT.2019.2936410.
  22. Cao, W., Mirjalili, V. and Raschka, S., "Rank consistent ordinal regression for neural networks with application to age estimation", Pattern Recognition Letters, Vol. 140, (2020), 325-331. doi: 10.1016/j.patrec.2020.11.008.
  23. Xia, M., Zhang, X., Weng, L. and Xu, Y., "Multi-stage feature constraints learning for age estimation", IEEE Transactions on Information Forensics and Security, Vol. 15, (2020), 2417-2428. doi: 10.1109/TIFS.2020.2969552.
  24. Zhang, K., Gao, C., Guo, L., Sun, M., Yuan, X., Han, T.X., Zhao, Z. and Li, B., "Age group and gender estimation in the wild with deep ror architecture", IEEE Access, Vol. 5, (2017), 22492-22503. doi: 10.1109/ACCESS.2017.2761849.
  25. Dhomne, A., Kumar, R. and Bhan, V., "Gender recognition through face using deep learning", Procedia computer science, Vol. 132, (2018), 2-10. doi: 10.1016/j.procs.2018.05.053.
  26. Nga, C.H., Nguyen, K.-T., Tran, N.C. and Wang, J.-C., "Transfer learning for gender and age prediction", in 2020 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), IEEE. (2020), 1-2.
  27. Bekhet, S., Alghamdi, A.M. and Taj-Eddin, I., "Gender recognition from unconstrained selfie images: A convolutional neural network approach", International Journal of Electrical & Computer Engineering, Vol. 12, No. 2, (2022), 2066-2078. doi: 10.11591/ijece.v12i2.pp2066-2078.
  28. Thung, K.-H. and Wee, C.-Y., "A brief review on multi-task learning", Multimedia Tools and Applications, Vol. 77, (2018), 29705-29725. doi: 10.1007/s11042-018-6463-x.
  29. Zhang, Y. and Yang, Q., "A survey on multi-task learning", IEEE Transactions on Knowledge and Data Engineering, Vol. 34, No. 12, (2021), 5586-5609. doi: 10.1109/TKDE.2021.3070203.
  30. Ranjan, R., Patel, V.M. and Chellappa, R., "Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, No. 1, (2017), 121-135. doi: 10.1109/TPAMI.2017.2781233.
  31. Ran, H., Ning, X., Li, W., Hao, M. and Tiwari, P., "3d human pose and shape estimation via de-occlusion multi-task learning", Neurocomputing, (2023), 126284. doi: 10.1016/j.neucom.2023.126284.
  32. Savchenko, A.V., "Facial expression and attributes recognition based on multi-task learning of lightweight neural networks", in 2021 IEEE 19th International Symposium on Intelligent Systems and Informatics (SISY), IEEE. (2021), 119-124.
  33. Yu, W. and Xu, H., "Co-attentive multi-task convolutional neural network for facial expression recognition", Pattern Recognition, Vol. 123, (2022), 108401. doi: 10.1016/j.patcog.2021.108401.
  34. Wu, C. and Zhang, Y., "Mtcnn and facenet based access control system for face detection and recognition", Automatic Control and Computer Sciences, Vol. 55, (2021), 102-112. doi: 10.3103/S0146411621010090.
  35. Agbo-Ajala, O. and Viriri, S., "Deeply learned classifiers for age and gender predictions of unfiltered faces", The Scientific World Journal, Vol. 2020, (2020). doi: 10.1155/2020/1289408.