Predicting the Empirical Distribution of Video Quality Scores Using Recurrent Neural Networks

Document Type : Original Article

Authors

Electrical Engineering Department, Sharif University of Technology, Tehran, Iran

Abstract

Video quality assessment is a crucial routine in the broadcasting industry. Due to the duration and the excessive number of video files, a computer-based video quality assessment mechanism is the only solution. While it is common to measure the quality of a video file at the compression stage by comparing it against the raw data, at later stages, no reference video is available for comparison. Therefore, a no-reference (Blind) video quality assessment (NR-VQA) technique is essential. The current NR-VQA methods predict only the mean opinion score (MOS) and do not provide further information about the distribution of people score. However, this distribution is informative for the evaluation of QoE. In this paper, we propose a method for predicting the empirical distribution of human opinion scores in the assessment of video quality. To this end, we extract some frame-level features, and next, we feed these features to a recurrent neural network. Finally, the distribution of opinion score is predicted in the last layer of the RNN. The experiments show that averages of predicted distributions have comparable or better results with previous methods on the KonVid-1k dataset.

Keywords


 
1. Cisco White Paper, “Cisco visual networking index:
Forecast and methodology” (2019), 2017-2022. 
2. Mittal A., Saad M. A., and Bovik A. C., “A completely blind
video integrity oracle,” IEEE Transactions on Image
Processing, Vol. 25, No. 1, (2016), 289-300. 
3. Saad M. A., Bovik A. C., and Charrier C., “Blind rediction of
natural video quality,” IEEE Transactions on Image Processing,
Vol. 23, No. 3, (2014), 1352-1365. 
4. X. Li, Q. Guo, and X. Lu, “Spatiotemporal statistics for video
quality assessment,” IEEE Transactions on Image Processing,
Vol. 25, No. 7, (2016), 3329-3342. 
5. P. Yan and X. Mou, “No-reference video quality assessment
based on perceptual features extracted from multi-directional
video spatiotemporal slices images,” in Proceedings of SPIE
Optoelectronic Imaging and Multimedia Technology V, Beijing,
China, (2018). 
6. Varga D. and Sziranyi T., “No-reference video quality assessment
via ´ pretrained cnn and lstm networks,” Signal, Image and Video
Processing, (2019), 1-8. 
7. Varga D., “No-reference video quality assessment based on the
temporal pooling of deep features,” Neural Processing Letters,
(2019), 1–14. 
8. Otroshi Shahreza H., Amini A., and Behroozi H., “No-reference
video quality assessment using recurrent neural networks,” in
Proceedings of the 5th Conference on Signal Processing and
Intelligent Systems (ICSPIS), Sharood, Iran, (2019). 
9. Seshadrinathan K., Soundararajan R., Bovik A. C., and Cormack
L. K., “Study of subjective and objective quality assessment of
video,” IEEE Transactions on Image Processing, Vol. 19, No.
6, (2010), 1427-1441. 
10. Hosu V., Hahn F., Jenadeleh M., Lin H., Men H., Sziranyi T., LiS., and Saupe D., “The konstanz natural video database (konvid1k),” in Proceedings of IEEE International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, Germany,(2017).
11. Men H., Lin H., and Saupe D., “Empirical evaluation of noreference vqa methods
on a natural video quality database,” in
Proceedings of IEEE International Conference on Quality of
Multimedia Experience (QoMEX), Erfurt, Germany, (2017). 
12. Kim J., Zeng H., Ghadiyaram D., Lee S., Zhang L., and Bovik A.
C., “Deep convolutional neural models for picture-quality
prediction: Challenges and solutions to data-driven image quality
assessment,” IEEE Signal Processing Magazine, Vol. 34, No. 6,
(2017), 130-141. 
13. S. Bosse, D. Maniry, K.-R. Muller, T. Wiegand, and W. Samek,
“Deep neural networks for no-reference and full-reference image
quality assessment,” IEEE Transactions on Image Processing,
Vol. 27, No. 1, (2017), 206-219. 
14. S. Bianco, L. Celona, P. Napoletano, and R. Schettini, “On the
use of deep learning for blind image quality assessment,” Signal,
Image and Video Processing, Vol. 12, No. 2, (2018), 355-362. 
15. Otroshi-Shahreza H., Amini A., and Behroozi H., “No-reference
image quality assessment using transfer learning,” in Proceedings
of the 9
th
 International Symposium on Telecommunications (IST),
Tehran, Iran, (2018). 
16. Otroshi Shahreza H., Amini A., and Behroozi H., “In-the-wild
noreference image quality assessment using deep convolutional
neural networks,” in Proceedings of the 5th Conference on Signal 
Processing and Intelligent Systems (ICSPIS), Shahrood, Iran,
(2019). 
17. Gu J., Meng G., Xiang S., and Pan C., “Blind image quality
assessment via learnable attention-based pooling,” Pattern
Recognition, Vol. 91, (2019), 332-344. 
18. Yang R., Xu M., Wang Z., and Li T., “Multi-frame quality
enhancement for compressed video,” in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition
(CVPR), Salt Lake City, UT, USA, (2018). 
19. Vu T., Van Nguyen C., Pham T. X., Luu T. M., and Yoo C. D.,
“Fast and efficient image quality enhancement via desubpixel
convolutional neural networks,” in Proceedings of the European
Conference on Computer Vision (ECCV), (2018). 
20. Ignatov A., Kobyshev N., Timofte R., Vanhoey K., and Van Gool
L., “Dslr-quality photos on mobile devices with deep
convolutional networks,” in Proceedings of the IEEE
International Conference on Computer Vision (CVPR),
Honolulu, HI, USA, (2017). 
21. Lore K. G., Akintayo A., and Sarkar S., “Llnet: A deep
autoencoder approach to natural low-light image enhancement,”
Pattern Recognition, Vol. 61, (2017), 650–662. 
22. Hassanpour H. and Asadi A. S., “Image quality enhancement
using pixel-wise gamma correction via svm classifier,”
International Journal of Engineering, (2011). 
23. Azari Nasrabad F., Hassanpour H., and Asadi Amiri S., “Adaptive
image dehazing via improving dark channel prior,” International
Journal of Engineering, Vol. 32, No. 2, (2019), 249–255. 
24. Asadi S., Hassanpour H., and Mortezaei Z., “Image enhancement
using an adaptive un-sharp masking method considering the
gradient variation,” International Journal of Engineering, Vol.
30, No. 8, (2017), 1118-1125. 
25. Iravani S. and Ezoji M., “A general framework for 1-d histogrambaesd
image contrast enhancement,” International Journal of
Engineering, Vol. 29, No. 10, (2016), 1384-1391. 
26. Ye P., Kumar J., Kang L., and Doermann D., “Unsupervised
feature learning framework for no-reference image quality
assessment,” in Proceedings of IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), Rhode Island, USA,
(2012). 
27. Xu J., Ye P., Liu Y., and Doermann D., “No-reference video
quality assessment via feature learning,” in Proceedings od IEEE
International Conference on Image Processing (ICIP), Paris,
France, (2014). 
28. Men H., Lin H., and Saupe D., “Spatiotemporal feature
combination model for no-reference video quality assessment,” in
Proceedings of IEEE International Conference on Quality of
Multimedia Experience (QoMEX), Sardinia, Italy, (2018). 
29. Mittal A., Moorthy A. K., and Bovik A. C., “No-reference image
quality assessment in the spatial domain,” IEEE Transactions on
Image Processing, Vol. 21, No. 12, (2012), 4695-4708. 
30. Mittal A., Soundararajan R., and Bovik A. C., “Making a
“completely blind” image quality analyzer.” IEEE Signal
Processing Letters, Vol. 20, No. 3, (2013), 209-212. 
31. Lasmar N.-E., Stitou Y., and Berthoumieu Y., “Multiscale
skewed heavy tailed model for texture analysis,” in Proceedings
of IEEE International Conference on Image Processing (ICIP),
Cairo, Egypt, (2009). 
32. Hasler D. and Suesstrunk S. E., “Measuring colourfulness in
natural images,” in Proceedings of SPIE Human Vision and
Electronic Imaging VIII, (2003). 
33. M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural
networks,” IEEE Transactions on Signal Processing, Vol. 45,
No. 11, (1997), 2673-2681.