Convolutional Gating Network for Object Tracking

Document Type : Original Article


Faculty of Electrical Engineering, Damghan University, Damghan, Semnan, Iran


Object tracking through multiple cameras is a popular research topic in security and surveillance systems especially when human objects are the target. However, occlusion is one of the challenging problems for the tracking process. This paper proposes a multiple-camera-based cooperative tracking method to overcome the occlusion problem.  The paper presents a new model for combining convolutional neural networks (CNNs), which allows the proposed method to learn the features with high discriminative power and geometrical independence. In the training phase, the CNNs are first pre-trained in each of the camera views, and a convolutional gating network (CGN) is simultaneously pre-trained to produce a weight for each CNN output. The CNNs are then transferred to the tracking task where the pre-trained parameters of the CNNs are re-trained by using the data from the tracking phase. The weights obtained from the CGN are used in order to fuse the features learnt by the CNNs and the resulting weighted combination of the features is employed to represent the objects. Finally, the particle filter is used in order to track objects. The experimental results showed the efficiency of the proposed method in this paper.


Main Subjects

1. Jin, Z. and Bhanu, B., "Single camera mult i-person tracking based
on crowd simulat ion", in Proceedings of t he 21st International
Conference on Pat tern Recognit ion (ICPR2012), IEEE. V,
(2012), 3660-3663.
2. Andriyenko, A., Rot h, S. and Schindler, K., "An analytical
formulat ion of global occlusion reasoning for mult i-target
t racking", in 2011 IEEE International Conference on Computer
Vision Workshops (ICCV Workshops), IEEE., (2011), 18391846.
3. Wu, Z., Thangali, A., Sclaroff, S. and Bet ke, M., "Coupling
det ect ion and dat a associat ion for mult iple object t racking", in
2012 IEEE Conference on Comput er Vision and Pattern
Recognit ion, IEEE., (2012), 1948-1955.
4. de Pádua, P.H., Pádua, F.L., Sousa, M.T. and Pereira, M.d.A.,
"Part icle filter-based predictive tracking of futsal players from a single st at ionary camera", in 2015 28th SIBGRAPI Conference
on Graphics, Pat t erns and Images, IEEE., (2015), 134-141.
5. García, F., Prioletti, A., Cerri, P. and Broggi, A., "Phd filter for
vehicle t racking based on a monocular camera", Expert Systems
with Applications,  Vol. 91, (2018), 472-479.
6. Zulkifley, M.A., "Robust  single object tracker based on kernelled
pat ch of a fixed rgb camera", Optik-International Journal for
Light and Electron Optics,  Vol. 127, No. 3, (2016), 1100-1110.
7. Liem, M.C. and Gavrila, D.M., "Joint multi-person det ection and
t racking from overlapping cameras", Computer Vision and
Image Understanding,  Vol. 128, (2014), 36-50.
8. Chen, X., Qin, Z., An, L. and Bhanu, B., "An online learned
element ary grouping model for mult i-t arget t racking", in
Proceedings of t he IEEE Conference on Computer Vision and
Pat t ern Recognit ion., (2014), 1242-1249.
9. Lee, Y.-G., Tang, Z. and Hwang, J.-N., "Online-learning-based
human t racking across non-overlapping cameras", IEEE
Transactions on Circuits and Systems for Video Technology,
Vol. 28, No. 10, (2017), 2870-2883.
10. Yoon, K., Song, Y.-m. and Jeon, M., "Mult iple hypothesis
t racking algorithm for mult i-target mult i-camera t racking with
disjoint  views", IET Image Processing,  Vol. 12, No. 7, (2018),
11. Lin, D.-T. and Huang, K.-Y., "Collaborative pedestrian tracking
and dat a fusion wit h multiple cameras", IEEE Transactions on
Information Forensics and Security,  Vol. 6, No. 4, (2011),
12. Tesfaye, Y.T., Zemene, E., Prati, A., Pelillo, M. and Shah, M.,
"Mult i-t arget tracking in multiple non-overlapping cameras using
const rained dominant set s", arXiv preprint arXiv:1706.06196,
(2017), 1-15.
13. Wang, N. and Yeung, D.-Y., "Learning a deep compact image
representation for visual t racking", in Advances in neural
informat ion processing syst ems., (2013), 809-817.
14. Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D. and Zhang, H.,
"Cnnt racker: Online discriminat ive object  t racking via deep
convolutional neural net work", Applied Soft Computing,  Vol.
38, (2016), 1088-1098.
15. Hong, S., You, T., Kwak, S. and Han, B., "Online t racking by
learning discriminative saliency map wit h convolutional neural
net work", in Int ernational conference on machine learning.,
(2015), 597-606.
16. Zhang, P., Zhuo, T., Huang, W., Chen, K. and Kankanhalli, M.,
"Online object  t racking based on cnn wit h spat ial-temporal
saliency guided sampling", Neurocomputing,  Vol. 257, (2017),
17. Wang, H., Zhang, S., Ge, H., Chen, G. and Du, Y., "Robust  visual
t racking via semiadapt ive weight ed convolutional feat ures",
IEEE Signal Processing Letters,  Vol. 25, No. 5, (2018), 670-
18. Qian, X., Han, L., Wang, Y. and Ding, M., "Deep learning
assist ed robust  visual t racking wit h adapt ive particle filtering",
Signal Processing: Image Communication,  Vol. 60, (2018),
19. Sun, R., Wang, X. and Yan, X., "Robust  visual t racking based on
ext reme learning machine wit h multiple kernels features fusion",
in 2017 3rd IEEE Int ernational Conference on Computer and
Communicat ions (ICCC), IEEE., (2017), 2029-2033.
20. Hao, Z., Liu, G. and Zhang, H., "Correlation filter-based visual
t racking via adaptive weighted cnn features fusion", IET Image
Processing,  Vol. 12, No. 8, (2018), 1423-1431.
21. Simonyan, K. and Zisserman, A., "Very deep convolutional
net works for large-scale image recognit ion", arXiv preprint
arXiv:1409.1556,  (2014).