Semantic Segmentation of Aerial Imagery: A Novel Approach Leveraging Hierarchical Multi-scale Features and Channel-based Attention for Drone Applications

Document Type : Original Article

Authors

Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran

Abstract

Drone semantic segmentation is a challenging task in computer vision, mainly due to inherent complexities associated with aerial imagery. This paper presents a comprehensive methodology for drone semantic segmentation and evaluates its performance using the ICG dataset. The proposed method leverages hierarchical multi-scale feature extraction and efficient channel-based attention Atrous Spatial Pyramid Pooling (ASPP) to address the unique challenges encountered in this domain. In this study, the performance of the proposed method is compared to several state-of-the-art models. The findings of this research highlight the effectiveness of the proposed method in tackling the challenges of drone semantic segmentation. The outcomes demonstrate its superiority over the state-of-the-art models, showcasing its potential for accurate and efficient segmentation of aerial imagery. The results contribute to the advancement of drone-based applications, such as surveillance, object tracking, and environmental monitoring, where precise semantic segmentation is crucial. The obtained experimental results demonstrate that the proposed method outperforms these existing approaches regarding Dice, mIOU, and accuracy metrics. Specifically, the proposed method achieves an impressive performance with Dice, mIOU, and accuracy scores of 86.51%, 76.23%, and 91.74%, respectively.

Graphical Abstract

Semantic Segmentation of Aerial Imagery: A Novel Approach Leveraging Hierarchical Multi-scale Features and Channel-based Attention for Drone Applications

Keywords

Main Subjects


  1. Bhatnagar S, Gill L, Ghosh B. Drone image segmentation using machine and deep learning for mapping raised bog vegetation communities. Remote Sensing. 2020;12(16):2602. 10.3390/rs12162602
  2. Lyu Y, Vosselman G, Xia G-S, Yilmaz A, Yang MY. UAVid: A semantic segmentation dataset for UAV imagery. ISPRS journal of photogrammetry and remote sensing. 2020;165:108-19.
  3. Asgari Taghanaki S, Abhishek K, Cohen JP, Cohen-Adad J, Hamarneh G. Deep semantic segmentation of natural and medical images: a review. Artificial Intelligence Review. 2021;54:137-78. 10.48550/arXiv.1910.07655
  4. Benjdira B, Bazi Y, Koubaa A, Ouni K. Unsupervised domain adaptation using generative adversarial networks for semantic segmentation of aerial images. Remote Sensing. 2019;11(11):1369.
  5. Chakravarthy AS, Sinha S, Narang P, Mandal M, Chamola V, Yu FR. DroneSegNet: Robust Aerial Semantic Segmentation for UAV-Based IoT Applications. IEEE Transactions on Vehicular Technology. 2022;71(4):4277-86. 10.1109/TVT.2022.3144358
  6. Zhang L, Wang M, Ding Y, Wan T, Qi B, Pang Y. FBC-ANet: A Semantic Segmentation Model for UAV Forest Fire Images Combining Boundary Enhancement and Context Awareness. Drones. 2023;7(7):456. 10.3390/drones7070456
  7. Mahmudnia D, Arashpour M, Bai Y, Feng H. Drones and blockchain integration to manage forest fires in remote regions. Drones. 2022;6(11):331. 10.3390/drones6110331
  8. Kumar S, Kumar A, Lee D-G. Semantic Segmentation of UAV Images Based on Transformer Framework with Context Information. Mathematics. 2022;10(24):4735. 10.3390/math10244735
  9. Lobo Torres D, Queiroz Feitosa R, Nigri Happ P, Elena Cué La Rosa L, Marcato Junior J, Martins J, et al. Applying fully convolutional architectures for semantic segmentation of a single tree species in urban environment on high resolution UAV optical imagery. Sensors. 2020;20(2):563. 10.3390/s20020563
  10. Sezavar A, Farsi H, Mohamadzadeh S. A Modified Grasshopper Optimization Algorithm Combined with CNN for Content Based Image Retrieval. International Journal of Engineering. 2019;32(7):924-30. 10.5829/ije.2019.32.07a.04 https://www.ije.ir/article_87122_c98ece12fa7377fe34ea6c8a5f519352.pdf
  11. Feizi A. Convolutional Gating Network for Object Tracking. International Journal of Engineering. 2019;32(7):931-9. 10.5829/ije.2019.32.07a.05 https://www.ije.ir/article_87123_97e73a696cb75dc070dd2dced8875f7d.pdf
  12. Hassanpour H, Mortezaie Z, Beghdadi A. Sensing Image Regions for Enhancing Accuracy in People Re-identification. Iranian (Iranica) Journal of Energy & Environment. 2022;13(3):295-304.
  13. Habibi M, Hassanpour H. Splicing Image Forgery Detection and Localization Based on Color Edge Inconsistency using Statistical Dispersion Measures. International Journal of Engineering. 2021;34(2):443-51. 10.5829/ije.2021.34.02b.16 https://www.ije.ir/article_124931_7626c7eef9e9dc3ae8dfa8e6c6adf7ac.pdf
  14. Sakimalla G, Chilukuri P, Jamuna A. Picture Segmentation using changing Artifacts Identification and Bias Modification. 2023. 10.36227/techrxiv.22362469.v1
  15. Niu R, Sun X, Tian Y, Diao W, Feng Y, Fu K. Improving semantic segmentation in aerial imagery via graph reasoning and disentangled learning. IEEE Transactions on Geoscience and Remote Sensing. 2021;60:1-18.
  16. Zagoruyko S, Komodakis N. Wide residual networks. arXiv preprint arXiv:160507146. 2016.
  17. Kestur R, Farooq S, Abdal R, Mehraj E, Narasipura O, Mudigere M. UFCN: A fully convolutional neural network for road extraction in RGB imagery acquired by remote sensing from an unmanned aerial vehicle. Journal of Applied Remote Sensing. 2018;12(1):016020-.
  18. Giang TL, Dang KB, Le QT, Nguyen VG, Tong SS, Pham V-M. U-Net convolutional networks for mining land cover classification based on high-resolution UAV imagery. Ieee Access. 2020;8:186257-73. 10.1109/ACCESS.2020.3030112
  19. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging. 2019;39(6):1856-67. 10.48550/arXiv.1912.05074
  20. Chiu W-T, Lin C-H, Jhu C-L, Lin C, Chen Y-C, Huang M-J, et al., editors. Semantic segmentation of lotus leaves in UAV aerial images via U-Net and deepLab-based networks. 2020 International Computer Symposium (ICS); 2020: IEEE. 10.1109/ICS51289.2020.00110
  21. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S, editors. Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition; 2017.
  22. Zhao H, Shi J, Qi X, Wang X, Jia J, editors. Pyramid scene parsing network. Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. 10.48550/arXiv.1612.01105
  23. Li S, Zhu X, Bao J. Hierarchical multi-scale convolutional neural networks for hyperspectral image classification. Sensors. 2019;19(7):1714. 10.3390/s19071714
  24. Huo X, Sun G, Tian S, Wang Y, Yu L, Long J, et al. HiFuse: Hierarchical Multi-Scale Feature Fusion Network for Medical Image Classification. arXiv preprint arXiv:220910218. 2022. 10.1016/j.bspc.2023.105534
  25. Megir V, Sfikas G, Mekras A, Nikou C, Ioannidis D, Tzovaras D, editors. Salient object detection with pretrained deeplab and K-means: Application to UAV-captured building imagery. Pattern Recognition ICPR International Workshops and Challenges: Virtual Event, January 10-15, 2021, Proceedings, Part VII; 2021: Springer. 10.1007/978-3-030-68787-8_35
  26. Liu Y, Wang L, Zhao L, Yu Z. Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery: Volume 1: Springer Nature; 2019.
  27. Shaw P, Uszkoreit J, Vaswani A. Self-attention with relative position representations. arXiv preprint arXiv:180302155. 2018. 10.48550/arXiv.1803.02155
  28. Hu J, Shen L, Sun G, editors. Squeeze-and-excitation networks. Proceedings of the IEEE conference on computer vision and pattern recognition; 2018.
  29. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q, editors. ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020. 10.48550/arXiv.1910.03151
  30. Semantic Drone Dataset. In: Technology GUo, editor. Institute of Computer Graphics and Vision, Graz University of Technology.
  31. Prakash S, Shah P, Agrawal A. Exploiting CNNs for Semantic Segmentation with Pascal VOC. arXiv preprint arXiv:230413216. 2023. 10.48550/arXiv.2304.13216
  32. Rebuffi S-A, Gowal S, Calian DA, Stimberg F, Wiles O, Mann TA. Data augmentation can improve robustness. Advances in Neural Information Processing Systems. 2021;34:29935-48. 10.48550/arXiv.2111.05328
  33. Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, et al. Ce-net: Context encoder network for 2d medical image segmentation. IEEE transactions on medical imaging. 2019;38(10):2281-92. 10.48550/arXiv.1903.02740
  34. Union GIO, editor A metric and a loss for bounding box regression. Rezatofighi, N Tsoi, J Gwak, A Sadeghian, I Reid, S Savarese//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA; 2019. 10.48550/arXiv.1902.09630
  35. Wu K, Du B, Luo M, Wen H, Shen Y, Feng J, editors. Weakly supervised brain lesion segmentation via attentional representation learning. Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22; 2019: Springer. 10.1007/978-3-030-32248-9_24
  36. Long J, Shelhamer E, Darrell T, editors. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015.
  37. Ronneberger O, Fischer P, Brox T, editors. U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18; 2015: Springer.
  38. Guo Z, Xu J, Liu A, editors. Remote sensing image semantic segmentation method based on improved Deeplabv3+. International Conference on Image Processing and Intelligent Control (IPIC 2021); 2021: SPIE. 10.1117/12.2611930