Hybrid Attention Asynchronous Cascade Network for Salient Object Detection
Abstract
:1. Introduction and Background
- A lightweight hybrid attention module (LHAM) was designed to improve the feature recognition ability of salient detection. Compared with the self-attention module, the effect of feature extraction is almost unchanged, which substantially diminishes the amount of computation.
- A parallel dilated convolution (PDC) module was designed to extract multiscale information adaptively from the samples and can deal with the scale changes better.
- To effectively fuse the output of LHAM and PDC module cascade structure, an improved bi-directional asynchronous propagation strategy was adopted, which can fully capture contextual in-formation of different scales, thereby improving detection performance.
2. Proposed Method
2.1. Network Framework
2.2. Parallel Dilated Convolution
2.3. Lightweight Hybrid Attention Module
2.4. Asynchronous Cascading Strategy
3. Experiment
3.1. Experimental Setup
3.2. Ablation Experiment
3.2.1. Effectiveness Experiment of PDC
3.2.2. Effectiveness Experiments of LHAM and BACS
3.3. Comparison with State-of-the-Art
3.3.1. Quantitative Comparison
3.3.2. Qualitative Comparison
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cheng, X.; Li, E.; Fu, Z. Residual Attention Siamese RPN for Visual Tracking. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Nanjing, China, 16–18 October 2020. [Google Scholar]
- Mehta, D.; Skliar, A.; Yahia, H.B.; Borse, S.; Porikli, F.M. Simple and Efficient Architectures for Semantic Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 2627–2635. [Google Scholar]
- Zhou, Z.; Pei, X.; Li, X.; Wang, H.; Zheng, F.; He, Z. Saliency-associated object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9866–9875. [Google Scholar]
- Qin, X.; Zhang, Z.; Huang, C.; Dehghan, M.; Zaiane, O.R.; Jägersand, M. U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection. Pattern Recognit 2020, 106, 107404. [Google Scholar] [CrossRef]
- Yang, C.; Zhang, L. Saliency detection via graph-based manifold ranking. IEEE Trans. Multim. 2020, 22, 885–896. [Google Scholar]
- Borji, A.; Cheng, M.; Jiang, H.; Li, J. Salient object detection: A survey. Comput. Vis. Media 2014, 5, 117–150. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Lu, H.; Zhang, L. Saliency detection via dense and sparse reconstruction. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2976–2983. [Google Scholar]
- Sun, L.; Chen, Z.; Wu, Q.J.; Zhao, H.; He, W.; Yan, X. AMPNet: Average-and Max-Pool Networks for Salient Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 4321–4333. [Google Scholar] [CrossRef]
- Yun, Y.; Lin, W. SelfReformer: Self-Refined Network with Transformer for Salient Object Detection. arXiv 2022, arXiv:2205.11283. [Google Scholar]
- Zhang, P.; Wang, D.; Lu, H.; Wang, H.; Ruan, X. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 202–211. [Google Scholar]
- He, J.; Zhang, S.; Yang, M.; Shan, Y.; Huang, T. Bdcn: Bi-directional cascade network for perceptual edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 100–113. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4 December 2017. [Google Scholar]
- Guo, M.H.; Liu, Z.N.; Mu, T.J.; Hu, S.M. Beyond self-attention: External attention using two linear layers for visual tasks. arXiv 2021, arXiv:2105.02358. [Google Scholar] [CrossRef] [PubMed]
- Yang, H.; Chen, R.; Deng, D. Multiscale Balanced-Attention Interactive Network for Salient Object Detection. Mathematics 2022, 10, 512. [Google Scholar] [CrossRef]
- Wang, L.; Lu, H.; Wang, Y.; Feng, M.; Wang, D.; Yin, B.; Ruan, X. Learning to detect salient objects with image-level supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 136–145. [Google Scholar]
- Yan, Q.; Xu, L.; Shi, J.; Jia, J. Hierarchical saliency detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 1155–1162. [Google Scholar]
- Li, Y.; Hou, X.; Koch, C.; Rehg, J.M.; Yuille, A.L. The secrets of salient object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 280–287. [Google Scholar]
- Li, G.; Yu, Y. Visual saliency based on multiscale deep features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 July 2015; pp. 5455–5463. [Google Scholar]
- Yang, C.; Zhang, L.; Lu, H.; Ruan, X.; Yang, M.H. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 July 2013; pp. 3166–3173. [Google Scholar]
- Movahedi, V.; Elder, J.H. Design and perceptual validation of performance measures for salient object segmentation. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Wor, San Francisco, CA, USA, 13–18 June 2010; pp. 49–56. [Google Scholar]
- Achanta, R.; Hemami, S.; Estrada, F.; Susstrunk, S. Frequency-tuned salient region detection. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 1597–1604. [Google Scholar]
- Perazzi, F.; Krähenbühl, P.; Pritch, Y.; Hornung, A. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 733–740. [Google Scholar]
- Fan, D.P.; Cheng, M.M.; Liu, Y.; Li, T.; Borji, A. Structure-measure: A new way to evaluate foreground maps. In Proceedings of the IEEE International Conference on Computer Vision, Honolulu, HI, USA, 21–26 July 2017; pp. 4548–4557. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Wang, L.; Lu, H.; Ruan, X.; Yang, M.H. Deep Networks for Saliency Detection via Local Estimation and Global Search. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3183–3192. [Google Scholar]
- Zhao, R.; Ouyang, W.; Li, H.; Wang, X. Saliency Detection by Multi-context Deep Learning. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1265–1274. [Google Scholar]
- Lee, G.; Tai, Y.W.; Kim, J. Deep saliency with encoded low level distance map and high level features. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 660–668. [Google Scholar]
- Li, G.; Yu, Y. Deep contrast learning for salient object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 478–487. [Google Scholar]
- Hou, Q.B.; Cheng, M.M.; Hu, X.W.; Borji, A.; Tu, Z.; Torr, P. Deeply Supervised Salient Object Detection with Short Connections. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 815–828. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, J.; Pan, Z.; Liu, Q.; Cui, Y.; Sun, Y. Complementarity-Aware Attention Network for Salient Object Detection. IEEE Trans. Cybern. 2020, 52, 873–886. [Google Scholar] [CrossRef] [PubMed]
- Luo, Z.; Mishra, A.; Achkar, A.; Cui, Y.; Sun, Y. Non-local deep features for salient object detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 June 2017; pp. 6609–6617. [Google Scholar]
- Zhang, P.; Wang, D.; Lu, H.; Wang, H.; Yin, B. Learning uncertain convolutional features for accurate saliency detection. In Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 212–221. [Google Scholar]
- Wang, L.; Wang, L.; Lu, H.; Zhang, P.; Ruan, X. Salient object detection with recurrent fully convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1734–1746. [Google Scholar] [CrossRef] [PubMed]
- Qin, Y.; Lu, H.; Xu, Y.; Wang, H. Saliency detection via cellular automata. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 110–119. [Google Scholar]
- Islam, M.A.; Kalash, M. Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7142–7150. [Google Scholar]
Model | F-Measure ↑ | MAE ↓ | S-Measure ↑ |
---|---|---|---|
ResNet | 0.709 | 0.077 | 0.777 |
ResNet + PDC | 0.739 | 0.075 | 0.801 |
ResNet + LHAM | 0.745 | 0.071 | 0.806 |
ResNet + BACS | 0.741 | 0.072 | 0.803 |
ResNet + PDC + LHAM | 0.782 | 0.058 | 0.828 |
ResNet + PDC + BACS | 0.759 | 0.065 | 0.814 |
ResNet + LHAM + BACS | 0.785 | 0.057 | 0.830 |
ResNet + PDC + LHAM + BACS | 0.800 | 0.053 | 0.840 |
k | F-Measure ↑ | MAE ↓ | S-Measure ↑ |
---|---|---|---|
2 | 0.435 | 0.282 | 0.548 |
3 | 0.442 | 0.279 | 0.551 |
4 | 0.448 | 0.270 | 0.558 |
5 | 0.447 | 0.270 | 0.556 |
No. | F-Measure ↑ | MAE ↓ | S-Measure ↑ |
---|---|---|---|
{1,1,1,1} | 0.448 | 0.270 | 0.558 |
{1,2,3,4} | 0.482 | 0.217 | 0.598 |
{1,2,4,5} | 0.502 | 0.212 | 0.611 |
{1,3,5,7} | 0.512 | 0.200 | 0.619 |
{1,4,5,6} | 0.506 | 0.217 | 0.610 |
Model | DUTS-TE | ECSSD | PASCAL-S | HKU-IS | DUT-OMROM | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
F ↑ | M ↓ | S ↑ | F ↑ | M ↓ | S ↑ | F ↑ | M ↓ | S ↑ | F ↑ | M ↓ | S ↑ | F ↑ | M ↓ | S ↑ | |
Ours | 0.800 | 0.053 | 0.840 | 0.900 | 0.051 | 0.890 | 0.800 | 0.085 | 0.813 | 0.900 | 0.039 | 0.890 | 0.725 | 0.058 | 0.804 |
MBI [14] | 0.809 | 0.058 | 0.824 | 0.909 | 0.058 | 0.886 | 0.821 | 0.092 | 0.806 | 0.901 | 0.042 | 0.880 | 0.763 | 0.069 | 0.781 |
CANet [31] | 0.796 | 0.056 | 0.840 | 0.907 | 0.049 | 0.898 | 0.832 | 0.120 | 0.790 | 0.897 | 0.040 | 0.895 | 0.719 | 0.071 | 0.795 |
NLDF [32] | 0.813 | 0.065 | 0.816 | 0.905 | 0.063 | 0.875 | 0.822 | 0.098 | 0.805 | 0.902 | 0.048 | 0.878 | 0.753 | 0.080 | 0.771 |
Amulet [10] | 0.773 | 0.075 | 0.796 | 0.911 | 0.062 | 0.849 | 0.862 | 0.092 | 0.820 | 0.889 | 0.052 | 0.886 | 0.737 | 0.083 | 0.771 |
DCL [29] | 0.782 | 0.088 | 0.795 | 0.891 | 0.088 | 0.863 | 0.804 | 0.124 | 0.791 | 0.885 | 0.072 | 0.861 | 0.739 | 0.097 | 0.764 |
UCF [33] | 0.771 | 0.116 | 0.777 | 0.908 | 0.080 | 0.884 | 0.820 | 0.127 | 0.806 | 0.888 | 0.073 | 0.874 | 0.735 | 0.131 | 0.748 |
DSS [30] | 0.813 | 0.065 | 0.812 | 0.906 | 0.064 | 0.882 | 0.821 | 0.101 | 0.796 | 0.900 | 0.050 | 0.878 | 0.760 | 0.074 | 0.765 |
ELD [28] | 0.747 | 0.092 | 0.749 | 0.865 | 0.082 | 0.839 | 0.772 | 0.122 | 0.757 | 0.843 | 0.072 | 0.823 | 0.738 | 0.093 | 0.743 |
RFCN [34] | 0.784 | 0.091 | 0.791 | 0.898 | 0.097 | 0.860 | 0.827 | 0.118 | 0.793 | 0.895 | 0.079 | 0.859 | 0.747 | 0.095 | 0.774 |
BSCA [35] | 0.597 | 0.197 | 0.630 | 0.758 | 0.183 | 0.725 | 0.666 | 0.224 | 0.633 | 0.723 | 0.174 | 0.700 | 0.616 | 0.191 | 0.652 |
MDF [18] | 0.729 | 0.093 | 0.732 | 0.832 | 0.105 | 0.776 | 0.763 | 0.143 | 0.694 | 0.860 | 0.129 | 0.810 | 0.694 | 0.092 | 0.720 |
RSD [36] | 0.757 | 0.161 | 0.724 | 0.845 | 0.173 | 0.788 | 0.864 | 0.155 | 0.805 | 0.843 | 0.156 | 0.787 | 0.633 | 0.178 | 0.644 |
LEGS [26] | 0.655 | 0.138 | - | 0.827 | 0.118 | 0.787 | 0.756 | 0.157 | 0.682 | 0.770 | 0.118 | - | 0.669 | 0.133 | - |
MCDL [27] | 0.461 | 0.276 | 0.545 | 0.837 | 0.101 | 0.803 | 0.741 | 0.143 | 0.721 | 0.808 | 0.092 | 0.786 | 0.701 | 0.089 | 0.752 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, H.; Chen, Y.; Chen, R.; Liu, S. Hybrid Attention Asynchronous Cascade Network for Salient Object Detection. Mathematics 2023, 11, 1389. https://doi.org/10.3390/math11061389
Yang H, Chen Y, Chen R, Liu S. Hybrid Attention Asynchronous Cascade Network for Salient Object Detection. Mathematics. 2023; 11(6):1389. https://doi.org/10.3390/math11061389
Chicago/Turabian StyleYang, Haiyan, Yongxin Chen, Rui Chen, and Shuning Liu. 2023. "Hybrid Attention Asynchronous Cascade Network for Salient Object Detection" Mathematics 11, no. 6: 1389. https://doi.org/10.3390/math11061389
APA StyleYang, H., Chen, Y., Chen, R., & Liu, S. (2023). Hybrid Attention Asynchronous Cascade Network for Salient Object Detection. Mathematics, 11(6), 1389. https://doi.org/10.3390/math11061389