Multi-Scale Global Contrast CNN for Salient Object Detection
Abstract
:1. Introduction
2. Related Work
2.1. Contrast Based Models
2.2. Cnn Based Models
3. Multi-Scale Global Contrast CNN
3.1. Formulation
3.2. Global Contrast Learning
3.3. Multi-Scale Global Contrast Network
4. Experiments
4.1. Datasets
- ECSSD [37] is a challenge dataset which contains 1000 images with semantically meaningful but structurally complex natural contents.
- HKU-IS [27] is composed by 4447 complex images, each of which contains many disconnected objects with diverse spatial distribution. Furthermore, it is very challenging for the similar foreground/background appearance.
- PASCAL-S [38] contains a total of 850 images, with eye-fixation records, roughly pixel-wise and non-binary salient object annotations included.
- DUT-OMRON [39] consists of 5168 images with diverse variations and complex background, each of which has pix-level ground truth annotations.
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Comparison with the Sate-of-the-Art
4.5. Ablation Study
5. Conclusions and Future Work
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Liu, Z.; Shi, R.; Shen, L.; Xue, Y.; Ngan, K.; Zhang, Z. Unsupervised salient object segmentation based on kernel density estimation and two-phase graph cut. In Proceedings of the TMM 2012, Liberec, Czech Republic, 4–6 September 2012; Volume 14, pp. 1275–1289. [Google Scholar]
- Achanta, R.; Süsstrunk, S. Saliency detection for content-aware image resizing. In Proceedings of the 2009 IEEE International Conference on Image Processing, Cairo, Egypt, 7–10 November 2009; pp. 1005–1008. [Google Scholar]
- Andrej, K.; Li, F. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 3128–3137. [Google Scholar]
- Fan, D.; Wang, W.; Cheng, M.; Shen, J. Shifting More Attention to Video Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 8554–8564. [Google Scholar]
- Wang, W.; Shen, J.; Shao, L. Video Salient Object Detection via Fully Convolutional Networks. IEEE Trans. Image Process. 2018, 27, 38–49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhao, J.; Cao, Y.; Fan, D.; Cheng, M.; Li, X.; Zhang, L. Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–21 June 2019; pp. 3927–3936. [Google Scholar]
- Liu, Z.; Zhang, W.; Zhao, P. A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection. Neurocomputing 2020, 387, 210–220. [Google Scholar] [CrossRef]
- Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 11, 1254–1259. [Google Scholar] [CrossRef] [Green Version]
- Treisman, A.; Gelade, G. A feature-integration theory of attention. Cogn. Psychol. 1980, 12, 97–136. [Google Scholar] [CrossRef]
- Perazzi, F.; Krähenbühl, P.; Pritch, Y.; Hornung, A. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2012, Providence RI, USA, 16–21 June 2012; pp. 733–740. [Google Scholar]
- Achanta, R.; Hemami, S.; Estrada, F.; Süsstrunk, S. Frequency-tuned salient region detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009, Miami, FL, USA, 20–25 June 2009; pp. 1597–1604. [Google Scholar]
- Cheng, M.; Mitra, N.; Huang, X.; Torr, P.; Hu, S. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 569–582. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sun, X.; Huang, Z.; Yin, H.; Shen, H. An integrated model for effective saliency prediction. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI 2017), San Francisco, CA, USA, 4–9 February 2017; pp. 274–281. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 24–27 June 2014; pp. 580–587. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Zhao, R.; Ouyang, W.; Li, H.; Wang, X. Saliency detection by multi-context deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 1265–1274. [Google Scholar]
- Li, G.; Yu, Y. Deep contrast learning for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 478–487. [Google Scholar]
- Zhang, P.; Wang, D.; Lu, H.; Wang, H.; Yin, B. Learning uncertain convolutional features for accurate saliency detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 212–221. [Google Scholar]
- Borji, A.; Cheng, M.; Jiang, H.; Li, J. Salient object detection: A benchmark. IEEE Trans. Image Process. 2015, 24, 5706–5722. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Einhäuser, W.; König, P. Does luminance-contrast contribute to a saliency map for overt visual attention? Eur. J. Neurosci. 2003, 17, 1089–1097. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Harel, J.; Koch, C.; Perona, P. Graph-based visual saliency. In Proceedings of the Advances in Neural Information Processing Systems 20 (NIPS 2007), Vancouver, BC, Canada, 3–6 December 2007; pp. 545–552. [Google Scholar]
- Klein, D.; Frintrop, S. Center-surround divergence of feature statistics for salient object detection. In Proceedings of the 2011 IEEE International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2214–2219. [Google Scholar]
- Liu, T.; Yuan, Z.; Sun, J.; Wang, J.; Zheng, N.; Tang, X.; Shum, H. Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 353–367. [Google Scholar]
- Jiang, H.; Wang, J.; Yuan, Z.; Wu, Y.; Zheng, N.; Li, S. Salient object detection: A discriminative regional feature integration approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2013, Portland, OR, USA, 23–28 June 2013; pp. 2083–2090. [Google Scholar]
- Cheng, M.; Warrell, J.; Lin, W.; Zheng, S.; Vineet, V.; Crook, N. Efficient salient region detection with soft image abstractionn. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 1529–1536. [Google Scholar]
- Vig, E.; Dorr, M.; Cox, D. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 24–27 June 2014; pp. 2798–2805. [Google Scholar]
- Li, G.; Yu, Y. Visual saliency detection based on multiscale deep CNN features. IEEE Trans. Image Process. 2016, 25, 5012–5024. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, L.; Lu, H.; Ruan, X.; Yang, M. Deep networks for saliency detection via local estimation and global search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 3183–3192. [Google Scholar]
- He, S.; Lau, R.W.; Liu, W.; Huang, Z.; Yang, Q. SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection. Int. J. Comput. Vis. 2015, 115, 330–344. [Google Scholar] [CrossRef]
- Ren, Q.; Hu, R. Multi-scale deep encoder-decoder network for salient object detection. Neurocomputing 2018, 316, 95–104. [Google Scholar] [CrossRef]
- Li, X.; Yang, F.; Cheng, H.; Chen, J.; Guo, Y.; Chen, L. Multi-Scale Cascade Network for Salient Object Detection. In Proceedings of the 2017 ACM International Conference on Multimedia, Silicon Valley, CA, USA, 23–27 October 2017; pp. 439–447. [Google Scholar]
- Li, Z.; Lang, C.; Chen, Y.; Liew, J.; Feng, J. Deep Reasoning with Multi-scale Context for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Nguyen, T.; Liu, L. Salient object detection with semantic priors. arXiv 2017, arXiv:1705.08207. [Google Scholar]
- Xie, S.; Tu, Z. Holistically-nested edge detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 1395–1403. [Google Scholar]
- Lin, T.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Yan, Q.; Xu, L.; Shi, J.; Jia, J. Hierarchical saliency detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2013, Portland, OR, USA, 23–28 June 2013; pp. 1155–1162. [Google Scholar]
- Li, Y.; Hou, X.; Koch, C.; Rehg, J.M.; Yuille, A.L. The secrets of salient object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 24–27 June 2014; pp. 280–287. [Google Scholar]
- Yang, C.; Zhang, L.; Lu, H.; Ruan, X.; Yang, M. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2013, Portland, OR, USA, 23–28 June 2013; pp. 3166–3173. [Google Scholar]
- Peng, H.; Li, B.; Ling, H.; Hu, W.; Xiong, W.; Maybank, S.J. Salient object detection via structured matrix decomposition. TPAMI 2017, 39, 818–832. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Margolin, R.; Zelnikmanor, L.; Tal, A. How to Evaluate Foreground Maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 24–27 June 2014; pp. 248–255. [Google Scholar]
- Fan, D.; Cheng, M.; Liu, Y.; Li, T.; Borji, A. Structure-Measure: A New Way to Evaluate Foreground Maps. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4558–4567. [Google Scholar]
- Fan, D.; Gong, C.; Cao, Y.; Ren, B.; Cheng, M.; Borji, A. Enhanced-alignment Measure for Binary Foreground Map Evaluation. In Proceedings of the IJCAI 2018, Stockholm, Sweden, 13–19 July 2018; pp. 698–704. [Google Scholar]
- Fu, K.; Zhao, Q.; Gu, I.Y.; Yang, J. Deepside: A general deep framework for salient object detection. Neurocomputing 2019, 356, 69–82. [Google Scholar] [CrossRef]
- Su, J.; Li, J.; Zhang, Y.; Xia, C.; Tian, Y. Selectivity or Invariance: Boundary-aware Salient Object Detection. In Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019. [Google Scholar]
- Zhao, T.; Wu, X. Pyramid Feature Attention Network for Saliency Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 3085–3094. [Google Scholar]
- Zhang, P.; Lu, H.; Shen, C. Troy: Give Attention to Saliency and for Saliency. arXiv 2018, arXiv:1808.02373. [Google Scholar]
- Zhang, P.; Lu, H.; Shen, C. HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection. arXiv 2018, arXiv:1804.05142. [Google Scholar]
- Piao, Y.; Ji, W.; Li, J.; Zhang, M.; Lu, H. Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection. In Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 7254–7263. [Google Scholar]
- Wei, J.; Wang, S.; Huang, Q. F3Net: Fusion, Feedback and Focus for Salient Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Paszke, A.; Gross, S.; Chintala, S.; Chanan, G. PyTorch: Tensors and Dynamic Neural Networks in Python with Strong GPU Acceleration. 2017. Available online: https://github.com/pytorch/pytorch (accessed on 6 May 2020).
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019, Long Beach, CA, USA, 16–20 June 2019; pp. 248–255. [Google Scholar]
- Lee, G.; Tai, Y.; Kim, J. Deep saliency with encoded low level distance map and high level features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 660–668. [Google Scholar]
- Zhu, W.; Liang, S.; Wei, Y.; Sun, J. Saliency optimization from robust background detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 24–27 June 2014; pp. 2814–2821. [Google Scholar]
- Tu, W.; He, S.; Yang, Q.; Chien, S. Real-time salient object detection with a minimum spanning tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 2334–2342. [Google Scholar]
- Zhang, J.; Sclaroff, S.; Lin, Z.; Shen, X.; Price, B.; Mech, R. Minimum barrier salient object detection at 80 fps. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1404–1412. [Google Scholar]
- Wang, W.; Lai, Q.; Fu, H.; Shen, J.; Ling, H. Salient Object Detection in the Deep Learning Era: An In-Depth Survey. arXiv 2019, arXiv:1904.09146. [Google Scholar]
- Fan, D.; Cheng, M.; Liu, J.; Gao, S.; Hou, Q.; Borji, A. Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground. In Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, 8–14 September 2018; pp. 196–212. [Google Scholar]
- Wu, Z.; Su, L.; Huang, Q. Stacked Cross Refinement Network for Edge-Aware Salient Object Detection. In Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 7264–7273. [Google Scholar]
Layer 1 | Layer 2 | Layer 3 | Layer 4 | Layer 5 | |
---|---|---|---|---|---|
Scale-1 | (320, 320)/(3×3) | (320, 256)/(1×1) | (256, 256)/(3×3) | (256, 1)/(1×1) | - |
Scale-2 | (384, 384)/(3×3) | (384, 256)/(1×1) | (256, 256)/(3×3) | (256, 1)/(1×1) | - |
Scale-3 | (512, 512)/(3×3) | (512, 256)/(1×1) | (256, 256)/(3×3) | (256, 1)/(1×1) | - |
Scale-4 | (768, 768)/(3×3) | (768, 512)/(3×3) | (512, 256)/(1×1) | (256, 256)/(3×3) | (256, 1)/(1×1) |
Datasets | ECSSD [37] | HKU-IS [27] | PASCAL-S [38] | DUT-OMRON [39] | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Methods | Max↑ | MAE↓ | ↑ | ↑ | ↑ | Max↑ | MAE↓ | ↑ | ↑ | ↑ | Max↑ | MAE↓ | ↑ | ↑ | ↑ | Max↑ | MAE↓ | ↑ | ↑ | ↑ |
DRFI [24] | 0.782 | 0.170 | 0.462 | 0.720 | 0.763 | 0.777 | 0.145 | 0.528 | 0.727 | 0.832 | 0.694 | 0.201 | 0.469 | 0.648 | 0.745 | 0.664 | 0.150 | 0.326 | 0.697 | 0.793 |
RBD [54] | 0.716 | 0.171 | 0.430 | 0.695 | 0.697 | 0.723 | 0.142 | 0.488 | 0.683 | 0.736 | 0.659 | 0.197 | 0.429 | 0.617 | 0.670 | 0.630 | 0.144 | 0.397 | 0.668 | 0.721 |
MB+ [56] | 0.739 | 0.171 | 0.428 | 0.607 | 0.691 | 0.728 | 0.150 | 0.491 | 0.534 | 0.643 | 0.680 | 0.193 | 0.453 | 0.714 | 0.814 | 0.624 | 0.168 | 0.386 | 0.579 | 0.693 |
MST [55] | 0.731 | 0.149 | 0.445 | 0.601 | 0.686 | 0.722 | 0.168 | 0.485 | 0.693 | 0.753 | 0.670 | 0.187 | 0.458 | 0.636 | 0.715 | 0.600 | 0.149 | 0.313 | 0.653 | 0.688 |
SMD [40] | 0.760 | 0.173 | 0.453 | 0.716 | 0.745 | 0.743 | 0.156 | 0.502 | 0.702 | 0.796 | 0.690 | 0.201 | 0.463 | 0.645 | 0.737 | 0.624 | 0.166 | 0.385 | 0.686 | 0.716 |
LEGS [28] | 0.827 | 0.118 | 0.805 | 0.786 | 0.872 | 0.767 | 0.192 | 0.736 | 0.743 | 0.931 | 0.759 | 0.155 | – | 0.728 | – | 0.670 | 0.204 | 0.631 | 0.713 | – |
MCDL [16] | 0.837 | 0.110 | 0.816 | 0.803 | 0.889 | 0.808 | 0.091 | 0.768 | 0.786 | 0.927 | 0.743 | 0.146 | 0.787 | 0.721 | 0.706 | 0.702 | 0.088 | 0.670 | 0.752 | 0.670 |
DCL [17] | 0.887 | 0.072 | 0.838 | 0.868 | 0.916 | 0.880 | 0.058 | 0.841 | 0.877 | 0.931 | 0.808 | 0.110 | 0.733 | 0.785 | 0.849 | 0.717 | 0.094 | 0.639 | 0.771 | 0.826 |
MDF [27] | 0.834 | 0.105 | 0.810 | 0.776 | 0.886 | 0.814 | 0.112 | 0.754 | 0.810 | 0.872 | 0.768 | 0.150 | 0.704 | 0.696 | 0.794 | 0.694 | 0.092 | 0.643 | 0.721 | 0.820 |
ELD [53] | 0.866 | 0.079 | 0.786 | 0.838 | 0.910 | 0.839 | 0.073 | 0.780 | 0.821 | 0.910 | 0.771 | 0.126 | 0.669 | 0.761 | 0.818 | 0.700 | 0.092 | 0.596 | 0.751 | 0.797 |
Scale-1 | 0.783 | 0.165 | 0.744 | 0.732 | 0.765 | 0.782 | 0.143 | 0.673 | 0.729 | 0.834 | 0.693 | 0.184 | 0.682 | 0.707 | 0.752 | 0.672 | 0.157 | 0.494 | 0.701 | 0.762 |
Scale-2 | 0.807 | 0.143 | 0.766 | 0.763 | 0.803 | 0.807 | 0.124 | 0.715 | 0.752 | 0.858 | 0.734 | 0.153 | 0.704 | 0.725 | 0.762 | 0.688 | 0.135 | 0.558 | 0.713 | 0.794 |
Scale-3 | 0.823 | 0.128 | 0.783 | 0.791 | 0.824 | 0.825 | 0.106 | 0.753 | 0.790 | 0.892 | 0.757 | 0.138 | 0.718 | 0.743 | 0.778 | 0.693 | 0.117 | 0.583 | 0.721 | 0.807 |
Scale-4 | 0.835 | 0.104 | 0.811 | 0.803 | 0.865 | 0.834 | 0.091 | 0.776 | 0.819 | 0.913 | 0.765 | 0.127 | 0.730 | 0.754 | 0.794 | 0.699 | 0.105 | 0.623 | 0.739 | 0.815 |
MGCC(Proposed) | 0.891 | 0.066 | 0.847 | 0.887 | 0.931 | 0.878 | 0.057 | 0.838 | 0.886 | 0.955 | 0.808 | 0.100 | 0.796 | 0.793 | 0.857 | 0.726 | 0.074 | 0.698 | 0.793 | 0.838 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, W.; Li, X.; Gao, G.; Chen, X.; Liu, Q. Multi-Scale Global Contrast CNN for Salient Object Detection. Sensors 2020, 20, 2656. https://doi.org/10.3390/s20092656
Feng W, Li X, Gao G, Chen X, Liu Q. Multi-Scale Global Contrast CNN for Salient Object Detection. Sensors. 2020; 20(9):2656. https://doi.org/10.3390/s20092656
Chicago/Turabian StyleFeng, Weijia, Xiaohui Li, Guangshuai Gao, Xingyue Chen, and Qingjie Liu. 2020. "Multi-Scale Global Contrast CNN for Salient Object Detection" Sensors 20, no. 9: 2656. https://doi.org/10.3390/s20092656
APA StyleFeng, W., Li, X., Gao, G., Chen, X., & Liu, Q. (2020). Multi-Scale Global Contrast CNN for Salient Object Detection. Sensors, 20(9), 2656. https://doi.org/10.3390/s20092656