Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding
Abstract
:1. Introduction
2. Related Work
2.1. Superpixel Segmentation Algorithms
2.2. Advances in Graph Convolution definition
2.3. Superpixel Image Classification Using Graph Convolutional Neural Networks
2.4. Graph Positional Embedding Methods
3. Proposed Method
3.1. Notation
3.2. Standard Message-Passing Process
3.3. Learnable Positional Embedding Methods
3.4. ArcFace Loss Function
4. Experiments
4.1. Datasets
4.2. Experiment Details
4.3. Results Details
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Pasdeloup, B.; Gripon, V.; Vialatte, J.C.; Pastor, D.; Frossard, P. Convolutional neural networks on irregular domains based on approximate vertex-domain translations. arXiv 2017, arXiv:1710.10035. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
- Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Avelar, P.H.; Tavares, A.R.; da Silveira, T.L.; Jung, C.R.; Lamb, L.C. November. Superpixel image classification with graph attention networks. In Proceedings of the 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Porto de Galinhas, Brazil, 7–10 November 2020. [Google Scholar]
- Bae, J.H.; Vu, D.T.; Kim, J.Y. Superpixel Image Classification based on Graph Neural Network. In Proceedings of the Korea Telecommunications Society Conference, Pyeongchang, Korea, 9–11 February 2022; pp. 971–972. [Google Scholar]
- Long, J.; Yan, Z.; Chen, H. A Graph Neural Network for superpixel image classification. J. Phys. Conf. Ser. 2021, 1871, 012071. [Google Scholar] [CrossRef]
- Dadsetan, S.; Pichler, D.; Wilson, D.; Hovakimyan, N.; Hobbs, J. Superpixels and Graph Convolutional Neural Networks for Efficient Detection of Nutrient Deficiency Stress from Aerial Imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, virtual, 19–25 June 2021; pp. 2950–2959. [Google Scholar] [CrossRef]
- Wang, K.; Li, L.; Zhang, J. End-to-end trainable network for superpixel and image segmentation. Pattern Recognit. Lett. 2020, 140, 135–142. [Google Scholar] [CrossRef]
- Yang, C.; Zhang, L.; Lu, H.; Ruan, X.; Yang, M.H. Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3166–3173. [Google Scholar]
- Zhang, K.; Li, T.; Shen, S.; Liu, B.; Chen, J.; Liu, Q. Adaptive graph convolutional network with attention graph clustering for co-saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9050–9059. [Google Scholar]
- Diao, Q.; Dai, Y.; Zhang, C.; Wu, Y.; Feng, X.; Pan, F. Superpixel-Based Attention Graph Neural Network for Semantic Segmentation in Aerial Images. Remote Sens. 2022, 14, 305. [Google Scholar] [CrossRef]
- Mentasti, S.; Matteucci, M. Image Segmentation on Embedded Systems via Superpixel Convolutional Networks. In Proceedings of the European Conference on Mobile Robots (ECMR), Prague, Czech Republic, 4–6 September 2019; pp. 1–7. [Google Scholar]
- Zhang, C.; Lin, G.; Liu, F.; Guo, J.; Wu, Q.; Yao, R. Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9587–9595. [Google Scholar]
- Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In European Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2018; pp. 593–607. [Google Scholar] [CrossRef]
- Pradhyumna, P.; Shreya, G.P. Graph neural network (GNN) in image and video understanding using deep learning for computer vision applications. In Proceedings of the Second International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 4–6 August 2021; pp. 1183–1189. [Google Scholar]
- Wu, L.; Chen, Y.; Shen, K.; Guo, X.; Gao, H.; Li, S.; Pei, J.; Long, B. Graph neural networks for natural language processing: A survey. arXiv 2021, arXiv:2106.06090. [Google Scholar]
- Zhong, T.; Wang, T.; Wang, J.; Wu, J.; Zhou, F. Multiple-Aspect Attentional Graph Neural Networks for Online Social Network User Localization. IEEE Access 2020, 8, 95223–95234. [Google Scholar] [CrossRef]
- Li, Y.; Ji, Y.; Li, S.; He, S.; Cao, Y.; Liu, Y.; Liu, H.; Li, X.; Shi, J.; Yang, Y. Relevance-aware anomalous users detection in social network via graph neural network. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Wang, Y.; Wang, J.; Cao, Z.; Farimani, A.B. Molclr: Molecular contrastive learning of representations via graph neural networks. arXiv 2021, arXiv:2102.10056. [Google Scholar]
- Godwin, J.; Schaarschmidt, M.; Gaunt, A.L.; Sanchez-Gonzalez, A.; Rubanova, Y.; Veličković, P.; Kirkpatrick, J.; Battaglia, P. Simple gnn regularisation for 3d molecular property prediction and beyond. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed]
- Vedaldi, A.; Soatto, S. Quick shift and kernel methods for mode seeking. In Proceedings of the European Conference on Computer Vision, Marseille, France, 12–18 October 2008; pp. 705–718. [Google Scholar]
- Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 2004, 59, 167–181. [Google Scholar] [CrossRef]
- Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv 2013, arXiv:1312.6203. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems 29 (NIPS 2016); Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 29. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Dwivedi, V.P.; Luu, A.T.; Laurent, T.; Bengio, Y.; Bresson, X. Graph neural networks with learnable structural and positional representations. arXiv 2021, arXiv:2110.07875. [Google Scholar]
- Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4690–4699. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- LeCun, Y. The MNIST Database of Handwritten Digits. 1998. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 4 August 2022).
- Youn, C.H. Dynamic graph neural network for super-pixel image classification. In Proceedings of the International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Korea, 20–22 October 2021; pp. 1095–1099. [Google Scholar]
- De Boer, P.T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
- Dwivedi, V.P.; Bresson, X. A generalization of transformer networks to graphs. arXiv 2020, arXiv:2012.09699. [Google Scholar]
- Dwivedi, V.P.; Joshi, C.K.; Laurent, T.; Bengio, Y.; Bresson, X. Benchmarking graph neural networks. arXiv 2020, arXiv:2003.00982. [Google Scholar]
- Li, P.; Wang, Y.; Wang, H.; Leskovec, J. Distance encoding: Design provably more powerful neural networks for graph representation learning. In Advances in Neural Information Processing Systems 33 (NIPS 2020); Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 4465–4478. [Google Scholar]
- You, J.; Ying, R.; Leskovec, J. Position-aware graph neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019; pp. 7134–7143. [Google Scholar]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; University of Toronto: Toronto, ON, Canada, 2009. [Google Scholar]
- Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 12–17 December 2011. [Google Scholar]
- Krause, J.; Stark, M.; Deng, J.; Fei-Fei, L. 3d object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, NSW, Australia, 2–8 December 2013; pp. 554–561. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-UCSD Birds-200-2011 Dataset; California Institute of Technology: Pasadena, CA, USA, 2011. [Google Scholar]
- Helber, P.; Bischke, B.; Dengel, A.; Borth, D. EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2217–2226. [Google Scholar] [CrossRef]
- Bottou, L. Large-Scale Machine Learning with Stochastic Gradient Descent. In Proceedings of the 19th International Conference on Computational Statistics, Paris, France, 22–27 August 2010; pp. 177–186. [Google Scholar] [CrossRef] [Green Version]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Preprocess | Model | Dataset | ||||||
---|---|---|---|---|---|---|---|---|
F-MNIST (28 × 28) | CIFAR10 (32 × 32) | SVHN (32 × 32) | CAR196 (224 × 224) | CUB200 (64 × 64) | CUB200 (128 × 128) | EuroSAT (64 × 64) | ||
SLIC-75 | GCN-1Layer | 87.542 | 46.497 | 72.682 | 51.498 | 58.477 | 58.070 | 66.487 |
GCN-2Layers | 89.794 | 57.219 | 73.454 | 53.842 | 62.813 | 63.466 | 67.921 | |
GAT-1Head | 89.688 | 57.693 | 79.211 | 58.903 | 65.813 | 67.544 | 71.608 | |
GAT-2Heads | 90.874 | 61.103 | 80.726 | 59.582 | 67.560 | 69.620 | 74.863 | |
ChebNet (K = 25) | 79.167 | 33.297 | 52.045 | 27.663 | 47.093 | 45.647 | 54.236 | |
IMGCN-LPE | ||||||||
GCN-1Layer | 89.183 | 50.563 | 74.133 | 55.712 | 61.916 | 64.866 | 65.736 | |
GCN-2Layers | 90.561 | 60.464 | 75.486 | 57.808 | 65.848 | 67.212 | 66.388 | |
GAT-1Head | 90.618 | 66.480 | 80.511 | 61.438 | 68.223 | 69.838 | 74.758 | |
GAT-2Heads | 92.013 | 68.632 | 81.986 | 62.729 | 70.192 | 72.101 | 76.373 | |
ChebNet (K = 25) | 91.21 | 73.086 | 80.714 | 60.861 | 70.484 | 69.637 | 75.482 |
Method | Model | Dataset | ||||||
---|---|---|---|---|---|---|---|---|
F-MNIST (28 × 28) | CIFAR10 (32 × 32) | SVHN (32 × 32) | CAR196 (224 × 224) | CUB200 (64 × 64) | CUB200 (128 × 128) | EuroSAT (64 × 64) | ||
SLIC-75 | RAG-GAT | 83.07 | 45.93 | 80.72 | - | - | - | - |
DISCO-GCN | 90.02 | 70.01 | - | - | - | - | - | |
HGNN | - | 70.61 | - | - | - | - | 75.22 | |
GCN-2Layers | - | - | - | 53.84 | 62.81 | 63.49 | - | |
GAT-2Heads | - | - | - | 59.58 | 67.56 | 69.62 | - | |
IMGCN-LPE | ||||||||
Baseline models | 92.01 | 73.09 | 81.99 | 62.73 | 70.48 | 72.10 | 76.37 |
Preprocess | Model | Dataset | ||||||
---|---|---|---|---|---|---|---|---|
F-MNIST (28 × 28) | CIFAR10 (32 × 32) | SVHN (32 × 32) | CAR196 (224 × 224) | CUB200 (64 × 64) | CUB200 (128 × 128) | EuroSAT (64 × 64) | ||
SLIC-75 | IMGCN-LPE | |||||||
GCN-1Layer | 89.18 → 89.32 | 50.56 → 55.85 | 74.13 → 75.38 | 55.71 → 55.79 | 61.92 → 63.01 | 65.74 → 65.93 | 65.74 → 66.04 | |
GCN-2Layers | 90.56 → 90.98 | 60.46 → 60.84 | 75.47 → 76.61 | 57.61 → 57.88 | 65.85 → 65.92 | 67.21 → 67.74 | 66.39 → 66.77 | |
GAT-1Head | 90.62 → 90.90 | 66.46 → 67.28 | 80.51 → 81.17 | 61.44 → 61.76 | 68.22 → 69.06 | 69.90 → 72.45 | 74.76 → 74.96 | |
GAT-2Heads | 92.01 → 92.25 | 68.63 → 69.71 | 81.99 → 82.52 | 62.73 → 63.46 | 70.19 → 70.62 | 72.10 → 73.85 | 76.37 → 76.96 | |
ChebNet (K = 25) | 91.21 → 91.63 | 73.09 → 73.22 | 80.71 → 82.02 | 60.86 → 61.20 | 70.48 → 72.99 | 69.64 → 71.26 | 75.48 → 76.61 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bae, J.-H.; Yu, G.-H.; Lee, J.-H.; Vu, D.T.; Anh, L.H.; Kim, H.-G.; Kim, J.-Y. Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding. Appl. Sci. 2022, 12, 9176. https://doi.org/10.3390/app12189176
Bae J-H, Yu G-H, Lee J-H, Vu DT, Anh LH, Kim H-G, Kim J-Y. Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding. Applied Sciences. 2022; 12(18):9176. https://doi.org/10.3390/app12189176
Chicago/Turabian StyleBae, Ji-Hun, Gwang-Hyun Yu, Ju-Hwan Lee, Dang Thanh Vu, Le Hoang Anh, Hyoung-Gook Kim, and Jin-Young Kim. 2022. "Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding" Applied Sciences 12, no. 18: 9176. https://doi.org/10.3390/app12189176
APA StyleBae, J. -H., Yu, G. -H., Lee, J. -H., Vu, D. T., Anh, L. H., Kim, H. -G., & Kim, J. -Y. (2022). Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding. Applied Sciences, 12(18), 9176. https://doi.org/10.3390/app12189176