applsci-logo

Journal Browser

Journal Browser

Advances in Computer Vision and Semantic Segmentation, 2nd Edition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 10 March 2025 | Viewed by 4730

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, College of Science, Swansea University, Singleton Park, Swansea SA2 8PP, UK
Interests: visual analytics; machine learning; digital geometry processing; pattern recognition and vision; multi-dimensional data analysis; information retrieval and indexing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, Durham University, Durham DH1 3LE, UK
Interests: computer graphics; geometric modelling and processing; collaborative virtual environments; visual aesthetics; educational techno
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, College of Science, Swansea University, Singleton Park, Swansea SA2 8PP, UK
Interests: computer vision; image processing; machine learning; medical image analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computer Science, University of Birmingham, Edgbaston Birmingham B15 2TT, UK
Interests: computer vision; machine learning; medical imaging
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Semantic segmentation is a core problem for many applications, such as image manipulation, facial segmentation, healthcare, security and surveillance, medical imaging and diagnosis, aerial and satellite image surveying and processing, city 3D modeling, and scene understanding. It is also an important building block in more complex systems, including autonomous cars, drones, and human-centric robots.

The recent advances in deep learning techniques (e.g., CNN, FCN, UNet, graph LSTM, spatial pyramid, attentional modelling, and transformer) have fostered many great improvements in semantic segmentation, not only improving speed and accuracy but also inspiring other areas such as instance and panoptic segmentation.

This Special Issue welcomes research papers on semantic segmentation (and its broader areas, including instance and panoptic segmentation) and advanced computer vision applications relating to semantic segmentation. It covers possible research and application areas, including multimodal segmentation (e.g., referring to image segmentation), salient object detection and segmentation, 3D (point cloud and meshes) semantic segmentation, video semantic segmentation, and many others. Papers focusing on new data (e.g., hyper-spectral data, MRI CT, point cloud, and meshes) and new deep architectures, techniques, and learning strategies (e.g., weakly supervised/unsupervised semantic segmentation, zero/few-shot learning, domain adaptation, real-time processing, contextual information, transfer learning, reinforcement learning, and the critical issue of acquiring training data) are all welcome.

Dr. Gary KL Tam
Dr. Frederick W. B. Li
Prof. Dr. Xianghua Xie
Dr. Jianbo Jiao
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • semantic segmentation
  • instance segmentation
  • panoptic segmentation
  • multimodal segmentation
  • referring image segmentation
  • salient object detection and segmentation
  • 3D semantic segmentation
  • video semantic segmentation
  • weakly supervised semantic segmentation
  • unsupervised semantic segmentation
  • advanced machine learning segmentation techniques
  • medical semantic segmentation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1473 KiB  
Article
Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features
by Xinyu Cao, Yongqiang Tian, Zhixin Yao, Yunjie Zhao and Taihong Zhang
Appl. Sci. 2024, 14(19), 8739; https://doi.org/10.3390/app14198739 - 27 Sep 2024
Viewed by 771
Abstract
Semantic segmentation of rural roads presents unique challenges due to the unstructured nature of these environments, including irregular road boundaries, mixed surfaces, and diverse obstacles. In this study, we propose an enhanced PP-LiteSeg model specifically designed for rural road segmentation, incorporating a novel [...] Read more.
Semantic segmentation of rural roads presents unique challenges due to the unstructured nature of these environments, including irregular road boundaries, mixed surfaces, and diverse obstacles. In this study, we propose an enhanced PP-LiteSeg model specifically designed for rural road segmentation, incorporating a novel Strip Pooling Simple Pyramid Module (SP-SPPM) and a Bottleneck Unified Attention Fusion Module (B-UAFM). These modules improve the model’s ability to capture both global and local features, addressing the complexity of rural roads. To validate the effectiveness of our model, we constructed the Rural Roads Dataset (RRD), which includes a diverse set of rural scenes from different regions and environmental conditions. Experimental results demonstrate that our model significantly outperforms baseline models such as UNet, BiSeNetv1, and BiSeNetv2, achieving higher accuracy in terms of mean intersection over union (MIoU), Kappa coefficient, and Dice coefficient. Our approach enhances segmentation performance in complex rural road environments, providing practical applications for autonomous navigation, infrastructure maintenance, and smart agriculture. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

26 pages, 1895 KiB  
Article
Enhanced Ischemic Stroke Lesion Segmentation in MRI Using Attention U-Net with Generalized Dice Focal Loss
by Beatriz P. Garcia-Salgado, Jose A. Almaraz-Damian, Oscar Cervantes-Chavarria, Volodymyr Ponomaryov, Rogelio Reyes-Reyes, Clara Cruz-Ramos and Sergiy Sadovnychiy
Appl. Sci. 2024, 14(18), 8183; https://doi.org/10.3390/app14188183 - 11 Sep 2024
Viewed by 996
Abstract
Ischemic stroke lesion segmentation in MRI images represents significant challenges, particularly due to class imbalance between foreground and background pixels. Several approaches have been developed to achieve higher F1-Scores in stroke lesion segmentation under this challenge. These strategies include convolutional neural networks (CNN) [...] Read more.
Ischemic stroke lesion segmentation in MRI images represents significant challenges, particularly due to class imbalance between foreground and background pixels. Several approaches have been developed to achieve higher F1-Scores in stroke lesion segmentation under this challenge. These strategies include convolutional neural networks (CNN) and models that represent a large number of parameters, which can only be trained on specialized computational architectures that are explicitly oriented to data processing. This paper proposes a lightweight model based on the U-Net architecture that handles an attention module and the Generalized Dice Focal loss function to enhance the segmentation accuracy in the class imbalance environment, characteristic of stroke lesions in MRI images. This study also analyzes the segmentation performance according to the pixel size of stroke lesions, giving insights into the loss function behavior using the public ISLES 2015 and ISLES 2022 MRI datasets. The proposed model can effectively segment small stroke lesions with F1-Scores over 0.7, particularly in FLAIR, DWI, and T2 sequences. Furthermore, the model shows reasonable convergence with their 7.9 million parameters at 200 epochs, making it suitable for practical implementation on mid and high-end general-purpose graphic processing units. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

17 pages, 2933 KiB  
Article
Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection
by Qinsheng Du, Yingxu Bian, Jianyu Wu, Shiyan Zhang and Jian Zhao
Appl. Sci. 2024, 14(17), 7440; https://doi.org/10.3390/app14177440 - 23 Aug 2024
Viewed by 655
Abstract
The salient object detection (SOD) task aims to automatically detect the most prominent areas observed by the human eye in an image. Since RGB images and depth images contain different information, how to effectively integrate cross-modal features in the RGB-D SOD task remains [...] Read more.
The salient object detection (SOD) task aims to automatically detect the most prominent areas observed by the human eye in an image. Since RGB images and depth images contain different information, how to effectively integrate cross-modal features in the RGB-D SOD task remains a major challenge. Therefore, this paper proposes a cross-modal adaptive interaction network (CMANet) for the RGB-D salient object detection task, which consists of a cross-modal feature integration module (CMF) and an adaptive feature fusion module (AFFM). These modules are designed to integrate and enhance multi-scale features from both modalities, improve the effect of integrating cross-modal complementary information of RGB and depth images, enhance feature information, and generate richer and more representative feature maps. Extensive experiments were conducted on four RGB-D datasets to verify the effectiveness of CMANet. Compared with 17 RGB-D SOD methods, our model accurately detects salient regions in images and achieves state-of-the-art performance across four evaluation metrics. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

14 pages, 5108 KiB  
Article
Soldering Defect Segmentation Method for PCB on Improved UNet
by Zhongke Li and Xiaofang Liu
Appl. Sci. 2024, 14(16), 7370; https://doi.org/10.3390/app14167370 - 21 Aug 2024
Viewed by 660
Abstract
Despite being indispensable devices in the electronic manufacturing industry, printed circuit boards (PCBs) may develop various soldering defects in the production process, which seriously affect the product’s quality. Due to the substantial background interference in the soldering defect image and the small and [...] Read more.
Despite being indispensable devices in the electronic manufacturing industry, printed circuit boards (PCBs) may develop various soldering defects in the production process, which seriously affect the product’s quality. Due to the substantial background interference in the soldering defect image and the small and irregular shapes of the defects, the accurate segmentation of soldering defects is a challenging task. To address this issue, a method to improve the encoder–decoder network structure of UNet is proposed for PCB soldering defect segmentation. To enhance the feature extraction capabilities of the encoder and focus more on deeper features, VGG16 is employed as the network encoder. Moreover, a hybrid attention module called the DHAM, which combines channel attention and dynamic spatial attention, is proposed to reduce the background interference in images and direct the model’s focus more toward defect areas. Additionally, based on GSConv, the RGSM is introduced and applied in the decoder to enhance the model’s feature fusion capabilities and improve the segmentation accuracy. The experiments demonstrate that the proposed method can effectively improve the segmentation accuracy for PCB soldering defects, achieving an mIoU of 81.74% and mPA of 87.33%, while maintaining a relatively low number of model parameters at only 22.13 M and achieving an FPS of 30.16, thus meeting the real-time detection speed requirements. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

14 pages, 2038 KiB  
Article
An Efficient Semantic Segmentation Method for Remote-Sensing Imagery Using Improved Coordinate Attention
by Yan Huo, Shuang Gang, Liang Dong and Chao Guan
Appl. Sci. 2024, 14(10), 4075; https://doi.org/10.3390/app14104075 - 10 May 2024
Cited by 1 | Viewed by 995
Abstract
Semantic segmentation stands as a prominent domain within remote sensing that is currently garnering significant attention. This paper introduces a pioneering semantic segmentation model based on TransUNet architecture with improved coordinate attention for remote-sensing imagery. It is composed of an encoding stage and [...] Read more.
Semantic segmentation stands as a prominent domain within remote sensing that is currently garnering significant attention. This paper introduces a pioneering semantic segmentation model based on TransUNet architecture with improved coordinate attention for remote-sensing imagery. It is composed of an encoding stage and a decoding stage. Notably, an enhanced and improved coordinate attention module is employed by integrating two pooling methods to generate weights. Subsequently, the feature map undergoes reweighting to accentuate foreground information and suppress background information. To address the issue of time complexity, this paper introduces an improvement to the transformer model by sparsifying the attention matrix. This reduces the computing expense of calculating attention, making the model more efficient. Additionally, the paper uses a combined loss function that is designed to enhance the training performance of the model. The experimental results conducted on three public datasets manifest the efficiency of the proposed method. The results indicate that it excels in delivering outstanding performance for semantic segmentation tasks pertaining to remote-sensing images. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

Back to TopTop