MDPI - Publisher of Open Access Journals

26 pages, 5591 KB

Open AccessArticle

Design and Development of a Precision Spraying Control System for Orchards Based on Machine Vision Detection

by Yu Luo, Xiaoli He, Hanwen Shi, Simon X. Yang, Lepeng Song and Ping Li

Sensors 2025, 25(12), 3799; https://doi.org/10.3390/s25123799 - 18 Jun 2025

Cited by 1 | Viewed by 593

Precision spraying technology has attracted increasing attention in orchard production management. Traditional chemical pesticide application relies on subjective judgment, leading to fluctuations in pesticide usage, low application efficiency, and environmental pollution. This study proposes a machine vision-based precision spraying control system for orchards. [...] Read more.

Precision spraying technology has attracted increasing attention in orchard production management. Traditional chemical pesticide application relies on subjective judgment, leading to fluctuations in pesticide usage, low application efficiency, and environmental pollution. This study proposes a machine vision-based precision spraying control system for orchards. First, a canopy leaf wall area calculation method was developed based on a multi-iteration GrabCut image segmentation algorithm, and a spray volume calculation model was established. Next, a fuzzy adaptive control algorithm based on an extended state observer (ESO) was proposed, along with the design of flow and pressure controllers. Finally, the precision spraying system’s performance tests were conducted in laboratory and field environments. The indoor experiments consisted of three test sets, each involving six citrus trees, totaling eighteen trees arranged in two staggered rows, with an interrow spacing of 3.4 m and an intra-row spacing of 2.5 m; the nozzle was positioned approximately 1.3 m from the canopy surface. Similarly, the field experiments included three test sets, each selecting eight citrus trees, totaling twenty-four trees, with an average height of approximately 1.5 m and a row spacing of 3 m, representing a typical orchard environment for performance validation. Experimental results demonstrated that the system reduced spray volume by 59.73% compared to continuous spraying, by 30.24% compared to PID control, and by 19.19% compared to traditional fuzzy control; meanwhile, the pesticide utilization efficiency increased by 61.42%, 26.8%, and 19.54%, respectively. The findings of this study provide a novel technical approach to improving agricultural production efficiency, enhancing fruit quality, reducing pesticide use, and promoting environmental protection, demonstrating significant application value. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 3326 KB

Open AccessArticle

Comparison of Image Preprocessing Strategies for Convolutional Neural Network-Based Growth Stage Classification of Butterhead Lettuce in Industrial Plant Factories

by Jung-Sun Gloria Kim, Soo Chung, Myungjin Ko, Jihoon Song and Soo Hyun Shin

Appl. Sci. 2025, 15(11), 6278; https://doi.org/10.3390/app15116278 - 3 Jun 2025

Viewed by 810

Abstract

The increasing need for scalable and efficient crop monitoring systems in industrial plant factories calls for image-based deep learning models that are both accurate and robust to domain variability. This study investigates the feasibility of CNN-based growth stage classification of butterhead lettuce ( [...] Read more.

The increasing need for scalable and efficient crop monitoring systems in industrial plant factories calls for image-based deep learning models that are both accurate and robust to domain variability. This study investigates the feasibility of CNN-based growth stage classification of butterhead lettuce (Lactuca sativa L.) using two data types: raw images and images processed through GrabCut–Watershed segmentation. A ResNet50-based transfer learning model was trained and evaluated on each dataset, and cross-domain performance was assessed to understand generalization capability. Models trained and tested within the same domain achieved high accuracy (Model 1: 99.65%; Model 2: 97.75%). However, cross-domain evaluations revealed asymmetric performance degradation—Model 1-CDE (trained on raw images, tested on preprocessed images) achieved 82.77% accuracy, while Model 2-CDE (trained on preprocessed images, tested on raw images) dropped to 34.15%. Although GrabCut–Watershed offered clearer visual inputs, it limited the model’s ability to generalize due to reduced contextual richness and oversimplification. In terms of inference efficiency, Model 2 recorded the fastest model-only inference time (0.037 s/image), but this excluded the segmentation step. In contrast, Model 1 achieved 0.055 s/image without any additional preprocessing, making it more viable for real-time deployment. Notably, Model 1-CDE combined the fastest inference speed (0.040 s/image) with stable cross-domain performance, while Model 2-CDE was both the slowest (0.053 s/image) and least accurate. Grad-CAM visualizations further confirmed that raw image-trained models consistently attended to meaningful plant structures, whereas segmentation-trained models often failed to localize correctly in cross-domain tests. These findings demonstrate that training with raw images yields more robust, generalizable, and deployable models. The study highlights the importance of domain consistency and preprocessing trade-offs in vision-based agricultural systems and lays the groundwork for lightweight, real-time AI applications in smart farming. Full article

(This article belongs to the Special Issue Applications of Image Processing Technology in Agriculture)

► Show Figures

Figure 1

20 pages, 9041 KB

Open AccessArticle

D²-SPDM: Faster R-CNN-Based Defect Detection and Surface Pixel Defect Mapping with Label Enhancement in Steel Manufacturing Processes

by Taewook Wi, Minyeol Yang, Suyeon Park and Jongpil Jeong

Appl. Sci. 2024, 14(21), 9836; https://doi.org/10.3390/app14219836 - 28 Oct 2024

Cited by 1 | Viewed by 2762

Abstract

The steel manufacturing process is inherently continuous, meaning that if defects are not effectively detected in the initial stages, they may propagate through subsequent stages, resulting in high costs for corrections in the final product. Therefore, detecting surface defects and obtaining segmentation information [...] Read more.

The steel manufacturing process is inherently continuous, meaning that if defects are not effectively detected in the initial stages, they may propagate through subsequent stages, resulting in high costs for corrections in the final product. Therefore, detecting surface defects and obtaining segmentation information is critical in the steel manufacturing industry to ensure product quality and enhance production efficiency. Specifically, segmentation information is essential for accurately understanding the shape and extent of defects, providing the necessary details for subsequent processes to address these defects. However, the time-consuming and costly process of generating segmentation annotations poses a significant barrier to practical industrial applications. This paper proposes a cost-efficient segmentation labeling framework that combines deep learning-based anomaly detection and label enhancement to address these challenges in the steel manufacturing process. Using ResNet-50, defects are classified, and faster region convolutional neural networks (faster R-CNNs) are employed to identify defect types and generate bounding boxes indicating the defect locations. Subsequently, recursive learning is performed using the GrabCut algorithm and the DeepLabv3+ model based on the generated bounding boxes, significantly reducing annotation costs by generating segmentation labels. The proposed framework effectively detects defects and accurately defines them, even in randomly collected images from the steel manufacturing process, contributing to both quality control and cost reduction. This study presents a novel approach for improving the quality of the steel manufacturing process and is expected to enhance overall efficiency in the steel manufacturing industry. Full article

(This article belongs to the Special Issue Deep Learning in Object Detection)

► Show Figures

Figure 1

17 pages, 6135 KB

Open AccessArticle

Research on Improved Image Segmentation Algorithm Based on GrabCut

by Shangzhen Pang, Tzer Hwai Gilbert Thio, Fei Lu Siaw, Mingju Chen and Yule Xia

Electronics 2024, 13(20), 4068; https://doi.org/10.3390/electronics13204068 - 16 Oct 2024

Cited by 5 | Viewed by 2120

Abstract

The classic interactive image segmentation algorithm GrabCut achieves segmentation through iterative optimization. However, GrabCut requires multiple iterations, resulting in slower performance. Moreover, relying solely on a rectangular bounding box can sometimes lead to inaccuracies, especially when dealing with complex shapes or intricate object [...] Read more.

The classic interactive image segmentation algorithm GrabCut achieves segmentation through iterative optimization. However, GrabCut requires multiple iterations, resulting in slower performance. Moreover, relying solely on a rectangular bounding box can sometimes lead to inaccuracies, especially when dealing with complex shapes or intricate object boundaries. To address these issues in GrabCut, an improvement is introduced by incorporating appearance overlap terms to optimize segmentation energy function, thereby achieving optimal segmentation results in a single iteration. This enhancement significantly reduces computational costs while improving the overall segmentation speed without compromising accuracy. Additionally, users can directly provide seed points on the image to more accurately indicate foreground and background regions, rather than relying solely on a bounding box. This interactive approach not only enhances the algorithm’s ability to accurately segment complex objects but also simplifies the user experience. We evaluate the experimental results through qualitative and quantitative analysis. In qualitative analysis, improvements in segmentation accuracy are visibly demonstrated through segmented images and residual segmentation results. In quantitative analysis, the improved algorithm outperforms GrabCut and min_cut algorithms in processing speed. When dealing with scenes where complex objects or foreground objects are very similar to the background, the improved algorithm will display more stable segmentation results. Full article

► Show Figures

Figure 1

19 pages, 3615 KB

Open AccessArticle

EnNet: Enhanced Interactive Information Network with Zero-Order Optimization

by Yingzhao Shao, Yanxin Chen, Pengfei Yang and Fei Cheng

Sensors 2024, 24(19), 6361; https://doi.org/10.3390/s24196361 - 30 Sep 2024

Viewed by 968

Abstract

Interactive image segmentation extremely accelerates the generation of high-quality annotation image datasets, which are the pillars of the applications of deep learning. However, these methods suffer from the insignificance of interaction information and excessively high optimization costs, resulting in unexpected segmentation outcomes and [...] Read more.

Interactive image segmentation extremely accelerates the generation of high-quality annotation image datasets, which are the pillars of the applications of deep learning. However, these methods suffer from the insignificance of interaction information and excessively high optimization costs, resulting in unexpected segmentation outcomes and increased computational burden. To address these issues, this paper focuses on interactive information mining from the network architecture and optimization procedure. In terms of network architecture, the issue mentioned above arises from two perspectives: the less representative feature of interactive regions in each layer and the interactive information weakened by the network hierarchy structure. Therefore, the paper proposes a network called EnNet. The network addresses the two aforementioned issues by employing attention mechanisms to integrate user interaction information across the entire image and incorporating interaction information twice in a design that progresses from coarse to fine. In terms of optimization, this paper proposes a method of using zero-order optimization during the first four iterations of training. This approach can reduce computational overhead with only a minimal reduction in accuracy. The experimental results on GrabCut, Berkeley, DAVIS, and SBD datasets validate the effectiveness of the proposed method, with our approach achieving an average NOC@90 that surpasses RITM by 0.35. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 16224 KB

Open AccessArticle

Lightweight Machine Learning Method for Real-Time Espresso Analysis

by Jintak Choi, Seungeun Lee, Kyungtae Kang and Hyojoong Suh

Electronics 2024, 13(4), 800; https://doi.org/10.3390/electronics13040800 - 19 Feb 2024

Cited by 2 | Viewed by 2827

Abstract

Coffee crema plays a crucial role in assessing the quality of espresso. In recent years, in response to the rising labor costs, aging population, remote security/authentication needs, civic awareness, and the growing preference for non-face-to-face interactions, robot cafes have emerged. While some people [...] Read more.

Coffee crema plays a crucial role in assessing the quality of espresso. In recent years, in response to the rising labor costs, aging population, remote security/authentication needs, civic awareness, and the growing preference for non-face-to-face interactions, robot cafes have emerged. While some people seek sentiment and premium coffee, there are also many who desire quick and affordable options. To align with the trends of this era, there is a need for lightweight artificial intelligence algorithms for easy and quick decision making, as well as monitoring the extraction process in these automated cafes. However, the application of these technologies to actual coffee machines has been limited. In this study, we propose an innovative real-time coffee crema control system that integrates lightweight machine learning algorithms. We employ the GrabCut algorithm to segment the crema region from the rest of the image and use a clustering algorithm to determine the optimal brewing conditions for each cup of espresso based on the characteristics of the crema extracted. Our results demonstrate that our approach can accurately analyze coffee crema in real time. This research proposes a promising direction by leveraging computer vision and machine learning technologies to enhance the efficiency and consistency of coffee brewing. Such an approach enables the prediction of component replacement timing in coffee machines, such as the replacement of water filters, and provides administrators with Before Service. This could lead to the development of fully automated artificial intelligence coffee making systems in the future. Full article

(This article belongs to the Special Issue Software Analysis, Quality, and Security)

► Show Figures

Figure 1

22 pages, 6549 KB

Open AccessArticle

Intelligent Segmentation and Change Detection of Dams Based on UAV Remote Sensing Images

by Haimeng Zhao, Xiaojian Yin, Anran Li, Huimin Zhang, Danqing Pan, Jinjin Pan, Jianfang Zhu, Mingchun Wang, Shanlin Sun and Qiang Wang

Remote Sens. 2023, 15(23), 5526; https://doi.org/10.3390/rs15235526 - 27 Nov 2023

Cited by 3 | Viewed by 1941

Abstract

Guilin is situated in the southern part of China with abundant rainfall. There are 137 reservoirs, which are widely used for irrigation, flood control, water supply and power generation. However, there has been a lack of systematic and full-coverage remote sensing monitoring of [...] Read more.

Guilin is situated in the southern part of China with abundant rainfall. There are 137 reservoirs, which are widely used for irrigation, flood control, water supply and power generation. However, there has been a lack of systematic and full-coverage remote sensing monitoring of reservoir dams for a long time. According to the latest public literature, high-resolution unmanned aerial vehicle (UAV) remote sensing has not been used to detect changes on the reservoir dams of Guilin. In this paper, an intelligent segmentation change detection method is proposed to complete the detection of dam change based on multitemporal high-resolution UAV remote sensing data. Firstly, an enhanced GrabCut that fuses the linear spectral clustering (LSC) superpixel mapping matrix and the Sobel edge operator is proposed to extract the features of reservoir dams. The edge operator is introduced into GrabCut to redefine the new energy function’s smooth item, which makes the segmentation results of enhanced GrabCut more robust and accurate. Then, through image registration, the multitemporal dam extraction results are unified to the same coordinate system to complete the difference operation, and finally the dam change results are obtained. The experimental results of two representative reservoir dams in Guilin show that the proposed method can achieve a very high accuracy of change detection, which is an important reference for related research. Full article

(This article belongs to the Special Issue Advances and Challenges on Multisource Remote Sensing Image Fusion: Datasets, New Technologies, and Applications)

► Show Figures

Graphical abstract

15 pages, 5820 KB

Open AccessArticle

Visual Ranging Based on Object Detection Bounding Box Optimization

by Zhou Shi, Zhongguo Li, Sai Che, Miaowei Gao and Hongchuan Tang

Appl. Sci. 2023, 13(19), 10578; https://doi.org/10.3390/app131910578 - 22 Sep 2023

Cited by 1 | Viewed by 2320

Abstract

Faster and more accurate ranging can be achieved by combining the object detection technique based on deep learning with conventional visual ranging. However, changes in scene, uneven lighting, fuzzy object boundaries and other factors may result in a non-fit phenomenon between the detection [...] Read more.

Faster and more accurate ranging can be achieved by combining the object detection technique based on deep learning with conventional visual ranging. However, changes in scene, uneven lighting, fuzzy object boundaries and other factors may result in a non-fit phenomenon between the detection bounding box and the object. The pixel spacing between the detection bounding box and the object can cause ranging errors. To reduce pixel spacing, increase the degree of fit between the object detection bounding box and the object, and improve ranging accuracy, an object detection bounding box optimization method is proposed. Two evaluation indicators, WOV and HOV, are also proposed to evaluate the results of bounding box optimization. The experimental results show that the pixel width of the bounding box is optimized by 1.19~19.24% and the pixel height is optimized by 0~12.14%. At the same time, the ranging experiments demonstrate that the optimization of the bounding box improves the ranging accuracy. In addition, few practical monocular range measurement techniques can also determine the distance to an object whose size is unknown. Therefore, a similar triangle ranging technique based on height difference is suggested to measure the distance to items of unknown size. A ranging experiment is carried out based on the optimization of the detecting bounding box, and the experimental results reveal that the ranging relative error within 6 m is between 0.7% and 2.47%, allowing for precise distance measurement. Full article

(This article belongs to the Special Issue Advances and Applications of Digital Image Processing and Deep Learning)

► Show Figures

Figure 1

17 pages, 2892 KB

Open AccessArticle

An H-GrabCut Image Segmentation Algorithm for Indoor Pedestrian Background Removal

by Xuchao Huang, Shigang Wang, Xueshan Gao, Dingji Luo, Weiye Xu, Huiqing Pang and Ming Zhou

Sensors 2023, 23(18), 7937; https://doi.org/10.3390/s23187937 - 16 Sep 2023

Cited by 2 | Viewed by 2430

Abstract

In the context of predicting pedestrian trajectories for indoor mobile robots, it is crucial to accurately measure the distance between indoor pedestrians and robots. This study aims to address this requirement by extracting pedestrians as regions of interest and mitigating issues related to [...] Read more.

In the context of predicting pedestrian trajectories for indoor mobile robots, it is crucial to accurately measure the distance between indoor pedestrians and robots. This study aims to address this requirement by extracting pedestrians as regions of interest and mitigating issues related to inaccurate depth camera distance measurements and illumination conditions. To tackle these challenges, we focus on an improved version of the H-GrabCut image segmentation algorithm, which involves four steps for segmenting indoor pedestrians. Firstly, we leverage the YOLO-V5 object recognition algorithm to construct detection nodes. Next, we propose an enhanced BIL-MSRCR algorithm to enhance the edge details of pedestrians. Finally, we optimize the clustering features of the GrabCut algorithm by incorporating two-dimensional entropy, UV component distance, and LBP texture feature values. The experimental results demonstrate that our algorithm achieves a segmentation accuracy of 97.13% in both the INRIA dataset and real-world tests, outperforming alternative methods in terms of sensitivity, missegmentation rate, and intersection-over-union metrics. These experiments confirm the feasibility and practicality of our approach. The aforementioned findings will be utilized in the preliminary processing of indoor mobile robot pedestrian trajectory prediction and enable path planning based on the predicted results. Full article

(This article belongs to the Special Issue Trajectory Planning and Object Recognition for Robot Sensing and Control)

► Show Figures

Figure 1

17 pages, 6568 KB

Open AccessArticle

Vehicle Target Recognition Method Based on Visible and Infrared Image Fusion Using Bayesian Inference

by Jie Wu and Xiaoqian Zhang

Appl. Sci. 2023, 13(14), 8334; https://doi.org/10.3390/app13148334 - 19 Jul 2023

Cited by 2 | Viewed by 1712

Abstract

The accuracy of single-mode optical imaging systems for vehicle target recognition is limited by external ambient illumination or imaging equipment resolution. In this paper, a vehicle target recognition method based on visible and infrared image fusion using Bayesian inference is proposed. Based on [...] Read more.

The accuracy of single-mode optical imaging systems for vehicle target recognition is limited by external ambient illumination or imaging equipment resolution. In this paper, a vehicle target recognition method based on visible and infrared image fusion using Bayesian inference is proposed. Based on the significance area detection combined with Spectral Residual (SR) and EdgeBox algorithms, the target area is marked on the visible light image, and the maximum between-class variance method and GrabCut algorithm are used to segment vehicle targets in marked images. The open operation filter is employed to extract the vehicle target features in the infrared image, and the fusion result of the visible light image and the infrared image is obtained by introducing the Intersection Over Union (IOU), and as the parameter of Bayesian inference, the class and attribute of the parameter are defined and substituted into the established naive Bayesian classification model, and the probability of the class is calculated to determine whether the vehicle target is recognized. Experiments under different test conditions were carried out, and the experimental results were as follows: the accuracy of image target recognition reached 77% when the vehicle target was not occluded; when the vehicle target was partially occluded, the accuracy of image target recognition reached 74%; the results verified that the proposed method can recognize vehicle targets in different scenarios. Full article

► Show Figures

Figure 1

16 pages, 5012 KB

Open AccessArticle

In-Water Fish Body-Length Measurement System Based on Stereo Vision

by Minggang Zhou, Pingfeng Shen, Hao Zhu and Yang Shen

Sensors 2023, 23(14), 6325; https://doi.org/10.3390/s23146325 - 12 Jul 2023

Cited by 9 | Viewed by 3923

Abstract

Fish body length is an essential monitoring parameter in aquaculture engineering. However, traditional manual measurement methods have been found to be inefficient and harmful to fish. To overcome these shortcomings, this paper proposes a non-contact measurement method that utilizes binocular stereo vision to [...] Read more.

Fish body length is an essential monitoring parameter in aquaculture engineering. However, traditional manual measurement methods have been found to be inefficient and harmful to fish. To overcome these shortcomings, this paper proposes a non-contact measurement method that utilizes binocular stereo vision to accurately measure the body length of fish underwater. Binocular cameras capture RGB and depth images to acquire the RGB-D data of the fish, and then the RGB images are selectively segmented using the contrast-adaptive Grab Cut algorithm. To determine the state of the fish, a skeleton extraction algorithm is employed to handle fish with curved bodies. The errors caused by the refraction of water are then analyzed and corrected. Finally, the best measurement points from the RGB image are extracted and converted into 3D spatial coordinates to calculate the length of the fish, for which measurement software was developed. The experimental results indicate that the mean relative percentage error for fish-length measurement is 0.9%. This paper presents a method that meets the accuracy requirements for measurement in aquaculture while also being convenient for implementation and application. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

18 pages, 5401 KB

Open AccessArticle

Green Sweet Pepper Fruit and Peduncle Detection Using Mask R-CNN in Greenhouses

by Jesús Dassaef López-Barrios, Jesús Arturo Escobedo Cabello, Alfonso Gómez-Espinosa and Luis-Enrique Montoya-Cavero

Appl. Sci. 2023, 13(10), 6296; https://doi.org/10.3390/app13106296 - 21 May 2023

Cited by 26 | Viewed by 4945

Abstract

In this paper, a mask region-based convolutional neural network (Mask R-CNN) is used to improve the performance of machine vision in the challenging task of detecting peduncles and fruits of green sweet peppers (Capsicum annuum L.) in greenhouses. One of the most [...] Read more.

In this paper, a mask region-based convolutional neural network (Mask R-CNN) is used to improve the performance of machine vision in the challenging task of detecting peduncles and fruits of green sweet peppers (Capsicum annuum L.) in greenhouses. One of the most complicated stages of the sweet pepper harvesting process is to achieve a precise cut of the peduncle or stem because this type of specialty crop cannot be grabbed and pulled by the fruit since the integrity and value of the product are compromised. Therefore, accurate peduncle detection becomes vital for the autonomous harvesting of sweet peppers. ResNet-101 combined with the feature pyramid network (FPN) architecture (ResNet-101 + FPN) is adopted as the backbone network for feature extraction and object representation enhancement at multiple scales. Mask images of fruits and peduncles are generated, focused on green sweet pepper, which is the most complex color variety due to its resemblance to the background. In addition to bounding boxes, Mask R-CNN provides binary masks as a result of instance segmentation, which would help improve the localization process in 3D space, the next phase of the autonomous harvesting process of sweet peppers, since it isolates the pixels belonging to the object and demarcates its boundaries. The prediction results of 1148 fruits on 100 test images showed a precision rate of 84.53%. The prediction results of 265 peduncles showed a precision rate of 71.78%. The mean average precision rate with an intersection over union at 50 percent (mAP@IoU=50) for model-wide instance segmentation was 72.64%. The average detection time for sweet pepper fruit and peduncle using high-resolution images was 1.18 s. The experimental results show that the proposed implementation manages to segment the peduncle and fruit of the green sweet pepper in real-time in an unmodified production environment under occlusion, overlap, and light variation conditions with effectiveness not previously reported for simultaneous 2D detection models of peduncles and fruits of green sweet pepper. Full article

(This article belongs to the Section Robotics and Automation)

► Show Figures

Figure 1

41 pages, 10917 KB

Open AccessReview

Review of GrabCut in Image Processing

by Zhaobin Wang, Yongke Lv, Runliang Wu and Yaonan Zhang

Mathematics 2023, 11(8), 1965; https://doi.org/10.3390/math11081965 - 21 Apr 2023

Cited by 26 | Viewed by 6603

Abstract

As an image-segmentation method based on graph theory, GrabCut has attracted more and more researchers to pay attention to this new method because of its advantages of simple operation and excellent segmentation. In order to clarify the research status of GrabCut, we begin [...] Read more.

As an image-segmentation method based on graph theory, GrabCut has attracted more and more researchers to pay attention to this new method because of its advantages of simple operation and excellent segmentation. In order to clarify the research status of GrabCut, we begin with the original GrabCut model, review the improved algorithms that are new or important based on GrabCut in recent years, and classify them in terms of pre-processing based on superpixel, saliency map, energy function modification, non-interactive improvement and some other improved algorithms. The application status of GrabCut in various fields is also reviewed. We also experiment with some classical improved algorithms, including GrabCut, LazySnapping, OneCut, Saliency Cuts, DenseCut and Deep GrabCut, and objectively analyze the experimental results using five evaluation indicators to verify the performance of GrabCut. Finally, some existing problems are pointed out and we also propose some future work. Full article

(This article belongs to the Special Issue Mathematical Methods for Pattern Recognition)

► Show Figures

Figure 1

19 pages, 6331 KB

Open AccessArticle

Object Detection Based on the GrabCut Method for Automatic Mask Generation

by Hao Wu, Yulong Liu, Xiangrong Xu and Yukun Gao

Micromachines 2022, 13(12), 2095; https://doi.org/10.3390/mi13122095 - 28 Nov 2022

Cited by 2 | Viewed by 2187

Abstract

The Mask R-CNN-based object detection method is typically very time-consuming and laborious since it involves obtaining the required target object masks during training. Therefore, in order to automatically generate the image mask, we propose a GrabCut-based automated mask generation method for object detection. [...] Read more.

The Mask R-CNN-based object detection method is typically very time-consuming and laborious since it involves obtaining the required target object masks during training. Therefore, in order to automatically generate the image mask, we propose a GrabCut-based automated mask generation method for object detection. The proposed method consists of two stages. The first stage is based on GrabCut’s interactive image segmentation method to generate the mask. The second stage is based on the object detection network of Mask R-CNN, which uses the mask from the previous stage together with the original input image and the associated label information for training. The Mask R-CNN model then automatically detects the relevant objects during testing. During experimentation with three objects from the Berkeley Instance Recognition Dataset, this method achieved a mean of average precision (mAP) value of over 95% for segmentation. The proposed method is simple and highly efficient in obtaining the mask of a segmented target object. Full article

(This article belongs to the Special Issue Machine Learning Techniques on IoT Applications)

► Show Figures

Figure 1

13 pages, 2872 KB

Open AccessArticle

Interactive Image Segmentation Based on Feature-Aware Attention

by Jinsheng Sun, Xiaojuan Ban, Bing Han, Xueyuan Yang and Chao Yao

Symmetry 2022, 14(11), 2396; https://doi.org/10.3390/sym14112396 - 12 Nov 2022

Viewed by 3220

Abstract

Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, [...] Read more.

Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, limited by the learning capability of the model, this structure does not accurately represent the user’s interaction intention. In this work, we propose a multi-click interactive segmentation solution for employing human intention to refine the segmentation results. We propose a coarse segmentation network to extract semantic information and generate rough results. Then, we designed a feature-aware attention module according to the symmetry of user intention and image semantic information. Finally, we establish a refinement module to combine the feature-aware results with coarse masks to generate precise intentional segmentation. Furthermore, the feature-aware module is trained as a plug-and-play tool, which can be embedded into most deep image segmentation models for exploiting users’ click information in the training process. We conduct experiments on five common datasets (SBD, GrabCut, DAVIS, Berkeley, MS COCO) and the results prove our attention module can improve the performance of image segmentation networks. Full article

(This article belongs to the Special Issue Advances in Computer Vision, Pattern Recognition, Machine Learning and Symmetry)

► Show Figures

Figure 1

Search Results (38)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (38)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI