applsci-logo

Journal Browser

Journal Browser

Application of Machine Vision and Deep Learning Technology

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 June 2024) | Viewed by 46193

Special Issue Editors


E-Mail Website
Guest Editor
School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
Interests: machine vision; optical engineering; deep learning
Special Issues, Collections and Topics in MDPI journals
Department of Electronic Information Engineering, Zhengzhou University, Zhengzhou 450000, China
Interests: image processing; visual inspection; photoelectric measurement
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
Interests: photoelectric measurement; optical information processing; optical precision instrument; digital image processing; manufacturing systems; quality engineering

Special Issue Information

Dear Colleagues,

Machine vision is a branch of artificial intelligence that is developing rapidly. Machine vision technology aims to use machines to measure and judge things instead of human eyes. In recent years, with the continuous development of deep learning, its unique end-to-end learning concept and outstanding data analysis ability have helped machine vision technology to achieve higher accuracy in image classification, target recognition, and semantic segmentation, increasing its use in security, driverless cars, smart home, medical imaging, and other fields. In this Special Issue, the recent efforts and advances made in machine vision and deep learning will be discussed.

Dr. Junhui Huang
Dr. Qi Xue
Prof. Dr. Zhao Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine vision
  • deep learning
  • 3D sensing
  • optical imaging and measurement
  • super-resolution imaging
  • image processing
  • artificial intelligence and photonic neural network
  • target recognition
  • neural network and optimization
  • semantic segmentation and understanding
  • automatic optical inspection, industrial product testing, driverless car, character recognition, tracking and positioning, etc. hardware, algorithm, and techniques relating to machine vision

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 3397 KiB  
Article
Multi-UAV Area Coverage Track Planning Based on the Voronoi Graph and Attention Mechanism
by Jubo Wang and Ruixin Wang
Appl. Sci. 2024, 14(17), 7844; https://doi.org/10.3390/app14177844 - 4 Sep 2024
Viewed by 574
Abstract
Drone area coverage primarily involves using unmanned aerial vehicles (UAVs) for extensive monitoring, surveying, communication, and other tasks over specific regions. The significance and value of this technology are multifaceted. Firstly, UAVs can rapidly and efficiently reach remote or inaccessible areas to perform [...] Read more.
Drone area coverage primarily involves using unmanned aerial vehicles (UAVs) for extensive monitoring, surveying, communication, and other tasks over specific regions. The significance and value of this technology are multifaceted. Firstly, UAVs can rapidly and efficiently reach remote or inaccessible areas to perform tasks such as terrain mapping, disaster monitoring, or search and rescue, significantly enhancing response speed and execution efficiency. Secondly, drone area coverage in agricultural monitoring, forestry conservation, and urban planning offers high-precision data support, aiding scientists and decision-makers in making more accurate judgments and decisions. Additionally, drones can serve as temporary communication base stations in areas with poor communication, ensuring the transfer of crucial information. Drone area coverage technology is vital in improving work efficiency, reducing costs, and strengthening decision support. This paper aims to solve the optimization problem of multi-UAV area coverage flight path planning to enhance system efficiency and task execution capability. For multi-center optimization problems, a region decomposition method based on the Voronoi graph is designed, transforming the multi-UAV area coverage issue into the single-UAV area coverage problem, greatly simplifying the complexity and computational process. For the single-UAV area coverage problem and its corresponding area, this paper contrives a convolutional neural network with the channel and spatial attention mechanism (CSAM) to enhance feature fusion capability, enabling the model to focus on core features for solving single-UAV path selection and ultimately generating the optimal path. Simulation results demonstrate that the proposed method achieves excellent performance. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

18 pages, 21505 KiB  
Article
Correction Compensation and Adaptive Cost Aggregation for Deep Laparoscopic Stereo Matching
by Jian Zhang, Bo Yang, Xuanchi Zhao and Yi Shi
Appl. Sci. 2024, 14(14), 6176; https://doi.org/10.3390/app14146176 - 16 Jul 2024
Viewed by 572
Abstract
Perception of digitized depth is a prerequisite for enabling the intelligence of three-dimensional (3D) laparoscopic systems. In this context, stereo matching of laparoscopic stereoscopic images presents a promising solution. However, the current research in this field still faces challenges. First, the acquisition of [...] Read more.
Perception of digitized depth is a prerequisite for enabling the intelligence of three-dimensional (3D) laparoscopic systems. In this context, stereo matching of laparoscopic stereoscopic images presents a promising solution. However, the current research in this field still faces challenges. First, the acquisition of accurate depth labels in a laparoscopic environment proves to be a difficult task. Second, errors in the correction of laparoscopic images are prevalent. Finally, laparoscopic image registration suffers from ill-posed regions such as specular highlights and textureless areas. In this paper, we make significant contributions by developing (1) a correction compensation module to overcome correction errors; (2) an adaptive cost aggregation module to improve prediction performance in ill-posed regions; (3) a novel self-supervised stereo matching framework based on these two modules. Specifically, our framework rectifies features and images based on learned pixel offsets, and performs differentiated aggregation on cost volumes based on their value. The experimental results demonstrate the effectiveness of the proposed modules. On the SCARED dataset, our model reduces the mean depth error by 12.6% compared to the baseline model and outperforms the state-of-the-art unsupervised methods and well-generalized models. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

18 pages, 3419 KiB  
Article
Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images
by Yufei Jiang, Ruonan He, Yi Chen, Jing Zhang, Yuyang Lei, Shengxian Yan and Hui Cao
Appl. Sci. 2024, 14(12), 5331; https://doi.org/10.3390/app14125331 - 20 Jun 2024
Viewed by 951
Abstract
Photoacoustic imaging (PAI) is an emerging imaging technique that offers real-time, non-invasive, and radiation-free measurements of optical tissue properties. However, image quality degradation due to factors such as non-ideal signal detection hampers its clinical applicability. To address this challenge, this paper proposes an [...] Read more.
Photoacoustic imaging (PAI) is an emerging imaging technique that offers real-time, non-invasive, and radiation-free measurements of optical tissue properties. However, image quality degradation due to factors such as non-ideal signal detection hampers its clinical applicability. To address this challenge, this paper proposes an algorithm for super-resolution reconstruction and segmentation based on deep learning. The proposed enhanced deep super-resolution minimalistic network (EDSR-M) not only mitigates the shortcomings of the original algorithm regarding computational complexity and parameter count but also employs residual learning and attention mechanisms to extract image features and enhance image details, thereby achieving high-quality reconstruction of PAI. DeepLabV3+ is used to segment the images before and after reconstruction to verify the network reconstruction performance. The experimental results demonstrate average improvements of 19.76% in peak-signal-to-noise ratio (PSNR) and 4.80% in structural similarity index (SSIM) for the reconstructed images compared to those of their pre-reconstructed counterparts. Additionally, mean accuracy, mean intersection and union ratio (IoU), and mean boundary F1 score (BFScore) for segmentation showed enhancements of 8.27%, 6.20%, and 6.28%, respectively. The proposed algorithm enhances the effect and texture features of PAI and makes the overall structure of the image restoration more complete. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

14 pages, 2302 KiB  
Article
A Multi-Scale Attention Fusion Network for Retinal Vessel Segmentation
by Shubin Wang, Yuanyuan Chen and Zhang Yi
Appl. Sci. 2024, 14(7), 2955; https://doi.org/10.3390/app14072955 - 31 Mar 2024
Cited by 3 | Viewed by 1175
Abstract
The structure and function of retinal vessels play a crucial role in diagnosing and treating various ocular and systemic diseases. Therefore, the accurate segmentation of retinal vessels is of paramount importance to assist a clinical diagnosis. U-Net has been highly praised for its [...] Read more.
The structure and function of retinal vessels play a crucial role in diagnosing and treating various ocular and systemic diseases. Therefore, the accurate segmentation of retinal vessels is of paramount importance to assist a clinical diagnosis. U-Net has been highly praised for its outstanding performance in the field of medical image segmentation. However, with the increase in network depth, multiple pooling operations may lead to the problem of crucial information loss. Additionally, handling the insufficient processing of local context features caused by skip connections can affect the accurate segmentation of retinal vessels. To address these problems, we proposed a novel model for retinal vessel segmentation. The proposed model is implemented based on the U-Net architecture, with the addition of two blocks, namely, an MsFE block and MsAF block, between the encoder and decoder at each layer of the U-Net backbone. The MsFE block extracts low-level features from different scales, while the MsAF block performs feature fusion across various scales. Finally, the output of the MsAF block replaces the skip connection in the U-Net backbone. Experimental evaluations on the DRIVE dataset, CHASE_DB1 dataset, and STARE dataset demonstrated that MsAF-UNet exhibited excellent segmentation performance compared with the state-of-the-art methods. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

23 pages, 21756 KiB  
Article
Segmenting Urban Scene Imagery in Real Time Using an Efficient UNet-like Transformer
by Haiqing Xu, Mingyang Yu, Fangliang Zhou and Hongling Yin
Appl. Sci. 2024, 14(5), 1986; https://doi.org/10.3390/app14051986 - 28 Feb 2024
Cited by 1 | Viewed by 1095
Abstract
Semantic segmentation of high-resolution remote sensing urban images is widely used in many fields, such as environmental protection, urban management, and sustainable development. For many years, convolutional neural networks (CNNs) have been a prevalent method in the field, but the convolution operations are [...] Read more.
Semantic segmentation of high-resolution remote sensing urban images is widely used in many fields, such as environmental protection, urban management, and sustainable development. For many years, convolutional neural networks (CNNs) have been a prevalent method in the field, but the convolution operations are deficient in modeling global information due to their local nature. In recent years, the Transformer-based methods have demonstrated their advantages in many domains due to the powerful ability to model global information, such as semantic segmentation, instance segmentation, and object detection. Despite the above advantages, Transformer-based architectures tend to incur significant computational costs, limiting the model’s real-time application potential. To address this problem, we propose a U-shaped network with Transformer as the decoder and CNN as the encoder to segment remote sensing urban scene images. For efficient segmentation, we design a window-based, multi-head, focused linear self-attention (WMFSA) mechanism and further propose the global–local information modeling module (GLIM), which can capture both global and local contexts through a dual-branch structure. Experimenting on four challenging datasets, we demonstrate that our model not only achieves a higher segmentation accuracy compared with other methods but also can obtain competitive speeds to enhance the model’s real-time application potential. Specifically, the mIoU of our method is 68.2% and 52.8% on the UAVid and LoveDA datasets, respectively, while the speed is 114 FPS, with a 1024 × 1024 input on a single 3090 GPU. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

12 pages, 22729 KiB  
Article
nmODE-Unet: A Novel Network for Semantic Segmentation of Medical Images
by Shubin Wang, Yuanyuan Chen and Zhang Yi
Appl. Sci. 2024, 14(1), 411; https://doi.org/10.3390/app14010411 - 2 Jan 2024
Cited by 4 | Viewed by 1447
Abstract
Diabetic retinopathy is a prevalent eye disease that poses a potential risk of blindness. Nevertheless, due to the small size of diabetic retinopathy lesions and the high interclass similarity in terms of location, color, and shape among different lesions, the segmentation task is [...] Read more.
Diabetic retinopathy is a prevalent eye disease that poses a potential risk of blindness. Nevertheless, due to the small size of diabetic retinopathy lesions and the high interclass similarity in terms of location, color, and shape among different lesions, the segmentation task is highly challenging. To address these issues, we proposed a novel framework named nmODE-Unet, which is based on the nmODE (neural memory Ordinary Differential Equation) block and U-net backbone. In nmODE-Unet, the shallow features serve as input to the nmODE block, and the output of the nmODE block is fused with the corresponding deep features. Extensive experiments were conducted on the IDRiD dataset, e_ophtha dataset, and the LGG segmentation dataset, and the results demonstrate that, in comparison to other competing models, nmODE-Unet showcases a superior performance. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

12 pages, 2615 KiB  
Article
Automatic Recognition of Blood Cell Images with Dense Distributions Based on a Faster Region-Based Convolutional Neural Network
by Yun Liu, Yumeng Liu, Menglu Chen, Haoxing Xue, Xiaoqiang Wu, Linqi Shui, Junhong Xing, Xian Wang, Hequn Li and Mingxing Jiao
Appl. Sci. 2023, 13(22), 12412; https://doi.org/10.3390/app132212412 - 16 Nov 2023
Viewed by 931
Abstract
In modern clinical medicine, the important information of red blood cells, such as shape and number, is applied to detect blood diseases. However, the automatic recognition problem of single cells and adherent cells always exists in a densely distributed medical scene, which is [...] Read more.
In modern clinical medicine, the important information of red blood cells, such as shape and number, is applied to detect blood diseases. However, the automatic recognition problem of single cells and adherent cells always exists in a densely distributed medical scene, which is difficult to solve for both the traditional detection algorithms with lower recognition rates and the conventional networks with weaker feature extraction capabilities. In this paper, an automatic recognition method of adherent blood cells with dense distribution is proposed. Based on the Faster R-CNN, the balanced feature pyramid structure, deformable convolution network, and efficient pyramid split attention mechanism are adopted to automatically recognize the blood cells under the conditions of dense distribution, extrusion deformation, adhesion and overlap. In addition, the Align algorithm for region of interest also contributes to improving the accuracy of recognition results. The experimental results show that the mean average precision of cell detection is 0.895, which is 24.5% higher than that of the original network model. Compared with the one-stage mainstream networks, the presented network has a stronger feature extraction capability. The proposed method is suitable for identifying single cells and adherent cells with dense distribution in the actual medical scene. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

16 pages, 3003 KiB  
Article
A Visual-Based Approach for Driver’s Environment Perception and Quantification in Different Weather Conditions
by Longxi Luo, Minghao Liu, Jiahao Mei, Yu Chen and Luzheng Bi
Appl. Sci. 2023, 13(22), 12176; https://doi.org/10.3390/app132212176 - 9 Nov 2023
Cited by 1 | Viewed by 1517
Abstract
The decision-making behavior of drivers during the driving process is influenced by various factors, including road conditions, traffic situations, weather conditions, and so on. However, our understanding and quantification of the driving environment are still very limited, which not only increases the risk [...] Read more.
The decision-making behavior of drivers during the driving process is influenced by various factors, including road conditions, traffic situations, weather conditions, and so on. However, our understanding and quantification of the driving environment are still very limited, which not only increases the risk of driving but also hinders the deployment of autonomous vehicles. To address this issue, this study attempts to transform drivers’ visual perception into machine vision perception. Specifically, the study provides a detailed decomposition of the elements constituting weather and proposes three environmental quantification indicators: visibility brightness, visibility clarity, and visibility obstruction rate. These indicators help us to describe and quantify the driving environment more accurately. Based on these indicators, a visual-based environmental quantification method is further proposed to better understand and interpret the driving environment. Additionally, based on drivers’ visual perception, this study extensively analyzes the impact of environmental factors on driver behavior. A cognitive assessment model is established to evaluate drivers’ cognitive abilities in different environments. The effectiveness and accuracy of the model are validated through driver simulation experiments, thereby establishing a communication bridge between the driving environment and driver behavior. This research achievement enables us to better understand the decision-making behavior of drivers in specific environments and provides some references for the development of intelligent driving technology. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

19 pages, 2158 KiB  
Article
U2-Net: A Very-Deep Convolutional Neural Network for Detecting Distracted Drivers
by Nawaf O. Alsrehin, Mohit Gupta, Izzat Alsmadi and Saif Addeen Alrababah
Appl. Sci. 2023, 13(21), 11898; https://doi.org/10.3390/app132111898 - 31 Oct 2023
Cited by 2 | Viewed by 2046
Abstract
In recent years, the number of deaths and injuries resulting from traffic accidents has been increasing dramatically all over the world due to distracted drivers. Thus, a key element in developing intelligent vehicles and safe roads is monitoring driver behaviors. In this paper, [...] Read more.
In recent years, the number of deaths and injuries resulting from traffic accidents has been increasing dramatically all over the world due to distracted drivers. Thus, a key element in developing intelligent vehicles and safe roads is monitoring driver behaviors. In this paper, we modify and extend the U-net convolutional neural network so that it provides deep layers to represent image features and yields more precise classification results. It is the basis of a very deep convolution neural network, called U2-net, to detect distracted drivers. The U2-net model has two paths (contracting and expanding) in addition to a fully-connected dense layer. The contracting path is used to extract the context around the objects to provide better object representation while the symmetric expanding path enables precise localization. The motivation behind this model is that it provides precise object features to provide a better object representation and classification. We used two public datasets: MI-AUC and State Farm, to evaluate the U2 model in detecting distracted driving. The accuracy of U2-net on MI-AUC and State Farm is 98.34 % and 99.64%, respectively. These evaluation results show higher accuracy than achieved by many other state-of-the-art methods. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

17 pages, 4439 KiB  
Article
Exploring the ViDiDetect Tool for Automated Defect Detection in Manufacturing with Machine Vision
by Mateusz Dziubek, Jacek Rysiński and Daniel Jancarczyk
Appl. Sci. 2023, 13(19), 11098; https://doi.org/10.3390/app131911098 - 9 Oct 2023
Cited by 2 | Viewed by 1966
Abstract
Automated monitoring of cutting tool wear is of paramount importance in the manufacturing industry, as it directly impacts production efficiency and product quality. Traditional manual inspection methods are time-consuming and prone to human error, necessitating the adoption of more advanced techniques. This study [...] Read more.
Automated monitoring of cutting tool wear is of paramount importance in the manufacturing industry, as it directly impacts production efficiency and product quality. Traditional manual inspection methods are time-consuming and prone to human error, necessitating the adoption of more advanced techniques. This study explores the application of ViDiDetect, a deep learning-based defect detection solution, in the context of machine vision for assessing cutting tool wear. By capturing high-resolution images of machining tools and analyzing wear patterns, machine vision systems offer a non-contact and non-destructive approach to tool wear assessment, enabling continuous monitoring without disrupting the machining process. In this research, a smart camera and an illuminator were utilized to capture images of a car suspension knuckle’s machined surface, with a focus on detecting burrs, chips, and tool wear. The study also employed a mask to narrow the region of interest and enhance classification accuracy. This investigation demonstrates the potential of machine vision and ViDiDetect in automating cutting tool wear assessment, ultimately enhancing manufacturing processes’ efficiency and product quality. The project is at the implementation stage in one of the automotive production plants located in southern Poland. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

15 pages, 1482 KiB  
Article
LightSeg: Local Spatial Perception Convolution for Real-Time Semantic Segmentation
by Xiaochun Lei, Jiaming Liang, Zhaoting Gong and Zetao Jiang
Appl. Sci. 2023, 13(14), 8130; https://doi.org/10.3390/app13148130 - 12 Jul 2023
Cited by 1 | Viewed by 1365
Abstract
Semantic segmentation is increasingly being applied on mobile devices due to advancements in mobile chipsets, particularly in low-power consumption scenarios. However, the lightweight design of mobile devices poses limitations on the receptive field, which is crucial for dense prediction problems. Existing approaches have [...] Read more.
Semantic segmentation is increasingly being applied on mobile devices due to advancements in mobile chipsets, particularly in low-power consumption scenarios. However, the lightweight design of mobile devices poses limitations on the receptive field, which is crucial for dense prediction problems. Existing approaches have attempted to balance lightweight designs and high accuracy by downsampling features in the backbone. However, this downsampling may result in the loss of local details at each network stage. To address this challenge, this paper presents a novel solution in the form of a compact and efficient convolutional neural network (CNN) for real-time applications: our proposed model, local spatial perception convolution (LSPConv). Furthermore, the effectiveness of our architecture is demonstrated on the Cityscapes dataset. The results show that our model achieves an impressive balance between accuracy and inference speed. Specifically, our LightSeg, which does not rely on ImageNet pretraining, achieves an mIoU of 76.1 at a speed of 61 FPS on the Cityscapes validation set, utilizing an RTX 2080Ti GPU with mixed precision. Additionally, it achieves a speed of 115.7 FPS on the Jetson NX with int8 precision. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

17 pages, 5162 KiB  
Article
Semantic Segmentation of Packaged and Unpackaged Fresh-Cut Apples Using Deep Learning
by Udith Krishnan Vadakkum Vadukkal, Michela Palumbo and Giovanni Attolico
Appl. Sci. 2023, 13(12), 6969; https://doi.org/10.3390/app13126969 - 9 Jun 2023
Cited by 2 | Viewed by 1964
Abstract
Computer vision systems are often used in industrial quality control to offer fast, objective, non-destructive, and contactless evaluation of fruit. The senescence of fresh-cut apples is strongly related to the browning of the pulp rather than to the properties of the peel. This [...] Read more.
Computer vision systems are often used in industrial quality control to offer fast, objective, non-destructive, and contactless evaluation of fruit. The senescence of fresh-cut apples is strongly related to the browning of the pulp rather than to the properties of the peel. This work addresses the identification and selection of pulp inside images of fresh-cut apples, both packaged and unpackaged; this is a critical step towards a computer vision system that is able to evaluate their quality and internal properties. A DeepLabV3+-based convolutional neural network model (CNN) has been developed for this semantic segmentation task. It has proved to be robust with respect to the similarity of colours between the peel and pulp. Its ability to separate the pulp from the peel and background has been verified on four varieties of apples: Granny Smith (greenish peel), Golden (yellowish peel), Fuji, and Pink Lady (reddish peel). The semantic segmentation achieved an accuracy greater than 99% on all these varieties. The developed approach was able to isolate regions significantly affected by the browning process on both packaged and unpackaged pieces: on these areas, the colour analysis will be studied to evaluate internal quality and senescence of packaged and unpackaged products. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

20 pages, 6660 KiB  
Article
Real-Time Defect Detection for Metal Components: A Fusion of Enhanced Canny–Devernay and YOLOv6 Algorithms
by Hongjun Wang, Xiujin Xu, Yuping Liu, Deda Lu, Bingqiang Liang and Yunchao Tang
Appl. Sci. 2023, 13(12), 6898; https://doi.org/10.3390/app13126898 - 7 Jun 2023
Cited by 6 | Viewed by 2843
Abstract
Due to the presence of numerous surface defects, the inadequate contrast between defective and non-defective regions, and the resemblance between noise and subtle defects, edge detection poses a significant challenge in dimensional error detection, leading to increased dimensional measurement inaccuracies. These issues serve [...] Read more.
Due to the presence of numerous surface defects, the inadequate contrast between defective and non-defective regions, and the resemblance between noise and subtle defects, edge detection poses a significant challenge in dimensional error detection, leading to increased dimensional measurement inaccuracies. These issues serve as major bottlenecks in the domain of automatic detection of high-precision metal parts. To address these challenges, this research proposes a combined approach involving the utilization of the YOLOv6 deep learning network in conjunction with metal lock body parts for the rapid and accurate detection of surface flaws in metal workpieces. Additionally, an enhanced Canny–Devernay sub-pixel edge detection algorithm is employed to determine the size of the lock core bead hole. The methodology is as follows: The data set for surface defect detection is acquired using the labeling software lableImg and subsequently utilized for training the YOLOv6 model to obtain the model weights. For size measurement, the region of interest (ROI) corresponding to the lock cylinder bead hole is first extracted. Subsequently, Gaussian filtering is applied to the ROI, followed by a sub-pixel edge detection using the improved Canny–Devernay algorithm. Finally, the edges are fitted using the least squares method to determine the radius of the fitted circle. The measured value is obtained through size conversion. Experimental detection involves employing the YOLOv6 method to identify surface defects in the lock body workpiece, resulting in an achieved mean Average Precision (mAP) value of 0.911. Furthermore, the size of the lock core bead hole is measured using an upgraded technique based on the Canny–Devernay sub-pixel edge detection, yielding an average inaccuracy of less than 0.03 mm. The findings of this research showcase the successful development of a practical method for applying machine vision in the realm of the automatic detection of metal parts. This achievement is accomplished through the exploration of identification methods and size-measuring techniques for common defects found in metal parts. Consequently, the study establishes a valuable framework for effectively utilizing machine vision in the field of metal parts inspection and defect detection. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

14 pages, 4371 KiB  
Article
Lossless Compression of Large Aperture Static Imaging Spectrometer Data
by Lu Yu, Hongbo Li, Jing Li and Wei Li
Appl. Sci. 2023, 13(9), 5632; https://doi.org/10.3390/app13095632 - 3 May 2023
Cited by 1 | Viewed by 1273
Abstract
The large-aperture static imaging spectrometer (LASIS) is an interference spectrometer with high device stability, high throughput, a wide spectral range, and a high spectral resolution. One frame image of the original data cube acquired by the LASIS shows the image superimposed with interference [...] Read more.
The large-aperture static imaging spectrometer (LASIS) is an interference spectrometer with high device stability, high throughput, a wide spectral range, and a high spectral resolution. One frame image of the original data cube acquired by the LASIS shows the image superimposed with interference fringes, which is distinctly different from traditional hyperspectral images. For compression studies using this new type of data, a lossless compression scheme that combines a novel data rearrange method and the lossless multispectral and hyperspectral image compression standard CCSDS-123 is presented. In the rearrange approach, the LASIS data cube is rearranged such that the interference information overlapped on the image can be separated, and the results are then processed using the CCSDS-123 standard. Then, several experiments are conducted to investigate the performance of the rearrange method and examine the impact of different CCSDS-123 parameter settings for the LASIS. The experimental results indicate that the proposed scheme provides a 32.9% higher ratio than traditional rearrange methods. Moreover, an adequate parameter combination for this compression scheme for LASIS is presented, and it yields a 19.6% improvement over the default settings suggested by the standard. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

18 pages, 13737 KiB  
Article
TSDNet: A New Multiscale Texture Surface Defect Detection Model
by Min Dong, Dezhen Li, Kaixiang Li and Junpeng Xu
Appl. Sci. 2023, 13(5), 3289; https://doi.org/10.3390/app13053289 - 4 Mar 2023
Cited by 2 | Viewed by 2123
Abstract
Industrial defect detection methods based on deep learning can reduce the cost of traditional manual quality inspection, improve the accuracy and efficiency of detection, and are widely used in industrial fields. Traditional computer defect detection methods focus on manual features and require a [...] Read more.
Industrial defect detection methods based on deep learning can reduce the cost of traditional manual quality inspection, improve the accuracy and efficiency of detection, and are widely used in industrial fields. Traditional computer defect detection methods focus on manual features and require a large amount of defect data, which has some limitations. This paper proposes a texture surface defect detection method based on convolutional neural network and wavelet analysis: TSDNet. The approach combines wavelet analysis with patch extraction, which can detect and locate many defects in a complex texture background; a patch extraction method based on random windows is proposed, which can quickly and effectively extract defective patches; and a judgment strategy based on a sliding window is proposed to improve the robustness of CNN. Our method can achieve excellent detection accuracy on DAGM 2007, a micro-surface defect database and KolektorSDD dataset, and can find the defect location accurately. The results show that in the complex texture background, the method can obtain high defect detection accuracy with only a small amount of training data and can accurately locate the defect position. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

20 pages, 8014 KiB  
Article
Research on Automatic Error Data Recognition Method for Structured Light System Based on Residual Neural Network
by Aozhuo Ding, Qi Xue, Xulong Ding, Xiaohong Sun, Xiaonan Yang and Huiying Ye
Appl. Sci. 2023, 13(5), 2920; https://doi.org/10.3390/app13052920 - 24 Feb 2023
Viewed by 1465
Abstract
In a structured light system, the positioning accuracy of the stripe is one of the determinants of measurement accuracy. However, the quality of the structured light stripe is reduced by noise, object shape, color, etc. The positioning accuracy of the low-quality stripe center [...] Read more.
In a structured light system, the positioning accuracy of the stripe is one of the determinants of measurement accuracy. However, the quality of the structured light stripe is reduced by noise, object shape, color, etc. The positioning accuracy of the low-quality stripe center will be decreased, and the large error will be introduced into measurement results, which can only be recognized by a human. To address this problem, this paper proposes a method to identify data with relatively large errors in 3D measurement results by evaluating the quality of the grayscale distribution of stripes. In this method, the undegraded and degraded stripe images are captured. Then, the residual neural network is trained using the grayscale distribution of the two types of stripes. The captured stripes are classified by the trained model. Finally, the data corresponding to the degraded stripes, which correspond to large errors in the data, can be identified according to the classified results. The experiment shows that the algorithm proposed in this paper can effectively identify the data with large errors automatically. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

Review

Jump to: Research

17 pages, 1044 KiB  
Review
Quick Overview of Face Swap Deep Fakes
by Tomasz Walczyna and Zbigniew Piotrowski
Appl. Sci. 2023, 13(11), 6711; https://doi.org/10.3390/app13116711 - 31 May 2023
Cited by 5 | Viewed by 20763
Abstract
Deep Fake technology has developed rapidly in its generation and detection in recent years. Researchers in both fields are outpacing each other in their axes achievements. The works use, among other methods, autoencoders, generative adversarial networks, or other algorithms to create fake content [...] Read more.
Deep Fake technology has developed rapidly in its generation and detection in recent years. Researchers in both fields are outpacing each other in their axes achievements. The works use, among other methods, autoencoders, generative adversarial networks, or other algorithms to create fake content that is resistant to detection by algorithms or the human eye. Among the ever-increasing number of emerging works, a few can be singled out that, in their solutions and robustness of detection, contribute significantly to the field. Despite the advancement of emerging generative algorithms, the fields are still left for further research. This paper will briefly introduce the fundamentals of some the latest Face Swap Deep Fake algorithms. Full article
(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)
Show Figures

Figure 1

Back to TopTop