Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s

Guo, Zhongfeng; Yang, Junlin; Sun, Jiahui; Zhao, Wenzeng

doi:10.3390/app13158771

Open AccessArticle

Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s

by

Zhongfeng Guo

,

Junlin Yang

^*,

Jiahui Sun

and

Wenzeng Zhao

Liaoning Provincial Key Laboratory of Intelligent Manufacturing and Industrial Robots, Shenyang University of Technology, Shenyang 110870, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(15), 8771; https://doi.org/10.3390/app13158771

Submission received: 7 June 2023 / Revised: 22 July 2023 / Accepted: 28 July 2023 / Published: 29 July 2023

Download

Browse Figures

Versions Notes

Abstract

:

At present, the identification of tire marking points relies primarily on manual inspection, which is not only time-consuming and labor-intensive but also prone to false detections, significantly impacting enterprise efficiency. To achieve accurate recognition of tire marking points, this study proposes a small target feature recognition method for automotive tire marking points. In image pre-processing, MSRCR (Multi-Scale Retinex with Color Restoration) is invoked to enhance image features, which can be adapted to different environmental detection tasks. The YOLOv5s network is improved by adding a parameter-free simAM (Similarity Attention Mechanism) attention mechanism to improve the detection efficiency; adding a small target prediction head in the network to improve the minimum recognition size of the network; and changing the loss function to improve the network recognition performance. MAP, precision, and recall are important parameters. The comparison experiment with the traditional YOLOv5s network shows that the mAP of the improved YOLOv5s network and the original network is 0.86 and 0.955, respectively, and the mAP is increased by 9.5%. The precision is 0.87 and 0.96, an improvement of 9%, and the recall rate is 0.84 and 0.89, an improvement of 4%; the improved YOLOv5s model has a higher confidence level for small target recognition and is more suitable for application in practical detection tasks.

Keywords:

deep learning; YOLOv5 algorithm; attention module; target prediction head; loss function; tire identification point

1. Introduction

The quality of car tires directly impacts the safety and comfort of the vehicle. Therefore, tire marking point detection is crucial prior to the release of automotive tires from the factory. Downstream companies that use these tires require rapid and high-precision inspection of a significant number of tire marking points for efficient storage. Through enterprise research, it has been observed that the current practice predominantly relies on manual inspection, which is time-consuming and labor-intensive. Companies aspire to replace manual inspection with machine vision technology to achieve automated detection, offering a more efficient and accurate solution [1,2]. As early as the 1990s, domestic and foreign scholars conducted research related to the identification of tire marking points. In 1999, Bridgestone of Japan improved their rotating marking point identification system by integrating a marking point marking device and an identification device, which greatly reduced the installation space of the device and allowed the identification of the marking points immediately after they were marked on the tire [3]. The Identity CONTROL TMI 8303.1 closed-loop control system developed by Micro epsilon, Germany, is used for online identification of tire marking points and can identify the color, tire and location of tire marking points. The IRIS_M, developed by SICK in Germany, uses three-dimensional images to locate the tire and two-dimensional images to identify the shape and color of the points [4]. In [5], a support vector machine approach was used to achieve shape recognition and color recognition of tire identification points. In [6], color recognition and shape recognition are trained for using convolutional neural network SGD. In [7], the recognition area recognition is achieved by a template matching method. The above methods also represent most of the recognition methods for automotive tire marking point features. The advantages and shortcomings of the methods in the literature and the methods used in this paper are shown in Table 1.

Based on enterprise research and literature review, it is evident that most current methods for automotive tire marking point recognition rely on traditional image processing or manual recognition. However, these approaches suffer from low efficiency and accuracy in recognition. Additionally, there is limited utilization of deep learning, a powerful image feature recognition method, in automotive tire marking point recognition. To address this issue and improve the speed, accuracy, recognition diversity, and reliability of tire marking point recognition, this paper proposes a deep-learning-based method for small target recognition in automotive tire marking points. The proposed method demonstrates high accuracy in recognizing marking points in both tire industrial production environments and outdoor natural environments.

The article is structured as follows:

Introduction. The research background of tire marking points and its development trend is presented through company research and literature summary.
Materials and methods. It mainly introduces the overall scheme of tire marking point recognition, the way of data collection composition, and image pre-processing to solve the influence of light, weather, and other factors.
Tire marking point identification network structure. An improved YOLOv5s algorithm is proposed for the problem of tire marking point recognition:
(1)
Adding an attention mechanism
(2)
Loss function
(3)
Small target prediction head
Model Training and Testing. To verify whether the improved algorithm is better than the original algorithm and to cerify the rationality of the improvement through ablation experiments.
Conclusion. To summarize the whole text and draw conclusions.

2. Materials and Methods

The steps of the tire marking point recognition algorithm are shown in Figure 1a, and the flow chart is shown in Figure 1b. First, the captured raw image is pre-processed. Second, the processed images are expanded and labeled. Finally, an improved YOLOv5s is used to train the images and obtain the model data.

2.1. Dataset Preparation

A 14 MP industrial camera was used to take the shots and construct the dataset, which consisted of two parts.

The first part of the dataset is derived from the actual inspection environment in the factory, as shown in Figure 2. In the actual inspection, the tires are transported by a transport unit to the inspection area and pushed into the warehouse when the inspection is completed.

The second part of the dataset is derived from natural environment photography, collecting images of tires in different environments and weather; part of the collection is shown in Figure 3.

2.2. Image Pre-Processing

In industrial and natural environment acquisition, there are many factors that can lead to poor image quality, such as interference with industrial lighting, strong exposure due to strong light, and low contrast due to cloudy and rainy days. Therefore, image processing is needed to improve the image quality [8]. Therefore, Multi-Scale Retinex with Color Restoration (MSRCR) was introduced to pre-process the image to enhance the differentiation between the marker points and the background, which is implemented as shown in Equation (1) [9,10].

R_{M S R C R_{i}} (x, y) = C_{i} (x, y) R_{M S R_{i}} (x, y)

(1)

R_{M S R C R_{i}} (x, y)

represents the recovered image of the ith channel after using this algorithm; MSRCR adds a color recovery factor C as shown in Equation (2), which is used to adjust the color distortion due to contrast enhancement in local areas of the image, where α is the adjustment factor, β is the gain constant, and

I (x, y)

represents the original image.

C_{i} (x, y) = β \{\log [α \cdot I_{i} (x, y)] - \log [\sum_{i = 1}^{N} I_{i} (x, y)]\}

(2)

R_{M S R_{i}} (x, y) = \sum_{n = 1}^{m} λ_{n} \{\log I (x, y) - \log [G_{n} (x, y) \cdot I (x, y)]\}

(3)

In Equation (3),

R_{M S R_{i}} (x, y)

represents the recovered image of the ith channel after using multi-scale filtering,

G_{n} (x, y)

is a single-scale Gaussian filter, λ represents the weight, and m is the number of scales [11,12,13]. A comparison of the images before and after processing is shown in Figure 4.

From the results in Figure 4, it can be seen that after pre-processing, the environmental factors such as strong light and darkness, which cause unclear marking points, have been solved.

2.3. Data Enhancement

In the deep learning training process, the number of images has a significant impact on the model: the higher the number, the more accurate the training results. Therefore, the pre-processed images were flipped, rotated, and randomly scaled to expand the number of samples [14]. A total of 5600 sample images were obtained to construct the dataset, of which 80% are used as the training set, 10% as the validation set, and 10% as the test set. The dataset is labeled using Labelimg and converted to txt format.

3. Tire Marking Point Identification Network Structure

The official code for YOLOv5 currently offers five different depths and widths, YOLOv5n, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. In industrial applications, it is desirable to speed up recognition as much as possible, so the smaller model, YOLOv5s, was chosen as the training model.

3.1. YOLOv5s Model Structure

The YOLOv5s model consists of three main components: the backbone network (Backbone), the neck network (Neck), and the head network (Head).

The Backbone uses the CSPDarknet53 structure to extract feature information from the input image [15,16]. It serves as the foundation of the model.

The Neck is situated between the Head network and the Backbone network. Its primary function is to enhance feature diversity and network robustness. It achieves this by performing multi-scale feature fusion on the feature map, further improving the features’ quality and passing them to the prediction layer.

The Head network is responsible for generating the target detection results. In the original model, it comprises three detection layers, each corresponding to a set of initialized anchor values. The Head network receives feature maps of three different scales, obtained from the Neck, and conducts network predictions on these feature maps. By employing convolutional operations, it produces the final feature output [17]. The detection process involves identifying targets of three different dimensions: large, medium, and small [18]. The model diagram is presented in Figure 5.

3.2. YOLOv5s Model Improvements

Due to the large overall tire shape size and the small percentage of marking points in the whole tire image, the traditional YOLOv5s model is not ideal for small target detection, so the YOLOv5s structure is improved.

3.2.1. Adding an Attention Mechanism

In the improved YOLOv5s model [19], the simAM (Similarity Attention Mechanism) attention mechanism was introduced. This mechanism is a 3D weighted attention module that derives attention weights for the feature map without requiring additional parameters. The schematic diagram of the simAM module is depicted in Figure 6. The inspiration behind this module comes from the attention mechanism observed in the human brain. In the figure, the 3D weight implementation is achieved using an energy function.

The design of the simAM module draws inspiration from the field of neurology, where neurons carrying rich information are often distinguished from other neurons by exhibiting unique firing patterns. Additionally, neurons carrying rich information tend to suppress the activity of surrounding neurons, a phenomenon known as null-field suppression in neurology. Hence, it is crucial to assign higher weights to these neurons to value their contributions.

The simAM module uses the measurement of linear differentiability of neurons to find significant neurons and devises an energy function to find significant neurons, Equation (4) [20]:

e_{t} (w_{t}, b_{t}, y, x_{i}) = {(y_{t} - \hat{t})}^{2} + \frac{1}{M - 1} \sum_{i = 1}^{M - 1} {(y_{0} - x_{i})}^{2}

(4)

where

t

and

x_{i}

are the input features,

X \in R^{C \times H \times W}

are neurons in one of the channels that contain a lot of information, and other ordinary neurons that do not contain much information,

\hat{t}

and

x_{i}

These two parameters are derived from

t

and

x_{i}

and are transformed linearly by the following equation:

\hat{t} = w_{i} t + b

(5)

M = C \times H

(6)

where the parameter i is the spatial dimension index,

M = C \times H

is the number of neurons in the current channel,

w_{i}

is the transform weight, and

b_{i}

represents the deviation. Solving for the minimum of Equation (4) by binarizing

y_{t}

and

y_{0}

and regularizing Equation (4), a new expression for the energy function can be obtained as (7):

e_{t} (w_{t}, b_{t}, y, x_{i}) = \frac{1}{M - 1} \sum_{i = 1}^{M - 1} {(- 1 - (w_{t} x_{i} + b_{t}))}^{2} + {(1 - (w_{t} t + b_{t}))}^{2} + λ w_{t}^{2}

(7)

An analysis of the above equation gives:

w_{t} = - \frac{2 (t - μ_{t})}{{(t - μ_{t})}^{2} + 2 σ^{2} + 2 λ}

(8)

b_{t} = - \frac{1}{2} (t + μ_{i}) w_{i}

(9)

Assuming for the moment that all pixels in each channel follow the same distribution and that the mean of the target neuron

μ = \frac{1}{M} \sum_{i = 1}^{M - 1} x_{i}

and the variance of the target neuron

σ^{2} = \frac{1}{M} \sum_{i = 1}^{M - 1} {(x_{i} - μ)}^{2}

are removed, the energy function of the smallest neuron should be the following equation:

e_{t}^{*} = \frac{4 (σ^{2} + λ)}{{(t - μ)}^{2} + 2 σ^{2} + 2 λ}

(10)

where

μ = \frac{1}{M} \sum_{i = 1}^{M - 1} x_{i}

,

σ^{2} = \frac{1}{M} \sum_{i = 1}^{M - 1} {(x_{i} - μ)}^{2}

, and

e_{t}^{*}

are smaller means that this neuron t is different from the surrounding neurons and that this neuron t is also quite important, so it should be given enough weight and given a higher weight value.

Once a specific neuron has been found and then the derivative of the neuron energy expansion has been derived, the feature enhancement to be studied is performed by adding the Sigmoid function as in Equation (11).

\tilde{x} = s i g m o i d (\frac{1}{E}) ⊙ X

(11)

To determine the best location for the simAM module, it was connected to CBL, CSP_1, and CSP_3 of the original network to test the network performance, and it was finally decided to insert the simAM module behind CSP_3 and named CSPS.

3.2.2. Loss Function

The YOLOv5s model uses the IoU Loss, which was proposed in 2016, as the localization loss function. However, the original IoU Loss has several drawbacks. It fails to determine the distance between two frames when the predicted frame and the real frame do not intersect. Additionally, it cannot accurately determine the position and size of two frames when they have the same IoU value [21]. To address these limitations, Hamid Rezatofigh introduced the GIoU (Generalized Intersection over Union) Loss in CVPR 2019 [22]. The GIoU Loss takes into account both overlapping and nonoverlapping regions, providing a more comprehensive measure of the overlap between predicted and real frames compared to the original IoU Loss. Following this, DIoU and CIoU were developed, each with their own advantages and disadvantages [23]. In an effort to generalize existing IoU loss functions, Jiabo H proposed a power transformation by adding a Box–Cox transformation to the IoU Loss [24]. By adjusting the value of the power, the Alpha-IoU Loss enables the utilization of the existing IoU to its full advantage, leading to improved target recognition accuracy. The transformation process is described by the Equation (12):

L_{α - I o U} = \frac{1 - I o U^{α}}{α}, α > 0

(12)

Most of the loss functions now available by conditioning α include all of the loss functions mentioned above, and when using multiple α, can be extended to other loss functions that have α conditioning; multiple α values are generalized as in Equation (13).

L_{α - I o U} = 1 - I o U^{α_{1}} + P^{α_{2}} (B, B^{g t}), α ↛ 0

(13)

It is found that Alpha-IoU Loss has better robustness in small data and in noisy interference, and the regression accuracy can be improved by simply adjusting the value of α. Therefore, in this paper, the localization loss function of the network is replaced by Alpha-IoU Loss to improve the network recognition performance.

3.2.3. Small Target Prediction Head

When the image is input to the YOLOv5s network, the default input image size of the network is 640 × 640; if the image input size is too large, then the input image will be scaled, and after compression it will make the car tire marking point recognition more difficult. Therefore, on the basis of the original network, the small target prediction head is added to avoid the loss of features due to the compression of the image. The structure of the improved YOLOv5s network is shown in Figure 7.

4. Model Training and Testing

4.1. Experimental Environment

The experimental setting for this study is shown in Table 2.

4.2. Training Results and Analysis

4.2.1. Comparison of Experimental Data

The training set was trained with YOLOv5s and modified YOLOv5s to obtain the mAP, precision, and recall comparison curves, respectively. The results are shown in Figure 8, Figure 9 and Figure 10.

As can be seen from the three comparison plots above, the improved YOLOv5s accuracy rises significantly higher than the traditional YOLOv5s in the first 25 iterations; in 50–100 iterations the accuracy rises, and after 150 iterations both models stabilize. Finally, after 300 iterations the traditional YOLOv5s mAP reaches 86%, while the improved YOLOv5s mAP reaches 95.5%, an improvement of 9.5%. At the same time, improved YOLOv5s achieved a 9% increase in accuracy and 4% increase in recall compared to traditional YOLOv5s. The improved YOLOv5s also converged faster, and the recall rate did not decrease while the precision improved. Therefore, the improved YOLOv5s model is significantly better than the traditional YOLOv5s model. The visualization curves generated during the training of the improved YOLOv5s model are shown in Figure 11.

4.2.2. SimAM Module Validation

Currently, a CBAM attention mechanism is widely applied to the model of YOLOv5 [25]; to obtain the effect of a simAM module on the model, YOLOv5s, YOLOv5s+CBAM, and improved YOLOv5s were trained to compare the accuracy, recall, and average precision, and the results are shown in Table 3.

Although the CBAM attention module also improves the performance of the network, it does not improve as much as the simAM attention module, and unlike the simAM module, the CBAM module is parameter-free, which prolongs the training time of the model and can easily lead to overfitting.

4.2.3. Model Visualization Comparison Experiments

To further analyze the effectiveness of the YOLOv5s model and the improved YOLOv5s model for tire marking point recognition, two randomly selected image samples from the test set of the dataset were tested separately for the two trained models, and the results were visualized. Some of the results are shown in Figure 12.

From the results, it can be seen that the confidence levels of the improved YOLOv5s model are all higher than those of the traditional YOLOv5s model, and for the clearer cases of Figure 12a,b, the confidence levels of the two models are not much different: 0.95 and 0.92, respectively. For the smaller target detection of Figure 12c,d, the confidence levels of Figure 12c are 0.68, 0.65, 0.60, and for Figure 12d they are 0.62, 0.6, 0.6. The traditional YOLOv5s model detects one less marker point, and for small target detection, the improved YOLOv5s model has better results. Figure 12e confidence level is 0.85, Figure 12f confidence level is 0.82, Figure 12g confidence level is 0.82, 0.75, 0.79, and Figure 12h confidence level is 0.75, 0.65, 0.72.

4.2.4. Comparison of Test Results

The YOLO series and Fast-RCNN are the more common small target detection models. For the dataset of this study, we choose YOLOv4, YOLOv5s, Faster-RCNN, and improved YOLOv5s for comparison experiments. The results are shown in Table 4. It can be observed that in the task of tire marking point detection, the model proposed in this study outperforms other models.

5. Conclusions

This paper proposes a deep-learning-based method for identifying small targets in car tire marking points, which improves the accuracy of small target marking point recognition in images; the research results of this paper are summarized as follows:

In image pre-processing, the MSRCR algorithm is introduced to improve the contrast of identification points in the image, which can adapt to different environments and reduce the difficulty of model training.
Improve YOLOv5s model by adding a small target detection head to improve the accuracy of the model for small target recognition.
The addition of the parameter-free simAM attention mechanism, which enhances the ability of the convolutional layer to protrude features, can better improve feature information.
The loss function CIoU of the original network is replaced by Alpha-IoU, which is more flexible than the original CIoU in updating the parameters using back propagation and reducing the loss of predicted and true values, thus optimizing the model and enabling it to have better recognition results.

With the above improvements, the resulting improved YOLOv5s model has a 9.5% improvement in mAP compared to the original YOLOv5s. The accuracy is improved by 9%, the recall rate is improved by 4%, and the convergence speed is accelerated. Compared with other small target detection models, the models in this paper all have significant advantages. Through random test experiments, the superiority of the improved YOLOv5s model for small target recognition is verified and can be applied to practical detection tasks.

Author Contributions

Z.G.: conception of the study, proposition of the theory and method, supervision; J.Y.: literature search, figures, manuscript preparation and writing; J.S.: programming, testing of exiting code components. W.Z.: data collection. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Liaoning Provincial Education Department Project (Grant No. LJKZ0114). Yingkou Enterprise Doctoral Entrepreneurship and Entrepreneurship Program Project.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, N.; Li, Y.; Xiang, T. Influence of Tire Cornering Property on Ride Comfort Evaluation of Vehicles on Bridge. J. Southwest Jiaotong Univ. 2016, 51, 645–653. [Google Scholar]
Wang, Y.; Guo, H. Color Recognition of Tyre Marking Points Based on Support Vector Machine. J. East China Univ. Sci. Technol. Nat. Sci. Ed. 2014, 40, 520–523+532. [Google Scholar]
Kokubu, T.; Kunitake, H. Mark Inspecting System: US. U.S. Patent 6144033A, 21 February 2001. [Google Scholar]
SICK Sensor Inteligence. Powerful Image Processing: For Added Quality and Added Efficiency [EB/OL]. Available online: https://www.sick.com/cn/en/powerful-image-processing-for-added-quality-and-added-efficiency/w/blog-powerful-image-processing-for-added-quality-and-added-efficienc/ (accessed on 13 January 2017).
Wang, Y. Study on Recognition Technology of Automobile Tire Marking Points; East China University of Science and Technology: Shanghai, China, 2015. [Google Scholar]
Zhang, Q. Research on the Method of Recognition Tire Marking Points Based on Machine Vision; Beijing University of Chemical Technology: Beijing, China, 2020. [Google Scholar]
Zhao, L. Research and Design on Tire Surface Mark Recognition System; Shenyang University of Technology: Shenyang, China, 2012. [Google Scholar]
Zhu, Q.; Li, J.; Zhu, Y.; Li, J.; Zhang, J. Adaptive Infrared Thermal Image Enhancement Based on Retinex. Microelectron. Comput. 2013, 30, 22–25. [Google Scholar]
Zhu, S.; Hang, R.; Liu, Q. Underwater Object Detection Based on the Class-Weighted YOLO Net. J. Nanjing Norm. Univ. Nat. Sci. Ed. 2020, 43, 129–135. [Google Scholar]
Wang, Q.; Li, S.; Qin, H.; Hao, A. Super-Resolution of Multi-Observed RGB-D Images Based on Nonlocal Regression and Total Variation. IEEE Trans. Image Process. 2016, 25, 1425–1440. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Wu, Y.; Qi, B. Study on UAV spray method of intercropping farmland based on Faster RCNN. J. Chin. Agric. Mech. 2019, 40, 76–81. [Google Scholar] [CrossRef]
Jiang, Y.; Ma, Z.; Wu, C.; Zhang, Z.; Yan, W. RSAP-Net: Joint optic disc and cup segmentation with a residual spatial attention path module and MSRCR-PT pre-processing algorithm. BMC Bioinform. 2022, 23, 523. [Google Scholar]
Shang, Z.J.; Zhang, H.; Zeng, C.; Le, M.; Zhao, Q. Automatic orientation method and experiment of Fructus aurantii based on machine vision. J. Chin. Agric. Mech. 2019, 40, 119–124. [Google Scholar]
Zhang, J. Research on Multi-Object Tracking Method for Airport Surface; University of Electronic Science and Technology of China: Chengdu, China, 2021. [Google Scholar]
Zhang, Z.; Xie, F.; Chen, J.; Li, M. Intelligent Extraction Method of Image Marker Point Features Based on Improved YOLOv5 Model. Laser J. 2023, 1–8. Available online: http://kns.cnki.net/kcms/detail/50.1085.TN.20230421.0915.004.html (accessed on 27 July 2023).
Han, Z.; Wan, J.; Liu, Z.; Liu, K. Multispectral Imaging Detection Using the Ultraviolet Fluorescence Characteristics of Oil. Chin. J. Lumin. 2015, 36, 1335–1341. [Google Scholar]
Chen, F.; Wang, C.; Gu, M.; Zhao, Y. Spruce Image Segmentation Algorithm Based on Fully Convolutional Networks. Trans. Chin. Soc. Agric. Mach. 2018, 49, 188–194+210. [Google Scholar]
Huang, J.; Zhang, H.; Wang, L.; Zhang, Z.; Zhao, C. Improved YOLOv3 Model for miniature camera detection. Opt. Laser Technol. 2021, 142, 107133. [Google Scholar] [CrossRef]
Liu, L.; Hou, D.; Hou, A.; Liang, C.; Zheng, H. Automatic driving target detection algorithm based on SimAM-YOLOv4. J. Chang. Univ. Technol. 2022, 43, 244–250. [Google Scholar]
Qin, X.; Li, N.; Weng, C.; Su, D.; Li, M. Simple Attention Module based Speaker Verification with Iterative noisy label detection. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022. [Google Scholar]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; IEEE: New York, NY, USA, 2019. [Google Scholar]
Hu, X.; Liu, Y.; Zhao, Z.; Liu, J.; Yang, X.; Sun, C.; Chen, S.; Li, B.; Zhou, C. Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 2021, 185, 106135. [Google Scholar] [CrossRef]
He, J.; Erfani, S.; Ma, X.; Bailey, J.; Chi, Y.; Hua, X.S. Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression. Adv. Neural Inf. Process. Syst. 2021, 34, 20230–20242. [Google Scholar]
Yuan, D.R.; Zhang, Y.; Tang, Y.J.; Li, B.Y.; Xie, B.L. Multiscale Residual Attention Network and Its Facial Expression Recognition Algorithm. J. Chin. Comput. Syst. 2023, 11, 1–9. [Google Scholar]

Figure 1. (a) Algorithm step diagram. (b) Flow chart.

Figure 2. Tire inspection device.

Figure 3. Images of part of the original dataset: (a,b) for industrial environment collection, (c–f) for natural environment collection.

Figure 4. Comparison images: (a,c) for the original image, (b,d) for the processed image.

Figure 5. YOLOv5s model.

Figure 6. Structure of simAM attention module.

Figure 7. Improved YOLOv5s network structure.

Figure 8. Comparison diagram of mean average precision (mAP).

Figure 9. Comparison diagram of precision (P).

Figure 10. Comparison diagram of recall (R).

Figure 11. Improved YOLOv5s model visualization curve.

Figure 12. Comparison of results: where (a,c,e,g) are improved YOLOv5s results, and (b,d,f,h) are YOLOv5s results.

Table 1. Comparison of methods.

Categories	Advantages	Shortcomings
Methods of the literature	Low hardware configuration	Slow, inaccurate, requires human involvement
Methods of this paper	High speed and accuracy	High hardware configuration

Table 2. Experimental environment.

Category (Software/Hardware)	Version/Model
Python	3.7
Pytorch	1.5.1
Pycharm	2021.1.3
CUDA	11.0.2
Cudnn	11.0
Operating system	Window11
CPU	Intel i5-12500H
GPU	RTX 3050

Table 3. Model comparison.

Models	Accuracy (P)	Recall (R)	Average Accuracy (mAP)
YOLOv5s	0.87	0.84	0.860
YOLOv5s+CBAM	0.91	0.86	0.902
Improved YOLOv5s	0.96	0.89	0.955

Table 4. Improved YOLOv5s model compared to YOLOv5s, YOLOv4, and faster-RCNN.

Models	Precision	Recall	Training Duration (Hour)	Detection Speed (ms/pic)
Improved YOLOv5s	0.96	0.89	3.4	22
YOLOv5s	0.87	0.84	3.2	19
YOLOv4	0.78	0.77	3.8	31
Faster-RCNN	0.82	0.81	3.5	25

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, Z.; Yang, J.; Sun, J.; Zhao, W. Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s. Appl. Sci. 2023, 13, 8771. https://doi.org/10.3390/app13158771

AMA Style

Guo Z, Yang J, Sun J, Zhao W. Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s. Applied Sciences. 2023; 13(15):8771. https://doi.org/10.3390/app13158771

Chicago/Turabian Style

Guo, Zhongfeng, Junlin Yang, Jiahui Sun, and Wenzeng Zhao. 2023. "Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s" Applied Sciences 13, no. 15: 8771. https://doi.org/10.3390/app13158771

APA Style

Guo, Z., Yang, J., Sun, J., & Zhao, W. (2023). Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s. Applied Sciences, 13(15), 8771. https://doi.org/10.3390/app13158771

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on the Small Target Recognition Method of Automobile Tire Marking Points Based on Improved YOLOv5s

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset Preparation

2.2. Image Pre-Processing

2.3. Data Enhancement

3. Tire Marking Point Identification Network Structure

3.1. YOLOv5s Model Structure

3.2. YOLOv5s Model Improvements

3.2.1. Adding an Attention Mechanism

3.2.2. Loss Function

3.2.3. Small Target Prediction Head

4. Model Training and Testing

4.1. Experimental Environment

4.2. Training Results and Analysis

4.2.1. Comparison of Experimental Data

4.2.2. SimAM Module Validation

4.2.3. Model Visualization Comparison Experiments

4.2.4. Comparison of Test Results

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI