End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning

Ma, Zongfang; Wu, Zheping; Cao, Yonggen

doi:10.3390/electronics12010203

Open AccessArticle

End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning

by

Zongfang Ma

,

Zheping Wu

^* and

Yonggen Cao

School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(1), 203; https://doi.org/10.3390/electronics12010203

Submission received: 27 November 2022 / Revised: 19 December 2022 / Accepted: 22 December 2022 / Published: 31 December 2022

(This article belongs to the Section Optoelectronics)

Download

Browse Figures

Versions Notes

Abstract

:

In the field of intelligent robot and automatic drive, the task of license plate detection and recognition (LPDR) are undertaken by mobile edge computing (MEC) chips instead of large graphics processing unit (GPU) servers. For this kind of small computing capacity MEC chip, a light LPDR network with good performance in accuracy and speed is urgently needed. Contemporary deep learning (DL) LP recognition methods use two-step (i.e., detection network and recognition network) or three-step (i.e., detection network, character segmentation method, and recognition network) strategies, which will result in loading two networks on the MEC chip and inserting many complex steps. To overcome this problem, this study presents an end-to-end light LPDR network. Firstly, this network adopts the light VGG16 structure to reduce the number of feature maps and adds channel attention at the third, fifth, and eighth layers. It can reduce the number of model parameters without losing the accuracy of prediction. Secondly, the prediction of the LP rotated angle is added, which can improve the matching between the bounding box and the LP. Thirdly, the LP part of the feature map is cropped by the relative position of detection module, and the region-of-interest (ROI) pooling and fusion are performed. Seven classifiers are then used to identify the LP characters through the third step’s fusion feature. At last, experiments show that the accuracy of the proposed network reaches 91.5 and that the speed reaches 63 fps. In the HiSilicon 3516DV300 and the Rockchip Rv1126 Mobile edge computing chips, the speed of the network has been tested for 15 fps.

Keywords:

convolutional neural network; license plate detection and recognition; end-to-end network; ROI pooling; rotation angle; mobile edge compute

1. Introduction

With the substantial increase in the number of cars per capita in China, vehicle and parking problems have begun to negatively affect the comfort, health, and safety of residents [1]. Vehicle management technologies have become more and more important. The key to vehicle management is the detection and recognition of license plates (LPs).

In the 1980s, researchers employed traditional image processing methods to build license plate recognition systems in some special scenarios, and by the 1990s, the algorithmic efficiency of traditional models had made great progress. Kanayama [1] determined that machine learning and the Sobel operator could be used to detect the edges of LPs based on the color differences between the LP and the background. Candidate boxes were then screened according to the given aspect ratio thresholds. This scheme is quite sensitive to lines, and if the background contains many linear features (e.g., the car’s trunk lid), the recognition accuracy suffers. Silva et al. used a convolutional neural network (CNN) to recognize Brazilian license plates based on the you-only-look-once (YOLO) machine learning model. Both LP and vehicle detection were achieved, including reasonable levels of character recognition [2]. Saini et al. later used a multiwavelet transform to locate LPs, and image enhancement and distortion correction were used to improve performance, reaching an accuracy level of 88.36% [3].

In recent years, Lubna and Naveed Mufti summarized many interesting LPDR methods [4]. State-of-the-art LPDR methods often consist of number plate extraction (NPE), character segmentation (CS), and character recognition (CR) (as output from the system) [5]. Wang et al. proposed a Chinese LP character recognition method based on a deep-CNN/recurrent neural network hybrid that extracts features without character segmentation [6]. Laroca et al. applied a CNN to a sliding window process to recognize Indonesian LPs [7], and Du et al. introduced saliency maps to CNN-based LP recognition [8]. DL LP recognition methods often use two-step (i.e., detection and recognition) or three-step (i.e., detection, character segmentation, and recognition) strategies [9,10,11,12,13], including SSD [4,14], YOLO [7], FAST-RCNN [15], and HC [16]. Nur-A-Alam et al. proposed a detection and recognition method using a convolutional neural network, which made a great contribution for the smart city [17]. Although the step-wise models are conceptually simple, they have some obvious shortcomings. For example, the latter step strongly depends on the speed and accuracy of the previous step. If there are any problems in the early stage, the entire algorithm may not be able to generate useful predictions. Some researchers find a new solution in removing the split step and recognizing the character directly. Then, TE2E [18] and RPnet [19] proposed the only end-to-end network, which removes extract feature twice. It helps to save much time and acquires high accuracy. Heshan Padmasiri et al. proposed a method for automated license plate recognition for resource-constrained environments, which inspires our study greatly [20].

Recently, a new kind of DL platform has been created, namely mobile edge computing. Compared to the powerful Graphics Processing Unit server with big volume, it is a nail-sized embedded chip [21]. However, it cannot bear the traditional large model network, and it has difficulty meeting the requirements of speed and accuracy. To overcome the problem, this study provides an end-to-end light LPDR method, which uses the VGG16 network to extract the image features. Then, a novel deformation-handling technique is applied in which the position of the LP and its rotation angle are detected. The LP part of the feature map is then cropped for the region-of-interest (ROI) pooling sampling and fusion. A character recognition network then completes the detection, and the position information, rotation angle, and characters of the LP are output.

This paper has five sections, including the introduction, relative work, end-to-end LP detection and recognition method, experiment verification and application scenario. In the end-to-end LP detection and recognition method, the paper mainly introduces the LPDR method, the recognition module, and how to connect these two modules.

2. Relative Work

Our network is related to the license plate detection and recognition network model. It can finish detection and recognition tasks by a unified deep learning network. Especially because it has a small number of parameters and an outstanding perform with speed and accuracy, it can load on many MEC chips.

2.1. Detection Module

LP detection methods can be roughly divided into traditional image processing and neural network. Traditional image processing LP detection methods exploit abundant edge information or background color features. Kanayama [1] proposed that machine learning and the Sobel operator could be used to detect the edges of LPs based on the color differences between the LP and the background. Wang et al. [22] exploited the cascade AdaBoost classifier and a voting mechanism to elect plate candidates.

With the development of DL, many method-based convolutional neural networks are widely applied in LP detection. FAST-RCNN [15] utilizes a region proposal network that can generate high-quality region proposals for detection and thus could detect objects more accurately and quickly. SSD [4] completely eliminates proposal generation and subsequent pixel or feature resampling stages and encapsulates all computation in a single network. YOLO [14] and its improved version frame object detection have a regression problem to spatially separated bounding boxes and associated class probabilities.

2.2. Recognition Module

The strategies of LP recognition include: (i) The two-step strategy includes LP characters segmentation and character recognition. Some researches segment the LP characters by connected component analysis and character-specific extremal regions. After obtaining the segmentation LP characters, a character recognition network can be used to predict the LP characters [16,23,24,25,26]. (ii) segmentation-free methods, which utilize the LP character features to extract the plate characters directly to avoid segmentation or deliver the LP to an optical character recognition system or a convolutional neural network to perform the recognition task [18,19,24,27,28,29,30,31].

3. End-to-End LP Detection and Recognition Method

3.1. Character Encoding

LP recognition requires characteristic encoding owing to the nature of LPs. To facilitate the loss calculation and reduce the risk of mischaracterizations, the characters must first be encoded reasonably when entering the recognition network. The encoding table of the first character is presented in Table 1. This Chinese character identifies the province.

The second character is an English letter, as shown in Table 2.

Finally, the third-to-seventh characters are encoded, as shown in Table 3.

The model applies these three transformation tables as shown in Figure 1. The first Chinese province character is encoded according to Table 1, the second municipality character is encoded according to Table 2, and the remaining characters are, respectively, encoded according to Table 3. Finally, the labels are established.

3.2. LP Position Transformation

The LP’s label comprises six parts: the abscissa and ordinate of the center point of the plate, the length and width of the prediction box, the inclination angle of the plate, and the characters. The formula is given as follows.

x_{o b j}, y_{o b j}, W_{o b j}, H_{o b j}, A n g l e_{o b j}, [c o d e_{1}, c o d e_{2}, c o d e_{3}, c o d e_{4}, c o d e_{5}, c o d e_{6}, c o d e_{7}]

(1)

The traditional datasets are labeled with the four vertex coordinates (upper-left corner:

p_{1} (x_{1,} y_{1})

; upper-right corner:

p_{2} (x_{2,} y_{2}

); lower-right corner:

p_{3} (x_{3,} y_{3})

; and lower-left corner:

p_{4} (x_{4,} y_{4}

).

W_{o b j}

and

H_{o b j}

are the width and height of the bounding box.

W

and

H

are the width and height of the input image.

A n g l e_{o b j}

is the rotation angle. Clockwise is positive, and counterclockwise is negative.

c o d e_{1}

–

c o d e_{7}

are the seven Chinese LP characters coding by Table 1, Table 2 and Table 3. Equations (2)–(7) must be converted into standardized inputs.

W_{o b j} = \frac{‖ p_{1} - p_{2} ‖}{W} = \frac{\sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}}{W}

(2)

H_{o b j} = \frac{‖ p_{3} - p_{2} ‖}{H} = \frac{\sqrt{{(x_{3} - x_{2})}^{2} + {(y_{3} - y_{2})}^{2}}}{H}

(3)

x_{o b j} = \frac{x_{1} + x_{2}}{2 W}

(4)

y_{o b j} = \frac{y_{1} + y_{2}}{2 H}

(5)

k = \frac{1}{2} (\frac{y_{1} - y_{2}}{x_{1} - x_{2}} + \frac{y_{3} - y_{4}}{x_{3} - x_{4}})

(6)

A n g l e_{o b j} = \frac{\arctan (k)}{π} + \frac{1}{2}

(7)

Respectively,

x_{o b j}

and

y_{o b j}

are the abscissa and ordinate of the center point of the license plate target after conversion.

k

is the slope of the target rotation angle.

3.3. Network Model

In this section, the detection and recognition module of the LP are introduced. As can be seen from Figure 2, the detection and recognition module are added into a unified network.

3.3.1. Detection Module

As can be seen from the top half of Figure 2a, the network consists of two steps, extracting the features and predicting the LP position. After an extensive literature review [4,22,23,27,32,33,34,35] and considerable testing, we improved the VGG16 architecture, and proposed the LP extracted features module with the attention module. The attention module helps to provide more accurate detection. The channel attention mechanism provides more grained refined features and emphasizes “what” is a semantic part from a given input [36]. Then, the three sibling fully connected (FC) layers are used to extract features and predict the parallelogram bounding box.

As can be seen from Figure 2a,b, the attention module was used after downsampling, which has two steps, the attention channel and the attention fusion. First, a MaxPooling layer was used to reduce the scale of the feature maps. Second, the two FC layers were used to squeeze the feature maps and model the correlation between the feature channels. Third, we obtained a group of c*1*1 attention channels. Fourth, the attention of the channel was used to weight every feature map channel, which can strengthen the effective channels and weaken the ineffective channels.

In order to obtain a good fit between the bounding box and the LP, the network adapts five predictions, including height, width, abscissa, ordinate of parallelogram center, and the rotation angle of the bounding box. Compared with the traditional rectangular bounding box, the parallelogram bounding box can improve the fitting between the bounding box and the LP. The loss function is Euclidean Distance.

L o s s = ‖ (x - \hat{x}) ‖ + ‖ (y - \hat{y}) ‖ + ‖ (w - \hat{w}) ‖ + ‖ (h - \hat{h}) ‖ + ‖ (θ - \hat{θ}) ‖

(8)

where

θ

represents the angle.

The full network architecture is shown in Figure 2, and the detailed detection network is presented in Table 4.

3.3.2. Recognition Module

As can be seen from Figure 2, the height, width, abscissa, ordinate, and angle of the prediction are generated from the detection module; these are converted into specific feature coordinate information via Equations (9)–(17):

x_{1} = x - \frac{w}{2}

(9)

y_{1} = y - \frac{h}{2}

(10)

x_{2} = x + \frac{w}{2}

(11)

y_{2} = y + \frac{h}{2}

(12)

x_{3} = x - \frac{w}{2}

(13)

y_{3} = y + \frac{h}{2}

(14)

x_{4} = x + \frac{w}{2}

(15)

y_{4} = y - \frac{h}{2}

(16)

{\begin{cases} x_{i} = x_{i} \cdot s c a l e_{j}, i = 1, 2, 3, 4, j = 1, 2, 3 \\ y_{i} = y_{i} \cdot s c a l e_{j}, i = 1, 2, 3, 4, j = 1, 2, 3 \end{cases}

(17)

where

(x_{1}, y_{1}), (x_{2}, y_{2}), (x_{3}, y_{3}), (x_{4}, y_{4})

are the relative coordinate points of the four points on the feature map,

w

,

h

are the relative lengths of width and height respectively,

s c a l e_{j}

is the actual size of each feature map,

(x_{i}, y_{i})

is the actual coordinate point on each feature map [37,38,39].

First, these equations are applied to the feature maps of the second, fourth, and seventh layers, which are cropped and the size of the feature map is (608 × w) × (608 × h) × 64, (304 × w) × (304 × h) × 64, and (152 × w) × (152 × h) × 64, respectively, where w and h are the relative lengths of the predicted width and height of the LP, respectively.

Second, the ROI pooling is performed on the cropped feature maps and the sizes are 8 × 16 × 64, 8 × 16 × 64, and 8 × 16 × 64, respectively. The three feature maps are spliced into fusion feature maps of 8 × 16 × 192.

Third, two convolution layers and three full-connected layers are used to predict seven characters.

As we know, it takes much time to load data into the GPU and extract the features. However, as can be seen from Figure 3, traditional LP recognition, including two-step strategy or segmentation-free methods, all have twice the data loaded on the GPU and twice the extracted features.

Therefore, this study refers the ROI pooling from FAST-RCNN [15], the multiscale features from YOLO [6], and the network architecture from SSD [14]. Through many experiments, this study proposed this recognition module, which fuses the multiscale features from the detection module through a new ROI pooling method. Then, seven classifiers are used to predict the LP characters directly.

4. Experiment Verification

The CCPD dataset [16] was used in this study, which contains 280 k LP images, including dimly lit, long-distance, inclined, and bad weather plates. In order to validate the model, this study collected 20k LP images in the province of Shaanxi. First, 300 k LP images were used to divide the test dataset (5 k), validate the dataset (5 k) and train the dataset (290 k). Then, the 5k valid dataset was used to validate the model and count the average precision (AP). Second, in order to validate the precision of the rotated LP images and the bad weather LP images, the 1k rotated LP images and the 1 k bad weather LP images were divided from the 5 k valid dataset. Third, the rotated LP image and the bad weather LP images were validated through the best model after training in the training dataset, respectively. The hardware and software used for the experiment are shown in Table 5

4.1. LP Detection

A test LP dataset was extracted from the CCPD via interpolation, including blurred, missing, weather obscured, rotated (20–50°), and vertically tilted (−10–10°) plates, as shown in Figure 4 [16].

Compared with the contemporary LP detection boxes, as shown in Figure 5b, the red prediction bounding box and green ground truth bounding box are rectangular. These boxes have a large difference between the bounding box and the LP. However, the network in this study applies a more accurate deformed parallelogram, as shown in Figure 5a.

As can be seen from Figure 5, this study can be applied to complex real scenes. For the deformed license plates, this study can generate a parallelogram with a rotating angle, which tightly wraps the license plate.

The LP recognition is correct when and only when the intersection-over-union (IoU) is greater than 0.6 and all characters in the LP number are correctly recognized. As listed in Table 6, the experiment adopted several LP detection and recognition models (i.e., SSD300, YOLOV4, FAST-RCNN and TE2E) for comparison. The comparison results include how many frames are predicted per second (FPS) for the average prediction (AP), for the prediction of the base LP datasets in the CCPD (Base), for the prediction of the rotated LP datasets (Rotate), and for the bad weather in the CCPD.

4.2. LP Character Recognition

The comparative experimental results of the LPDR method are listed in Table 6 and Table 7. It can be seen that the proposed end-to-end network is over 63 fps. In the aspect of accuracy, it can reach 91.5%. Compared with the traditional target detection, our model is much smaller.

For the license plate detection contrast test, FAST-YOLO improved the YOLO methods, which can reach 65 fps. For the average precision, traditional objection detection methods all have great performance.

For the LPDR, in order to compare the end-to-end LPDR methods with the traditional two-step methods sufficiently, we combined state-of-the-art detection models with a state-of-the-art recognition model named Holistic-CNN(HC) [16] as comparisons, including Cascade classifier + HC, SSD + HC, YOLO + HC, FAST-RCNN + HC. In order to highlight the proposed end-to-end net, PPA, Li et al., TE2E, and RPnet were added. Through conducting the comparison experiment in the valid dataset, the speed of the proposed end-to-end network can reach 63 frames per second and the average precision can reach 91.5%. Meanwhile, the number of model parameters is the minimum. Even while running on the MEC chips with two TOPS compute capacity, it can also reach 15 fps. Especially, in the rotated datasets, the network can reach 96.3%.

5. Application Scenario

The presented end-to-end DL LP detection and recognition network was applied in the chips of HiSilicon 3516DV300 and Rockchip Rv1126. This research was applied in the two chips with the network at three projects, i.e., the public toll, parking space, high monitoring robot, the curb robot, and the portable intelligent inspection robot, which have been used in many new first-tier cities in China.

As can be seen from Figure 6 and Figure 7, in the real scene, the developed portable intelligent patrol robot was equipped with the Rockchip Rv1126 chip. It can detect the license plate on the parking space in real time and identify it on a bicycle with a speed of 25 km/h.

6. Conclusions

In this paper, we fused the detection module and recognition module into an overall model and proposed a new end-to-end light LP detection and recognition network. Firstly, this study selected VGG16 network architecture as the backbone to extract the image features, which acquired great performance in SSD and ImageNet 2014 contest. Secondly, in order to improve the sensitivity to the license plate features, a channel attention mechanism was added after the pooling layers. Thirdly, in order to improve the fit between the LP and the prediction box, this study adapted the parallelogram prediction box instead of the rectangle box. The most but not least, the multiscale feature maps cropped from the detection module were used to transform into a fused feature map through the ROI pooling method. Seven special character classifiers were used to identify the seven characters of the license plate. After testing on the CCPD dataset, the precision of the proposed network can reach 91.5% and the speed can reach 63 fps. The presented network was applied in the high monitoring robot, the curb robot, and the portable intelligent inspection robot to assist in the monitoring of public parking spaces.

Author Contributions

Conceptualization, Z.M. and Z.W.; methodology, Z.W.; validation, Z.W. and Y.C.; writing—original draft preparation, Z.W. and Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (2019YFC1907105) and the Key Research and Development Project of Shaanxi Province (2020GY-186, 2020SF-367).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jiang, X.; Sun, K.; Ma, L.; Qu, Z.; Ren, C. Vehicle Logo Detection Method Based on Improved YOLOv4. Electron 2022, 11, 3400. [Google Scholar] [CrossRef]
Kanayama, K.; Fujikawa, Y.; Fujimoto, K.; Horino, M. Development of vehicle-license number recognition system using realtime image processing and its application to travel-time measurement. In Proceedings of the 41st IEEE Vehicular Technology Conference, St. Louis, MO, USA, 19–22 May 1991; pp. 798–804. [Google Scholar]
Akhtar, M.J.; Mahum, R.; Butt, F.S.; Amin, R.; El-Sherbeeny, A.M.; Lee, S.M.; Shaikh, S. A Robust Framework for Object Detection in a Traffic Surveillance System. Electronics 2022, 11, 3425. [Google Scholar] [CrossRef]
Saini, M.K.; Saini, S. Multiwavelet transform based license plate detection. J. Vis. Commun. Image Represent. 2017, 44, 128–138. [Google Scholar] [CrossRef]
Mufti, N.; Shah, S.A.A. Automatic number plate Recognition: A detailed survey of relevant algorithms. Sensors 2021, 21, 3028. [Google Scholar]
Lalimi, M.A.; Ghofrani, S.; McLernon, D. A vehicle license plate detection method using region and edge based methods. Comput. Electr. Eng. 2013, 39, 834–845. [Google Scholar] [CrossRef]
Wang, J.; Yang, Y.; Mao, J.; Huang, Z.; Huang, C.; Xu, W. Cnn-rnn: A unified framework for multi-label image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2285–2294. [Google Scholar]
Laroca, R.; Severo, E.; Zanlorensi, L.A.; Oliveira, L.S.; Goncalves, G.R.; Schwartz, W.R.; Menotti, D. A robust real-time automatic license plate recognition based on the yolo detector. arXiv 2018, arXiv:1802.09567. [Google Scholar]
Yu, S.; Li, B.; Zhang, Q.; Liu, C.; Meng, M.Q.-H. A novel license plate location method based on wavelet transform and EMD analysis. Pattern Recognit. 2015, 48, 114–125. [Google Scholar] [CrossRef]
Montazzolli, S.; Jung, C. Real-time brazilian license plate detection and recognition using deep convolutional neural networks. In Proceedings of the 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Niteroi, Brazil, 17–20 October 2017; pp. 55–62. [Google Scholar]
Silva, S.M.; Jung, C.R. Real-time license plate detection and recognition using deep convolutional neural networks. J. Vis. Commun. Image Represent. 2020, 71, 102773. [Google Scholar] [CrossRef]
Jain, V.; Sasindran, Z.; Rajagopal, A.; Biswas, S.; Bharadwaj, H.S.; Ramakrishnan, K.R. Deep automatic license plate recognition system. In Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing ICVGIP’16, Hyderabad, India, 18–22 December 2016; pp. 1–8. [Google Scholar]
Song, M.K.; Sarker, M.; Kamal, M. Modeling and implementing two-stage AdaBoost for real-time vehicle license plate detection. J. Appl. Math. 2014, 2014, 697658. [Google Scholar] [CrossRef] [Green Version]
Girshick, R. FAST R-CNN. arXiv 2015, arXiv:1504.08083. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 21–37. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE PAMI 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Špaňhel, J.; Sochor, J.; Juránek, R.; Herout, A.; Maršík, L.; Zemčík, P. Holistic recognition of low quality license plates by cnn using track annotated data. In Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 29 August–1 September 2017; pp. 1–6. [Google Scholar]
Padmasiri, H.; Shashirangana, J.; Meedeniya, D.; Rana, O.; Perera, C. Automated license plate recognition for resource-constrained environments. Sensors 2022, 22, 1434. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Wang, P.; Shen, C. Towards end-to-end car license plates detection and recognition with deep neural networks. arXiv 2017, arXiv:1709.08828. [Google Scholar] [CrossRef]
Xu, Z.; Yang, W.; Meng, A.; Lu, N.; Huang, H.; Ying, C.; Huang, L. Towards end-to-end license plate detection and recognition: A large dataset and baseline. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 255–271. [Google Scholar]
Alam, N.; Ahsan, M.; Based, M.A.; Haider, J. Intelligent System for Vehicles Number Plate Detection and Recognition Using Convolutional Neural Networks. Technologies 2021, 9, 9. [Google Scholar] [CrossRef]
Du, S.; Ibrahim, M.; Shehata, M.; Badawy, W. Automatic license plate recognition (ALPR): A state-of-the-art review. IEEE Trans Circuits Syst Video Technol 2012, 23, 311–325. [Google Scholar] [CrossRef]
Zheng, D.; Zhao, Y.; Wang, J. An efficient method of license plate location. Pattern Recognit. Lett. 2005, 26, 2431–2438. [Google Scholar] [CrossRef]
Ashourian, M.; Daneshmandpoura, N.; Tehrania, O.S.; Moallemb, P. Real time implementation of a license plate location recognition system based on adaptive morphology. Int. J. Eng. 2013, 26, 1347–1356. [Google Scholar] [CrossRef]
Huang, Q.; Cai, Z.; Lan, T. A New Approach for Character Recognition of Multi-Style Vehicle License Plates. IEEE Trans Multimedia 2020, 23, 3768–3777. [Google Scholar] [CrossRef]
Wang, S.Z.; Lee, H.J. A Cascade Framework for a Real-Time Statistical Plate Recognition System. IEEE TIFS 2007, 2, 267–282. [Google Scholar] [CrossRef]
Li, H.; Wang, P.; Shen, C. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans. Intell. Transp. Syst. 2019, 20, 1126–1136. [Google Scholar] [CrossRef]
Haider, S.A.; Khurshid, K. An implementable system for detection and recognition of license plates in Pakistan. In Proceedings of the 2017 International Conference on Innovations in Electrical Engineering and Computational Technologies (ICIEECT), Karachi, Pakistan, 5–7 April 2017; pp. 1–5. [Google Scholar]
Abedin, M.Z.; Nath, A.C.; Dhar, P.; Deb, K.; Hossain, M.S. License plate recognition system based on contour properties and deep learning model. In Proceedings of the 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 21–23 December 2017; pp. 590–593. [Google Scholar]
Dias, C.; Jagetiya, A.; Chaurasia, S. Anonymous Vehicle Detection for Secure Campuses: A Framework for License Plate Recognition using Deep Learning. In Proceedings of the 2019 2nd International Conference on Intelligent Communication and Computational Techniques (ICCT), Jaipur, India, 28–29 September 2019. [Google Scholar]
Barreto, S.C.; Lambert, J.A.; Barros Vidal, F.D. Using Synthetic Images for Deep Learning Recognition Process on Automatic License Plate Recognition. In Proceedings of the Mexican Conference on Pattern Recognition, Querétaro, Mexico, 26–29 June 2019. [Google Scholar]
Babbar, S.; Kesarwani, S.; Dewan, N.; Shangle, K.; Patel, S. A New Approach for Vehicle Number Plate Detection. In Proceedings of the 2018 Eleventh International Conference on Contemporary Computing (IC3), Noida, India, 2–4 August 2018; pp. 1–6. [Google Scholar]
Huang, Y.P.; Chen, C.H.; Chang, Y.T.; Sandnes, F.E. An intelligent strategy for checking the annual inspection status of motorcycles based on license plate recognition. Expert Syst. Appl. 2009, 36, 9260–9267. [Google Scholar] [CrossRef]
Pechiammal, B.; Renjith, J.A. An efficient approach for automatic license plate recognition system. In Proceedings of the 2017 Third International Conference on Science Technology Engineering & Management (ICONSTEM), Chennai, India, 23–24 March 2017; pp. 121–129. [Google Scholar]
Sarfraz, M.; Ahmed, M.J.; Ghazi, S.A. Saudi Arabian license plate recognition system. In Proceedings of the 2003 International Conference on Geometric Modeling and Graphics, London, UK, 16–18 July 2003; pp. 36–41. [Google Scholar]
Sferle, R.M.; Moisi, E.V. Automatic Number Plate Recognition for a Smart Service Auto. In Proceedings of the 2019 15th International Conference on Engineering of Modern Electric Systems (EMES), Oradea, Romania, 13–14 June 2019; pp. 57–60. [Google Scholar]
Wang, X.; Zhu, H. Single-Shot Object Detector Based on Attention Mechanism. In Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2019), Sanya, China, 20–22 December 2019. [Google Scholar]
Janssens, O.; Slavkovikj, V.; Vervisch, B.; Stockman, K.; Loccufier, M.; Verstockt, S.; Van de Walle, R.; Van Hoecke, S. Convolutional neural network based fault detection for rotating machinery. JSV 2016, 377, 331–345. [Google Scholar] [CrossRef]
Masood, S.Z.; Shu, G.; Dehghan, A.; Ortiz, E.G. License plate detection and recognition using deeply learned convolutional neural networks. arXiv 2017, arXiv:1703.07330. [Google Scholar]

Figure 1. Chinese license plate encoding process. The first encoding character identifies Chinese province. The second encoding character identifies city.

Figure 2. End-to-end light license plate detection and recognition network architecture. (a). Network structure diagram beginning with network input and ending with license plate position and character output. x and y are abscissa and ordinate. w and h are the width and height of bounding box. θ is the rotation angle of bounding box. (b). Attention mechanism (c). Region-of-interest (ROI) pooling, the red line divides LP into several regions, the red circles are the ROI centers of several regions. The white dashed line divides every small ROI region into four small regions. The maximum value within each small ROI region is calculated using bilinear interpolation. Then the maximum values of the four small ROI regions synthesize the red final maximum value.

Figure 3. Traditional LP recognition method.

Figure 4. License plate test dataset examples.

Figure 5. License plate boundary box improvements; (a) End-to-end network; (b) Contemporary networks.

Figure 6. Curb robot.

Figure 7. Portable intelligent inspection robot.

Table 1. Encoding the first Chinese character on the license plate.

first character	皖	沪	津	渝	冀	晋	蒙	辽	吉	黑	苏	浙
coded number	1	2	3	4	5	6	7	8	9	10	11	12
first character	京	闽	赣	鲁	豫	鄂	湘	粤	桂	琼	川	贵
coded number	13	14	15	16	17	18	19	20	21	22	23	24
first character	云	藏	陕	甘	青	宁	新	警	学	O
coded number	25	26	27	28	29	30	31	32	33	34

Table 2. Encoding the second character (city) in the license plate.

second character	A	B	C	D	E	F	G	H	J	K	L	M	N
coded number	1	2	3	4	5	6	7	8	9	10	11	12	13
second character	P	Q	R	S	T	U	V	W	X	Y	Z	0
coded number	14	15	16	17	18	19	20	21	22	23	24	25

Table 3. Encoding of the third-to-seventh characters in the license plate.

remaining character	A	B	C	D	E	F	G	H	J	K	L	M
coded number	1	2	3	4	5	6	7	8	9	10	11	12
remaining character	N	P	Q	R	S	T	U	V	W	X	Y	Z
coded number	13	14	15	16	17	18	19	20	21	22	23	24
remaining character	0	1	2	3	4	5	6	7	8	9
coded number	25	26	27	28	29	30	31	32	33	34

Table 4. Conventional layers of the detection network.

No.	Input	Convolution Kernel	Number of Feature Maps	Activation Function	Output
1	N × 608 × 608 × 3	3 × 3	64	LeakyRelu	n × 608 × 608 × 64
2	n × 608 × 608 × 64	3 × 3	64	LeakyRelu	n × 608 × 608 × 64
	n × 608 × 608 × 64	\	\	Pooling	n × 304 × 304 × 64
3	n × 304 × 304 × 64	3 × 3	64	LeakyRelu	n × 304 × 304 × 64
4	n × 304 × 304 × 64	3 × 3	64	LeakyRelu	n × 304 × 304 × 64
	n × 304 × 304 × 64	\	\	Pooling	n × 152 × 152 × 64
5	n × 152 × 152 × 64	3 × 3	64	LeakyRelu	n × 152 × 152 × 64
6	n × 152 × 152 × 64	3 × 3	64	LeakyRelu	n × 152 × 152 × 64
7	n × 152 × 152 × 64	3 × 3	64	LeakyRelu	n × 152 × 152 × 64
	n × 152 × 152 × 64	\	\	Pooling	n × 76 × 76 × 64
8	n × 76 × 76 × 64	3 × 3	32	LeakyRelu	n × 76 × 76 × 32
9	n × 76 × 76 × 32	3 × 3	32	LeakyRelu	n × 76 × 76 × 32
10	n × 76 × 76 × 32	3 × 3	32	LeakyRelu	n × 76 × 76 × 32
	n × 76 × 76 × 32	\	\	Pooling	n × 38 × 38 × 32
11	n × 38 × 38 × 32	3 × 3	32	LeakyRelu	n × 38 × 38 × 32
12	n × 38 × 38 × 32	3 × 3	32	LeakyRelu	n × 38 × 38 × 32
13	n × 38 × 38 × 32	3 × 3	32	LeakyRelu	n × 38 × 38 × 32
14	n × 46,208	\	\	Relu	n × 4096
15	n × 4096	\	\	Relu	n × 128
16	n × 128	\	\	sigmod	n × 5

Table 5. Hardware and software used for the experiment.

Equipment	Type
processor	Inter(R) Core(TM)i7-10700
memory	32.00GB
graphics card	GEFORCE GTX 3090 super
hard disk	SA400S37/480G
graphics architecture	CUDA10.0
operating system	Ubuntu16.04
simulation platform	Python3.6

Table 6. Comparative accuracy results of object detection.

Method	FPS	AP	Base	Rotate	Bad Weather
SSD300 [15]	40	94.4	99.1	95.6	96.1
YOLOV4-416 [7]	42	93.1	98	94	95.3
FAST-RCNN [16]	15	92.9	98.1	91.8	92.3
TE2E [19]	3	91.1	98.5	93	93.5
RPNet [20]	61	94.5	97.6	87.9	89
RCNN [30]	31	91.2	95.1	91.6	90.3
FAST YOLO [31]	65	90.3	94.5	92.1	91.3
Otsu’s threshold [32]	47	91.6	94.9	91.6	90.1
End-to-end Net	63	91.5	98.3	91	92
End-to-end Net (MEC)	32	91	97.3	90.1	90.7

Table 7. Comparative experimental results of license plate detection and character recognition.

Method	FPS	AP	Base	Rotate	Weather
SSD300+ Holistic-CNN [17]	35	94.4	99.1	95.6	94.2
YOLOV4-416+ Holistic-CNN	40	93.1	98	94	93.3
FAST-RCNN+ Holistic-CNN	12	92.9	98.1	91.8	90.2
Cascade classifier [26]+ Holistic-CNN	34	97.4	98.1	96.3	96.1
PPA [25]	29	59.9	69.7	0.5	36.1
Li et al. [27]	36	91.4	92.1	90.3	90.1
TE2E	3	94.4	97.8	95	93
RPNet	61	94.5	97.6	87.9	86.1
End-to-end Net	63	91.5	98.3	91	90.6
End-to-end Net (MEC)	15	88.9	96.3	90.1	89.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, Z.; Wu, Z.; Cao, Y. End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning. Electronics 2023, 12, 203. https://doi.org/10.3390/electronics12010203

AMA Style

Ma Z, Wu Z, Cao Y. End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning. Electronics. 2023; 12(1):203. https://doi.org/10.3390/electronics12010203

Chicago/Turabian Style

Ma, Zongfang, Zheping Wu, and Yonggen Cao. 2023. "End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning" Electronics 12, no. 1: 203. https://doi.org/10.3390/electronics12010203

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

End-to-End Light License Plate Detection and Recognition Method Based on Deep Learning

Abstract

1. Introduction

2. Relative Work

2.1. Detection Module

2.2. Recognition Module

3. End-to-End LP Detection and Recognition Method

3.1. Character Encoding

3.2. LP Position Transformation

3.3. Network Model

3.3.1. Detection Module

3.3.2. Recognition Module

4. Experiment Verification

4.1. LP Detection

4.2. LP Character Recognition

5. Application Scenario

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI