**1. Introduction**

Automatic vehicle license plate recognition (AVLPR) is used in a wide range of applications including automatic vehicle access control, traffic monitoring, and automatic toll and parking payment systems. Implementation of AVLPR systems is challenging due to the complexity of the natural images from which the license plates need to be extracted, and the real-time nature of the application. An AVLPR system depends on the quality of its physical components which acquire images and the algorithms that process the acquired images. In this paper, we focus on the algorithmic aspects of an AVLPR system, which includes the localization of a vehicle license plate, character extraction and character recognition. For the process of license plate localization, researchers have proposed various methods including, connected component analysis (CCA) [1], morphological analysis with edge

statistics [2], edge point analysis [3], color processing [4], and deep learning [5]. The rate of accuracy for these localization methods varies from 80.00% to 99.80% [3,6,7]. The methods most commonly used for the recognition stage are optical character recognition (OCR) [8,9], template matching [10,11], feature extraction and classification [12,13], and deep learning based methods [5,14].

In recent years, several countries including the United Kingdom, United States of America, Australia, China, and Canada have successfully used real-time AVLPR in intelligent transport systems [3,15,16]. However, this is not yet widely used in Malaysia. The road transportation department of Malaysia has authorized the use of three types of license plates. The first contains white alphanumeric characters embossed or pasted on a black background. The second type is allocated to vehicles belonging to diplomatic personnel and also contains white alphanumeric characters, but the background on these plates is red. The third type is assigned to taxi cabs and hired vehicles, consisting of black alphanumeric characters on a white plate. There are also some rules the characters must satisfy, for example, there are no leading zeros in a license plate sequence. Additionally, the letters "I" and "O" are excluded from the sequences due to their similarities with the numbers "1" and "0" and only military vehicles have the letter "Z" included in their license plates. The objective of this study was to implement a fast and accurate method for automatic recognition of Malaysian license plates. Additionally, this method can be easily applied to similar datasets.

The paper is organized as follows. A discussion of the existing literature in the field is given in Section 2. In Section 3, we introduce the proposed method: localizing the license plate based on a deep learning method for object detection, image feature extraction through histogram of oriented gradients (HOG), and character recognition using an artificial neural network (ANN) [17]. In Section 4, we establish the suitability of the method for real-time license plate detection through experiments, including comparisons with similar methods. The paper concludes in Section 5 with a discussion of the findings.

#### **2. Related Work**

A typical AVLPR system consists of three stages: detection or localization of the region of interest (i.e., the license plate from the image), character extraction from the license plate, and recognition of those characters [18–20].

#### *2.1. Detection or Localization*

The precision of an AVLPR system is typically influenced by the license plate detection stage. As such, many researchers have focused on license detection as a priority. For instance, Suryanarayana et al. [21] and Mahini et al. [22] used the Sobel gradient operator, CCA, and morphological operations to extract the license plate region. They reported 95.00% and 96.5% correct localization, respectively. To monitor highway ticketing systems, a hybrid edge-detection-based method for segmenting the vehicle license plate region was introduced by Hongliang and Changping [2], achieving 99.60% detection accuracy. According to Zheng et al. [23], if the vertical edges of the vehicle image are extracted while the edges representing the background and noise are removed, the vehicle license plate can be easily segmented from the resultant image. In their findings, the overall segmentation accuracy reported was approximately 97%. Luo et al. [24] proposed a license plate detection system for Chinese vehicles, where a single-shot multi-box detector was used for the detection method [25] and achieved 96.5% detection accuracy on their database.

A wavelet transform based method was applied by Hsieh et al. [26] to detect the license plate region from a complex background. They successfully localized the vehicle license plate in three steps. Firstly, they used Haar scaling [27] as a function for wavelet transformation. Secondly, they roughly localized the vehicle license plate by finding the reference line with the maximum horizontal variation in the transformed image. Finally, they localized the license plate region below the reference line by calculating the total pixel values (the region with the maximum pixel value was considered as the plate region) followed by geometric verification using metrics such as the ratio of length and

width of the region. They achieved 92.40% detection accuracy on average. A feature-based hybrid method for vehicle license plate detection was introduced by Niu et al. [28]. Initially, they used color processing (blue–white pairs) for possible localization of the license plate. They then used morphological processing such as open and close operations, followed by CCA. They used geometrical features (e.g., size) to remove unnecessary small regions and finally used HOG features in a support vector machine (SVM) to detect the vehicle license plate and achieved 98.06% detection accuracy with their database.

#### *2.2. License Plate Character Segmentation*

Correct segmentation of license plate characters is important, as the majority of incorrect recognition is due to incorrect segmentation, as opposed to issues in the recognition process [29]. Several methods have been introduced for character segmentation. For instance, Arafat et al. [9] proposed a license plate character segmentation method based on CCA. The detected license plate region was converted into a binary image and eight connected components were used for character region labeling. They achieved 95.40% character segmentation accuracy. A similar method was also introduced by Tabrizi et al. [13], where they achieved 95.24% character segmentation accuracy.

Chai and Zuo [30] used a similar process for segmenting vehicle license plate characters. To remove unnecessary small character regions from the detected license plate, a vertical and horizontal projection method, alongside morphological operations and CCA, was used. They achieved 97.00% character segmentation accuracy. Dhar et al. [31] proposed a vehicle license plate recognition system for Bangladeshi vehicles using edge detection and deep learning. Their method for character segmentation involved a combination of edge detection, morphological operations, and analysis of segmented region properties (e.g., ratio of height and width). Although they did not mention any segmentation results, they achieved 99.6% accuracy for license plate recognition. De Gaetano Ariel et al. [32] introduced an algorithm for Argentinian license plate character segmentation. They used horizontal and vertical edge projection to extract the characters, with a 96.49% accuracy level.

#### *2.3. Recognition or Classification*

Some researchers recognized license plates using adaptive boosting in conjunction with Haar-like features and training cascade classifiers on those features [33–35]. Several researchers have used template matching to recognize the license plate text [10,11]. Feature extraction based recognition has also proven to be accurate in vehicle license plate recognition [12,13,28]. Samma et al. [12] introduced fuzzy support vector machines (FSVM) with particle swarm optimization for Malaysian vehicle license plate recognition. They extracted image features using Haar-like wavelet functions and using a FSVM for classification and they achieved 98.36% recognition accuracy. A hybrid k-nearest neighbors and support vector machine (KNN-SVM) based vehicle license plate recognition system was proposed by Tabrizi et al. [13]. They used operations such as filling, filtering, dilation, and edge detection (using the Prewitt operator [36]) for license plate localization after color to grayscale conversion. For feature extraction, they used a structural and zoning feature extraction method. Initially, a KNN was trained with all possible classes including similar and dissimilar characters (whereas the SVM was trained only on similar character samples). Once the KNN ascertained which "similar character" class the target character belonged to, the SVM performed the next stage of classification to determine the actual class. They achieved 97.03% recognition accuracy.

Thakur et al. [37] introduced an approach that used a genetic algorithm (GA) for feature extraction and a neural network (NN) for classification in order to identify characters in vehicle license plates. They achieved 97.00% classification accuracy. Jin et al. [3] introduced a solution for license plate recognition in China. They used hand-crafted features on a fuzzy classifier to obtain 92.00% recognition accuracy. Another group of researchers proposed a radial wavelet neural network for vehicle license plate recognition [38]. They achieved 99.54% recognition accuracy.

Brillantes et al. [39] utilized fuzzy logic for Filipino vehicle license plate recognition. Their method was effective in identifying license plates from different issues which contained characters of different fonts and styles. They segmented the characters using CCA along with fuzzy clustering. They then used a template matching algorithm to recognize the segmented characters. The recognition accuracy of their methods was 95.00%. Another fuzzy based license plate region segmentation method was introduced by Mukherjee et al. [40]. They used fuzzy logic to identify edges in the license plate in conjunction with other edge detection algorithms such as Canny and Sobel [41,42]. A template matching algorithm was then used to recognize the license plate text from the segmented region and achieved a recognition accuracy of 79.30%. A hybrid segmentation method combining fuzzy logic and k-means clustering was proposed by Olmí et al. [43] for vehicle license plate region extraction. They developed SVM and ANN models to perform the classification task and achieved an accuracy level of 95.30%.

#### *2.4. Recent Methods of AVLPR*

Recently, deep learning based image classification approaches have received more attention from researchers as they can learn image features on their own, in addition to performing classification [44]. Therefore, no feature extraction is required for deep learning approaches. However, despite the advantages of using deep learning in image classification, it requires a large training image database and very high computational power. Li et al. [45] investigated a method of identifying similar characters on license plates based on convolution neural networks (CNN). They used CNNs as feature extractors and also as classifiers. They achieved 97.20% classification accuracy. Another deep learning method based on the AlexNet [46] was introduced by Lee et al. [47] for AVLPR, where they re-trained the AlexNet to perform their task on their database and achieved 95.24% correct recognition. Rizvi et al. [5] also proposed a deep learning based approach for Italian vehicle license plate recognition on a mobile platform. They utilized two deep learning models, one to detect and localize the license plate and the characters present, and another as a character classifier. They achieved 98.00% recognition accuracy with their database.

Another deep learning method called "you only look once" (YOLO) was developed for real-time object detection, which is now being used in AVLPR [48]. For example, Kessentini et al. [49] proposed a two-stage deep learning approach that first used YOLO version 2 (YOLO v2) for license plate detection [50]. Then, they used a convolutional recurrent neural network (CRNN) based segmentation-free approach for license plate character recognition. They achieved 95.31% and 99.49% character recognition accuracy in the two stages, respectively. Another YOLO based method was developed by Hendry and Chen [51] for vehicle license plate recognition in Taiwan. Here, for each character, detection and recognition was carried out using a YOLO model, totaling 36 YOLO models used for 36 classes. They achieved 98.22% and 78.00% accuracy for vehicle license plate detection and recognition, respectively.

Similarly, Yonetsu et al. [52] also introduced a two-stage YOLO v2 model for Japanese license plate detection. To increase accuracy, they initially detected the vehicle, followed by the detection of the license plate. In clear weather conditions, they achieved 99.00% and 87.00% accuracy for vehicle and license plate detection, respectively. A YOLO based three-stage Bangladeshi vehicle license plate detection and recognition method was implemented by Abdullah et al. [53]. Firstly, they used YOLO version 3 (YOLOv3) as their detection model [54]. In the second stage, they segmented the license plate region and character patches. Finally, they used a ResNet-20 deep learning model for the character recognition [55]. They achieved 95.00% and 92.70% accuracy for license plate detection character recognition, respectively. Laroca et al. [56] used YOLO for license plate detection and then another method proposed by Silva and Jung [57] for character segmentation and recognition. They tested their performance on their own database (UFPR-ALPR), which is now publicly available for research purposes. They achieved 98.33% and 93.53% accuracy for vehicle license plate detection and recognition, respectively.

#### **3. Methodology**

In the proposed system, a digital camera was placed at a fixed distance and height to be able to capture images of vehicle license plates. When a vehicle is at a predefined distance from the camera, it captures an image of the front of the vehicle, including the license plate. This image then went through several pre-processing steps to eliminate the unwanted background and localize the license plate region. Once this region was extracted from the original image, a character segmentation algorithm was used to segment the characters from the background of the license plate. The segmented characters were then identified using an ANN classifier trained on HOG features. Figure 1 illustrates the steps involved for the proposed AVLPR system.

**Figure 1.** Outline of the proposed system for automatic license plate recognition.

#### *3.1. Image Acquisition*

A digital camera was used as an image acquisition device. This camera was placed at a height of 0.5 m from the ground. An ideal distance to capture images of an arriving vehicle was pre-defined. To detect whether a vehicle was within this pre-defined distance threshold, we subtracted the image of the background (with no vehicles) from each frame of the obtained video. If more than 70% of the background was obscured, it was considered that a vehicle was within this threshold. To avoid unnecessary background information, the camera lens was set to 5× zoom. The speed of the vehicles when the images were captured was around 20 km/h. Camera specifications and image acquisition properties are shown in Table 1.



#### *3.2. Detection of the License Plate*

Once the image was acquired, it was then processed to detect the license plate. First, the contrast of the RGB (red, green, and blue) images was improved using histogram equalization. As the location of the license plate in the acquired images was relatively consistent, we extracted a pre-defined rectangular region from the image to be used in the next stages of processing. Figure 2 shows the specifications of the region of interest (ROI). Therefore, we reduced the size of the image to be processed from 4608 × 3456 pixels in the original image to 2995 × 1891 pixels. The region to be extracted was defined using the x and y coordinates of the upper left corner (XOffset and YOffset) and the width and height of the rectangle.

**Figure 2.** The initial extracted region of interest, defined by the green bounding box.

To detect the license plate, the ROI extracted in the previous stage was first resized to 128 × 128 pixels. Then, based on previous studies, a deep learning based approach (YOLO v2, as discussed in [50]) was used to extract the license plate region. The network accepts 128 × 128 pixels RGB images as an input and processes them in an end-to-end manner to produce a corresponding bounding box for the license plate region. This network has 25 layers: one Image Input layer, seven Convolution layers, six Batch Normalization layers, six Rectified Linear Unit (ReLU) layers, three Max Pooling layers, one YOLO v2 Transform Layer, and one YOLO v2 Output layer [58]. Figure 3 shows the YOLO v2 network architecture. We used this network to detect the license plate region only, not for license plate character recognition. The motivation behind this methodology is to reduce total processing time with low computational power without compromising the accuracy.

**Figure 3.** The architecture of YOLO V2 network.

For training, we used the stochastic gradient descent with momentum (SGDM) optimizer of 0.9, Initial Learn Rate of 0.001, and Max Epochs of 30 [59]. We chose these values as they provided the best performance with low computational power in our experiments. To improve the network accuracy, we used training image augmentation by randomly flipping the images during the training phase. By using this image augmentation, we increased the variations in the training images without actually having to increase the number of labeled images. Note that, to observe unbiased evaluation, we did not perform any augmentation to test images and preserved it as unmodified. An example detection result is shown in Figure 4.

**Figure 4.** Detection result: from left to right, input image, ground truth image (red bounding box), and output image with detected license plate ROI (green bounding box).
