A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates

Kern, John; Urrea, Claudio; Cubillos, Francisco; Navarrete, Ricardo

doi:10.3390/app14125011

Open AccessArticle

A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates

by

John Kern

¹

,

Claudio Urrea

^1,*

,

Francisco Cubillos

² and

Ricardo Navarrete

¹

Electrical Engineering Department, Faculty of Engineering, University of Santiago of Chile, Las Sophoras 165, Estación Central, Santiago 9170020, Chile

²

Chemical Engineering and Biotechnology Department, Faculty of Engineering, University of Santiago of Chile, Las Sophoras 165, Estación Central, Santiago 9170020, Chile

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5011; https://doi.org/10.3390/app14125011

Submission received: 27 March 2024 / Revised: 6 June 2024 / Accepted: 6 June 2024 / Published: 8 June 2024

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a novel use of a bio-inspired retina model as a scene preprocessing stage for the recognition of letters and numbers on Chilean vehicle license plates. The goal is to improve the effectiveness and ease of pattern recognition. Inspired by the responses of mammalian retinas, this retinal model reproduces both the natural adjustment of contrast and the enhancement of object contours by parvocellular cells. Among other contributions, this paper provides an in-depth exploration of the architecture, advantages, and limitations of the model; investigates the tuning parameters of the model; and evaluates its performance when integrating a convolutional neural network and a spiking neural network into an optical character recognition (OCR) algorithm, using 40 different genuine license plate images as a case study and for testing. The results obtained demonstrate the reduction of error rates in character recognition based on convolutional neural networks (CNNs), spiking neural networks (SNNs), and OCR. It is concluded that this bio-inspired retina model offers a wide spectrum of potential applications to further explore, including motion detection, pattern recognition, and improvement of dynamic range in images, among others.

Keywords:

bio-inspired retina; pattern recognition; OCR; image processing; parvocellular cells

1. Introduction

The recognition of vehicle license plates and their alphanumeric content is a pressing necessity that has garnered considerable attention from researchers in recent decades. Various methods have emerged to address challenges such as detecting vehicle license plates in everyday environments and recognizing all their characters and digits under difficult conditions. Most methods are based on features tailored for vehicle license plates of specific regions, given the diverse design laws across countries regarding font type, number of digits, background color, text color, and even reflective properties.

In general, techniques for detection and recognition fall into three categories: (i) methods for vehicle license plate detection and character segmentation, (ii) optical character recognition (OCR), and (iii) vehicle license plate recognition. The literature indicates that techniques utilizing deep learning yield superior results [1], though they rely heavily on the volume of training data. Satisfactory results are not achievable with small datasets, and processing time significantly increases with very large databases. Therefore, it is logical to develop models with region-specific databases, despite common characters across regions.

In the case of Chile, alphanumeric vehicle license plates have been in circulation since 1985, comprising 2 letters (except for Ñ and Q) and 4 numbers, separated by the national coat of arms logo. Since 2007, the format has changed to 4 letters and 2 numbers, with the set of letters restricted to 18. Additionally, there are 10 different styles denoting vehicle types, with common styles including a white background for private vehicles, orange for taxis, yellow for public transportation, and black for police vehicles.

Despite advancements, challenges persist, such as processing time, environmental opacity, and necessary resolution per letter. Given these challenges, preprocessing images becomes crucial to increase success rates in vehicle license plate recognition. Hence, this paper explores the application of a bio-inspired parvocellular retinal model as a preprocessing stage to accentuate scene contours.

Bio-inspired retinal processing emulates mammalian visual responses, extracting aspects such as contour, depth, and motion. With the application of a bio-inspired retina, photographs with insufficient optical analysis often improve, making it a relevant tool to enhance the success of artificial vision processes.

The primary objective is to enhance the recognition of Chilean vehicle license plate characters, with secondary objectives including presenting key concepts of the bio-inspired retina and its mathematical modeling, and testing the retina model’s efficacy in enhancing scenes for character recognition using OCR, convolutional neural networks (CNNs), and spiking neural networks (SNNs).

This work is composed of six sections, beginning with the Introduction section and a general approach to the problem, followed by Section 2, entitled Related Work, where a bibliographic review of several works related to the application of the bio-inspired retina model is conducted. Then, in Section 3, the methodology is presented, with special emphasis on the mathematical models to be used. In Section 4, the final test results of the three experiments on 40 license plate samples under different conditions are reported. Section 5 presents a summary of the developed work and the most relevant conclusions. Finally, in Section 6, future work is presented.

2. Related Work

This section presents several articles focused on the recognition of characters within vehicle license plates, as well as research that develops bio-inspired retinal models applied in various disciplines and their impact on automated systems.

The recognition of letters and numbers on vehicle license plates can be divided into three different stages:

Detection and extraction of the region of interest containing the license plate, where the effort is focused on detecting vehicle license plates in a scene and ensuring that they are of readable quality.
Number and letter plate character segmentation, where the effort is focused on determining which part of a vehicle license plate segmented from the scene contains numbers and letters.
Recognition of alphanumeric characters, where, after determining the location of patterns on a vehicle license plate, two questions are answered: What letters are present on a vehicle license plate? What numbers are on a vehicle license plate? The answers must be presented in logical order [2].

Regardless of the stage, the use of deep learning-based methods has improved accuracy and the ability to recognize vehicle license plates in challenging environments. Convolutional neural networks (CNNs) have shown superior performance in this area. However, one of the main challenges associated with deep learning techniques is the need for large quantities of data, as the variability and quantity of training data influence performance [3]. The authors also highlight two main challenges in vehicle license plate detection (LPD). First, existing methods often use horizontal rectangles to enclose objects, which may not be suitable for license plates with different orientations due to rotation and perspective distortion. Second, the varying scales of license plates lead to difficulties in multi-scale detection. To address these issues, the authors propose a multi-oriented and scale-invariant license plate detection (MOSI-LPD) method based on CNNs, which outperforms existing approaches in detecting license plates with different orientations and multiple scales.

In [4], a comparison of object detection methods for vehicle license plate recognition is performed using a new dataset for Turkish license plates labeled with “car” and “license plate” classes. The tested networks include a single-shot multibox detector (SSD), a faster region-based convolutional neural network (Faster R-CNN), and region-based fully convolutional networks (R-FCNs). The results show that SSD-based solutions can better detect larger objects (cars) but have difficulties detecting smaller license plates, while Faster R-CNN achieves the highest accuracy of 97.9%.

The authors of [5] propose the edge-guided sparse attention network (EGSANet) for license plate detection, which utilizes edge contours and solves the problem of real-time detection. They also adjust parameters of the YOLOv3 network [6] to improve accuracy on a Malaysian license plate dataset from 87.75% to 99%.

Research in [7] proposes a convolutional neural network called optimal k-means clustering-based segmentation and convolutional neural networks (OKM-CNNs) for license plate detection, character segmentation, and number recognition. The proposed model employs the improved Bernsen algorithm (IBA) and connected component analysis (CCA) for license plate detection and localization, and the simulation results demonstrate effective performance compared to other methods.

In [8], an accurate and real-time automatic license plate recognition (ALPR) method called VSNet is proposed, which includes two CNNs: VertexNet for license plate detection and SCR-Net for license plate recognition. The proposed VSNet outperforms state-of-the-art methods by more than 50% relative improvement in the error rate, achieving recognition accuracy superior to 99% on the Chinese City Parking Dataset (CCPD) and the Application-Oriented License Plate (AOLP) dataset with an inference speed of 149 frames per second (FPS).

In [9], the researchers propose ALPRNet, a single neural network for the detection and recognition of mixed-style license plates. ALPRNet uses two fully convolutional one-stage object detectors to simultaneously detect and classify license plates and characters, followed by an assembly module to output the license plate strings. Experimental results show that ALPRNet achieves state-of-the-art results with a simple one-stage network.

One study [10] focuses on the detection and recognition of Chinese license plates in complex situations using datasets from the Chinese City Parking Dataset. The authors propose a two-stage license plate recognition algorithm based on YOLOv3 and the Improved License Plate Recognition Net (ILPRNET). The test results indicate that the proposed algorithm works well in diverse complex scenarios, with recognition accuracy reaching up to 99.2% on some datasets.

The presented research shares the common use of convolutional neural networks and well-known datasets such as “Caltech1999–2001” [11], “PKU-G1, G2, G3, G4, and G5”, the “Synthetic dataset”, the “Stanford cars dataset” [12], “CCPD”, “UFPR-ALPR”, and “SaudiLP” [13]. However, it is also common to find research using personal, non-public databases.

Given the works presented, the detection of vehicle license plates and the identification of their characters remains an ongoing problem with multiple approaches. Due to the high performance of CNNs, they have become the preferred technique for developing models, whether with known or proprietary datasets.

The following section presents research focused on the use of bio-inspired models for scene improvement or feature extraction to obtain better results in different applications.

It is important to note that computer vision is still an area of current research and is constantly evolving [14,15,16,17,18,19]. Different techniques require reinforcement of system responses to improve results, such as convolutional neural networks [20,21,22] and the use of a bio-inspired retina that allows the creation of filters for the visual preprocessing stage, analogous to the functioning of some biological retinas. This type of filter is applicable to multiple applications such as navigational robots [23], pattern recognition, and improvements in tone mapping, among others.

In [24], the authors propose a bio-inspired system based on the retina of fish to solve the problems of underwater image degradation caused by color distortion and non-uniform light polarization. The adaptive nature of this bio-inspired model allows it to operate without knowing the previous conditions of the water. The authors conclude that their algorithm fuses complementary information on luminosity provided by the ON–OFF response of the designed retina, which differs from other fusion methods. From an implementation perspective, the application of this technique in reconnaissance aquatic robots is fascinating.

The research described in [25] found that bio-inspired retinal models allow reduced processing time and applied these techniques to unmanned or autonomous vehicles. The results support the idea that the decision-making capabilities of the robots became faster and more accurate. Furthermore, the authors argue that bio-inspired retinal models supported by a reinforcement learning method prove very reliable for decision-making.

A proposed bio-inspired convolutional neural network layer, named Q9, is presented in [26]. This layer is designed to provide an invariant response to high-contrast changes and maintain consistent performance in classification tasks, even when the convolutional neural network is trained with different levels of contrast. Additionally, the authors claim that Q9 enables faster learning of image rotation angles than a standard convolutional layer. This demonstrates that bio-inspired retinal models offer improved performance in computational tasks when training a CNN.

In summary, the use of bio-inspired models allows great versatility in decomposing an image, enabling the resolution of various pattern identification, image segmentation, and motion detection problems with low processing time, becoming a complement to CNNs.

Based on the presented research, the analysis of characters on vehicle plates using CNNs remains under study, as it is one of the preferred methods due to its high accuracy and processing time. Since there is no exclusive dataset of Chilean vehicle license plates, this work opts to use a dataset of already-segmented characters containing letters and numbers compatible with Chilean regulations. Thus, this work focuses on processing already-segmented vehicle plate images, leaving out of scope the recognition and segmentation of license plates from a scene. To validate the results, images captured in an everyday environment are used. Finally, due to the need to preprocess images before recognition, the incorporation of a bio-inspired retina model is chosen to improve the overall results of the experiments to be presented.

3. Methodology

The work to be presented consists of three experiments, the first being an alphanumeric character interpreter based on OCR; the remaining two interpreters were based on a CNN model and an SNN model. Each experiment presented in this study includes two implementations: one incorporating a preprocessing stage utilizing a bio-inspired retinal model and another without the inclusion of the retinal model. This is performed with the objective of verifying the impact of bio-inspired preprocessing on the recognition of alphanumeric characters.

3.1. Analysis and Structure of the Bio-Inspired Retinal Model

3.1.1. Global Structure

The retina is a light-sensitive multilayer located at the back of the eye. Each layer of the retina is made up of neuronal cells with specialized functions that, by working in concert, can convert light signals into electrical impulses for transmission to the brain. Figure 1 illustrates the distribution of the layers that make up the retina.

The RPE attaches to the choroid via Bruch’s membrane and maintains the health of adjacent rod and cone photoreceptors through various processes. The tight junctions between individual RPE cells facilitate the formation of the blood–retinal barrier, maintaining the immune-privileged status of the eye and regulating ion exchange.

Photoreceptors, the main type of light-sensitive cell in the retina, traverse the next three layers of the retina in a highly polarized manner. The inner and outer segments of the rod- and cone-shaped photoreceptors form the photoreceptor layer adjacent to the RPE, followed by the outer limiting membrane (OLM), outer nuclear layer (ONL), and outer plexiform layer (OPL), as shown in Figure 2.

The human retina contains approximately 6 million cones and 120 million rods, with cones concentrated in the macula and fovea, providing central vision and color perception. Rods are located along the periphery of the retina and enable vision in poorly lit environments.

Via visual phototransduction, trapped light causes dissociation of retinal molecules from opsin proteins in cones and rods, leading to hyperpolarization of photoreceptors and inhibition of glutamate release. In the dark, glutamate suppresses the activity of bipolar and horizontal cells, while its absence relieves the inhibition of retinal neurons, leading to their activation. Bipolar cells amplify and transmit the electrical signal downstream to amacrine cells, while horizontal cells help optimize performance and provide feedback to photoreceptors.

The nuclei of secondary neurons—horizontal, amacrine, and bipolar cells—are located within the inner nuclear layer (INL), as shown in Figure 3. Amacrine cell dendrites synapse with retinal ganglion cell (RGC) dendrites in the inner plexiform layer (IPL). RGCs send action potentials through their axons, which form the optic nerve and carry visual messages to the brain. The internal limiting membrane (ILM) is the innermost layer of the retina, separating the layer of nerve fibers from the vitreous.

The outer and inner plexiform layers can be considered a set of biological low-pass and high-pass filters, forming a spatiotemporal filter. Processing is mostly performed in the frequency domain, facilitating the parvocellular and magnocellular pathways. The parvocellular pathway highlights contours to answer “What is it?” while the magnocellular pathway highlights movement and answers “Where is it?” Both pathways connect with the lateral geniculate nucleus (LGN) and relay information to different areas of visual processing. This study focuses on the parvocellular responses of bipolar cells, which highlight image contours and facilitate image processing.

3.1.2. Light Regulator Model

Since photoreceptors have the ability to adapt their response as a function of the surrounding luminance levels, the model of Equation (1), modified from the Michaelis–Menten model, is presented, allowing the light range to be normalized between 0 and a maximum value

V_{m a x}

[28].

C (p) = \frac{R (p)}{R (p) + V_{0} L (p) + V_{m a x} (1 - V_{0})} V_{m a x} + R (p)

(1)

Thus, the light adjustment

C (p)

of the photoreceptor

p

is determined by the current luminosity

R (p)

and the luminosity of the neighborhood

L (p)

of the photoreceptor

p

. The luminosity is calculated after applying the low-pass filter

F_{h}

. Conversely, the constant

V_{0}

is a compression parameter that ranges from 0 to 1, while

V_{m a x}

is the maximum value that the pixel to be processed can adopt.

The proposed model allows the brightness of the darkest pixels to be modified while keeping the lighter pixels constant or close to their value. To demonstrate this effect, Figure 4 illustrates the response of the photoreceptor system to several values of

V_{0}

.

3.1.3. Outer Plexiform Layer

The synaptic behavior between neurons can be modeled as an electrical circuit, as explained in Equation (2). Therefore, in accordance with the model presented in [29], the membrane potential of a cone

c (k, t)

at location

k

corresponds to the flux of incident photons in an electrical circuit with resistance

r_{c}

and capacitance

C_{c}

, excited by a voltage source with an internal resistance

r

. The gap junctions (dendrite and axon connections) between cones are modeled by

R_{c}

, between position

k

and the neighboring positions

k - 1

and

k + 1

.

C_{c} \frac{d}{d t} c (k, t) = \frac{s (k, t) - c (k, t)}{r_{c}} + \frac{c (k - 1, t) - c (k, t)}{R_{c}} + \frac{c (k + 1, t) - c (k, t)}{R_{c}}

(2)

The solution to Equation (2) is obtained by successively taking the Fourier transforms with respect to the discrete variable

k

and the continuous variable

t

, resulting in the transfer function of the conical circuit presented below:

C (f_{s}, f_{t}) = \frac{1}{1 + 2 α_{c} \cdot (1 - \cos (2 π a f_{s})) + j 2 π τ_{c} f_{t}} S (f_{s}, f_{t})

(3)

where

f_{s}

and

f_{t}

are the spatial and temporal frequencies, respectively, while

α_{c} = r_{c} / R_{c}

is the space containing the gap junctions and τ is the time constant of the cell membrane. From Equation (3), it follows that the photoreceptor layer functions as a spatiotemporal low-pass filter of non-separable variables. In the process of chromatic adaptation through negative feedback to the photoreceptors, the horizontal cells play a fundamental role in modulating the information processed by the cones, which exhibit behavior analogous to what is described by Equation (3), whose coefficients are

α_{h} = r_{h} / R_{h}

and

τ_{h} = r_{h} C_{h}

, as presented in the following equation:

F_{h} (f_{s}, f_{t}) = \frac{1}{1 + 2 α_{h} \cdot (1 - \cos (2 π f_{s})) + j 2 π τ_{h} f_{t}} S (f_{s}, f_{t})

(4)

In summary, given that horizontal cells regulate abrupt changes in luminance, it can be concluded that they function as a low-pass filter in the frequency domain.

In this context, the study conducted by [30] introduces an additional gain

β

in Equations (3) and (4), as presented in Equations (5) and (6), respectively. In these equations,

β_{p h}

represents the gain of

F_{p h}

, which is typically maintained at 0 but can be increased up to 1 to enhance the dynamic signal. Conversely,

β_{h}

denotes the gain of the filter

F_{h}

, which is generally set at 1 but can be decreased down to 0 to extract only the contours of the system. Consequently, low-pass filters can be considered within the OPL, which comprises horizontal cells and photoreceptors [26], enabling their behavior to be modeled with relative simplicity. The model for photoreceptors

F_{p h} (f_{s}, f_{t})

is expressed by Equation (5), whereas the model for horizontal cells

F_{h} (f_{s}, f_{t})

is determined by Equation (6). Both functions depend on the spatial frequency

f_{s}

and temporal frequency

f_{t}

.

F_{p h} (f_{s}, f_{t}) = \frac{1}{1 + β_{p h} + 2 α_{p h} (1 - \cos (2 π f_{s})) + j 2 π τ_{p h} f_{t}}

(5)

F_{h} (f_{s}, f_{t}) = \frac{1}{1 + β_{h} + 2 α_{h} (1 - \cos (2 π f_{s})) + j 2 π τ_{h} f_{t}}

(6)

Despite having the same structure, Equations (5) and (6) differ in certain terms. In these equations,

β_{p h}

represents the gain of the filter

F_{p h}

, which is generally set at 0, while

F_{h}

denotes the gain of the filter S, typically set at 0 to extract only the contours of the system.

The terms

α_{p h}

and

α_{h}

are spatial filtering constants, as

α_{p h}

adjusts the cutoff for high frequencies, whereas

α_{h}

adjusts the cutoff for low frequencies. Finally,

τ_{p h}

and

τ_{h}

are temporal filtering constants that allow minimization of the temporal noise.

Concerning horizontal cells, by either reinforcing or inhibiting the response of bipolar cells, it is hypothesized that the response of the

F_{p h}

filter corresponds to the response of the ON bipolar cell, while the convolution between the

F_{p h}

and

F_{h}

filters represents the response of the OFF bipolar cell. The combination of both cell types enables the acquisition of the OPL output, as expressed in Equation (7) [30]. It is important to note that the responses of the ON and OFF bipolar cells are complementary.

F_{O P L} (f_{s}, f_{t}) = F_{p h} (f_{s}, f_{t}) \cdot [1 - F_{h} (f_{s}, f_{t})]

(7)

Figure 5 shows the responses of each of the stages that make up the OPL. The spatiotemporal filter

F_{O P L}

allows the control of spatial and temporal noise and, in turn, improves the contours. The authors in [30] argue that this property is complementary, as the generated noise creates a diffuse contour that, when combined with the original image, results in an enhanced contour. Thus, the structures and textures can be easily extracted from the scene.

3.1.4. Parvocellular Response

Ganglion cells are responsible for processing information from bipolar cells before transmitting it to the lateral geniculate nucleus. In the parvocellular pathway, ganglion cells are referred to as midget cells. One approach to represent this type of neuron is by applying a logarithmic compressor, which exhibits behavior analogous to that of the applied photoreceptor model.

The compressor is expressed in Equation (8), where

C_{f}

represents the compression constant and

x

denotes the value of the pixel to be processed.

C_{l o g} (x) = C_{f} \cdot \log (1 + x)

(8)

Although applying this compressor evens out the grayscale, the purpose of the parvocellular pathway is to highlight the contours. For this reason, a logarithmic compressor was implemented within the photoreceptor model of Equation (1). To achieve this, the neighborhood

L (p)

is replaced by the compressor

C_{l o g} (x)

, and the

R (p)

value represents the response of the

F_{O P L}

model. Consequently, the curve was adjusted to further enhance the contours while preserving black tones.

3.2. Deep Learning Model

Although there is a large set of models suggested in the literature, it is important to consider that many research studies use pre-trained models and adapt them to their specific needs. However, the disadvantage that has been demonstrated is a slight to significant increase in processing time. Moreover, since the stated objective is alphanumeric recognition on Chilean vehicle license plates, it should be considered that at least 28 categories need to be detected (18 different letters and 10 numbers). Therefore, we propose the use of a simple network without prior training, which should generate an interpreter with a high percentage of accuracy and low processing time. The proposed model consists of 8 layers, where the first is a convolutional layer of 5 × 5 × 12 with stride 1 and zero padding, followed by a ReLU activation layer, continuing with a 2 × 2 average pooling layer and a stride of 2, resulting in an image of 48 × 35 × 12. The fourth layer is a convolutional layer of 5 × 5 × 64 with a stride of 1 and zero padding, generating an output of 44 × 31 × 64, followed by a ReLU layer and a 2 × 2 average pooling layer with a stride of 2, generating an output of 22 × 15 × 64. After these 6 layers, it enters a fully connected layer converting the three-dimensional vector to one dimension with 21,120 elements as input to the 28 assigned neurons, ending with the softmax activation layer among 28 categories. The network architecture depicted in detail in Figure 6.

3.2.1. Dataset

As the objective is to detect and interpret alphanumeric characters on Chilean vehicle license plates rather than to detect license plates in a scene, a dataset suitable in format and number of elements is sought. For this reason, we use the CNN letter dataset [31], which contains 35 categories (numbers 0 to 9 and letters A to Z), with a total of 35,500 elements. Figure 7 shows an excerpt of the “0” category, displaying 20 elements in different positions, angles, and focus. It should be noted that, given the focus of the work, elements that do not appear on license plates due to current regulations, including vowels and consonants such as Q and Ñ, etc., are removed from the dataset. After removing the letters that will not be used, there remain a total of 28 categories with 28,410 different images, of which 80% are used for training, 10% for testing, and 10% for validation.

3.2.2. Confusion Matrix and Performance Metrics

After training the CNN with a dataset of 28 categories, an accuracy of 99.62% is achieved. Figure 8 presents the confusion matrix of the 28 mentioned classes. The performance metrics of the model are displayed in Table 1.

Table 1 presents the metrics of precision, recall, and F1 score for each of the 28 classes of the trained model.

3.3. Spiking Neural Network

Spiking neural networks are a type of neural network inspired by the functioning of biological neural networks [32]. Unlike traditional neural networks, where information is represented by continuous values, in SNNs, information is encoded through the occurrence of discrete events called pulses or spikes. These pulses propagate throughout the neural network, transmitting information from one neuron to another [33].

The advantage of using SNNs lies in their processing time, since it is not necessary to evaluate a continuous mathematical function at all times. Instead, a set of pulses that feed a neuron is sufficient, and if they exceed the activation threshold, then the neuron will have an activation pulse at its output.

To facilitate the modeling, the same CNN architecture is used, adapted to spikes. Thus, the generated architecture is illustrated in Figure 9.

In this particular case, it is interesting to note that a bio-inspired retinal model as preprocessing would have an impact on the processing of an SNN since it would be expected to have highlighted edges, as well as a balance in pixel tonality, implying a higher probability of action in the region of interest.

4. Tests and Results

Subsequently, three experiments are conducted to evaluate the performance of character recognition in the presence and absence of bio-inspired retinal preprocessing.

The first experiment involves extracting digits from vehicle license plates and binarizing them before applying an OCR algorithm. The second experiment utilizes the proposed CNN model for character recognition. Lastly, the third experiment employs an SNN model generated from the CNN model.

For the experiments, 40 sample images were captured using a Huawei P20 smartphone (which incorporates a 20-megapixel camera), and the samples were analyzed on a desktop computer featuring an AMD Ryzen 3700 processor, 32 GB of RAM, and an Nvidia RTX-3080 GPU.

4.1. Parameters of the Bio-Inspired Model

The parameters used in the bio-inspired retinal model were selected after applying an algorithm to determine the maximum level of coincidence. Table 2 lists the selected parameters.

4.1.1. OCR Results

As an example, Figure 10 shows the license plate to be reviewed with and without the application of a bio-inspired retina model. In this image, it is apparent that the bio-inspired retina model adjusts the luminosity and reinforces the contours of the features of interest.

The results show an improvement in the reading of the samples and present superior performance in the presence of screws or rivets that make reading difficult, and in photographs with a certain inclination. An example of the above can be seen in Figure 11, where the OCR algorithm, using detection without a retina, failed to obtain any digit or letter, whereas in the case with a retina, a license plate reading “JD GC 57” was obtained with 83% accuracy. On the other hand, for the same license plate captured from a frontal view, both approaches achieve an accuracy of 83%.

In Figure 12, an example is shown with the presence of two rivets that negatively affect the reading of the photo without a retina (0% success rate). When applying the retina, the license plate reading “JF FP 14” was obtained (another case with 83% success).

However, in Figure 13, the opposite case occurs, where the result obtained without a bio-inspired retina is superior, achieving 100% success, while the retina-based approach yields 0% success. This could be attributed to the fact that the retinal model highlights edges or contours. Therefore, it injects more noise into the image to be analyzed by the OCR algorithm.

The processing time for each detection is around 50 to 60 milliseconds in the absence of the retinal model, while, when incorporating the bio-inspired retinal model, the processing time is between 55 to 63 milliseconds.

4.1.2. CNN Results

Unlike the OCR system, the nature of the CNN’s operation does not require prior binarization, unless its detection objective requires it. The trained CNN, in general, presents better results than OCR, as suggested by the literature, and with an execution time of around 150 milliseconds for processing static images.

In the case of Figure 10, where the license plate corresponds to “BJZC72”, the CNN produces only a 16% error, equivalent to one incorrect character. After applying the bio-inspired retina model, the error decreases to 0%. For Figure 11 (“JD GC 57”), the CNN model fails to detect two characters, resulting in a 33% error, which is reduced to 0% after applying the bio-inspired retina model. In the case of Figure 12, corresponding to “JF FP 14”, a 33% error is also obtained, but when applying the retina, it is only reduced to 16%. Finally, for the license plate “CTCB99”, corresponding to Figure 13, although the CNN model achieves a 0% error, the error increases to 33% after applying the retina model.

4.1.3. SNN Results

Although the SNN operates under a concept of pulses as illustrated in Figure 14, it should be remembered that it operates under the same structure as the CNN presented above. Therefore, similar results to the CNN are highly expected.

After analyzing the results, a processing time of around 50 milliseconds is observed, with a percentage of accuracy slightly lower than that of the CNN.

For the case of Figure 10, where the license plate corresponds to “BJZC72”, the SNN obtains a 33% error, equivalent to 2 incorrect characters, and after applying the bio-inspired retina model, the error decreases to 16%. In the case of Figure 11 (“JD GC 57”), the SNN model leaves 2 undetected characters, meaning a 33% error, which is reduced to 0% after applying the bio-inspired retina model. In the case of Figure 12, corresponding to “JF FP 14”, a 33% error is also obtained, but when applying the retina, it is reduced to 0%. Finally, for the license plate “CTCB99”, corresponding to Figure 13, although the error is 0% with the SNN model, after applying the retina model, the error increases to 33%. In this way, and in the cases presented, the SNN has a response capability similar to that of the CNN, as expected, but stands out with a shorter data processing time. The overall test results are presented below.

4.1.4. Global Results

As a summary of the experiment, Figure 15 illustrated the results obtained for the first 20 samples, while Figure 16 depicts the last 20 case studies. In these figures, a comparison of the results obtained by the three applied techniques is presented, with the error obtained by the OCR technique in the absence of the retinal model in blue and the OCR technique with the implementation of the bio-inspired retinal model in orange. Following the analogy of the OCR technique, the error rate obtained in the absence and presence of the bio-inspired retinal model is shown in gray and yellow, respectively. Finally, the error obtained by using the SNN technique in the absence and presence of the bio-inspired retinal model is shown in light blue and green, respectively.

As the literature indicates, the deep learning method shows better accuracy when recognizing an alphanumeric character, at least for this specific task. This performance can be attributed to the data used for training the deep learning network. However, it is important to note that OCR is a powerful tool in environments with sharp images and known typography, such as advertisements, books, documents, signs, etc. When focusing attention on the comparison between SNN and CNN, it can be stated that the two of them have similar performance, and they differ in processing times, with SNN being faster than CNN as it requires less computing capacity.

Although the systems perform well in the character recognition task, it is crucial to emphasize that the introduction of bio-inspired retina preprocessing tends to improve performance in the three studied systems. Demonstrating favorable behavior for SNN, since the performance of the SNN in the absence of preprocessing is slightly lower than that of CNN, but after applying the preprocessing, it presents better performance than the CNN with preprocessing. It should be noted that the retina model worsened the performance of the techniques in situations with very stained or dirty license plates and in the presence of very large shadows that crossed over the characters. This is because the retinal model highlights the contours of the existing elements in the scene, also highlighting undesired information.

5. Conclusions

The literature review highlighted the importance of validating results in real-world scenarios and the necessity of preprocessing images before recognition. This need led to the incorporation of bio-inspired retina models to enhance the overall experimental outcomes in the license plate recognition process. This process involved several critical stages, including the detection and extraction of the region of interest, character segmentation, and alphanumeric recognition, where an increased application of CNNs was observed due to their superior performance. However, it is essential to consider that the success of these techniques heavily depended on the availability and variability of training data.

It was expected that the bio-inspired retina model would offer an overall improvement of the images before processing them using techniques such as OCR and neural networks. This expectation was met, as the bio-inspired retina acted as a filter, accentuating specific image attributes such as contours, lighting, and motion.

In this application, utilizing the parvocellular ganglion cell response proved appropriate, and the obtained results demonstrated a reduction in the error rate. However, it is imperative to acknowledge that instances marked by image contamination, paint anomalies, or high-contrast shadows could inadvertently accentuate irrelevant elements. Consequently, the use of this model is recommended in controlled lighting environments.

Notably, parameter tuning significantly influenced the performance of the bio-inspired retina model. Consequently, meticulous tuning was performed in advance to achieve optimal results.

Regarding the processing time, the OCR model had a processing time between 50 and 60 milliseconds, which increased to 55 and 63 milliseconds when incorporating the retinal model, while the CNN model had an execution time of approximately 150 milliseconds for static images, which increased to 155 milliseconds when incorporating the bio-inspired model. The SNN that operated under a pulse concept showed a processing time of around 50 milliseconds, very similar to the OCR case. While the SNN was a good candidate considering the processing time, it should be remembered that the CNN presented a better accuracy percentage.

In summary, the results obtained effectively addressed the challenge of reducing error rates in character detection using CNN-, SNN-, and OCR-based techniques, demonstrating that the CNN model with the retinal model offered the best accuracy pecentage with processing times that allowed it to operate in real time.

Finally, it was concluded that the bio-inspired retinal model offers a wide range of potential applications for further exploration, including motion detection, pattern recognition, and dynamic range enhancement in images, among others.

6. Future Work

The present study focused on the identification of letters or numbers within vehicle license plates by utilizing pre-segmented images. However, extending this system to process raw, unprocessed images remains an intriguing prospect for future investigation. This would enable the application of the bio-inspired retina model in real-world scenarios, where image segmentation and license plate detection are crucial initial steps.

Another pivotal consideration for future research pertains to the integration of this bio-inspired retina model with artificial intelligence systems. Combining the model’s image enhancement capabilities with advanced machine learning techniques could potentially facilitate the recognition of numbers or letters in a more robust and efficient manner. This integration could leverage the strengths of both approaches, leading to improved accuracy and versatility in character recognition tasks.

Furthermore, exploring the applicability of the bio-inspired retina model in other domains beyond license plate recognition could unlock a wide range of potential applications. Its ability to accentuate specific image features, such as contours, lighting, and motion, could prove valuable in fields such as object detection, pattern recognition, and image enhancement for various computer vision tasks.

Additionally, investigating the model’s performance under different lighting conditions, environmental factors, and image quality levels could provide valuable insights for further refinement and optimization. This would ensure the model’s robustness and adaptability in diverse real-world scenarios.

In conclusion, the promising results obtained in this study open up future research avenues, in which the bio-inspired retina model could be integrated with advanced techniques, applied to raw image data, and explored in various domains, ultimately contributing to the advancement of computer vision and image processing technologies.

Author Contributions

Conceptualization, J.K., C.U., F.C. and R.N.; methodology, J.K., C.U., F.C. and R.N.; software, J.K., C.U., F.C. and R.N.; validation, J.K., C.U., F.C. and R.N.; formal analysis, J.K., C.U., F.C. and R.N.; investigation, J.K., C.U., F.C. and R.N.; resources, J.K., C.U., F.C. and R.N.; data curation, J.K., C.U., F.C. and R.N.; writing—original draft preparation, J.K., C.U. and R.N.; writing—review and editing, J.K., C.U., F.C. and R.N.; visualization, J.K., C.U., F.C. and R.N.; supervision, J.K., C.U. and F.C.; project administration, J.K., C.U. and F.C.; funding acquisition, J.K., C.U. and R.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

This work has been supported by the Vicerrectoría de Investigación, Innovación y Creación of the Universidad de Santiago de Chile, Chile.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations, listed in alphabetical order, are used in this manuscript:

AC	access control
ALPR	automatic license plate recognition
AOLP	Application-Oriented License Plate
ATP	adenosine triphosphate
CCA	connected component analysis
CCPD	Chinese City Parking Dataset
CNNs	convolutional neural networks
EGSA	edge-guided sparse attention
EGSANet	edge-guided sparse attention network
ELM	external limiting membrane
Faster R-CNN	faster region-based convolutional neural network
FPS	frames per second
GCL	ganglion cell layer
IBA	improved Bernsen algorithm
LE	law enforcement
ILM	internal limiting membrane
INL	inner nuclear layer
IPL	inner plexiform layer
LGN	lateral geniculate nucleus
LPD	license plate detection
MOSI-LPD	multi-oriented and scale-invariant license plate detection
OCR	optical character recognition
OKM-CNNs	optimal k-means clustering-based segmentation and convolutional neural networks
ONL	outer nuclear layer
OPL	outer plexiform layer
R-CNN	region-based convolutional neural network
R-FCN	region-based fully convolutional network
RGC	retinal ganglion cell
RP	road patrol
RPE	retinal pigment epithelium
SNNs	spiking neural networks
SSD	single-shot multibox detector
V1	primary visual cortex
V5	medial temporal area of the brain

References

Khan, M.M.; Ilyas, M.U.; Khan, I.R.; Alshomrani, S.M.; Rahardja, S. License Plate Recognition Methods Employing Neural Networks. IEEE Access 2023, 11, 73613–73646. [Google Scholar] [CrossRef]
Wójcik, P.; Neumann, T. A Personal Microcomputer as an Access Control Management Platform in Road Transport. Appl. Sci. 2023, 13, 9770. [Google Scholar] [CrossRef]
Han, J.; Yao, J.; Zhao, J.; Tu, J.; Liu, Y. Multi-oriented and scale-invariant license plate detection based on convolutional neural networks. Sensors 2019, 19, 1175. [Google Scholar] [CrossRef]
Peker, M. Comparison of Tensorflow Object Detection Networks for Licence Plate Localization. In Proceedings of the 2019 1st Global Power, Energy and Communication Conference (GPECOM), Nevsehir, Turkey, 12–15 June 2019; pp. 101–105. [Google Scholar] [CrossRef]
Liang, J.; Chen, G.; Wang, Y.; Qin, H. EGSANet: Edge–guided sparse attention network for improving license plate detection in the wild. Appl. Intell. 2022, 52, 4458–4472. [Google Scholar] [CrossRef]
Lee, Y.Y.; Halim, Z.A.; Wahab, M.N.A. License Plate Detection Using Convolutional Neural Network-Back to the Basic With Design of Experiments. IEEE Access 2022, 10, 22577–22585. [Google Scholar] [CrossRef]
Pustokhina, I.V.; Pustokhin, D.A.; Rodrigues, J.J.P.C.; Gupta, D.; Khanna, A.; Shankar, K.; Seo, C.; Joshi, G.P. Automatic Vehicle License Plate Recognition Using Optimal K-Means with Convolutional Neural Network for Intelligent Transportation Systems. IEEE Access 2020, 8, 92907–92917. [Google Scholar] [CrossRef]
Wang, Y.; Bian, Z.P.; Zhou, Y.; Chau, L.P. Rethinking and Designing a High-Performing Automatic License Plate Recognition Approach. IEEE Trans. Intell. Transp. Syst. 2022, 23, 8868–8880. [Google Scholar] [CrossRef]
Huang, Q.; Cai, Z.; Lan, T. A Single Neural Network for Mixed Style License Plate Detection and Recognition. IEEE Access 2021, 9, 21777–21785. [Google Scholar] [CrossRef]
Zou, Y.; Zhang, Y.; Yan, J.; Jiang, X.; Huang, T.; Fan, H.; Cui, Z. License plate detection and recognition based on YOLOv3 and ILPRNET. Signal Image Video Process. 2022, 16, 473–480. [Google Scholar] [CrossRef]
Weber, M.; Perona, P. Caltech Database. 2022. Available online: https://data.caltech.edu/records/fmbpr-ezq86 (accessed on 26 March 2024).
Li, J. Stanford Cars Dataset. Available online: https://www.kaggle.com/datasets/jessicali9530/stanford-cars-dataset (accessed on 26 March 2024).
Ammar, A.; Koubaa, A.; Boulila, W.; Benjdira, B.; Alhabashi, Y. A Multi-Stage Deep-Learning-Based Vehicle and License Plate Recognition System with Real-Time Edge Inference. Sensors 2023, 23, 2120. [Google Scholar] [CrossRef]
Arroyo, S.; García, L.; Safar, F.; Oliva, D. Urban Dual Mode Video Detection System Based on Fisheye and PTZ Cameras. IEEE Lat. Am. Trans. 2021, 19, 1537–1545. [Google Scholar] [CrossRef]
Evangelista, L.G.C.; Guedes, E.B. Ensembles of Convolutional Neural Networks on Computer-Aided Pulmonary Tuberculosis Detection. IEEE Lat. Am. Trans. 2019, 17, 1954–1963. [Google Scholar] [CrossRef]
Marques, L.S.; De Lima, D.A.; Tsuchida, J.E.; Fuzatto, D.C. Virtual Environment for Smart Wheelchair Simulation. IEEE Lat. Am. Trans. 2021, 19, 456–465. [Google Scholar] [CrossRef]
Hernandez-Molina, E.; Ojeda-Magana, B.; Robledo-Hernandez, J.G.; Ruelas, R. Vision system prototype for inspection and monitoring with a smart camera. IEEE Lat. Am. Trans. 2020, 18, 1614–1622. [Google Scholar] [CrossRef]
Machado, A.; Veras, R.; Aires, K.; Neto, L.B. A Systematic Review on Product Recognition for Aiding Visually Impaired People. IEEE Lat. Am. Trans. 2021, 19, 592–603. [Google Scholar] [CrossRef]
Rosique, F.; Losilla, F.; Navarro, P.J. Using Artificial Vision for Measuring the Range of Motion. IEEE Lat. Am. Trans. 2021, 19, 1129–1136. [Google Scholar] [CrossRef]
Barth, V.B.D.O.; Oliveira, R.; De Oliveira, M.A.; Nascimento, V.E. Vehicle Speed Monitoring using Convolutional Neural Networks. IEEE Lat. Am. Trans. 2019, 17, 1000–1008. [Google Scholar] [CrossRef]
Khalifa, A.A.; Alayed, W.M.; Elbadawy, H.M. applied sciences Real-Time Navigation Roads: Lightweight and Efficient Convolutional Neural Network (LE-CNN) for Arabic Traffic Sign Recognition in Intelligent Transportation Systems (ITS). Appl. Sci. 2024, 14, 3903. [Google Scholar] [CrossRef]
Kabir, H.; Lee, H. Mask R-CNN-Based Stone Detection and Segmentation for Underground Pipeline Exploration Robots. Appl. Sci. 2024, 14, 3752. [Google Scholar] [CrossRef]
Lehnert, H.; Araya, M.; Carrasco-Davis, R.; Escobar, M.J. Bio-Inspired Deep Reinforcement Learning for Autonomous Navigation of Artificial Agents. IEEE Lat. Am. Trans. 2019, 17, 2037–2044. [Google Scholar] [CrossRef]
Gao, S.-B.; Zhang, M.; Zhao, Q.; Zhanga, X.-S.; Li, Y.-J. Underwater Image Enhancement using Adaptive Retinal Mechanisms. IEEE Trans. Image Process. 2019, 28, 5580–5595. [Google Scholar] [CrossRef]
Lehnert, H.; Mar, S. Retina-inspired Visual Module for Robot Navigation in Complex Environments. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar] [CrossRef]
Moya-Sánchez, E.U.; Xambó-Descamps, S.; Sánchez, A.; Salazar-Colores, S.; Martínez-Ortega, J.; Cortés, U. A bio-inspired quaternion local phase CNN layer with contrast invariance and linear sensitivity to rotation angles. Pattern Recognit. Lett. 2020, 131, 56–62. [Google Scholar] [CrossRef]
Guerrero, K. Capaz de la Retina. Available online: https://quizlet.com/237940864/capas-de-la-retina-diagram/ (accessed on 5 January 2024).
Urrea, C.; Kern, J.; Navarrete, R. Bioinspired Photoreceptors with Neural Network for Recognition and Classification of Sign Language Gesture. Sensors 2023, 23, 9646. [Google Scholar] [CrossRef]
Hérault, J.; Durette, B. Modeling visual perception for image processing. In Computational and Ambient Intelligence: 9th International Work-Conference on Artificial Neural Networks, IWANN 2007, San Sebastián, Spain, 20–22 June 2007. Proceedings 9; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4507, pp. 662–675. [Google Scholar] [CrossRef]
Benoit, A.; Caplier, A.; Herault, J. Using Human Visual System Modeling for Bio-inspired Low Level Image Processing. Comput. Vis. Image Underst. 2010, 114, 758–773. [Google Scholar] [CrossRef]
Jelal, A. License Plate Digits Classification Dataset. 2021. Available online: https://www.kaggle.com/datasets/aladdinss/license-plate-digits-classification-dataset?resource=download (accessed on 26 March 2024).
Eshraghian, J.K.; Ward, M.; Neftci, E.O.; Wang, X.; Lenz, G.; Dwivedi, G.; Bennamoun, M.; Jeong, D.S.; Lu, W.D. Training Spiking Neural Networks Using Lessons from Deep Learning. Proc. IEEE 2023, 111, 1016–1054. [Google Scholar] [CrossRef]
Guo, W.; Yantır, H.E.; Fouda, M.E.; Eltawil, A.M.; Salama, K.N. Towards efficient neuromorphic hardware: Unsupervised adaptive neuron pruning. Electronics 2020, 9, 1059. [Google Scholar] [CrossRef]

Figure 1. Representation of human retinal anatomy [27].

Figure 2. Schematic representation of the biological connectivity utilized for modeling the outer plexiform layer.

Figure 3. Schematic representation of the biological connections for the modeling of the inner plexiform layer.

Figure 4. Photoreceptor model response (different

V_{0}

values). (a) Original image. (b) Processed image with

V_{0} = 0.3

. (c) Processed image with

V_{0} = 0.4

. (d) Processed image with

V_{0} = 0.5

. (e) Processed image with

V_{0} = 0.6

. (f) Processed image with

V_{0} = 0.7

. (g) Processed image with

V_{0} = 0.8

. (h) Processed image with

V_{0} = 0.9

. (i) Processed image with

V_{0} = 1

.

Figure 4. Photoreceptor model response (different

V_{0}

values). (a) Original image. (b) Processed image with

V_{0} = 0.3

. (c) Processed image with

V_{0} = 0.4

. (d) Processed image with

V_{0} = 0.5

. (e) Processed image with

V_{0} = 0.6

. (f) Processed image with

V_{0} = 0.7

. (g) Processed image with

V_{0} = 0.8

. (h) Processed image with

V_{0} = 0.9

. (i) Processed image with

V_{0} = 1

.

Figure 5. Frequency domain response of the various stages involved in modeling the OPL. (a) Low-pass filter response

(F_{p h} (f_{s}, f_{t}))

. (b) High-pass filter response

(1 - F_{h} (f_{s}, f_{t}))

. (c) Passband filter response

(F_{O P L} (f_{s}, f_{t}))

.

Figure 5. Frequency domain response of the various stages involved in modeling the OPL. (a) Low-pass filter response

(F_{p h} (f_{s}, f_{t}))

. (b) High-pass filter response

(1 - F_{h} (f_{s}, f_{t}))

. (c) Passband filter response

(F_{O P L} (f_{s}, f_{t}))

.

Figure 6. Structure of the neural network.

Figure 7. Collection of samples per class from the dataset.

Figure 8. Confusion Matrix.

Figure 9. Representation of the architecture of an SNN. The network consists of an input layer and a processing layer that represents the 8 layers of the CNN architecture. An example image of the digit 8 is taken from the dataset.

Figure 10. Bio-inspired retina response example. (a) Original picture. (b) Picture with bio-inspired retina model.

Figure 11. Example of response for a tilted picture. (a) Original picture. (b) Picture with bio-inspired retina model.

Figure 12. Example of bio-inspired retina result for a picture with a rivet. (a) Original picture. (b) Picture with bio-inspired retina model.

Figure 13. Example of bio-inspired retina result for a picture with dirt. (a) Original picture. (b) Picture with bio-inspired retina model.

Figure 14. Example of sample dataset used and converted to spikes. (a) Corresponds to the original input of the dataset, the letter “Y”. (b) Corresponds to the image transformed into spikes from the input image.

Figure 15. Percentage of error obtained by the OCR, CNN, and SNN techniques in the presence and absence of preprocessing by a bio-inspired retina model for the first 20 samples.

Figure 16. Percentage of error obtained by the OCR, CNN, and SNN techniques in the presence and absence of preprocessing by a bio-inspired retina model, for the last 20 samples.

Table 1. Performance metrics of the proposed model.

Class	Precision	Recall	F1	Class	Precision	Recall	F1
0	0.99202128	0.97643979	0.98416887	G	1	1	1
1	0.99206349	0.98167539	0.98684211	H	0.9893617	1	0.99465241
2	0.99472296	0.98691099	0.99080158	J	0.99477807	0.9973822	0.99607843
3	0.98701299	0.9947644	0.99087353	K	0.99449036	0.99723757	0.99586207
4	0.99475066	0.9921466	0.99344692	L	0.99178082	1	0.99587345
5	0.9973545	0.98691099	0.99210526	P	0.98898072	0.99171271	0.99034483
6	0.97916667	0.98429319	0.98172324	R	0.9919571	0.99462366	0.99328859
7	0.9843342	0.98691099	0.98562092	S	0.99189189	0.98655914	0.98921833
8	0.98153034	0.97382199	0.97766097	T	0.98659517	0.98924731	0.98791946
9	0.98950131	0.98691099	0.98820446	V	0.98955614	0.9921466	0.99084967
B	0.98425197	0.98167539	0.98296199	W	1	0.99723757	0.99861687
C	1	0.99193548	0.99595142	X	0.9972067	0.98618785	0.99166667
D	0.98087432	0.99171271	0.98626374	Y	0.97574124	1	0.98772169
F	1	1	1	Z	0.99386503	1	0.99692308

Table 2. Parameters used in the bio-inspired retinal model.

Light Regulator		$Filter F_{p h}$		$Filter F_{h}$		Logarithmic Compressor
$V_{0}$	0.7	$β_{p h}$	0	$β_{h}$	0.2	$c_{f}$	1
$V_{m a x}$	255	$α_{p h}$	1	$α_{h}$	100
		$τ_{p h}$	1	$τ_{h}$	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kern, J.; Urrea, C.; Cubillos, F.; Navarrete, R. A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates. Appl. Sci. 2024, 14, 5011. https://doi.org/10.3390/app14125011

AMA Style

Kern J, Urrea C, Cubillos F, Navarrete R. A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates. Applied Sciences. 2024; 14(12):5011. https://doi.org/10.3390/app14125011

Chicago/Turabian Style

Kern, John, Claudio Urrea, Francisco Cubillos, and Ricardo Navarrete. 2024. "A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates" Applied Sciences 14, no. 12: 5011. https://doi.org/10.3390/app14125011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Bio-Inspired Retinal Model as a Prefiltering Step Applied to Letter and Number Recognition on Chilean Vehicle License Plates

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Analysis and Structure of the Bio-Inspired Retinal Model

3.1.1. Global Structure

3.1.2. Light Regulator Model

3.1.3. Outer Plexiform Layer

3.1.4. Parvocellular Response

3.2. Deep Learning Model

3.2.1. Dataset

3.2.2. Confusion Matrix and Performance Metrics

3.3. Spiking Neural Network

4. Tests and Results

4.1. Parameters of the Bio-Inspired Model

4.1.1. OCR Results

4.1.2. CNN Results

4.1.3. SNN Results

4.1.4. Global Results

5. Conclusions

6. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI