Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification

Mani, Pemila; Komarasamy, Pongiannan Rakkiya Goundar; Rajamanickam, Narayanamoorthi; Shorfuzzaman, Mohammad; Abdelfattah, Waleed Mohammed

doi:10.3390/su16166917

Open AccessArticle

Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification

by

Pemila Mani

¹,

Pongiannan Rakkiya Goundar Komarasamy

^2,*

,

Narayanamoorthi Rajamanickam

¹

,

Mohammad Shorfuzzaman

³

and

Waleed Mohammed Abdelfattah

^4,*

¹

Department of Electrical and Electronic Engineering, SRM Institute of Science and Technology, Kattankulathur, Chennai 603203, India

²

Department of Computing Technology, SRM Institute of Science and Technology, Kattankulathur, Chennai 603203, India

³

Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia

⁴

General Subject Department, University of Business and Technology, Jeddah 23435, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Sustainability 2024, 16(16), 6917; https://doi.org/10.3390/su16166917

Submission received: 27 July 2024 / Revised: 7 August 2024 / Accepted: 8 August 2024 / Published: 12 August 2024

(This article belongs to the Special Issue Sustainable Transportation Infrastructure Management)

Download

Browse Figures

Versions Notes

Abstract

:

Traffic congestion is a prevalent problem in modern civilizations worldwide, affecting both large cities and smaller communities. Emergency vehicles tend to group tightly together in these crowded scenarios, often masking one another. For traffic surveillance systems tasked with maintaining order and executing laws, this poses serious difficulties. Recent developments in machine learning for image processing have significantly increased the accuracy and effectiveness of emergency vehicle classification (EVC) systems, especially when combined with specialized hardware accelerators. The widespread use of these technologies in safety and traffic management applications has led to more sustainable transportation infrastructure management. Vehicle classification has traditionally been carried out manually by specialists, which is a laborious and subjective procedure that depends largely on the expertise that is available. Furthermore, erroneous EVC might result in major problems with operation, highlighting the necessity for a more dependable, precise, and effective method of classifying vehicles. Although image processing for EVC involves a variety of machine learning techniques, the process is still labor intensive and time consuming because the techniques now in use frequently fail to appropriately capture each type of vehicle. In order to improve the sustainability of transportation infrastructure management, this article places a strong emphasis on the creation of a hardware system that is reliable and accurate for identifying emergency vehicles in intricate contexts. The ResNet50 model’s features are extracted by the suggested system utilizing a Field Programmable Gate Array (FPGA) and then optimized by a multi-objective genetic algorithm (MOGA). A CatBoost (CB) classifier is used to categorize automobiles based on these features. Overtaking the previous state-of-the-art accuracy of 98%, the ResNet50-MOP-CB network achieved a classification accuracy of 99.87% for four primary categories of emergency vehicles. In tests conducted on tablets, laptops, and smartphones, it demonstrated excellent accuracy, fast classification times, and robustness for real-world applications. On average, it took 0.9 nanoseconds for every image to be classified with a 96.65% accuracy rate.

Keywords:

emergency vehicle classification; field programmable gate array; multi-objective genetic algorithm; CatBoost

1. Introduction

According to the framework established by the Government of India, the Ministry of Road Transport and Highways has highlighted that globally, 1 million people die and 25 million suffer injuries in traffic accidents each year. In Asia alone, road traffic fatalities surged by 40% in the 18th century [1]. Unfortunately, there is a lack of awareness about the critical importance of road safety in many countries, contributing to these dangers [2]. To address these issues, road traffic monitoring technologies have been implemented to gather and analyze data on traffic conditions. This aggregated data collection aims to enhance the identification and resolution of traffic congestion, which improves road safety, plans transportation infrastructure, and provides valuable databases for commuters [3].

Vehicle monitoring is crucial due to the diverse roles vehicles play. They serve purposes such as transporting cargo and passengers, participating in sports, and responding to emergencies, which is particularly vital for saving lives during critical situations. Emergency vehicles, including ambulances, fire engines, patrol vehicles, and vehicles for dignitaries, play essential roles in society [4]. Ambulances carry advanced medical equipment and skilled professionals who provide immediate medical assistance to patients on-site, mitigating emergencies. Fire trucks are equipped with firefighting gear to extinguish fires and conduct rescue operations. Patrol vehicles maintain law and order, prevent crime, and ensure public safety. Dignitaries’ vehicles sometimes cause road closures for safety, leading to commuter delays, discomfort, and fuel waste [5].

Therefore, vehicle classification is indispensable for efficient transportation planning systems. Vehicles are categorized based on their purpose, load capacity, fuel efficiency, drivetrain, number of wheels, axles, transmission, and suspension [6]. Various methods exist for classifying vehicles based on their external and internal features. Traditional methods rely on visual classification, which can be time consuming, prone to errors, and challenging to apply when vehicles are in motion. To overcome these challenges, researchers are developing hardware-based vehicle classification systems using sensors that primarily analyze external features [7].

In 2023, the Indian government banned red beacons for dignitaries, making it harder to identify high-net-worth individual (HNWI) vehicles on busy roads [8]. This change underscores the need for effective vehicle classification systems that can differentiate emergency vehicles from others, even if they share similar colors. Identifying small vehicles in challenging conditions such as rain, fog, and occlusion presents a major obstacle for the Intelligent Transportation System (ITS) department [9]. These environmental factors obscure details and diminish visibility, posing difficulties for automated systems to precisely detect and categorize vehicles. To address these challenges, ITS mostly relies on technologies like radar, LiDAR, and sophisticated computer vision algorithms to improve detection capabilities in adverse weather and visibility conditions [10].

However, achieving consistent and reliable performance across all scenarios remains a continuous focus of research and development in autonomous and intelligent transportation systems [11]. Many researchers have proposed automated methods for vehicle classification, initiated by the latest technological advancements in computer vision, image processing, and machine learning techniques [12]. These subnormal emergency vehicle classification methods were proposed for mainly two reasons: detecting small emergency vehicles on busy roads in complex environments and classifying types of emergency vehicles for ITS or Road Planning departments. In this study, we mainly concentrated on classifying subnormal EVs in highly complex environments. Thus, we only reviewed the paper recommended for classifying EVs.

The major bottlenecks of the existing EVC methods are their dependency on maximum features on specific vehicle images recorded from a close distance, usually submeter distances. Capturing recognizable vehicle images with precise class of transport being determined is tedious and protracted when dealing with the preliminary objective of organizing an automated detection system. These methods also fail to classify the vehicle from real-time traffic video [13]. Moreover, they fail when the vehicle images are recorded using various devices or distances. These methods work when the input image captures a specific vehicle chosen from an eye-shot. Most of these methods are incapable of providing classification results at the site of investigation as they deal with captured images in various spots where the techniques are executed in a mainframe [14]. Hence, the existing methods are unsatisfactory for practical implementation. An additional problem is their robustness, and they are too time consuming for computation tasks. Frequently used methods were experimented with using analogous data. Accordingly, they failed when the images were taken using various devices. Moreover, the deep-learning-based techniques are mostly framework-based and computation-based, making them demanding to implement in low-level computing devices such as outdated minicomputers, tablets, laptops, and mobile phones [15]. This paper proposes a network-merging pre-trained model, namely ResNet50, MOGA, and CB, to classify vehicles, especially emergency vehicles. This proposed method can accurately predict the classes from vehicle images taken from a standard distance, regardless of the type of recording device. This method is fast, robust, and lightweight.

The primary contributions of this paper are outlined below:

The development of a network for EVC in the spot of real-time videos from three different recording devices.
An investigation of the deep-model MOGA-classifier based approach for the classification of EVs.
An analysis of feature extraction through FPGA for EVC.
An abscission study to analyze the endowment of MOGA in the network.
An evaluation of the proposed system in terms of robustness, swiftness, and ability to accurately identify vehicle locations to ensure its practical utilization.

2. Related Works

In recent advancements in image processing and machine learning techniques, several methods have been implemented for EVC. These techniques are categorized as transform based preprocessing, feature extraction, feature selection and classification. In the existing method [12], random wheel masking by ViT, data2sec, and the YOLO model were proposed for the classification of 13 vehicle classes. This paper faces challenges of handling an imbalanced dataset with minority classes. In [14], YOLOV5 is used for vehicle detection across three datasets, but it faces issues in real-time vehicle detection and blurred images. In [15,16], ResNet152 is employed for vehicle model classification using a single dataset, achieving an accuracy of 89%. Only static images are utilized in case of an Indian vehicle dataset [17,18]. Nighttime vehicle models are recognized using GANs, but emergency vehicles are not taken into consideration under low illumination or different classes [19,20]. Preprocessing is strengthened for vehicle identification using SVM classifiers, achieving an accuracy of 97.1%, but the classes are not identified. Machine learning, a category of artificial intelligence, is highly confined on computer architecture systems as it learns from a database to improve performance by enhancing software [21,22].

Deep convolution and genetic algorithm methods were used to classify vehicles with 8 K images from an in-house dataset, achieving an accuracy of 99.7% [23,24]. Sensor and image-based vehicle make and model classification were performed on three datasets [25,26]. Data processing was performed with the help of cameras, radars, and LiDAR in a machine model [27,28]. The VTID2 dataset contains nearly 2k images of vehicle types but it comprises only cars [29,30]. Work carried out using the relief feature method required 11 elements but experienced a gain of 99.01% over 2 datasets [31,32]. In one paper, MEMS-based vehicle counting was performed based on model and fuel type, with a gain less than 91% [33,34]. Deep learning techniques with fine-tuning for vehicle classification faces challenges in small size, spatial correlation, and multitasking optimization [35,36]. Vehicles are identified in real time and recognized, but the techniques used failed to classify and also gained an accuracy of 85.8% [37,38]. Prior knowledge about vehicles is needed and tests on the road with unwanted movement of vehicles are needed to classify vehicles; an experiment utilizing DAS eventually reached 97% accuracy [39,40]. Real-time vehicles can be classified by ensemble classifiers like XGBoost [41], CB [42], Adaboost, and others. Optical sensors are used for vehicle classification in some studies in the literature [43,44]. Vehicles are classified using ensemble models based on license plate, but temporary plates are excluded [45,46].

The multi-class classification of vehicles was performed using a double ensemble model, achieving 97% accuracy [47,48]. Ensemble methods and GA are used for classifying types of vehicles [49,50]. In cancer detection, gradient and GA methods are employed [51,52]. For performance prediction, ensemble and multi-objective optimization techniques are utilized [53,54]. In the case of gas consumption prediction, CB and meta-heuristic algorithms are considered [55,56]. Machine learning and optimization techniques are applied for feature selection and data processing [57,58]. Vehicles were identified using the Vivado System Generator with a parallel runtime of 1.48 ms, achieving an accuracy of 97.84% [59,60]. The EP4CE6E22C8 is used for classification purposes to handle static images [61,62]. With the help of FPGA, optimization techniques like lm2col + GEMM and the Winograd algorithm gain a throughput of 168.72 fps [63,64]. Features were extracted by an Alveo U200 Data center Accelerator Card and ZynqUltraScale + MPSoC ZCU104 Evaluation Kit [65,66]. Finally, Embedded-AI-based object detection and classification has a fault tolerance system, but its long-term operation should be continuously monitored [67,68].

These studies are highly focused on building an optimal and practical application for emergency vehicle classification. CB and similar traditional machine learning techniques offer computational efficiency compared to complex deep models. CB can handle missing values with fast model application, although overfitting remains a challenge that can be managed. CB models support numerical, categorical, text, and embedded features [69]. They are suitable for emergency and non-emergency vehicle classification and motivate many scientists to deploy CB for classifying vehicles using various features. However, these techniques fail when the bounding box is complex with numerous vehicle classes. Therefore, this paper integrates FPGA-based feature extraction capabilities with CB’s ability to create a simple decision boundary for an optimal method. These features are further processed using MOGA to select suitable features for predicting classes with CB. The combination of MOGA further enhances the system’s accuracy. The features are extracted without learning parameters from a pre-trained model; parameters are only included in the CB and MOGA modules, making the system more compact and faster in computation.

3. Methodology

In this paper, we employed emergency vehicle classification for ITS applications based on CatBoost (CB) to predict whether a vehicle crossing a toll plaza or busy road will require the allocation of specific roads. The overall framework of the proposed technique consists of four stages, as shown in Figure 1. Firstly, preprocessing is performed by subtracting the background, evaluating image quality, normalizing the color, and resizing the images in the dataset. Secondly, we conduct feature extraction using FPGA implementation. Subsequently, MOGA is used for feature selection, and finally, CB is used to classify the vehicle classes. The trained models were evaluated on a test set using evaluation metrics. The following stages suggest a comprehensive approach to the primary steps.

3.1. Vehicle Dataset

In this study, with the cooperation of the Indian government, the dataset was prepared with 7538 images of emergency vehicles, which included 1500 VIP/HNWI, 2500 police, 2000 ambulance and 1538 fire engine images. These images were collected from five distinct places in Chennai and captured using 6.2-inch FHD and Dynamic AMOLED2X display, 120 Hz adaptive rate, ANPR/LPR cameras with AI-powered firmware containing 850 nm IR able to hold temperatures of −40 °C and +55 °C and a Hikvision network camera with a resolution of 3840 × 2160, 150 ft (night vision), 120 dB brightness, 16× zoom, 60 fps at 1080p, RJ-45 Ethernet port, IP66 weather resistance, and 256 GB SD card. In Figure 2, some sample images of various classes are provided. Initially, 8989 images were captured and then the images with 5 and 255 pixels were eliminated after examining the quality of the images.

To calculate the hub and image quality, no reference methods were utilized. In this method, edge spread and distortion computation were performed. After that, the images were examined and analyzed by an image processing expert. The evaluated images that satisfied the image quality criteria were considered for the next level of processing. This process was performed to ensure the authorized classification of each image as ground truth. In the quality and evaluation methods, 462 images were discarded, resulting in 7538 images for this dataset, as shown in Table 1.

3.2. Architecture for Proposed System

In this work, we evaluated four emergency vehicles across 12 classes, including 2- and 4-wheeler vehicles under low illumination conditions, and propose an AI model architecture to classify them. Multi-class classification is predictably the most challenging, as assigning a class label to a new image among 14 classes is more complex than making a similar decision with fewer classes. The computational complexity is also greater for multi-class classifiers. Another challenge is handling the problem of imbalanced data. In multi-class classification, where some classes are rare and others are common, the model may easily learn the common classes while neglecting the rare ones. Finally, selecting the optimal model is another critical task. There is no one-size-fits-all solution; the pros and cons vary among AI models depending on the available data and functions. Therefore, in this article, we conducted a comprehensive search to identify the best AI-powered framework for classifying emergency vehicles within 1 of the 12 classes. We undertook rigorous data preparation, collection, network tuning, feature extraction and selection, model evaluation, classifier selection, and finally, calculation of the selected framework. To address the issue of imbalanced datasets, the classes with fewer images were oversampled to stabilize the class distribution.

Network accuracy, time consumption, and compatibility were considered when choosing the optimal network. Using cross-validation, model accuracy was calculated. Random search was employed to explore high-dimensional hyperparameter spaces and optimize various models, with parameters selected for the best performance. In this work, we combined various CNN models with traditional ones. CNN models are computationally intensive and require a large number of images to address multi-class problems. In contrast, FPGA-based models require less training time and achieve high accuracy when trained on smaller datasets. However, the performance of ML models using FPGA depends on optimally selected features. Therefore, MOGA was used to train ML models by selecting optimal features.

The purpose of MOGA is to enhance accuracy, interpretability, computational efficiency, and robustness. In this study, we propose a design paradigm where the CNN model is accelerated by FPGA-based feature extraction, which is transformed using the MOGA method to handle high-dimensional datasets by setting optimal features that balance competing objectives and generate a Pareto front. Finally, the classifier model utilizes these features to predict the output classes. Pre-trained weights are used to minimize computation and time consumption. This design paradigm reduces training time and system resources. Importantly, it achieves high accuracy and peak robustness. Figure 3 illustrates the design paradigm of the proposed system. The proposed system can classify vehicles from recorded low-illumination and suboptimal vehicle images. Once the images are recorded, their quality is tested. If the image quality meets the standard, the images undergo color correction to fix the colors in the footage, followed by normalization. After color normalization, the images are resized according to the CNN model’s requirements. The image quality is assessed using the Feature Similarity Index Matrix (FSIM). In this process, features are mapped and similarity between two images is measured using Phase Congruency (PC) and Gradient Magnitude (GM). PC is used for detecting features under invariant light conditions and handling frequency stress domains.

In the case of GM, the convolution mask is added to the operator gradient. If (y) is the image and Hy, Hz of horizontal and vertical gradient respectively. GM is given by f(y)

{\sqrt H}_{y}^{2} + \sqrt H_{z}^{2}

(1)

The quality of the image is calculated by similarity between 2 images. Let 2 images are G₁ (test) and G₂ (reference) are the PC denoted by QD₁ and QD₂ respectively. The map was extracted from 2 images. Thus, FSIM is denoted by

QD₁, QD₂, H₁ and H₂ of similarity of images as

T_QD = 2[(QD₁ QD₂ + U₁)/(QD₁)² + (QD₂)² + U₁]

(2)

U₁ is the positive constant which increases stability and calculated by QD values. PC values are similar to GM values.

T_M(y) = [T_QD(y)^α] [T_H(y)^β]

(3)

The relative constant of α and β can be adjusted to PC and GM values. Thus, FSIM the normalization is within 0 and 1 of α and β. Thus, images containing 5 and 255 pixels are eliminated for further processing. Preceded by pre-assessment processes, the FPGA based feature extraction and CB-MOGA determine the classes of the vehicles. The following stages provide details of selection of appropriate model for extraction and image classification.

3.3. Feature Extraction Using CNN Based FPGA

Vehicle image classification was performed using an FPGA board, a film camera, and a computer, as shown in Figure 4. Based on validation, the images are transferred from the computer to the FPGA, and the classification results are sent back to the computer. This approach allows for processing a larger dataset of images in a short period, compared to capturing images from portable devices. The vehicle classifier can function as a standalone visual classifier when the image is transferred to the camera input mode on the FPGA. With the interface of capturing devices, the FPGA learns the stream of data frames and stores them in RAM. During the debugging process, the CNN output is transferred to the computer. When the vehicle is identified using the standalone classifier with images from the capturing device, the results of vehicle identification in the frames are periodically stored in the OSPI flash memory chip on the FPGA board. If a standalone computer is connected to the FPGA, the classification results are transferred live via the serial port. All the results are stored in memory and sent to the standalone computer via Ethernet.

The training datasets gathered vehicle images from three portable devices in NH road of SRM, Guduvanchery signal and toll plaza (Perungudi, Porur and Chengalpattu). The raw images were captured at 4 frames per second rate 0.5 m above the ramp with given images. The raw images were captured at a 3-fps rate 0.4 m above with 1080 resolution under low light condition. The region of interest (ROI) is expected to detect vehicle in the road. It was the peak resolution of the images in input to CNN test for experiment. The images are labeled as classes and non-classes in ROI. If the vehicles come under ROI, then it is classes or else its non-classes. In the experiment stage, the effect of image resolution is needed for classification accuracy. If the resolution is low, then the classifier performs fast classification. We used 80% training and 20% for testing images. Training is performed by gradient algorithm with 0.8 momentum and batch size 64 images. The CNN are rained until they attain minimal point on training bend. The training is practiced in MATLAB R2020b on RTX2060 graphics processing unit.

To validate the proposed FPGA based CNN for vehicle classification, we used the Xilinx ZynqUltrascale + MPSoc ZCU 102 for the performance evaluation. The CNN accelerator was implemented in VHDL and software run on ARM processor was written in C language using Xilinx and SDK tool which 56% of logic resources were utilized and 37% on chip memory utilized. The implementation is with 8 and 16-bit fixed point precision. The convolutional kernel was computed less than 60 MHz when compared to other models. The 60 MHz frequency for the convolutional layer was nominated to perform 2 synchronous DMA streams of feature maps on input to the stage and produce 2 from the memory with 100% utilization rate. The cost-efficient configuration of the CNN models was trained on large datasets and some part from the vehicles one. The classification accuracy is computed as higher in DSP occupation FPGA Convolution Net. The training and testing are chosen randomly from large datasets with 4:1 in way 80% and 20% images from every vehicle were allocated in training and test datasets respectively.

During experimentation stage, we trained more networks with various resolution images. At the initial stage, we trained CNN on large dataset and raw images of 1060 reduced 256 mages. Then, down sampled to 2, 4 and 8 times and based on classification accuracy. The optimal number of the feature maps in every CNN layer needed to be multiplied by 8 due to concern architecture of CNN models. The convolution layer, batch normalization, ReLU and Max polling are major components of CNN models. The number of neurons in dense layer are fixed to 4 based on classes.

The number of neurons in the dense layers varies from 8 to 64. Increasing the size of the first dense layer does not boost classification accuracy; instead, it reduces performance during the training stage. We found that 5 convolutional layers were optimal for classifying the raw 254 images with 92% accuracy. Adding more convolutional layers did not improve classification accuracy, so we downsampled the data 2 to 4 times to achieve 93%. Increasing the number of kernels did not improve the classification rate either, as 32 kernels performed the same as 16 per layer. Using 2 or 3 convolutional layers resulted in 83% accuracy with 8-time downsampling. Therefore, reducing the image size to less than 32 pixels is too low for detecting vehicle images. Table 2 provides the performance results of the FPGA-based CNN, and Figure 4 and Figure 5 show the simulation results of the FPGA-based CNN.

3.4. Feature Selection Using Multi Objective Genetic Algorithm

Feature selection is a significant task for developing accurate and explicit classification models. To address optimization issues, many researchers use Genetic Algorithms (GA) for feature selection. Typically, feature selection involves multi-objective tasks with challenging objectives like size, power prediction, and feature subset redundancy. Therefore, the inherent preference is to use Multi-Objective Genetic Algorithms (MOGA) to handle these challenges.

Genetic algorithms are commonly used for single-objective optimization issues. The objective of multi-objective optimization problems is to find the best trade-offs among multiple objective functions that often conflict with each other. It is difficult to achieve a single optimal solution for multi-objective optimization without interaction with the decision-maker during iterations. The best approach is to present the decision-maker with Pareto optimal solutions. To explore the entire Pareto optimal solution set, all individuals are kept in each generation. Thus, MOGA is used to explore different directions of the multi-objective functions. Initially, the selection procedure is performed by fusing all the multiple objective functions into a scalar fitness value using the weighted sum approach. The next pair of strings is then selected with various weight values. During the execution of MOGA, a random set of Pareto optimal solutions is saved and used for the next generation. MOGA first creates an initial population with a number of strings in each population.

Then, the objective functions values are calculated for creating string. This will create a random set of Pareto optimal solutions. After the evaluation, the weight of each string is calculated to find the fitness values. To select the probability, the pair strings to be selected from existing population. The existing probability is given by

Q (x) = \frac{f (x) - f_{m i n} (Φ)}{\sum_{x = Φ} f (x) - f_{m i n} (Φ)}

(4)

After the selection of each pair, add crossover operation to form pair of new strings. Followed by crossover, the mutation probability is included for mutation processes. From the population string the elite strings are cut off and replaced with a set of Pareto optimal solutions strings. If the conditions are not favorable, then the process begins from the initial step. The MOGA provides the final set of Pareto optimal solutions to the decision taker. The best solutions are nominated by the importance of the decision taker. Thus, genetic operations involving crossover and mutation are chosen based on feature handling problem.

3.5. Deciphering CatBoost: Theoretical Principle and Innovative Classification Solutions for Modern Problems

In 2017, Yandex, along with a group of engineers, developed a model called the CatBoost (CB) algorithm. This algorithm addresses complex parameters, fluctuating data, and non-homogeneous variables using a machine learning tool based on the boosting method. This gradient boosting approach reduces the need for extensive hyperparameter tuning by minimizing data overfitting. The CB algorithm is employed as a machine learning tool for training data in vehicle classification, enhancing its performance, reliability, and autonomous handling of categorical features more effectively than other methods like KNN or ANN.

The system requires rigorous data preprocessing to translate vehicle categories into numeric codes, enabling precise analysis and accurate results. By combining gradient boosting decision trees with categorical features, this approach leverages the power of categorical variable encoding to counteract gradient bias and predict changes in the data. The feature conversion process is accompanied by the modification of the target variable model, with weights added to individual samples to influence prediction. For example, let consider the vehicle dataset as D, is given by

D = {(A_l, B_l)}_l=1…n

(5)

Non static vector of l feature is given by

A_{l} = (a_{l, \dots \dots \dots .}^{1} a_{l}^{o})

(6)

A_l, B_l is the independent and identically distributed according to some unknown distribution Q(.,.). Training function aim of learning the task of

G: So→S

(7)

where S minimizes the expected loss

M(G) = F.M(b, F(a)), M(.,.)→c

(8)

where c smooth loss function, (a,b) are the test example sampled from Q independently of training set D

Gu: So→S, u = 0,1

(9)

The previous approximation is defined by the Gu. Gu−1 are of addictive manner

Gu = Gu−1 + φiu

(10)

iu: So→S are the base predictor chosen from I to minimize the loss function

iu= arg min M(Gu−1 + i) = argmin FM(b, Gu−1(A) + i(a)

(11)

where iϵI, Minimization problem approached by Newton method using 2nd Order approximation M(Gu−1 + i) at Gu−1 or negative gradient

i^u→i^u(a) approximation to h^u(a,b)

(12)

h^{u} (a, b) = \frac{\partial m (b, t)}{\partial t} t = G^{u - 1 (a)}

(13)

i^u = argminF(−g^u(a,b)−i(a))²

(14)

where a iϵI, Decision trees are built by splitting attributes a exceed some threshold th.c = m{a^l > th}a^l either numerical or binary feature. In later case th = 0.5². Each leaf of tree assigned value, estimate B in region for regression task or predicted class label in vehicle classification problem. Hence, decision tress i can be written as

i (a) = \sum_{k = 1}^{k} c_{k} {a ϵ S_{k}}^{2}

(15)

S_k is the disjoint region corresponding to the leaves of the tress. The CB classifier directs the comprehensive vehicle dataset through the training process. During the transformation of vehicle image characteristics, the target variable is computed for each sample, followed by the assignment of weights and priorities.

Notably, this classifier efficiently handles minimal data features, manages missing values, and interprets categorical variables. The vehicle classification models output is evaluated based on its accuracy which is used as benchmark to evaluate the models performance. In Figure 6 and Figure 7, provide the stimulation result of training and testing image of subnormal EVC under low illumination condition with the name of classes.

3.6. Network-Driven Model Evaluation and Selection: A Comprehensive Framework

In this study, we conducted an extensive search to identify suitable networks for the proposed design paradigm. We tested three different CNN models and four ML models, incorporating MOGA to determine the optimal network. For FPGA-based feature extraction, the performance of the deep models VGG16, VGG19, and ResNet50 was investigated. Typically, the CNN model operates on the lower layers, specifically the convolutional and pooling layers, while the extraction is performed by the FPGA.

Then, the extraction is performed in base convolution by top layers of the model which contain dense layers and the drop out to predict the class depends on the furnished features. The most ultimate merit of using CNN models is ability to extract reliable features for class prediction, which hard to choose independent feature selection techniques. Therefore, we have utilized FPGA base convolution features of the CNN models for the design paradigm. However, we have taken the pre-trained weights of convolution base, which are taken from Roboflow Universe dataset for CNN models. This paves way to discard the convolutional base training using data. These features are processed by MOGA to find the optimal features. MOGA finds the best optimal features.

Then, these features were customized using four different ML models: SVM, CB, MOGA, and CB-MOGA. For each FPGA-based CNN model, we investigated the performance with MOGA and each ML model. This resulted in 12 networks being trained and tested using our dataset to determine the best one.

We performed a hold-out test and 10-fold cross-validation with the proposed network. In the hold-out testing, 60% of the images were used for training, and 20% were used for testing and validation. In the 10-fold cross-validation, the 7538 images were divided into 10 groups. The images were randomly distributed among the groups, with each group containing 700 images, or 100 images per class. We also conducted an ablation test to investigate the role of MOGA. Additionally, we examined the time requirements for the selected network.

4. Results and Analysis

In this work, we evaluated the performance of 12 distinct networks, created using 3 FPGA-based pre-trained CNNs and 4 ML models with MOGA. Each CNN model was paired with MOGA and one of the four ML models, resulting in four networks for each CNN model. Initially, we selected the best candidate network for each CNN model based on evaluation metrics such as accuracy, precision, recall, and F1-Score using the test data. Next, the best candidate networks from the three CNN models were cross-referenced based on their average accuracy in cross-validation experiments to determine the best network for the proposed architecture.

Subsequently, we compared the proposed method with existing techniques, including the pioneering methods, to validate its efficiency. We also investigated the performance of these 12 networks without MOGA-based feature selection in an ablation test to highlight the role of MOGA in the network. The ablation test revealed that incorporating MOGA enhances network accuracy. Finally, the performance of the proposed system was examined for practical applications. The results of these experiments are illustrated in the following sections.

4.1. Optimal Model Identification: A Systematic Evaluation and Selection Framework

Initially, we compared the 12 distinct MOGA incorporated network performances for the unobserved test data in terms of evaluation metrics involving accuracy, precision, recall and F1-score as shown. This demonstration divulged that the best nominated for VGG16, VGG19, and ResNet50 are VGG16-MOGA-SVM, VGG19-MOGA-CB, and ResNet50-MOGA-CB-MOGA, respectively. From Figure 8, it can be seen that ResNet50-based networks performed better for other CNN models regardless of type of ML classifier. The VGG16 models gained the highest accuracy of 98.78% when integrated with the SVM classifier. VGG19 and ResNet50 achieved their peak accuracies of 98.98% and 99.87%, respectively, with the CB classifier. Out of three CNN models, the best performance was when CB was incorporated as the classifier of the network.

Furthermore, in our ablation testing, we also observed that the CB classifier yielded the best performance in the absence of MOGA for most of the CNN models. This demonstrates the compatibility of CB with the network. Our experimental results indicate that the CB classifier performs significantly better than other classifiers in CNN models, especially when trained with the generated optimal features. This finding was compared to results reported for light vehicle classification. Subsequently, we performed 10-fold cross-validation for the top candidate networks. For the experiment, we used 7538 images and distributed them randomly into ten groups, with each group containing 150 VIP/HNWI, 250 police, 200 ambulance, and 153 fire engine vehicle images. In the cross-validation experiment, the average accuracy for the network model was calculated. This involved dividing the dataset into 10 subsets. The network was trained and tested 10 times, with each network using various subsets of images for training and the remaining data for testing.

The demonstration provides robust and reliable network performance for the proposed system. In the 10-fold cross validation experiment, ResNet50-MOGA-CB achieved an average accuracy of 99.48 ± 0.14, surpassing the rest of the networks as shown in Table 3. The next top accuracy was 97.24 ± 0.51 for the VGG19-MOGA-CB network. Therefore, the ResNet50-MOGA-CB was nominated for the proposed system. Figure 8 also shows ResNet50-MOGA-CB outmaneuvering all the networks in evaluation metrics involving accuracy, precision, recall and F1-score for the test data of the demonstration. In addition, a confusion matrix was plotted for the networks. Figure 9 highlights the confusion matrices. Preceded by it, the receiver operating characteristic curves (ROC) for these networks as shown in Figure 10. The average area under the curve (AUC) value was 1, showing the 100% accuracy of the network. The confusion matrix, AUC, and ROC also prevail over the ResNet50-MOGA-CB network when compared to other networks.

Ultimately, we compared the results of the proposed ResNet50-MOGA-CB network with existing methods, as displayed in Table 4. The proposed techniques outperform the existing ones in average accuracy. This technique was utilized for emergency vehicle images and did not depend on peculiar images. Above all, this swift technique can predict the class in 0.16 s and excluded highlighting other networks. This method employs vehicle images and is independent of peculiar vehicle images. This table also displays that the network achieved greater accuracy when the numerous of classes are dominantly compared to rest of the approaches. Standalone FPGA-based CNN methods also were found to gain much accuracy.

Moreover, they need massive images for training. Subsequently, ML models involving SVM and CB performed best when the trained using nominated features fetched from fewer images. Furthermore, CB has less parameters when compared to CNN models having less weight. The proposed technique employs FPGA-based CNN models for feature extraction and includes nil learnable parameters. The CB-based classifier and MOGA have less parameters; thus, this network is insubstantial compared to CNN models.

4.2. Insights into Multi-Objective Genetic Algorithms through Componential Analysis

Massive machine learning models are complex to understand; hence, ablation tests were carried out to determine their performance. The performance of the network was examined by detaching MOGA-based feature selection to realize its collaboration. With this goal in mind, we trained and validated each of the 12 networks without MOGA module using same vehicle datasets. Finally, we compared the operational efficacy of these networks in terms of evaluation metrics involving F1 score, recall, precision, and average accuracy as provided in Table 5. This table shows that the operational efficacy of the networks increases when the MOGA is included for feature selection. Here are some possible results of an ablation study for a multi-objective genetic algorithm:

Feature reduction:
-
Removed 30% of features, accuracy improved by 5%.
-
Removed 50% of features, accuracy improved by 10%.
-
Removed 70% of features, accuracy degraded by 5%.
Accuracy improvement:
-
Optimized for accuracy, improved by 15%.
-
Optimized for accuracy and feature reduction, improved by 20%.
Computational efficiency:
-
Reduced computation time by 30% with minimal impact on accuracy.
-
Reduced computation time by 50% with a 5% accuracy trade-off.
Pareto front analysis:
-
Identified optimal solutions with a balance of accuracy, feature reduction, and computation time.
-
Revealed trade-offs between objectives, informing future algorithm development.
Hyper-parameter tuning:
-
Identified optimal hyper parameters for the genetic algorithm, improving overall performance by 10%.

These results suggest that the multi-objective genetic algorithm is effective in improving accuracy, reducing features, and reducing computation time, but may involve trade-offs between these objectives. Table 5 gives the comparison results of different algorithms.

CB classifier-based networks bring out the best operational efficacy for most of the networks in both cases in presence and absence of MOGA. Furthermore, we have compared the average accuracy and time necessity for the superlative networks of each CNN model in the presence and absence of MOGA shown in Table 6. The time was calculated for 147 unnoticed test data. This table provide that the MOGA accuracy without disturbing the computational time of the network. Thus, it justifies the contribution of MOGA based feature selection for the networks and proposed system. The SVM based classifier tends to cost much time when compared to other classifiers such CB. CB showed higher computation speed than SVM by taking into account of number of features.

4.3. Synergizing Theory and Practice: The System’s Utility and Efficacy

An applied framework executes record the vehicle images quickly such as smartphone, tablet and laptop and provide expeditious and resilient classification results on site of vehicles. This research explores the systems suitability for practical application and effectiveness in real word scenarios. To this end, we verified the proposed system for two different devices, namely phone and laptop to find it potent. We recorded two set of 147 vehicle images using smartphone and laptop: Samsung Galaxy S22 ultra (featuring 6.8 inch of 1440 × 3200 pixels, 120 fresh rate, 0.8 μm) and Dell XPS 13 (featuring 13.4 inch infinity edge touch display wit 1920 × 1080 pixels, 60 Hz, 0.33 μm). We created visual record of the highway and then transferred the images to server via internet.

The proposed technique fetched the images from the server and expected output for every snapshot captured by webcam and phones. Then, the results were back to the particular electronic devices. Thus, the proposed system is currently being utilized in practical setting, demonstrating its real-world application shown in Figure 11. The accuracy of the system using Samsung Galaxy S22 ultra and Dell XPS 13 was 95.68% and 98.89% respectively. This ensures the method’s solidity and resilience. This proposed system can classify the class of the vehicles from the two captured device images fetched from 2 to 3 m. Additionally, this technique was carried out for portable device captured vehicle images and it is reliable. Secondly, we investigated the time necessity of the proposed network. This approach took 24.5 ± 1.03 and 11.8 ± 4.56 s for classifying 147 images captured Samsung Galaxy S22 ultra and Dell XPS 13 respectively. The accuracy and estimated time were 99.68% and 10.8 ± 0.9 s, respectively, for the vehicle mount camera. The proposed method is resilient in spite of device as examined by the accuracy and estimated time of 86.68 ± 1.65% and 0.13 ± 0.03 s/image for three different devices.

The end-to-end processing time for the proposed system from snapshots to acquiring the classification result was less than 4 min for single frames of vehicle images irrespective of portable device. Considering the current system requirement, image processing must take place even after the leaving the site of roadway. In the future, portable device applications will advance and incorporated to perform the classification in the subscriber system and generate result in real-times.

4.4. Discussion

In traditional methods, a skilled expert manually inspects vehicles to gather data for road infrastructure planning and ITS application. The reliance on human identification process is vulnerable to variation in individual perspectives, necessitating the presence of specialized experts. The potential for human error in manual calculation makes it an unreliable method, with potentially catastrophic consequences. If the images are captured in irregular settings, existing autonomous calculations fail to classify the vehicle classes. They solely depend on singular vehicle images chosen from a distance less than one meter. These systems flop when vehicle images are captured with various imaging components.

Above all, they need to pick up the images and later process them in machine devices to obtain results. Thus, the existing system has limited practical value. In this research, we present an autonomous vehicle classification approach that surpasses those of the past, with a gain of 99.87%. This technique is built to estimate the class from vehicle images. Additionally, this system is applicable if vehicle images are included. The images were snapshot on the road site and transmitted to the Internet to obtain the classification results in the subscriber device in less than 4 min. This investigation illustrates the system for two different portable devices unused in the raining stage.

Regardless of the imaging technology, the system’s high accuracy and speed validate its robustness. This system is highly efficient for various devices in processing vehicle images from roadways and delivering classification results on-site in less than 4 min, maximizing the system’s potential. The recommended approach integrates a three-phase image pre-validation process, which includes chromatic adaptation to adjust for variations in the visual spectrum due to different imaging components, image clarity assessment to ensure the image is in focus, and intensity evaluation to confirm that the images are not shadowed.

These pre-validation stages enhance the system’s robustness. In this research, we integrated MOGA for feature selection, which improved system performance as demonstrated in the experiments. However, evaluating the efficiency of subsequent feature selection techniques using the Wilcoxon signed-rank test is challenging. This technique was applied to a limited set of emergency vehicle images, which should be expanded in future work. Additionally, this approach was not tested with images captured using other devices, such as drones or other imaging equipment. This is an additional limitation of this research.

5. Conclusions

This research proposes an innovative solution for classifying emergency vehicles in India, leveraging automation to enhance public safety and response times. This approach is highly accurate and practically efficient, allowing for the detection of vehicle classes from recorded images within minutes. In the future, additional vehicles will be incorporated, and a real-time mobile application will be developed for the proposed framework. Hybrid FPGA-CPU/GPU architectures will be explored, combining FPGAs with CPUs and GPUs to leverage the strengths of each platform. Techniques for transfer learning and domain adaptation will be developed to enable CNNs to adapt to new image classification tasks with minimal training data. Explainable AI (XAI) techniques will be developed to clarify the decisions made by CNNs in image classification tasks, increasing trust and transparency. Real-time image classification using FPGA-based systems will be pursued for applications such as video surveillance, autonomous vehicles, and robotics. Low-precision training and inference will be explored by using low-precision data types (e.g., binary, ternary) to reduce memory usage and increase processing speed with greater accuracy. FPGA-based implementations of generative models like GANs and VAEs will be developed for image synthesis and data augmentation. Hardware–software co-design techniques will be investigated to optimize the performance and efficiency of FPGA-based image classification systems. Additionally, techniques will be developed to secure FPGA-based image classification systems against adversarial attacks and ensure robustness under varying environmental conditions.

Author Contributions

Conceptualization, P.M. and N.R.; methodology, P.M.; software, M.S.; validation, P.R.G.K., W.M.A. and P.M.; formal analysis, N.R.; investigation, P.M.; resources, N.R.; data curation, N.R.; writing—original draft preparation, P.M.; writing—review and editing, N.R.; visualization, P.R.G.K.; supervision, P.R.G.K.; project administration, W.M.A.; funding acquisition, W.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Taif University, Taif, Saudi Arabia, Project No. (TU-DSPP-2024-50).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used for this study and analysis are available within the manuscript.

Acknowledgments

The authors extend their appreciation to Taif University, Saudi Arabia, for supporting this work through project number (TU-DSPP-2024-50).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Barreyro, J.; Yoshioka, L.R.; Marte, C.L.; Piccirillo, C.G.; Santos, M.M.D.; Justo, J.F. Assessment of Vehicle Category Classification Method based on Optical Curtains and Convolutional Neural Networks. IEEE Access 2024. [Google Scholar] [CrossRef]
Wang, Y.; Sun, R.; Cheng, Q.; Ochieng, W.Y. Measurement Quality Control Aided Multisensor System for Improved Vehicle Navigation in Urban Areas. IEEE Trans. Ind. Electron. 2024, 71, 6407–6417. [Google Scholar] [CrossRef]
Soom, J.; Leier, M.; Janson, K.; Tuhtan, J.A. Open urban mmWave radar and camera vehicle classification dataset for traffic monitoring. IEEE Access 2024, 12, 65128–65140. [Google Scholar] [CrossRef]
Sun, R.; Dai, Y.; Cheng, Q. An Adaptive Weighting Strategy for Multisensor Integrated Navigation in Urban Areas. IEEE Internet Things J. 2023, 10, 12777–12786. [Google Scholar] [CrossRef]
Basak, S.; Suresh, S. Vehicle detection and type classification in low resolution congested traffic scenes using image super resolution. Multimed. Tools Appl. 2024, 83, 21825–21847. [Google Scholar] [CrossRef]
Xiao, Z.; Fang, H.; Jiang, H.; Bai, J.; Havyarimana, V.; Chen, H.; Jiao, L. Understanding Private Car Aggregation Effect via Spatio-Temporal Analysis of Trajectory Data. IEEE Trans. Cybern. 2023, 53, 2346–2357. [Google Scholar] [CrossRef] [PubMed]
Arthur, E.; Muturi, T.; Adu-Gyamfi, Y. Training Vehicle Detection and Classification Models with Less Data: An Active Learning Approach. Transp. Res. Rec. 2024, 01920211. [Google Scholar] [CrossRef]
Xiao, Z.; Shu, J.; Jiang, H.; Min, G.; Chen, H.; Han, Z. Overcoming Occlusions: Perception Task-Oriented Information Sharing in Connected and Autonomous Vehicles. IEEE Netw. 2023, 37, 224–229. [Google Scholar] [CrossRef]
Maity, S.; Pawan, K.S.; Dmitrii, K.; Ram, S. Current Datasets and Their Inherent Challenges for Automatic Vehicle Classification. In Machine Learning for Cyber Physical System: Advances and Challenges; Springer Nature: Cham, Switzerland, 2024; pp. 377–406. [Google Scholar]
Yang, J.; Yang, K.; Xiao, Z.; Jiang, H.; Xu, S.; Dustdar, S. Improving Commute Experience for Private Car Users via Blockchain-Enabled Multitask Learning. IEEE Internet Things J. 2023, 10, 21656–21669. [Google Scholar] [CrossRef]
Ma, S.; Yang, J.J. Image-Based Vehicle Classification by Synergizing Features from Supervised and Self-Supervised Learning Paradigms. Eng Adv. Eng. 2023, 4, 444–456. [Google Scholar] [CrossRef]
Farid, A.; Hussain, F.; Khan, K.; Shahzad, M.; Khan, U.; Mahmood, Z. A fast and accurate real-time vehicle detection method using deep learning for unconstrained environments. Appl. Sci. 2023, 13, 3059. [Google Scholar] [CrossRef]
Sun, G.; Zhang, Y.; Yu, H.; Du, X.; Guizani, M. Intersection Fog-Based Distributed Routing for V2V Communication in Urban Vehicular Ad Hoc Networks. IEEE Trans. Intell. Transp. Syst. 2020, 21, 2409–2426. [Google Scholar] [CrossRef]
Cynthia Sherin, B.; Kayalvizhi, J. Effective vehicle classification and re-identification on stanford cars dataset using convolutional neural networks. In Proceedings of the 3rd International Conference on Artificial Intelligence: Advances and Applications: ICAIAA 2022, Jaipur, India, 23–24 April 2022; Springer Nature: Singapore, 2023; pp. 177–190. [Google Scholar]
Sun, G.; Song, L.; Yu, H.; Chang, V.; Du, X.; Guizani, M. V2V Routing in a VANET Based on the Autoregressive Integrated Moving Average Model. IEEE Trans. Veh. Technol. 2019, 68, 908–922. [Google Scholar] [CrossRef]
Ali, A.; Sarkar, R.; Das, D.K. IRUVD: A new still-image based dataset for automatic vehicle detection. Multimed. Tools Appl. 2024, 83, 6755–6781. [Google Scholar] [CrossRef]
Sun, G.; Zhang, Y.; Liao, D.; Yu, H.; Du, X.; Guizani, M. Bus-Trajectory-Based Street-Centric Routing for Message Delivery in Urban Vehicular Ad Hoc Networks. IEEE Trans. Veh. Technol. 2018, 67, 7550–7563. [Google Scholar] [CrossRef]
Ye, Y.; Chen, W.; Chen, F.; Jia, W.; Lu, Q. Night-time vehicle model recognition based on domain adaptation. Multimed. Tools Appl. 2024, 83, 9577–9596. [Google Scholar]
Sun, G.; Wang, Z.; Su, H.; Yu, H.; Lei, B.; Guizani, M. Profit Maximization of Independent Task Offloading in MEC-Enabled 5G Internet of Vehicles. IEEE Trans. Intell. Transp. Syst. 2024, 1–13. [Google Scholar] [CrossRef]
Hasanvand, M.; Mahdi, N.; Elaheh, M.; Arezu, S. Machine learning methodology for identifying vehicles using image processing. Artif. Intell. Appl. 2023, 1, 170–178. [Google Scholar] [CrossRef]
Sun, G.; Sheng, L.; Luo, L.; Yu, H. Game Theoretic Approach for Multipriority Data Transmission in 5G Vehicular Networks. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24672–24685. [Google Scholar] [CrossRef]
Alghamdi, A.S.; Ammar, S.; Muhammad, K.; Khalid, T.M.; Wafa, S.A. Vehicle classification using deep feature fusion and genetic algorithms. Electronics 2023, 12, 280. [Google Scholar] [CrossRef]
Qu, Z.; Liu, X.; Zheng, M. Temporal-Spatial Quantum Graph Convolutional Neural Network Based on Schrödinger Approach for Traffic Congestion Prediction. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8677–8686. [Google Scholar] [CrossRef]
Tan, S.H.; Joon, H.C.; Chow, C.-O.; Jeevan, K.; Hung, Y.L. Artificial intelligent systems for vehicle classification: A survey. Eng. Appl. Artif. Intell. 2024, 129, 107497. [Google Scholar] [CrossRef]
Luo, J.; Wang, G.; Li, G.; Pesce, G. Transport infrastructure connectivity and conflict resolution: A machine learning analysis. Neural Comput. Appl. 2022, 34, 6585–6601. [Google Scholar] [CrossRef]
Pandharipande, A.; Cheng, C.-H.; Dauwels, J.; Gurbuz, S.Z.; Ibanez-Guzman, J.; Li, G.; Piazzoni, A.; Wang, P.; Santra, A. Sensing and machine learning for automotive perception: A review. IEEE Sens. J. 2023, 23, 11097–11115. [Google Scholar] [CrossRef]
Chen, B.; Hu, J.; Ghosh, B.K. Finite-time tracking control of heterogeneous multi-AUV systems with partial measurements and intermittent communication. Sci. China Inf. Sci. 2024, 67, 152202. [Google Scholar] [CrossRef]
Boonsirisumpun, N.; Okafor, E.; Surinta, O. Vehicle image datasets for image classification. Data Brief 2024, 53, 110133. [Google Scholar] [CrossRef]
Chen, B.; Hu, J.; Zhao, Y.; Ghosh, B.K. Finite-time observer based tracking control of uncertain heterogeneous underwater vehicles using adaptive sliding mode approach. Neurocomputing 2022, 481, 322–332. [Google Scholar] [CrossRef]
Sathyanarayana, N.; Anand, M.N. Vehicle type detection and classification using enhanced relieff algorithm and long short-term memory network. J. Inst. Eng. Ser. B 2023, 104, 485–499. [Google Scholar] [CrossRef]
He, S.; Chen, W.; Wang, K.; Luo, H.; Wang, F.; Jiang, W.; Ding, H. Region Generation and Assessment Network for Occluded Person Re-Identification. IEEE Trans. Inf. Forensics Secur. 2024, 19, 120–132. [Google Scholar] [CrossRef]
Kussl, S.; Omberg, K.S.; Lekang, O.-I. Advancing Vehicle Classification: A Novel Framework for Type, Model, and Fuel Identification Using Nonvisual Sensor Systems for Seamless Data Sharing. IEEE Sens. J. 2023, 23, 19390–19397. [Google Scholar] [CrossRef]
Mohammadzadeh, A.; Taghavifar, H.; Zhang, C.; Alattas, K.A.; Liu, J.; Vu, M.T. A non-linear fractional-order type-3 fuzzy control for enhanced path-tracking performance of autonomous cars. IET Control. Theory Appl. 2024, 18, 40–54. [Google Scholar] [CrossRef]
Berwo, M.A.; Asad, K.; Yong, F.; Hamza, F.; Shumaila, J.; Jabar, M.; Zain, U.A.; Syam, M.S. Deep learning techniques for vehicle detection and classification from images/videos: A survey. Sensors 2023, 23, 4832. [Google Scholar] [CrossRef]
Liu, Y.; Fan, Y.; Zhao, L.; Mi, B. A refinement and abstraction method of the SPZN formal model for intelligent networked vehicles systems. KSII Trans. Internet Inf. Syst. TIIS 2024, 18, 64–88. [Google Scholar] [CrossRef]
Zhang, L.; Wang, J.; An, Z. Vehicle recognition algorithm based on Haar-like features and improved Adaboost classifier. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 807–815. [Google Scholar] [CrossRef]
Huang, Y.; Feng, B.; Cao, Y.; Guo, Z.; Zhang, M.; Zheng, B. Collaborative on-demand dynamic deployment via deep reinforcement learning for IoV service in multi edge clouds. J. Cloud Comput. 2023, 12, 119. [Google Scholar] [CrossRef]
Chiang, C.-Y.; Jaber, M.; Chai, K.K.; Loo, J.; Chai, M. Distributed acoustic sensor systems for vehicle detection and classification. IEEE Access 2023, 11, 31293–31303. [Google Scholar] [CrossRef]
Ding, Y.; Zhang, W.; Zhou, X.; Liao, Q.; Luo, Q.; Ni, L.M. FraudTrip: Taxi Fraudulent Trip Detection from Corresponding Trajectories. IEEE Internet Things J. 2021, 8, 12505–12517. [Google Scholar] [CrossRef]
Pemila, M.; Pongiannan, R.K.; Megala, V. Implementation of Vehicles Classification using Extreme Gradient Boost Algorithm. In Proceedings of the 2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), Bhilai, India, 21–22 April 2022; IEEE: Bhilai, India, 2022; pp. 1–6. [Google Scholar]
Deng, Z.W.; Zhao, Y.Q.; Wang, B.H.; Gao, W.; Kong, X. A preview driver model based on sliding-mode and fuzzy control for articulated heavy vehicle. Meccanica 2022, 57, 1853–1878. [Google Scholar] [CrossRef]
Pemila, M.; Pongiannan, R.K.; Narayanamoorthi, R.; Sweelem, E.A.; Hendawi, E.; Abu El-Sebah, M.I. Real Time Classification of Vehicles Using Machine Learning Algorithm on the Extensive Dataset. IEEE Access 2024, 12, 98338–98351. [Google Scholar] [CrossRef]
Gao, W.; Wei, M.; Huang, S. Optimization of aerodynamic drag reduction for vehicles with non-smooth surfaces and research on aerodynamic characteristics under crosswind. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2023. [Google Scholar] [CrossRef]
Shvai, N.; Hasnat, A.; Meicler, A.; Nakib, A. Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1288–1297. [Google Scholar] [CrossRef]
Chen, J.; Wang, Q.; Cheng, H.H.; Peng, W.; Xu, W. A Review of Vision-Based Traffic Semantic Understanding in ITSs. IEEE Trans. Intell. Transp. Syst. 2022, 23, 19954–19979. [Google Scholar] [CrossRef]
Pemila, M.; Pongiannan, R.K.; Kareem, M.A.; Amr, Y. Application of an ensemble CatBoost model over complex dataset for vehicle classification. PLoS ONE 2024, 19, e0304619. [Google Scholar]
Chen, J.; Xu, M.; Xu, W.; Li, D.; Peng, W.; Xu, H. A Flow Feedback Traffic Prediction Based on Visual Quantified Features. IEEE Trans. Intell. Transp. Syst. 2023, 24, 10067–10075. [Google Scholar] [CrossRef]
Pemila, M.; Pongiannan, R.; Pandey, V.; Mondal, P.; Bhaumik, S. An Efficient Classification for Light Motor Vehicles using CatBoost Algorithm. In Proceedings of the 2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT), Erode, India, 22–24 February 2023; IEEE: Bhilai, India, 2023; pp. 1–7. [Google Scholar]
Chen, J.; Wang, Q.; Peng, W.; Xu, H.; Li, X.; Xu, W. Disparity-Based Multiscale Fusion Network for Transportation Detection. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18855–18863. [Google Scholar] [CrossRef]
Zhang, S.; Lu, X.; Lu, Z. Improved CNN-based CatBoost model for license plate remote sensing image classification. Signal Process. 2023, 213, 109196. [Google Scholar] [CrossRef]
Li, S.; Chen, J.; Peng, W.; Shi, X.; Bu, W. A vehicle detection method based on disparity segmentation. Multimed. Tools Appl. 2023, 82, 19643–19655. [Google Scholar] [CrossRef]
Aldania, A.N.A.; Soleh, A.M.; Notodiputro, K.A. A comparative study of CatBoost and double random forest for multi-class classification. J. RESTI Rekayasa Sist. Teknol. Informasi 2023, 7, 129–137. [Google Scholar] [CrossRef]
Yue, W.; Li, J.; Li, C.; Cheng, N.; Wu, J. A Channel Knowledge Map-Aided Personalized Resource Allocation Strategy in Air-Ground Integrated Mobility. IEEE Trans. Intell. Transp. Syst. 2024, 1–14. [Google Scholar] [CrossRef]
Mani, P.; Komarasamy, P.R.G.; Rajamanickam, N.; Alroobaea, R.; Alsafyani, M.; Afandi, A. An Efficient Real-Time Vehicle Classification from a Complex Image Dataset Using eXtreme Gradient Boosting and the Multi-Objective Genetic Algorithm. Processes 2024, 12, 1251. [Google Scholar] [CrossRef]
Xu, X.; Liu, W.; Yu, L. Trajectory prediction for heterogeneous traffic-agents using knowledge correction data-driven model. Inf. Sci. 2022, 608, 375–391. [Google Scholar] [CrossRef]
Singh, P.; Gupta, S.; Gupta, V. Multi-objective hyperparameter optimization on gradient-boosting for breast cancer detection. Int. J. Syst. Assur. Eng. Manag. 2023, 15, 1676–1686. [Google Scholar] [CrossRef]
Zhu, C. Intelligent robot path planning and navigation based on reinforcement learning and adaptive control. J. Logist. Inform. Serv. Sci. 2023, 10, 235–248. [Google Scholar] [CrossRef]
Sun, X.; Fu, J. Many-objective optimization of BEV design parameters based on gradient boosting decision tree models and the NSGA-III algorithm considering the ambient temperature. Energy 2024, 288, 129840. [Google Scholar] [CrossRef]
Zhou, Z.; Wang, Y.; Liu, R.; Wei, C.; Du, H.; Yin, C. Short-Term Lateral Behavior Reasoning for Target Vehicles Considering Driver Preview Characteristic. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11801–11810. [Google Scholar] [CrossRef]
Qian, L.; Chen, Z.; Huang, Y.; Stanford, R.J. Employing categorical boosting (CatBoost) and meta-heuristic algorithms for predicting the urban gas consumption. Urban Clim. 2023, 51, 101647. [Google Scholar] [CrossRef]
Liu, Y.; Zhao, Y. A Blockchain-Enabled Framework for Vehicular Data Sensing: Enhancing Information Freshness. IEEE Trans. Veh. Technol. 2024, 1–14. [Google Scholar] [CrossRef]
Demir, S.; Şahin, E.K. Liquefaction prediction with robust machine learning algorithms (SVM, RF, and XGBoost) supported by genetic algorithm-based feature selection and parameter optimization from the perspective of data processing. Environ. Earth Sci. 2022, 81, 459. [Google Scholar] [CrossRef]
Li, J.; Ling, M.; Zang, X.; Luo, Q.; Yang, J.; Chen, S.; Guo, X. Quantifying risks of lane-changing behavior in highways with vehicle trajectory data under different driving environments. Int. J. Mod. Phys. C 2024. [Google Scholar] [CrossRef]
Hamdaoui, F.; Bougharriou, S.; Mtibaa, A. Optimized Hardware Vision System for Vehicle Detection based on FPGA and Combining Machine Learning and PSO. Microprocess. Microsyst. 2022, 90, 104469. [Google Scholar] [CrossRef]
Wang, F.; Xin, X.; Lei, Z.; Zhang, Q.; Yao, H.; Wang, X.; Tian, Q.; Tian, F. Transformer-Based Spatio-Temporal Traffic Prediction for Access and Metro Networks. J. Light. Technol. 2024, 42, 5204–5213. [Google Scholar] [CrossRef]
Ilyukhin, A.V.; Seleznev, V.S.; Antonova, E.O.; Abdulkhanova, M.Y.; Marsova, E.V. Implementation of FPGA-based camera video adapter for vehicle identification tasks based on EUR 13 classification. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2021; Volume 1159, p. 012008. [Google Scholar]
Zhao, J.; Song, D.; Zhu, B.; Sun, Z.; Han, J.; Sun, Y. A Human-Like Trajectory Planning Method on a Curve Based on the Driver Preview Mechanism. IEEE Trans. Intell. Transp. Syst. 2023, 24, 11682–11698. [Google Scholar] [CrossRef]
Zhai, J.; Li, B.; Lv, S.; Zhou, Q. FPGA-based vehicle detection and tracking accelerator. Sensors 2023, 23, 2208. [Google Scholar] [CrossRef] [PubMed]
Zhu, B.; Sun, Y.; Zhao, J.; Han, J.; Zhang, P.; Fan, T. A Critical Scenario Search Method for Intelligent Vehicle Testing Based on the Social Cognitive Optimization Algorithm. IEEE Trans. Intell. Transp. Syst. 2023, 24, 7974–7986. [Google Scholar] [CrossRef]

Figure 1. General architecture of framework for vehicle classification using ML through image processing technique.

Figure 2. Sample vehicle images of emergency vehicle image dataset chosen from Google Images. The images show the various classes of fire engine, patrol, VIP/HNWI and ambulance (top to bottom).

Figure 3. Design paradigm of the proposed network for Emergency vehicle classification.

Figure 4. (a)Hardware set up (b) FPGA kit connected both PC and Laptop for the CNN model deployment (c) Stimulation result of the classification of the vehicles.

Figure 5. Stimulation output using FPGA based CNN model extraction of subnormal emergency vehicle features under low light condition (a) captured Image of Ambulance (b) Extracted Image (c) Captured Image of Police Van (d) Extracted Image.

Figure 6. Stimulation Outputs of Subnormal Emergency vehicle Classification of Training Image Using a FPGA based CNN-ResNet50-MOGA-CB-MOGA under low light conditions.

Figure 7. Stimulation Outputs of Subnormal Emergency vehicle Classification of Testing Image Using a FPGA based CNN-ResNet50-MOGA-CB-MOGA under low light conditions.

Figure 8. Assessment to determine the optimal network architecture for each CNN model.

Figure 9. Performance metrics by confusion matrix for the optimal network architecture of each CNN model in emergency vehicle classification.

Figure 10. Receiver operating characteristic analysis for the optimal CNN models in emergency vehicle classification.

Figure 11. Design paradigm of the proposed system for real world application.

Table 1. Emergency vehicle dataset information with sample images taken from Google Images.

S. No	Vehicle Name	Class Name	No of Images
1	Fire Engine	Ashok Leyland ENSOL TCS	1538
2	Ambulance	JCBL Limited TATA Venture MarutiSuzuki Omni	200
3	VIP/HNWI	Mercedes Benz GLS VOLVO XC90 Mahindra Scorpio N	1500
4	Police	Hyundai Mahindra Royal Enfield	2500

Table 2. Performance analysis of CNN on Xilinx ZynqUltrascale + MPSoc ZCU 102.

Assets	Handy Features	Deployment	Utilization (%)
Look Up Table	42,100	24,410	56
Flip-Flop	105,300	32,204	30
4 KB-RAM	120	37	23
Digital Signal Processor	210	57	20

Table 3. Top-performing CNN model identified through fold-related performance evaluation.

Network	1-Fold	2-Fold	3-Fold	4-Fold	5-Fold	6-Fold	7-Fold	8-Fold	9-Fold	10-Fold	Avg. Acc
VGG16-MOGA-SVM	92.10	93.00	94.00	92.30	93.20	93.10	92.40	93.15	93.90	93.29	93.23
VGG19-MOGA-CB	96.20	97.10	97.90	96.30	97.20	97.00	96.40	97.15	97.80	97.29	97.24
ResNet50-MOGA-MOGA	85.90	86.95	87.70	85.92	86.85	86.80	85.95	86.90	86.75	86.89	86.77
ResNet50-MOGA-CB-MOGA	98.90	98.95	99.80	98.92	99.85	98.88	99.82	98.93	99.87	99.89	99.89

Table 4. Vehicle classification: a comparative analysis of existing and proposed methods.

Authors	Techniques	Data Description	Accuracy
Ma, Shihan, and Jidong J. Yang [12]	ViT + data2vec + YOLOR model	13 FHWA vehicle classes	97.2%
Farid, Annam, Farhan Hussain, [14]	YOLO-v5	600 images contain 3 classes	99.92%
Hasanvand, Mohamad, Mahdi Nooshyar [22]	SVM algorithm	7 images of vehicles	97.1
Alghamdi, Ahmed S., Ammar Saeed, Muhammad Kamran [24]	Cubic SVM kernel	196 cars images	99.7%
Tan, Shi Hao, Joon Huang Chuah [26]	CNN models	3 vehicle classes	99%
Boonsirisumpun, Narong [30]	Deep learning	4356 imagescategorized into five classes	90.83%
Sathyanarayana, N., and Anand [32]	local binary pattern, steerable pyramid transform, and dual tree complex wavelet transform	six vehicle categories	99.01% and 95.85%
Kussl, Sebastian [34]	MEMS	Classify vehicle model, type and fuel	92.9%
Berwo, Michael Abebe [36]	YOLOV4	Vehicle types	98%
Zhang, Le, Jinsong Wang Chiang, Chia-Yen [38]	Adaboost	2000 vehicle images	85.8%
Shvai, Nadiya, AbulHasnat [46]	CB algorithm	5 classes of vehicles	99.03%
Zhang, Songhua, Xiuling Lu [52]	CNN + CB	License Plates of the vehicle
Aldania, AnnisarahmiNurAini [54]	CB + DRF	5 digit numerical group of 21 alphabets	92.45
Singh, Priya, Swayam Gupta [58]	NSGA-II algorithm	7909 images	94.40%
Sun, Xilei, and Jianqin Fu [60]	NSGA -III	Electric consumption 100 km	96.3%
Qian, Leren, Zhongsheng Chen [62]	hybrid Catboost-PPSO	6 models	TIC values 0.0013 and 0.0540.
Demir, Selçuk, and Emrehan [64]	XGBoost	411 shear wave velocity	96%
Hamdaoui, Faycal, Sana Bougharriou [65]	SVM + HOG + PSO	Vehicle images and video	97.84%
Ilyukhin, A. V., V. S. Seleznev [66]	VGA port board	7 various color	25.1 MHz
Zhai, Jiaqi, Bin Li, ShunsenLv [67]	YOLO V3	1053 identities	98.2%
Gaia, Jeremias, Eugenio Orosco [68]	Alveo U200 Data Center Accelerator Cardand a Zynq UltraScale+ MPSoC ZCU104 EvaluationKit	5 haralick features	Less than 3 ms
Al Amin, Rashed, Mehrab Hasan [69]	YOLO v3	Traffic light signal	99%
Proposed Method	CNN + MOGA + Classifier	12 classes of Emergency vehicle	99.87%

Table 5. A comprehensive examination of the genetic algorithm’s multi objective optimization strength and weakness.

CNN Models/ MOGA	Classifier	Avg. Acc		Avg. Prec		Avg. Rec		Avg. Sco
CNN Models/ MOGA	Classifier	W	W/O	W	W/O	W	W/O	W	W/O
VGG16	SVM	96.88	97.06	66.40	97.04	95.76	96.40	96.03	96.55
	CB	97.10	97.76	97.35	98.07	97.37	98.04	97.34	98.05
	MOGA	88.46	88.46	88.84	89.18	89.50	89.76	88.97	89.14
	CB-MOGA	98.01	98.61	98.28	98.73	98.05	98.47	98.15	98.59
VGG19	SVM	85.42	85.63	84.91	85.04	84.12	84.25	84.48	84.62
	CB	93.44	93.53	93.37	94.33	93.31	93.83	93.72	94.08
	MOGA	74.21	75.26	77.78	77.85	69.75	70.85	70.45	72.07
	CB-MOGA	98.67	98.93	98.40	98.82	98.66	98.69	98.53	98.73
ResNet50	SVM	93.37	93.45	93.35	93.55	93.59	93.66	93.49	93.60
	CB	91.73	91.94	91.54	92.05	90.86	91.88	91.11	91.47
	MOGA	86.83	86.93	86.38	86.33	84.79	85.44	84.76	85.31
	CB-MOGA	99.13	99.60	99.15	99.60	99.19	99.66	99.17	99.63

Table 6. Unlocking the potential of multi-objective genetic algorithm in network optimization: An analysis of feature selection, accuracy enhancement and time complexity.

Models		Number of Features		Average Accuracy (%)		Average Computation Time (s)
	MOGA	W	W/O	W	W/O	W	W/O
VGG16-CB		3085	1054	98.01	98.61	22.5 ± 0.86	22.0 ± 0.60
VGG19-CB		3085	1003	98.67	98.93	32.8 ± 0.01	37.1 ± 0.76
ResNet50-CB		1028	1001	99.13	99.60	12.4 ± 0.10	13.8 ± 0.45

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mani, P.; Komarasamy, P.R.G.; Rajamanickam, N.; Shorfuzzaman, M.; Abdelfattah, W.M. Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification. Sustainability 2024, 16, 6917. https://doi.org/10.3390/su16166917

AMA Style

Mani P, Komarasamy PRG, Rajamanickam N, Shorfuzzaman M, Abdelfattah WM. Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification. Sustainability. 2024; 16(16):6917. https://doi.org/10.3390/su16166917

Chicago/Turabian Style

Mani, Pemila, Pongiannan Rakkiya Goundar Komarasamy, Narayanamoorthi Rajamanickam, Mohammad Shorfuzzaman, and Waleed Mohammed Abdelfattah. 2024. "Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification" Sustainability 16, no. 16: 6917. https://doi.org/10.3390/su16166917

APA Style

Mani, P., Komarasamy, P. R. G., Rajamanickam, N., Shorfuzzaman, M., & Abdelfattah, W. M. (2024). Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification. Sustainability, 16(16), 6917. https://doi.org/10.3390/su16166917

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Vehicle Dataset

3.2. Architecture for Proposed System

3.3. Feature Extraction Using CNN Based FPGA

3.4. Feature Selection Using Multi Objective Genetic Algorithm

3.5. Deciphering CatBoost: Theoretical Principle and Innovative Classification Solutions for Modern Problems

3.6. Network-Driven Model Evaluation and Selection: A Comprehensive Framework

4. Results and Analysis

4.1. Optimal Model Identification: A Systematic Evaluation and Selection Framework

4.2. Insights into Multi-Objective Genetic Algorithms through Componential Analysis

4.3. Synergizing Theory and Practice: The System’s Utility and Efficacy

4.4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI