Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model

Wang, Xianbin; Zhao, Yuqi; Li, Weifeng

doi:10.3390/su15032644

Open AccessArticle

Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model

by

Xianbin Wang

^*,

Yuqi Zhao

and

Weifeng Li

School of Traffic and Transportation, Northeast Forestry University, Harbin 150040, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(3), 2644; https://doi.org/10.3390/su15032644

Submission received: 9 November 2022 / Revised: 18 January 2023 / Accepted: 30 January 2023 / Published: 1 February 2023

(This article belongs to the Special Issue Intelligent Vehicle-Infrastructure System and Sustainable Transportation)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we propose a multilayer perceptron-based recognition method for driving cycles of commercial vehicles. Our method solves the problem of identifying the type of driving cycle for commercial vehicles, and improves the efficiency and sustainability of road traffic. We collect driving condition data of 106,200 km long-distance commercial vehicles to validate our method. We pre-proceed six kinds of quantitative features as the data description; these are average speed, gear ratio, and accelerator pedal opening. Our model includes an input layer, hidden layers, and an output layer. The input layer receives and processes the input as low-dimensional features. The hidden layers consist of the feature extraction module and class regression module. The output layer projects extracted features to the classification space and computes the likelihood for each type. We achieve 99.83%, 97.85%, and 99.40% on the recognition accuracy for the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle, respectively. The experimental results demonstrate that our model achieves better results than the statistical method using Naive Bayes. Moreover, our method utilizes the data more efficiently and thus gains a better generalization performance.

Keywords:

driving cycle recognition; multilayer perceptron; unbalanced datasets; commercial vehicle; sustainable transportation

1. Introduction

With the development of artificial intelligence, the transportation industry is undergoing great changes, where vehicle-road intelligent mobility becomes an important carrying form of sustainable transportation. Commercial vehicles are mainly composed of freight vehicles and medium and large buses. They are an essential part of the global transportation industry. The industrialization of intelligent connection of commercial vehicles has remarkable application value and economic value. The driving cycle describes the micro and macro vehicle activity information, which contains the basic research content of people, vehicles, and roads. Han et al. [1] studied regional driving conditions and emission models using GPS-based passenger vehicle trajectory data. Kim et al. [2] used the trajectory data of commercial vehicles on the road to identify traffic flow. They leveraged artificial intelligence technology to analyze driving behavior event data and to predict traffic accidents. De Gennaro et al. [3] developed core algorithms based on a dataset of driving and mobility patterns collected by navigation systems on European roads, and analyzed the potential of big data for realizing green and low-carbon transportation. The recognition of the driving cycle of commercial vehicles helps the automatic driving system to select a reasonable driving mode. In this paper, we propose a multilayer perceptron model to classify and recognize the driving cycle of commercial vehicles.

Many excellent researchers have studied the feature extraction and classification method for the recognition of the driving cycle [4,5,6]. Shi et al. [7] analyzed the hierarchical properties of the road network and constructed a representative data set of the vehicle driving cycle. They also found that the probability distributions of velocity and acceleration had a high degree of similarity under the same grade of the road. Their job provided guidance for selecting appropriate data representations for the vehicle driving cycle. Shi et al. [8] collected four typical driving cycle data in real vehicle tests and used the support vector machine algorithm, with particle swarm optimization, to conduct driving cycle recognition. Wang et al. [9] studied the characteristics of the commercial vehicle driving cycle and proposed a method to recognize the expressway driving cycle based on the Naive Bayes method. This work provided a baseline for the design of the commercial vehicle driving cycle. He et al. [10] built the Markov transfer matrix with Monte Carlo by using the velocity segment and the road section velocity. They generated velocity segments that constituted the global driving cycle to reconstruct the global driving cycle, which further reflects the real-time road condition. Topić et al. [11] analyzed GPS- and CAN-based tracking data from multiple urban buses during regular operation. They predicted the fuel consumption based on the vehicle velocity, acceleration, and road slope time series. Their results demonstrated that the neural network-based approach is both effective in prediction accuracy and efficient in execution speed. The above works focused on the exploration of quantifiable feature selection for the driving cycle. The recognition method mainly relied on traditional pattern recognition classification methods or probabilistic reasoning methods. Thus, they could only conduct the recognition task by processing low-dimensional data from the driving cycle data. The characteristics of insufficient information constrain the discriminative ability of the corresponding method in obtaining promising accuracy. In addition, these methods are easily affected by unbalanced class samples and dirty data.

Statistical learning theory reveals that it is difficult for traditional methods to obtain sufficient classification results with limited data accounts [12,13]. When the data is unbalanced, the classifier obtains high accuracy on classes with large accounts, but low accuracy on classes with small data volumes [14], which is the typical long-tail problem limiting traditional methods. Bhowan et al. [15] proposed a multi-objective genetic programming (MOGP) approach to ensemble diverse genetic program classifiers; they obtained a good performance on both the minority and majority classes. Wang [16] summarized a series of techniques to improve the classification performance of unbalanced data. A data augmentation strategy was presented to effectively alleviate the unbalanced samples.

In addition, García et al. [17] introduced the definition, characteristics, and categorization of data preprocessing approaches and reviewed the state-of-the-art methods of data pre-processing. Hao et al. [18] formulated the problem of data cleaning and summarized their data error elimination technology by the data cleaning algorithm. Detecting and repairing dirty data is one of the fundamental challenges in data analytics. Chu et al. [19] leveraged machine learning methods to improve the effectiveness of data cleaning. Ma et al. [20] used different machine learning algorithms to model the risk of rear-end collision on the freeway on unbalanced datasets. Compared with the traditional model, multilayer perceptron (MLP) can reach better performance standards and maintain good generalization. Huang et al. [21] reformulated silent liveness detection from the traditional two-class classification problem as the multi-classification problem. Their experimental results indicated that the multi-class classification formulation can effectively reduce classification error and improve generalization ability.

In recent years, deep learning has made breakthroughs in various fields. Deep learning allows neural networks to learn representations of data with multiple levels of abstraction. It has dramatically improved the state-of-the-art methods of speech recognition, visual object recognition, object detection, and many other domains, such as drug discovery and genomics [22,23]. The methods introduced in [24,25] inspire us to leverage neural networks and deep learning methods to tackle the issue of recognition in commercial vehicle driving cycles. High-dimensional features with rich discriminative power can be obtained by feature extraction through neural networks, which can fulfill the requirements for feature discrimination ability in recognition problems. Bratsas [26] compared the prediction effects on urban road traffic speed, using three machine learning models, random forest (RF), support vector regression (SVM), and multilayer perceptron (MLP). The experiment results showed that the MLP model is better in cases with high variations. Murtagh [27] reviewed the theory and practice of MLP and compared the differences between traditional methods and MLP. He proved the advantages of MLP in classification and regression. Perrotta et al. [28] studied the application of machine learning techniques in modeling the fuel consumption of articulated trucks, which was based on a large dataset of road characteristics of the UK Highways Agency and on vehicle information data collected by standard sensors. Han et al. [29] proposed an improved recognition network based on MLP, which feeds the features extracted by the network into the MLP, and obtained a higher accuracy. Li et al. [30] used an MLP model to extract cross-modal features, such as external environmental factors, the manipulation of vehicle companies, and the driver’s driving habits. They obtained accurate classification results and effectively predicted fuel consumption. Krizhevsky et al. [31] trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images into 1000 different classes. They used Dropout regularization and the SoftMax function in the model, and achieved state-of-the-art results. Sai et al. [32] developed an MLP model based on order data from a car-sharing company in Lanzhou, China. They classified users into three different expenditure level classes and accurately predicted new user classes within 84 days. The above researches showed that the multilayer perceptron can extract common information from multimodal data for classification. It has been proven to be an excellent discriminator in various fields where the traditional multilayer perceptron model has difficulty dealing with complex data. However, it suffers from the problems of gradient disappearance and overfitting during training. The Dropout layer proposed by Srivastava et al. [33] can effectively alleviate the overfitting problem, and they regularized the model during training. Agarap et al. [34] proposed the use of ReLU (Rectified Linear Unit) to solve the gradient disappearance problem. Currently, ReLU is the most widely used activation function in deep neural networks.

This paper mainly focuses on designing an accurate driving cycle recognition model. We formulate the driving cycle recognition problem as a typical multitype classification task. Our work improves the recognition accuracy of commercial vehicle driving cycles and lays the foundation for a broad application of driving cycles. As shown in Figure 1, based on the above observations, we select the characteristics of the data from the driving cycle of commercial vehicles and propose a multilayer perceptron-based recognition algorithm to recognize the driving cycle classes. By analyzing and cleaning the raw data, we select six-dimensional data features as the input of the multilayer perceptron model. In the last two fully connected layers, we leverage Dropout to tackle the overfitting issue, and ReLU to deal with the gradient disappearance problem. The final recognition probabilities of the driving cycle classes are output through the SoftMax function.

The recognition accuracy of our model is obviously better than the traditional statistical learning methods based on Naive Bayes. Moreover, it has the advantage of utilizing and generalizing data, which are helpful for the accurate recognition of the driving cycle of commercial vehicles.

The main contributions and innovations of this paper can be summarized as follows:

(1) We conduct the pre-processing of the raw data and select the six-dimensional features characterized by the driving conditions of commercial vehicles, in order to represent the state of the driving cycle. All data are normalized to feed into the network.

(2) We propose an improved multilayer perceptron model to classify the driving cycle of commercial vehicles, which achieves excellent classification results even with relatively small amounts of data in the suburban and urban road driving cycle.

(3) Since the data distribution on the suburban driving cycle is ambiguous, we design a re-labeling algorithm to convert the original three types of driving cycles into four types of driving cycles. The classification accuracy of the suburban working conditions is significantly improved on the re-labeled data.

The main structure of this paper is shown below. Section 1 reviews the research background, including the definition of driving cycle recognition and the application of multilayer perceptron models in other fields. In Section 2, several modifications are made for the multilayer perceptron, and we design a driving cycle recognition model based on the multilayer perceptron for the commercial vehicle. Section 3 introduces the data collection and data processing. We analyze the data characteristics through box plots and clean the data. Section 4 compares the results of the Naive Bayesian and the multilayer perceptron models based on raw data and clean data. We demonstrate the superiority of the multilayer perceptron model. Section 5 discusses the relationship between this study and other relevant works worldwide and analyzes the relevance of our method to subsequent tasks. Section 6 presents the research conclusion. We review the achievements, limitations, and corresponding future works. We also present our contributions to society, economics, and environmental sustainability.

2. Multilayer Perceptron Based Model for the Recognition of the Driving Cycle of Commercial Vehicles

The multilayer perceptron model is a kind of typical deep learning model, which is also called the feedforward neural network. This kind of model usually consists of an input layer, multiple hidden layers, and an output layer. Information is flown between layers in a fully connected manner, which means that each neuron is connected to all neurons in the previous layer, except the input layer. The multilayer perceptron model realizes data feature extraction and enhancement by mapping the low-dimensional data received by the input layer into a high-dimensional space. The high-dimensional feature is then remapped to low-dimensional output layers to accomplish a specific task. In the past few years, benefiting from the significant improvement in GPU computing power, the multilayer perceptron model has achieved breakthroughs in many classification and recognition tasks.

According to the China automotive test cycle—Part 2: Heavy-duty commercial vehicles [35], the vehicle driving cycle can be divided into three classes; these are the expressway cycle, the suburban road cycle, and the urban road cycle. The expressway refers to highways with crucial political and economic significance. It always has four or more lanes and a central divider. All access points and exits are controlled. Moreover, the supporting facilities are complete. It is designed for motor vehicles to drive at high speed. The urban road refers to roads with certain technical conditions. It usually has facilities to pass vehicles and pedestrians within the city. The suburban road refers to roads other than expressways and urban roads. It connects cities, villages, and industrial and mining bases. It is mainly built for cars and has specific technical standards and facilities.

To recognize the driving cycle of commercial vehicles, we formulate it as a classification problem of three types of driving cycles; these are the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle. We select the average speed, the accelerator pedal opening, the idle gear ratio, the low gear ratio, the medium gear ratio, and the high gear ratio as the six-dimensional quantitative features to represent the vehicle’s driving state. The proposed multilayer perceptron classification model is shown in Figure 2. We classify the driving cycle of the vehicle according to the six-dimensional features.

The multilayer perceptron-based commercial vehicle driving cycle recognition model consists of eight fully connected layers. The first layer is the input layer, which is responsible for receiving and processing the input low-dimensional information. The middle six layers are hidden layers, which map the low-dimensional features extracted by the input layer to the high-dimensional space and then map them back to the low-dimensional space. These layers enhance the discriminative ability of the feature representation and extract beneficial information for the classification. The last layer is the classification layer, which projects the extracted features to the three types, and obtains the likelihood estimates for each type.

Suppose the model input information is a vector X, the connection coefficient is W₁, the bias vector is b₁, and the activation function is f, then the computation for one fully connected layer can be expressed as follows:

f (W_{1} X + b_{1}),

(1)

In this equation, the bias vector performs a whitening operation to adjust the center position of the extracted feature distribution. The activation function is a nonlinear function to scale the feature response. For example, the ReLU [34] only keeps responses with non-negative values.

We feed the aforementioned six-dimension data into the input layer to extract the feature. To speed up the convergence of the network, we normalize the input data between 0 and 1. After passing through the first fully connected layer, the six-dimensional data becomes a 32-dimension feature, which is the input to the hidden layer.

The hidden layer consists of a feature extraction part and a class regression part. The feature extraction part is to map the low-dimensional data features into the high-dimensional feature space. This part is composed of four fully connected layers of 64, 128, 256, and 512 channels, respectively. The class regression part then remaps the high-dimensional data features to a low-dimension space related to the class recognition, which is composed of two fully connected layers with 256 and 64 channels, respectively.

It is noted that we adopt the Dropout mechanism in the class regression part to avoid the overfitting problem. This problem makes the network fall into local optimum prematurely during training and negatively affects the generalization performance of the model. When the model prediction accuracy increases on the training set but decreases on the test set, it means that the model is overfitted. Dropout [36] can effectively alleviate the occurrence of overfitting and achieve the effect of model regularization to a certain extent. The working mechanism of Dropout is to let some neurons deactivate with a certain probability P (in this model, P is set to 0.2) during each training. Such stochastic processes force the model not to overly rely on some channels or specific activations. When we make it more difficult for the model to overfit specific data, the model generalizes better.

The output layer consists of a fully connected layer of three channels and a SoftMax function. We deem the process from the hidden layer to the output layer to be a multi-class regression process. Therefore, we set the number of outputs of the last fully connected layer as the number of target classes. Then, we use the SoftMax function to turn the response values into the probability values for classification. The three values from the last fully connected layer correspond to the probability of three types of driving cycles. SoftMax converts these values into corresponding likelihood estimates for the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle. The SoftMax function can be written as follows:

S o f t M a x (z_{i}) = \frac{e^{z_{i}}}{\sum_{c = 1}^{C} e^{z_{c}}},

(2)

Among them, z_i is the output value of the i-th node, and C is the target number of the output nodes, which is the number of classes. Through the SoftMax function, the output values from the network can be converted into a probability distribution in the range [0, 1]; meanwhile, the sum of them is 1.

In addition, we adopt a Rectified Linear Unit (ReLU) as the activation function of the multilayer perceptron. Compared to the traditional sigmoid and tanh activation functions, ReLU [34] is the most commonly used activation function in deep neural networks. It usually refers to a nonlinear activation function represented by a truncation function, which can be written as follows:

R e L U (x) = \max (0, x),

(3)

When the input value is positive, ReLU outputs it directly, otherwise, ReLU outputs a zero value. This mechanism can effectively preserve positive response features while ignoring unimportant negative responses. The advantage of using ReLU as the activation function is that it can effectively solve the vanishing gradient problem. It makes sure that the gradient will not be greatly reduced after multilayer back-propagation during the training. Moreover, the ReLU operation is quite simple, which can reduce the overall computational cost of the model.

We use the cross-entropy loss function (CELoss) to train the model which can be written as follows:

C E L o s s = - \sum_{c = 1}^{M} y_{c} l o g (p_{c}),

(4)

M is the number of classes. The log indicates the natural logarithmic function, y represents the class label, which is usually represented by a one-hot vector and p is the class probability predicted by the network. During training, CELoss supervises the network according to the maximum likelihood estimation mechanism. In the process of backpropagation training, this CELoss guides the optimizer to update the network weight to make the final predicted probability consistent with the real distribution of the sample. When testing, we directly select the class with the maximum probability value as the predicted class.

The proposed driving cycle classification model with eight fully connected layers can be expressed by the following formula:

F (x) = S o f t m a x (f (W_{8} f (W_{7} f (W_{6} f (W_{5} f (W_{4} f (W_{3} f (W_{2} f (W_{1} X + b_{1}) + b_{2}) + b_{3}) + b_{4}) + b_{5}) + b_{6}) + b_{7}) + b_{8}))

(5)

In this paper, we use eight fully connected layers to process the input of six-dimensional information and predict the probability for three driving cycle classes, respectively. The feature extraction part maps the low-dimension data into high-dimensional spaces layer by layer. It enhances the discriminative ability of human-designed data features. The class regression part remaps high-dimensional features to low-dimensional features, corresponding to the classification space. With the information compression on high-dimensional features, the most discriminative information is selected for classification. To improve the generalization performance of the model, we also adopt Dropout in the last two layers to avoid model overfitting. Finally, we leverage the SoftMax function to obtain the classification probabilities of the three classes and achieve the accurate recognition of the driving cycle for commercial vehicles.

3. Data Analysis

3.1. Raw Data Processing

Approximately 106,200 km of driving cycle data collected from 21 long-haul commercial vehicles are used in this study, as shown in Table 1. The data collection time is the fourth quarter of 2013, and the data processing time is January 2014. The commercial vehicles are Jiefang, with a total of 12 gears. Highway data covers various cities and provinces such as Beijing, Jiangsu, Zhejiang, Guangzhou, Heilongjiang, Jilin, Hebei, Yunnan, and Xinjiang. The original data types include time, latitude and longitude, GPS speed, accelerator pedal opening, wheel speed, brake switch, clutch switch, 10 Hz acceleration, 1 Hz acceleration, gear, etc. The driving cycle division method is combined with GPS latitude and longitude and artificial map comparison; refer to “China automotive test cycle—Part 2: Heavy-duty commercial vehicles”. Finally, the data of the different driving cycles of commercial vehicles are divided into three classes: the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle.

Among the dataset, the mileage of the expressway is approximately 85,000 km, the mileage of the suburban road is approximately 19,000 km, and the mileage of the urban road is approximately 2200 km. We first segment the driving state data. The segmentation standard is to reflect the characteristics of the different driving cycles, while ensuring that the data is easy to process. Based on the comparison and analysis from previous methods, we divided the 106,200 km drive cycle data into 36,480 effective segments equally. Among them, there are 28,399 sections of the expressway driving cycle, 6542 sections of the suburban road driving cycle, and 1539 sections of the urban road driving cycle.

Due to the overlap of trajectories in the 21 commercial vehicles, we need more than one track map to present all the driving routes. We summarize the driving trajectories as two track maps according to the driving area, as shown in Figure 3. The national border of China in this map is drawn according to the 1:4 million “Topographic Map of the People’s Republic of China”, published by China Map Publishing House in 1989.

By analyzing the original data distribution, we select the average speed, idle gear ratio, low-speed gear ratio, medium-speed gear ratio, high-speed gear ratio, and accelerator pedal opening as the six-dimensional data characteristics. The average speed (unit: km/h) is obtained by converting the average value of GPS speed (unit: m/s). The commercial vehicle collecting data has 12 gears. We compute the times and probability of the gears and use the idle gear ratio (gear 0), low gear ratio (gear 1 to gear 5), medium gear ratio (gear 6 to gear 10), and high gear ratio (gear 11 and gear 12) as gear characteristics. The opening of the accelerator pedal is the sixth-dimensional feature (unit: %), where the greater the opening, the greater the acceleration.

In summary, the driving cycle of commercial vehicles in this paper is described by information such as average speed, gear ratio, and accelerator pedal opening. To use this descriptive state information for classification tasks, the data were processed using quantitative methods. Specifically, we count the number of the average speed, the gear ratio (idle, low, medium, high), and the accelerator pedal opening degree in a fixed period; thus, we obtain a six-dimensional mathematical feature describing the driving cycle. In addition, this paper normalizes all data features to make sure the values are between 0 and 1. The normalization accelerates the induction of the statistical distribution of the unified samples and makes the subsequent data processing more convenient. In addition, during training, the normalized data ensures a faster convergence.

3.2. Three-Classes Raw Data Analysis

The raw data refers to the data of each driving cycle marked by the method of GPS and manual map comparison. The data volume of the raw three-class driving cycles is 28,399 expressways, 6542 suburban roads, and 1539 urban roads, with a total of 36,480.

To analyze the characteristics of the raw data, we conduct an overfitting experiment on the full dataset using a multilayer perceptron. The principle of the overfitting experiment is that the multilayer perceptron has a strong fitting ability and can theoretically fit any data distribution. Therefore, as long as you train with the same data for enough time, the test results will be close to the ground truth. From the classification result of the full-data overfitting experiment, we obtain the final recognition accuracy rates of the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle; these are 83.15%, 48.69%, and 81.29%, respectively, as shown in Figure 4. We observe that the accuracy of the suburban road cycle is obviously lower than the other two classes. In addition, even the highest accuracy among the three classes is lower than 90%. This result indicates that the raw dataset contains dirty data and thus influences the fitting of the data distribution.

To validate the dirty data among all the data, we leverage the box plot to check the distribution, as shown in Figure 5 below. Box plots are drawn for the six-dimensional data characteristics of the three types of driving cycles. It reflects the difference between the data distribution characteristics of each driving cycle. We first find the upper edge, lower edge, median, and two quartiles of a set of data. Then, we connect the two quartiles to draw the box, which contains 50% of the data. Finally, we wire the top and bottom edges to the box, where the median is in the middle of the box, representing the average level of the sample data. The points outside the upper and lower edge lines are deemed to be outliers.

As shown in the above figures, the distributions of data characteristics, such as idle gear ratio, low gear ratio, and accelerator pedal opening, are very similar between the expressway driving cycle and the suburban road driving cycle. There are major overlaps in the data distributions of the low gear ratio and the medium gear ratio between the suburban road driving cycle and the urban road driving cycle.

Based on the three-class data feature analysis, we observe that the data features of each driving cycle are confused. It is the truth that in the real driving process, some commercial vehicle drivers do not choose the corresponding driving mode according to the current road type. Sometimes, in order to save costs, drivers treat the suburban road as the expressway to drive fast. However, in some special sections, they also have to slow down. The raw data is mixed with dirty data that does not match the correct driving behavior and the road type. This easily causes errors in the recognition of the driving cycle type. Thus, we need to find and relabel these dirty data.

3.3. Four-Classes Cleaned Data Analysis

With the multiple repeats of the sub-classification and comparison of the raw data, we find an excellent logic for the division and cleaning of the data. Specifically, we use a multilayer perceptron to perform an overfit experiment on the full data and save the results of 100 epochs. We compare these results with the labels of the raw data. If the number of correct classifications is greater than 50%, the label remains the same as the origin label. The reason is that, for the uncertain result, we choose to trust the original label. However, if the number of correct classifications is lower than 10%, we change the label to the type with the highest misclassification rate. This is because, when more than 90% of the classification is incorrect, we believe that this data is wrongly labeled; thus, we change its label to a more reliable one. The other data with an accuracy between 10% and 50% is deemed to be irregular driving data, which is labeled as a new type, i.e., the mixed road driving cycle.

After re-labeling the raw data, we refer to these data as the cleaned data in this paper. The data volumes of the four driving cycles obtained after reclassification are 24,204 for the expressway driving cycle, 6368 for the suburban road driving cycle, 2784 for the urban road driving cycle, and 3124 for the mixed road driving cycle, a total of 36,480. We also use the box plot to analyze the data characteristics of the four driving cycles, and the results shown in Figure 6 are obtained.

The box plot of the four classes of data demonstrates obvious differences in the average speed, idle gear ratio, medium gear ratio, and high gear ratio between the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle. Therefore, cleaned data distribution is better to distinguish different driving cycles and can lead to more accurate classification results.

As for the data of the mixed road driving cycle, the data characteristics of the low gear ratio are similar to those of the expressway driving cycle. The data characteristics of the idle gear ratio and accelerator pedal opening are similar to those of the suburban road driving cycle. In addition, the data characteristics of the medium gear ratio are similar to those of the urban road driving cycle. This also proves that the driving cycle data of mixed roads are composed of dirty data from these three types of data.

4. Experiment Analysis

In this paper, we use TensorFlow to build the multilayer perceptron model for driving cycle recognition [37,38]. The model is trained for 100 epochs. In each epoch, we randomly select 1000 samples from each driving cycle class to form the training set. In order to speed up the training speed of the model without affecting the accuracy of the model, we select 64 batch sizes to train the model. We use a fixed learning rate [39] of 1 × 10⁻⁴ and use the Adam optimizer [40] to optimize the network in the process of gradient backpropagation. Adaptive Moment Estimation (Adam) is a very popular optimization algorithm for deep neural networks and is widely used in many tasks.

4.1. Dataset and Comparison on Three-Classes of Raw Data

The cycle classes of raw data are determined by GPS and manual map comparison, which are labeled as the expressway, the suburban road, and the urban road. For each class, we randomly divide the data into the training set and test set by the ratio of 7:3. Table 2 shows the distribution of the obtained data.

4.1.1. Results by Naive Bayesian Method

Based on the above training set, we compute the conditional probability value for each class on the six feature dimensions. Then, we leverage the Naive Bayes method to classify the driving cycle type on the test set. The final recognition results of the three driving cycles are shown in Figure 7.

The recognition accuracy of the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle are 77.84%, 41.01%, and 83.55%, respectively. The probabilities of recognizing the expressway driving cycle as the suburban road driving cycle and the urban road driving cycle are 18.09% and 4.07%, respectively. The probabilities of recognizing the suburban road driving cycle as the expressway driving cycle and urban road driving cycle are 36.53% and 22.47%, respectively. The probabilities of recognizing the urban road driving cycle as the expressway driving cycle and the suburban road driving cycle are 5.63% and 10.82%, respectively.

To eliminate the influence caused by the division of the data set, we also use the naive Bayes method for classification using the conditional probability computed by the full amount of data. We obtain a similar result, which demonstrates that the division of data will not influence the final result.

4.1.2. Results by Multilayer Perceptron Model

Based on the same training set and test set used above, we conduct the three-types classification experiments using the proposed multilayer perceptron model. The accuracy of the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle are 77.68%, 50.31%, and 83.95%, respectively. The results are shown in Figure 8.

4.2. Dataset and Comparison on Four-Classes of Cleaned Data

As described in Section 3, we relabel the data into four classes, i.e., the expressway driving cycle, the suburban road driving cycle, the urban road driving cycle, and the mixed road driving cycle. We also randomly divide the clean data for each class into the training set and test set by the ratio of 7:3. The cleaned data distribution is shown in Table 3.

4.2.1. Results by the Naive Bayesian Method

We use the Naive Bayes method to perform the four-classification experiment on the cleaned data, and the results are shown in Figure 9. Among them, the recognition accuracy of the expressway driving cycle, the suburban road driving cycle, the urban road driving cycle, and the mixed road driving cycle are 93.53%, 73.73%, 97.01%, and 25.51%, respectively. The probabilities of recognizing the expressway driving cycle as the suburban road driving cycle, the urban road driving cycle, and the mixed road driving cycle are 0.63%, 0%, and 5.84%, respectively. The probabilities of recognizing the suburban road driving cycle as the expressway driving cycle, the urban road driving cycle, and the mixed road driving cycle are 6.28%, 4.34%, and 15.65%, respectively. The probabilities of recognizing the urban road driving cycle as the expressway driving cycle, the suburban road driving cycle, and the mixed road driving cycle are 0%, 2.99%, and 0%, respectively. The probabilities of recognizing the mixed road driving cycle as the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle are 43.22%, 15.8%, and 15.47%, respectively.

4.2.2. Results by Multilayer Perceptron Model

We use the multilayer perceptron model to perform a four-type classification experiment on the cleaned data, and the results are shown in Figure 10. We obtain an accuracy for the expressway driving cycle, the suburban road driving cycle, the urban road driving cycle, and the mixed road driving cycle of 98.43%, 85.51%, 97.85%, and 84.42%, respectively.

4.3. Dataset and Comparison on the Three-Classes of Cleaned Data

The data for the mixed road driving cycle is ambiguous due to irregular driving behaviors, obstacle segments caused by environmental factors [41], or driving faults. Therefore, we discard the data of the mixed road driving cycle in this section. We conduct the three-type classification experiments on the cleaned data, which classifies the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle. The distribution of training data and test data is shown in Table 4 below.

4.3.1. Results by Naive Bayesian Method

We conduct the three-type classification experiments on the cleaned data using the Naive Bayes method, and the results are shown in Figure 11. Among them, the recognition accuracy of the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle are 93.61%, 89.32%, and 97.01%, respectively. The probabilities of recognizing the expressway driving cycle as the suburban road driving cycle and the urban road driving cycle are 6.39% and 0%, respectively. The probabilities of recognizing the suburban road driving cycle as the expressway driving cycle and the urban road driving cycle are 6.33% and 4.34%, respectively. The probabilities of recognizing the urban road driving cycle as the expressway driving cycle and the suburban road driving cycle are 0% and 2.99%, respectively.

4.3.2. Results by Multilayer Perceptron Model

We carry out the three-type classification experiment on the cleaned data using the multilayer perceptron model, and the results are shown in Figure 12. The final result achieves a recognition accuracy for the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle of 99.20%, 97.85%, and 99.40%, respectively.

4.4. Analysis of Results

The accuracy of each road driving cycle with different datasets and different classification methods is summarized in Table 5. Among them, the Naive Bayes is abbreviated as NB and the multilayer perceptron is abbreviated as MLP. The overall accuracy is compared by calculating the average of each experimental result.

We summarize the above results into the following aspects. First, we find that the classification results of the cleaned data are better than the classification results of the raw data. Specifically, in the three-classification experiment based on Naive Bayes, the classification results of the cleaned data are on average 25.84% higher than the classification results of the raw data. In the three-type classification experiments based on the multilayer perceptron model, the classification results for the cleaned data are, on average, 28.38% higher than those for the raw data. Second, the classification results of the multilayer perceptron model are better than those obtained by the naïve Bayes method. Specifically, in the four-type classification experiment based on cleaned data, the classification results of the multilayer perceptron model are, on average, 19.10% higher than the Naive Bayes classification results. In the three-type classification experiments based on cleaned data, the classification results of the multilayer perceptron model are, on average, 5.72% higher than that of Naive Bayes. For time consumption, we calculated all the test data. The average processing time of a piece of data in our multilayer perceptron model is 6.05 × 10⁻⁵ s, while that of Naive Bayes is 3.99 × 10⁻⁴ s. The speed of our method is faster than the Naïve Bayes by five times. All experiments were carried out on the hardware equipment of NVIDIA A100 GPU and AMD EPYC 7542 32-core CPU.

The above results fully demonstrate the priority of the proposed method, both in the data processing and model selection. It may have the potential to be a foundation module for the identification of the driving cycle in the autonomous driving process of commercial vehicles.

5. Discussion

This paper aims to use deep learning methods to improve the recognition accuracy of vehicle driving cycles. Wang et al. [9] proposed an expressway driving cycle recognition method based on the Naive Bayes method, which is an effective method for the optimal design of commercial vehicle driving cycles. Their results showed that the recognition accuracy of expressway driving cycles was 88.26%. The original data of this study is the same as [9]. We achieved 99.83% accuracy in expressway driving cycles based on the multilayer perceptron model. Since the recognition accuracy of suburban driving cycles in the original data is significantly lower than that of urban driving cycles and expressway driving cycles, after further analysis, we found that the raw data of suburban driving cycles contained dirty data. We cleaned the data with the help of the deep learning model, and using the cleaned data, we even achieved better results than the raw data.

Due to the lack of open-source data, our method cannot directly be compared with other methods. However, even with different data, the formulation of the driving cycle recognition is consistent, which is a typical multi-classification problem. This paper has fully verified the effectiveness of the proposed method on three and four classifications, so even if the data changes, this method can still provide accurate recognition results for driving cycles. Driving cycle recognition is a pre-task for many tasks. As our method can effectively improve the accuracy of driving cycle recognition, it can help subsequent tasks achieve better results.

The vehicle driving cycles affect the performance of a hybrid vehicle control strategy. Feng et al. [42] downloaded the velocity data for driving cycles from the U.S. Environmental Protection Agency website. They studied the supervised driving cycle recognition using the k-Nearest Neighbor (kNN) algorithm. Their results showed that adaptive control could improve fuel economy by up to 2.63% with a suitable driving cycle recognition method. Bhatti et al. [43] adopted a hybrid data collection methodology. They developed a real-world urban driving cycle recognition method with a road slope profile for the hilly urban terrain of Islamabad. The driving cycle was constructed using the Markov chain Monte Carlo method, which considered the weights of different road types in the geographic area. Their results showed that, without the driving cycle recognition, the error range of the powertrain system accumulated from 10.2 to 22.2%. In addition, by considering the effect of the driving cycle on the energy management strategy (EMS), Wu et al. [44] proposed a fuzzy EMS based on driving cycle recognition, in order to improve the fuel economy of a parallel hybrid electric vehicle. The simulation research demonstrated that this EMS improved fuel economy more effectively than the fuzzy EMS without driving cycle recognition. In addition, Wang et al. [45] also found that driving cycles greatly influenced the fuel economy and exhaust of the vehicle, especially in hybrid electric vehicles.

6. Conclusions

In this paper, we proposed a multilayer perceptron-based recognition algorithm for commercial vehicle driving cycle class recognition. Based on the 106,200 km data collected from Chinese highways, we chose the average speed, gear ratio, and accelerator pedal opening as the input features for the multilayer perceptron model. Our method achieved significantly better results in the recognition accuracy of the driving cycle compared to the traditional methods using Naive Bayes. Specifically, the proposed method reached 99.83%, 97.85%, and 99.40% recognition accuracy for the expressway driving cycle, the suburban road driving cycle, and the urban road driving cycle. In the future, we would consider classifying the driving cycle for more detailed scenarios, so as to provide supportive guidance for autonomous commercial vehicles.

Our methods have several limitations. We do not test our method on other vehicle types, except commercial vehicles. As different vehicle types may require different data descriptions, we will consider expanding data features on more vehicle types in future work. In addition, although we find the mixed driving cycle from the three driving cycle types, it may only cover some diversified driving scenarios in the real world. Therefore, we will thoroughly classify the driving cycles into more diversified types in future work.

This paper mainly focuses on designing an accurate driving cycle recognition model, so we have not validated the proposed method with subsequent applications. However, our method is flexible, in order to integrate it with driving style recognition and fuel economy analysis. It can provide accurate driving cycle recognition results for commercial vehicles and guide commercial vehicle drivers to choose an appropriate driving mode. This process can reduce fuel consumption and environmental pollution. It improves fuel economy and forms a sustainable development model. Moreover, new energy vehicles and intelligent driving systems have developed quickly in recent years. Driving cycle recognition can also select the appropriate driving mode for the new energy vehicles and intelligent driving systems. It reduces the long-time driving burden of drivers and lowers accident mortality. The accurate recognition of the driving cycle makes driving mode selection convenient and efficient, which can further guarantee the safety and harmony of society.

Author Contributions

Conceptualization, X.W.; methodology, X.W. and Y.Z.; software, X.W. and Y.Z.; validation, X.W. and Y.Z.; writing—original draft preparation, X.W. and Y.Z.; writing—review and editing, X.W., Y.Z. and W.L.; supervision, X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key R&D Plan of Heilongjiang Province, grant number JD22A014, National Natural Science Foundation of China, grant number 52175497.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Han, B.; Wu, Z.; Gu, C.; Ji, K.; Xu, J. Developing a Regional Drive Cycle Using GPS-Based Trajectory Data from Rideshare Passenger Cars: A Case of Chengdu, China. Sustainability 2021, 13, 2114. [Google Scholar] [CrossRef]
Kim, Y.; Park, J.; Oh, C. A Crash Prediction Method Based on Artificial Intelligence Techniques and Driving Behavior Event Data. Sustainability 2021, 13, 6102. [Google Scholar] [CrossRef]
De Gennaro, M.; Paffumi, E.; Martini, G. Big Data for Supporting Low-Carbon Road Transport Policies in Europe: Applications, Challenges and Opportunities. Big Data Res. 2016, 6, 11–25. [Google Scholar] [CrossRef]
Tong, H.Y.; Hung, W.T. A Framework for Developing Driving Cycles with On-Road Driving Data. Transp. Rev. 2010, 30, 589–615. [Google Scholar] [CrossRef]
Lei, Z.; Cheng, D.; Liu, Y.; Qin, D.; Zhang, Y.; Xie, Q. A Dynamic Control Strategy for Hybrid Electric Vehicles Based on Parameter Optimization for Multiple Driving Cycles and Driving Pattern Recognition. Energies 2017, 10, 54. [Google Scholar] [CrossRef]
Jeon, S.-I.; Jo, S.-T.; Park, Y.-I.; Lee, J.-M. Multi-mode driving control of a parallel hybrid electric vehicle using driving pattern recognition. J. Dyn. Sys. Meas. Control 2002, 124, 141–149. [Google Scholar] [CrossRef]
Shi, S.; Yu, Z.; Lin, L.; Wu, D.; Zhang, Y. Road Hierarchy for Vehicle Driving Cycle Data Collection Based on K-core Algorithm. China J. Highw. Transp. 2016, 29, 170–178. [Google Scholar]
Shi, Q.; Qiu, D.; Wu, B.; Li, Y.; Liu, B. DCR and Applications Based on PSO-SVM Algorithm. China Mech. Eng. 2018, 29, 1875–1883. [Google Scholar]
Wang, X.; Shi, S.; Pei, Y. Identification of Expressway Driving Cycles for Optimization of Commercial Vehicle Driving Cycles. China J. Highw. Transp. 2022, 35, 355–362. [Google Scholar]
He, H.; Guo, J.; Peng, J.; Tan, H.; Sun, C. Real-time global driving cycle construction and the application to economy driving pro system in plug-in hybrid electric vehicles. Energy 2018, 152, 95–107. [Google Scholar]
Topic, J.; Skugor, B.; Deur, J. Neural Network-Based Prediction of Vehicle Fuel Consumption Based on Driving Cycle Data. Sustainability 2022, 14, 744. [Google Scholar] [CrossRef]
Vapnik, V.N. An overview of statistical learning theory. ITNN 1999, 10, 988–999. [Google Scholar] [CrossRef]
Breiman, L. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Stat. Sci. 2001, 16, 199–231. [Google Scholar] [CrossRef]
Wang, L.; Han, M.; Li, X.; Zhang, N.; Cheng, H. Review of Classification Methods on Unbalanced Data Sets. IEEE Access 2021, 9, 64606–64628. [Google Scholar] [CrossRef]
Bhowan, U.; Johnston, M.; Zhang, M.; Yao, X. Evolving Diverse Ensembles Using Genetic Programming for Classification with Unbalanced Data. IEEE Trans. Evol. Comput. 2013, 17, 368–386. [Google Scholar] [CrossRef]
Wang, X. Research on Unbalanced Data Classification Based on Generated Data Enhancement. Ph.D. Thesis, Beijing Jiaotong University, Beijing, China, 2021. [Google Scholar]
García, S.; Ramírez-Gallego, S.; Luengo, J.; Benítez, J.M.; Herrera, F. Big data preprocessing: Methods and prospects. Big Data Anal. 2016, 1, 9. [Google Scholar] [CrossRef]
Hao, S.; Li, G.; Feng, J.; Wang, N. Survey of structured data cleaning methods. J. Tsinghua Univ. Sci. Technol. 2018, 58, 1037–1050. [Google Scholar]
Chu, X.; Ilyas, I.F.; Krishnan, S.; Wang, J. Data cleaning: Overview and emerging challenges. In Proceedings of the 2016 International Conference on Management of Data, San Francisco CA USA, 26 June–1 July 2016; pp. 2201–2206. [Google Scholar]
Ma, X.; Yu, Q.; Liu, J. Modeling Urban Freeway Rear-End Collision Risk Using Machine Learning Algorithms. Sustainability 2022, 14, 12047. [Google Scholar] [CrossRef]
Huang, X.; You, F.; Zhang, P.; Zhang, Z.; Zhang, B.; Lv, J.; Xu, L. Silent liveness detection algorithm based on multi classification and feature fusion network. J. Zhejiang Univ. Eng. Sci. 2022, 56, 263–270. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Deng, L.; Yu, D. Deep learning: Methods and applications. Found. Trends Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef]
Nielsen, M.A. Neural Networks and Deep Learning; Determination Press: San Francisco, CA, USA, 2015; Volume 25. [Google Scholar]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
Bratsas, C.; Koupidis, K.; Salanova, J.-M.; Giannakopoulos, K.; Kaloudis, A.; Aifadopoulou, G. A Comparison of Machine Learning Methods for the Prediction of Traffic Speed in Urban Places. Sustainability 2020, 12, 142. [Google Scholar] [CrossRef]
Murtagh, F. Multilayer perceptrons for classification and regression. Neurocomputing 1991, 2, 183–197. [Google Scholar] [CrossRef]
Perrotta, F.; Parry, T.; Neves, L.C. In Application of machine learning for fuel consumption modelling of trucks. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 3810–3815. [Google Scholar]
Han, B.; Ren, F. Improved Xception Facial Expression Recognition Based on MLP. J. Hunan Univ. Nat. Sci. 2022, 49, 65–72. [Google Scholar]
Li, Y.; Tang, G.; Du, J.; Zhou, N.; Zhao, Y.; Wu, T. Multilayer Perceptron Method to Estimate Real-World Fuel Consumption Rate of Light Duty Vehicles. IEEE Access 2019, 7, 63395–63402. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
Sai, Q.; Bi, J.; Xie, D.; Guan, W. Identifying and Predicting the Expenditure Level Characteristics of Car-Sharing Users Based on the Empirical Data. Sustainability 2019, 11, 6689. [Google Scholar] [CrossRef]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Agarap, A.F. Deep learning using rectified linear units (relu). arXiv 2018, arXiv:1803.08375. [Google Scholar]
China Automotive Technology Research Center Co., Ltd. Driving cycle of Chinese motor vehicles Part 2: Heavy commercial vehicles. In State Administration of Market Supervision and Administration; China National Standardization Administration: Beijing, China, 2019; Volume GB/T 38146.2-2019, p. 80. [Google Scholar]
Babyak, M.A. What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosom. Med. 2004, 66, 411–421. [Google Scholar]
Pang, B.; Nijkamp, E.; Wu, Y.N. Deep Learning with TensorFlow: A Review. J. Educ. Behav. Stat. 2020, 45, 227–248. [Google Scholar] [CrossRef]
Tensorflow. Available online: https://tensorflow.org (accessed on 7 November 2021).
Smith, L.N. In Cyclical learning rates for training neural networks. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 464–472. [Google Scholar]
Bock, S.; Weiß, M. In A proof of local convergence for the Adam optimizer. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
Roh, C.-G.; Im, I.J. A Review on Handicap Sections and Situations to Improve Driving Safety of Automated Vehicles. Sustainability 2020, 12, 5509. [Google Scholar] [CrossRef]
Feng, L.; Liu, W.; Chen, B. Driving pattern recognition for adaptive hybrid vehicle control. SAE Int. J. Altern. Powertrains 2012, 1, 169–179. [Google Scholar] [CrossRef]
Bhatti, A.H.U.; Kazmi, S.A.A.; Tariq, A.; Ali, G. Development and analysis of electric vehicle driving cycle for hilly urban areas. Transp. Res. Part D Transp. Environ. 2021, 99, 103025. [Google Scholar] [CrossRef]
Wu, J.; Zhang, C.H.; Cui, N.X. Fuzzy Energy Management Strategy for a Hybrid Electric Vehicle Based on Driving Cycle Recognition. Int. J. Automot. Technol. 2012, 13, 1159–1167. [Google Scholar] [CrossRef]
Wang, J.; Wang, Q.N.; Zeng, X.H.; Wang, P.Y.; Wang, J.N. Driving cycle recognition neural network algorithm based on the sliding time window for hybrid electric vehicles. Int. J. Automot. Technol. 2015, 16, 685–695. [Google Scholar] [CrossRef]

Figure 1. The overview structure of our model.

Figure 2. Model structure of the commercial vehicle driving cycle recognition based on multilayer perceptron.

Figure 3. Track map of test commercial vehicle: (a) The driving track of 11 commercial vehicles from Guangzhou to Zhengzhou, Jilin to Hangzhou, Xining to Lanzhou, etc.; (b) The driving track of 10 commercial vehicles from Nanjing to Beijing, Kunming to Nantong, Yuzhou to Kashgar, etc.

Figure 4. The results of the full data overfit experiment for three types of driving cycle.

Figure 5. Box plots of data characteristics of three types of the driving cycle in raw data: (a) Average speed; (b) Idle gear ratio; (c) Low gear ratio; (d) Medium gear ratio; (e) High gear ratio; (f) Accelerator pedal opening.

Figure 6. Box plots of data characteristics for four types of cleaned data conditions: (a) Average speed; (b) Idle gear ratio; (c) Low gear ratio; (d) Medium gear ratio; (e) High gear ratio; (f) Accelerator pedal opening.

Figure 7. The recognition results of Naive Bayes on the raw data.

Figure 8. The recognition results of the multilayer perceptron model on the raw data.

Figure 9. The recognition results of Naive Bayes on the four-class cleaned data.

Figure 10. The recognition results of the multilayer perceptron model on the four-class cleaned data.

Figure 11. The recognition results of Naive Bayes on the three-class cleaned data.

Figure 12. The recognition results of the multilayer perceptron model on the three-classes cleaned data.

Table 1. Statistical table of commercial vehicle driving test data.

Serial Number	Test Vehicle	Operating Area	Driven Distance (km)	Date of Data Collection	Date of Data Processing
1	Jiefang J6P420 6×4	Guangzhou–Zhengzhou	3588	2013.10.01	2014.01.17
2	Jiefang J6P420 6×4	Guangzhou–Tianjin	4681	2013.10.01	2014.01.04
3	Jiefang J6P420 6×4	Changchun–Xinjiang–Heilongjiang	11,736	2013.10.14	2014.01.10
4	Jiefang J6P420 6×4	Guangzhou–Nanning–Shanghai–Nanning	5428	2013.10.14	2014.01.10
5	JiefangJ6P460 6×4	Yunnan Province	1289	2013.10.16	2014.01.11
6	Jiefang J6P420 6×4	Fujian Province	387	2013.10.18	2014.01.18
7	Jiefang J6P420 6×4	Xiamen–Suzhou	5359	2013.10.23	2014.01.11
8	JiefangJ6P390 6×4	Nanjing–Beijing	2366	2013.10.23	2014.01.04
9	Jiefang J6P390 6×4	Jiangsu Province	1530	2013.10.29	2014.01.10
10	Jiefang J6P420 6×4	Hangzhou–Nanning–Qinzhou–Hangzhou	3787	2013.11.01	2014.01.12
11	Jiefang J6P390 6×4	Guangdong–Henan	9291	2013.11.04	2014.01.17
12	Jiefang J6P420 6×4	Jilin–Henan–Hangzhou	6327	2013.11.05	2014.01.18
13	Jiefang J6P390 6×4	Changchun–Hebei–Yanbian–Jilin–Panshi	3625	2013.11.06	2014.01.02
14	Jiefang J6P390 6×4	Kunming–Nantong	6499	2013.11.13	2014.01.11
15	Jiefang J6P420 6×4	Jilin–Sichuan–Yunnan–Guangzhou	10,146	2013.11.18	2014.01.18
16	Jiefang J6P420 6×4	Xining–Lanzhou	1196	2013.11.20	2014.01.11
17	Jiefang J6P420 6×4	Yuzhou–Zhengzhou–Kashgar (Aksu, Urumqi) round-trip	9185	2013.11.22	2014.01.04
18	Jiefang J6P420 6×4	Guangzhou–Chengde	8532	2013.11.22	2014.01.10
19	Jiefang J6P390 6×4	Hefei–Chongqing–Guizhou–Hefei	5190	2013.11.25	2014.01.02
20	Jiefang J6P390 6×4	Sanming–Jiujiang, Sanming–Ganzhou	3383	2013.11.27	2014.01.17
21	Jiefang J6P420 6×4	Guangzhou–Suihua–Guangzhou–Harbin	9663	2013.12.02	2014.01.03

Table 2. Distribution of training and testing dataset of raw three-class data.

	The Expressway Driving Cycle	The Suburban Road Driving Cycle	The Urban Road Driving Cycle	Total
Training Set	19,879	4579	1077	25,535
Testing Set	8520	1963	462	10,945
Total	28,399	6542	1539	36,480

Table 3. Distribution of training and testing dataset of cleaned four-class data.

	The Expressway Driving Cycle	The Suburban Road Driving Cycle	The Urban Road Driving Cycle	The Mixed Road Driving Cycle	Total
Training Set	16,942	4457	1948	2187	25,534
Testing Set	7262	1911	836	937	10,946
Total	24,204	6368	2784	3124	36,480

Table 4. Distribution of training and testing dataset of cleaned three-class data.

	The Expressway Driving Cycle	The Suburban Road Driving Cycle	The Urban Road Driving Cycle	Total
Training Set	16,942	4457	1948	23,347
Testing Set	7262	1911	836	10,009
Total	24,204	6368	2784	33,356

Table 5. Summary of classification and recognition accuracy results of each experiment.

Data Set	Method	Expressway	Suburban Road	Urban Road	Mixed Road	Average
Raw 3-class	NB	77.84%	41.01%	83.55%	-	67.47%
Raw 3-class	MLP	77.68%	50.31%	83.95%	-	70.65%
Clean 4-class	NB	93.53%	73.73%	97.01%	25.51%	72.45%
Clean 4-class	MLP	98.43%	85.50%	97.85%	84.42%	91.55%
Clean 3-class	NB	93.61%	89.32%	97.01%	-	93.31%
Clean 3-class	MLP	99.83%	97.85%	99.40%	-	99.03%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Zhao, Y.; Li, W. Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model. Sustainability 2023, 15, 2644. https://doi.org/10.3390/su15032644

AMA Style

Wang X, Zhao Y, Li W. Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model. Sustainability. 2023; 15(3):2644. https://doi.org/10.3390/su15032644

Chicago/Turabian Style

Wang, Xianbin, Yuqi Zhao, and Weifeng Li. 2023. "Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model" Sustainability 15, no. 3: 2644. https://doi.org/10.3390/su15032644

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Recognition of Commercial Vehicle Driving Cycles Based on Multilayer Perceptron Model

Abstract

1. Introduction

2. Multilayer Perceptron Based Model for the Recognition of the Driving Cycle of Commercial Vehicles

3. Data Analysis

3.1. Raw Data Processing

3.2. Three-Classes Raw Data Analysis

3.3. Four-Classes Cleaned Data Analysis

4. Experiment Analysis

4.1. Dataset and Comparison on Three-Classes of Raw Data

4.1.1. Results by Naive Bayesian Method

4.1.2. Results by Multilayer Perceptron Model

4.2. Dataset and Comparison on Four-Classes of Cleaned Data

4.2.1. Results by the Naive Bayesian Method

4.2.2. Results by Multilayer Perceptron Model

4.3. Dataset and Comparison on the Three-Classes of Cleaned Data

4.3.1. Results by Naive Bayesian Method

4.3.2. Results by Multilayer Perceptron Model

4.4. Analysis of Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI