A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption

Zhao, Dengfeng; Li, Haiyang; Hou, Junjian; Gong, Pengliang; Zhong, Yudong; He, Wenbin; Fu, Zhijun

doi:10.3390/en16145258

Open AccessReview

A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption

by

Dengfeng Zhao

¹,

Haiyang Li

^1,*,

Junjian Hou

¹,

Pengliang Gong

²,

Yudong Zhong

¹

,

Wenbin He

¹ and

Zhijun Fu

¹

Henan Provincial Key Laboratory of Intelligent Manufacturing of Mechanical Equipment, Mechanical and Electrical Engineering Institute, Zhengzhou University of Light Industry, Zhengzhou 450002, China

²

Zhengzhou Senpeng Electronic Technology Co., Ltd., Zhengzhou 450052, China

^*

Author to whom correspondence should be addressed.

Energies 2023, 16(14), 5258; https://doi.org/10.3390/en16145258

Submission received: 3 June 2023 / Revised: 23 June 2023 / Accepted: 27 June 2023 / Published: 9 July 2023

(This article belongs to the Section I1: Fuel)

Download

Browse Figures

Versions Notes

Abstract

:

Accurately and efficiently predicting the fuel consumption of vehicles is the key to improving their fuel economy. This paper provides a comprehensive review of data-driven fuel consumption prediction models. Firstly, by classifying and summarizing relevant data that affect fuel consumption, it was pointed out that commonly used data currently involve three aspects: vehicle performance, driving behavior, and driving environment. Then, from the model structure, the predictive energy and the characteristics of the traditional machine learning model (support vector machine, random forest), the neural network model (artificial neural network and deep neural network), and this paper point out that: (1) the prediction model of fuel consumption based on neural networks has a higher data processing ability, higher training speed, and stable prediction ability; (2) by combining the advantages of different models to build a hybrid model for fuel consumption prediction, the prediction accuracy of fuel consumption can be greatly improved; (3) when comparing the relevant indicts, both the neural network method and the hybrid model consistently exhibit a coefficient of determination above 0.90 and a root mean square error below 0.40. Finally, the summary and prospect analysis are given based on various models’ predictive performance and application status.

Keywords:

fuel consumption; data-driven; machine learning; neural network; hybrid model

1. Introduction

With the development of the automotive industry, environmental pollution caused by vehicle exhaust emissions greatly impacts the human living environment and physical health [1]. The fuel consumed by vehicles mainly comes from non-renewable energy such as petroleum, and the problem of excessive fuel consumption and exhaust emissions urgently needs to be addressed. Accurately predicting a vehicle’s fuel consumption can help drivers adjust their driving strategies, optimize fuel efficiency, save fuel costs, and reduce environmental pollution. Furthermore, by monitoring real-time fuel consumption data, potential accidents can be effectively prevented, thereby enhancing driving safety.

At present, the prediction methods for vehicle fuel consumption can be roughly classified into two categories (as shown in Figure 1): (1) a physical fuel consumption prediction model constructed based on the principle of vehicle dynamics; (2) a data-driven fuel consumption prediction model.

The first type of model is mainly built through mathematical formulas based on the internal structure of the vehicle and the working principles of components, such as the engine. The model’s transparency is high and can provide more accurate prediction results [2]. However, the research of this type of model is mainly focused on the fixed path in some specific areas, and the influence of different road types and weather conditions on fuel consumption is often ignored, resulting in a single data dimension and the poor applicability of models. For example, Chang et al. [3] used sensors installed on a certain road section to obtain vehicle state parameters at specific locations as model inputs, while Huang et al. [4] used traditional microscopic models (MOVES) to predict vehicle fuel consumption, only using data obtained under simple road conditions, resulting in an inability to guarantee the predictive effect of fuel consumption in practical applications.

The second type of model mainly relies on sensors and other on-board equipment to obtain a large number of vehicle operation data related to fuel consumption. By mining the optimal features from the data and establishing the nonlinear relationship between the data and fuel consumption, fuel consumption prediction is realized. Compared to the traditional physical model, the data-driven model is simple and easy to construct. It can automatically execute repetitive and tedious tasks, saving time and resources while ensuring good accuracy. The performance of various data-driven models is usually achieved through some evaluation indicators, including the coefficient of determination (R²), mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), scatte^r index (SI), and uncertainty with a 95% confidence level (U₉₅). The calculation equation and evaluation standard for each index are shown in Table 1.

When using physical models to predict fuel consumption, researchers are required to possess strong domain knowledge, and the application scope is limited, making it challenging to widely apply them to vehicles with different characteristics. The introduction of data-driven methods effectively addresses the limitations of physical models. Well-trained models in data-driven approaches can consider data from different factors as inputs, and their application extends to a wider range of vehicle types. This paper discusses the applications of traditional machine learning (Traditional ML) and the neural network method in the prediction models of fuel consumption based on data-driven models (the process of realizing a fuel consumption prediction by the model is shown in Figure 2). Therefore, based on the existing studies listed in this paper, the above two commonly used data-driven methods are summarized from the two aspects of research years and the number of researchers (as shown in Figure 3).

Machine learning is a classic data-driven fuel consumption prediction method [30] that includes support vector machine (SVM), random forest (RF), decision tree (DL), etc. For example, Heni et al. [31] used SVM and gradient boosting machines to perform nonlinear regression analysis on data, and a large number of experiments based on real conditions have demonstrated the superiority of machine learning methods in fuel consumption prediction models. Hamed et al. [5] established a functional relationship between vehicle speed and fuel consumption based on a support vector machine, with a R² of up to 0.97. Abukhalil et al. [32] used on-board diagnostic (OBD-II) system data to construct an SVM model to estimate fuel consumption, with an RMSE of 2.43. SVM has a concise and interpretable structure, but it is more suitable for data processing on a smaller scale. On the other hand, the random forest algorithm can handle high-dimensional data and perform well in predicting fuel consumption [33] by effectively reducing dimensionality. For example, Gong et al. [34] considered 21 factors that cause fuel consumption to establish a random forest fuel consumption model with a predictive accuracy of 86%. Zhang et al. [35] established a fuel consumption estimation model based on the least squares method using vehicle speed and acceleration. Zhu et al. [36] proposed a prediction model based on the improved C4.5 decision tree and verified the effectiveness of the model by relying on a set of test data under the expressway scenario. In addition, the application of gradient boosting algorithms [37,38], LightGBM [39], and linear regression (LR) [40] in fuel consumption prediction models has also achieved good results. In order to give full play to the advantages of traditional machine learning methods, Li [28] and Mahzad [41] et al. developed multiple hybrid models, including the Aquila optimizer and extreme gradient boosting (AO-XGB), black widow optimization algorithm and extreme gradient boosting (BWOA-XGB), AO-SVM, AO-RF, etc. The results show that such models have better generalization ability and more reliable prediction accuracy.

Traditional machine learning methods have limited data processing capabilities and struggle to adapt to complex large-scale datasets. As a method of solving complex problems in the field of engineering, neural networks are more likely to capture minor changes in complex data, to have good generalization capabilities, and to adapt to changes in new data and new environments. At present, this technology has been widely used in the predictions of emissions and fuel consumption of vehicles [42,43,44], ships [45,46,47], and aircraft [48,49]. Katreddi et al. [50] established an artificial neural network model using engine speed, vehicle speed, and other data to predict the fuel consumption of a vehicle in a single journey. The results show that the performance of the artificial neural network (ANN) is better than traditional machine learning methods, such as LR and RF. Zargarnezhad et al. [51] successfully estimated the additional fuel consumption caused by an increase in vehicle weight using an artificial neural network based on the relationship between changes in vehicle weight and engine displacement and fuel consumption, with an MSE of 0.308. Topić et al. [6] trained different fuel consumption prediction models by the driving cycle data obtained by GPS and CAN bus and proved that the ANN model has higher accuracy and a higher ability of execution. Neural networks represented by ANN typically require data preprocessing and lack memory functionality during the prediction process. With the development of digital technology and the enrichment of network storage functions, deep neural network learning methods represented by a recurrent neural network (RNN), convolutional neural network (CNN), and multi-layer perceptron neural network (MLP) can directly extract features from original data without relying on feature engineering, reducing the time cost of human participation in feature extraction. For example, Ali et al. [52] used the multi-layer perceptron (MLP) neural network to build a fuel consumption model with the total weight and vehicle speed of trucks as input and obtained a relatively low MSE of 0.0017. Due to the influence of time series on vehicle fuel consumption, conventional neural network methods are ineffective in addressing this issue. Panapakidis et al. [53] proposed a mixed model based on RNN, which fully evaluated the influence of exogenous parameters on ship fuel consumption and proved the superiority of the RNN prediction model. RNN is proficient at extracting vehicle fuel consumption features related to time series, especially with long and short-term memory (LSTM) networks, which can handle data with long time intervals and significant delays more effectively. Bougiouklis et al. [54] proposed an electric vehicle energy management strategy based on the LSTM neural network, which reduced energy consumption by 24.03%. Based on the advantages of CNN in image processing, Valido et al. [55] captured the image sequence of vehicles running on the road through the camera sensor, then located the position coordinates of each vehicle in the image through CNN, and finally realized the estimation of vehicle emission and fuel consumption based on distance information and speed, with an average error of 5.48%. In summary, the neural network method has good fault tolerance and robustness, can deal with missing data and noise, and can adjust model parameters adaptively. Therefore, applying it in the prediction models of fuel consumption can obtain prediction results that are highly correlated with the actual fuel consumption of vehicles.

In this paper, various data-driven prediction models of fuel consumption in recent years were reviewed from the aspects of model construction, data acquisition, and prediction performance. By comparing the traditional machine learning method with the neural network method, the paper points out that: (1) the fuel consumption prediction model based on the neural network can better mine the feature information related to fuel consumption in the data, can establish the nonlinear relationship between sensor data and fuel consumption prediction, and has better model generalization ability, higher stability, and better prediction accuracy; (2) prediction models with a single method or a single dimension tend to pay too much attention to the details of the training data set and cannot be generalized, which easily leads to the overfitting of the model and ignores the impact of the interaction between different factors on fuel consumption; (3) the application of the hybrid model and multivariate data fusion technology can fully consider the influence of multi-dimensional factors such as person-vehicle-road on fuel consumption, and the research in this field will be the future development direction.

The remaining structure of this paper is as follows: The second section categorizes and summarizes the relevant data that affect fuel consumption and introduces the methods of obtaining different types of data. The third section discusses and analyzes fuel consumption prediction methods’ characteristics and research status. Finally, the paper summarizes the characteristics of different fuel consumption prediction models and gives prospects.

2. Data Analysis of Vehicle Fuel Consumption

The data required by the prediction model of fuel consumption mainly depend on variables that have an impact on vehicle fuel consumption. Generally speaking, they can be roughly divided into three categories: vehicle inherent variables, driving behavior variables, and driving environment variables (as shown in Table 2). Vehicle inherent variables include vehicle and engine model, engine capacity, total vehicle mass, etc., which can be obtained according to the information resources provided by vehicle manufacturers. Driving behavior variables are mainly based on the changes in vehicle running state data caused by the driver’s behaviors such as stepping on the pedal and turning during vehicle starting and stopping and vehicle running. Driving environment variables mainly come from the influence of uncontrollable factors such as weather factors, altitudes, and the road slope on the vehicle’s running state. These two types of data mainly include driving speed and acceleration, engine speed and torque, load rate of engine and revolution, driving distance, etc. The changes in vehicle operating data caused by the above variables can be obtained through onboard sensing devices such as GPS, gyroscope, OBD-II, and an on-board controller area network (CAN). Considering the convenience of data uploading and the portability of devices, smartphone sensors and apps can also be used to obtain information such as vehicle operation data, driving behavior, and the real fuel consumption of vehicles. Research manuscripts reporting large datasets that are deposited in a publicly available database should specify where the data have been deposited and provide the relevant accession numbers. If the accession numbers have not yet been obtained at the time of submission, it will state that they will be provided during review. They must be provided before publication.

Due to the possibility of errors or communication failures during the data collection process, it is necessary to filter the original data to eliminate invalid and redundant data, including duplicate data, extreme data, and not-running vehicle data (such as data with a speed of 0 or GPS information not updated for a long time). For the filtered effective data, depending on feature engineering, is necessary to establish a dataset with a high correlation to fuel consumption, aiming to improve the robustness and convergence speed of the prediction model (as shown in Figure 4). This process mainly involves two stages: feature dimensionality reduction and feature selection. Principal component analysis (PCA) and Pearson correlation coefficient (PCC) are the most commonly used techniques. PCA is commonly used for unsupervised dimensionality reduction tasks. It employs linear transformations to map the original features to a new low-dimensional space. By maximizing the variance, PCA enhances the separability between samples, thereby preserving as much original data information related to fuel consumption as possible. The PCC method is used to measure the strength of the linear relationship between two variables, and a correlation coefficient between −1 and 1 is obtained by calculating the variance between variables to achieve the best variable screening. In addition, sensitivity analysis is also used to determine the influence of an input variable on an output variable. The higher the sensitivity coefficient, the stronger the correlation between the variables (see Equation (8)):

W_{i} = \frac{M S E_{i}}{M S E}

(8)

where

W_{i}

is the sensitivity coefficient of the model to the variable

i

;

M S E_{i}

is the MSE of the model without the variable

i

.

3. Prediction Models of Vehicle Fuel Consumption

Machine learning is a method that uses algorithms and statistical models to learn general rules from observed data to predict unknown variables, including traditional machine learning (RF, SVM, LR, etc.) and neural network learning (ANN, FNN, BPNN, LSTM, etc.). It has the advantages of a simple structure, high prediction accuracy, and strong ability in data recognition, which can facilitate the establishment of nonlinear relationships between observation data and automotive fuel consumption and can be applied more in the field of automotive fuel consumption prediction. This part mainly introduces the structure, principle, and performance of fuel consumption prediction models.

3.1. SVM Model

SVM is mainly used to solve binary classification tasks and relies on the kernel function to realize mapping from low-dimensional data to high-dimensional data. Its working principle is shown in Figure 5. Figure 6 shows the process of using support vector machines for fuel consumption prediction.

SVM is suitable for processing small sample data and can select different kernel functions based on the type of datasets, which can be well applied to the fitting problem of fuel consumption models [56,57]. Wang et al. [58] used SVM to predict aircraft fuel consumption, with an average estimation error of −0.039%. Hussain et al. [7] used on-board sensor data to predict bus fuel consumption, and the R² of the SVM model was 0.95. Araújo et al. [21] used the SVM algorithm to predict the influence of road–tire interaction on vehicle energy consumption, and the results showed that using more data as input could improve the prediction performance of the model and reduce the error. However, the aforementioned studies used partially abstract input data, which posed challenges in terms of measurability. In response, Liu et al. [25] simplified the data collection process by using the most common variables, such as engine speed, vehicle speed, and acceleration, as inputs. They also achieved favorable prediction results, with a mean absolute error (MAE) of less than 0.16. Support vector regression (SVR) is a special algorithm for solving regression problems in SVM [59].

Nevertheless, studies that employed a single SVM model to predict fuel consumption cannot guarantee idealized prediction results. Ma et al. [60] used the Asymmetric ε-band fuzzy support vector regression based on the data domain description (ASVDD) to predict fuel consumption, and the MSE was less than 0.00127. Using a multi-fusion algorithm model to forecast fuel consumption can not only improve the overall performance of the model but can also improve the overfitting problem caused by noise and other factors. For example, Li et al. [61] proposed a coupled model of extreme learning machine and support vector machine (SVM-ELM) to predict energy consumption, with an R² of 0.99. In addition, when using the least squares method (LSM) [62] and genetic algorithm (GA) [17] in the support vector machine model, the prediction performance can also be significantly improved.

According to the input of different data sources and data characteristics, the SVM model has different adaptability and performance. The results of relevant studies are shown in Table 3.

3.2. RF Model

The RF model is a supervised learning algorithm that uses an optimal decision tree to learn rules from a given sample. The importance of each data can be estimated during the execution of the algorithm, and missing values can be automatically processed. The working principle is shown in Figure 7. In Figure 8, bagging stage 1 performs random sampling with dropout to achieve the reorganization of the new dataset; in bagging stage 2, the decision tree optimizes the relevant data and extracts features; the voting period is responsible for voting on the results generated by each decision tree to determine the optimal fuel consumption output.

RF has fewer parameters, which can compensate for the difficulty of parameter tuning in SVM and exhibit better performance. This point was demonstrated in the research conducted by Pereira et al. [65]. Therefore, RF is widely applied in the prediction of fuel consumption [66] and energy consumption [67]. Perrotta et al. [10] established three truck fuel consumption prediction models, namely SVM, ANN, and RF, with determination coefficients of 0.83, 0.85, and 0.87, respectively. Due to the mentioned studies’ specific focus on articulated trucks and the limited consideration of variables during the data acquisition phase, such as neglecting the impact of factors such as temperature and driving behavior on fuel consumption [68,69,70], the model’s accuracy was compromised, resulting in poorer precision. In the literature [71], Massoud et al. used RF to analyze the relationship between driving behavior data and fuel consumption and took characteristic parameters representing speed and engine speed as input, R² and MSE are 0.896 and 1.506. Yang et al. [18] considered the influence of vehicle factors, environmental factors, and driving behavior factors on fuel consumption, and the MAE of the random forest fuel consumption prediction model was 0.63, which greatly improved the generalization level of the model.

When considering the influence of multiple factors as inputs, the random forest (RF) model exhibited poorer performance. Therefore, Hu et al. [72] proposed a hybrid model consisting of RF, XGBoosting, and multiple linear regression (MLR) methods to predict ship fuel consumption. Compared to using a single model, this hybrid model achieved smaller error values. Shi et al. [29] integrated the improved arithmetic optimization algorithm with RF, and the application effect of the model was significantly improved (considering R² and RMSE).

The prediction model of fuel consumption established based on RF can effectively identify the nonlinear relationship between different variables and provide prediction results that are highly correlated with actual fuel consumption, especially the application of the coupling model, which can effectively reduce the error of the fuel consumption prediction model. Table 4 summarizes the literature that used the RF model to predict fuel consumption.

3.3. Neural Network Model

As a mathematical model that mimics the structure and function of the biological brain, ANN can accept multiple data inputs to achieve single- or multiple-result outputs, and this is something SVM and RF models do not possess. Figure 9 shows the structure of the artificial neural network model.

The neural network has good nonlinear mapping ability [73], which can automatically identify and learn the characteristic information of input data [74,75]. Moreover, it has a strong capability in parallel computing and can show high efficiency in processing large-scale data [76,77]. Huang et al. [78] used a radial basis function neural network (RBFNN) to predict fuel consumption with an accuracy of 85%. However, it should be noted that the model they proposed was trained only on a local dataset, which implies its strong regional bias and limited applicability in other regions. As a result, Wysocki et al. [13] trained a fuel consumption model using heavy-duty truck driving data collected over a five-year span. They employed an ANN model and achieved an RMSE of 0.32. However, it is important to note that historical data can be influenced by environmental changes and vehicle performance degradation, leading to unreliable predictions. To address the issue of low data reliability, Ling et al. [79] proposed a model predictive control (MPC) framework based on artificial neural networks. This framework relied on real-time predictions of vehicle speed to effectively tackle the problem. Simulation experiments demonstrated that HDVs (heavy-duty vehicles) controlled by MPC achieved a 5.9% reduction in fuel consumption. Asher [26] and Sun [14] et al. applied an artificial neural network to the fuel consumption prediction of hybrid electric vehicles, and the MAE of the model was between 0 and 0.1%. In addition, there are a few studies on the fuel consumption prediction model based on a feedforward neural network (FNN) [80]. Topić et al. [6] used vehicle speed to predict the fuel consumption of a bus, and the R² of FNN model was more than 0.97.

There are interactions among different variables that display clear non-linear relationships. However, conventional ANN can only handle variables with linear variations. In contrast, the backpropagation neural network (BPNN) trains and adjusts the weights of connections in the network using nonlinear differential functions and is capable of effectively addressing complex nonlinear problems [81]. Du et al. [82] proposed a BPNN model with a structure of 9-10-1, analyzed fuel consumption levels from two dimensions of time and space, and comprehensively described the relationship between fuel consumption and various influencing factors, and the accuracy of the model reached 81.7%.

While the study considered multiple factors influencing fuel consumption, the accuracy of using a standalone BP neural network for fuel consumption prediction is limited. Therefore, some studies have coupled multiple models to predict fuel consumption [83,84]. For example, Shang et al. [19] used data such as vehicle speed and GPS coordinates, combined with the hidden Markov model (HMM) and BPNN to predict fuel consumption, and the MSE of the model was less than 0.06 and the R² was more than 0.95. Similar coupling models for fuel consumption prediction also include the BPNN model based on the genetic annealing algorithm (GSA) and the BPNN model based on the Cauchy multi-verse optimizer (CMVO).

The results of fuel consumption prediction models based on different neural networks are shown in Table 5. Compared to the single neural network prediction model, the prediction accuracy of fuel consumption can be further improved by coupling the neural network with other methods.

3.4. Deep Neural Network

The neural network model represented by ANN and FNN is a static prediction process based on known historical data. However, the actual vehicle operation data are constantly changing, which may lead to a large deviation between the predicted value and the actual value [91]. Deep neural networks (DNN) are composed of neurons at multiple levels [92]. The automatic learning of data and automatic feature extraction can be realized through multiple feedback training, which is suitable for establishing the dynamic variation process between feature data and fuel consumption [93]. Li et al. [94] trained a well-performing MLP fuel consumption model based on factors such as climate conditions and vehicle characteristics to obtain data. Ziółkowski et al. [27] applied the Pearson correlation coefficient method to an MLP model and used 1750 passenger vehicle data to construct a friendly model with a structure of 22-10-3. They were able to effectively control the MAPE within the range of 5% to 8%. While MLP models can address the negative impact of noise factors, they lack the ability to perform recurrent processing compared to RNN and CNN models. MLP models can only propagate information in a forward direction and do not have the capability to autonomously handle relevant data features. RNN is a kind of neural network with time delay characteristics and contains a loop structure inside, which enables it to recursively propagate the input and retain the previously useful data processing state [95]. For example, Xu et al. [20] used a generalized recurrent neural network (GRNN) to establish a relationship model between driving behavior and fuel consumption and took the speed obtained based on different routes as input to obtain lower relative error and MSE. Kanarachos et al. [96] selected RNN based on the nonlinear AutoRegressive with the ExogenousInputs Model (NARX) to predict the instantaneous fuel consumption of the vehicle, and the error between the prediction result and actual fuel consumption was less than 6%. The above two studies relied on the Internet of Vehicles and smartphones for data collection, respectively. However, the data upload process is susceptible to network delays, and the RNNs with general network structures struggle to handle the long-term dependencies in the data and are prone to issues, such as gradient vanishing and gradient exploding. LSTM adds a gating mechanism based on RNN, which can effectively solve the problem of long sequence information attenuation and better deal with long lag data. The structure of LSTM is shown in Figure 10.

Based on the advantages of LSTM in fuel consumption prediction, Ping et al. [97] established an LSTM neural network with a different number of hidden nodes combined with driving behavior and traffic condition data to predict fuel consumption, and the prediction accuracy could reach 84.7%. Kan et al. [98] proposed a heavy truck fuel consumption estimation model based on LSTM, with an average error of 0.137. Jain et al. [99] used LSTM to monitor the instantaneous fuel economy of vehicles, and the overall accuracy of the model exceeded 98%. Based on the data obtained in various scenarios, Wang et al. [100] used LSTM to predict vehicle fuel consumption, and the error was less than 0.1. With a large number of weights for LSTM, it takes several iterations to obtain a well-trained model, and overfitting is prone to occur when the number of input data is insufficient. Therefore, Hua et al. [101] adopted model pre-training and transfer learning to achieve the high-level prediction of energy consumption.

CNN uses convolution operation to extract features from input data, which has a local connection and power-sharing characteristics. Compared to RNN, CNN is suitable for processing data with spatial structures. CNN is composed of a convolutional layer, a pooling layer, and a fully connected layer, as shown in Figure 11. The convolution layer performs convolution calculations for input and extracts features; the pooling layer reduces the computation by reducing the dimension of data; and the full-connection layer is responsible for the output of results.

CNN has certain advantages in image processing [102,103] but is rarely used in vehicle fuel consumption prediction. Hien et al. [15] used a one-dimensional convolutional neural network to estimate the total fuel consumption of vehicles on highways and urban roads, and the R² was 0.99. Yan et al. [104] applied CNN to the energy management of hybrid electric vehicles, effectively improving the fuel economy of vehicles. Han et al. [105] proposed a coupling model concept based on CNN, and Metlek [16] verified the coupling model of CNN and LSTM with 13 different input parameters and obtained a high R² of 0.974.

Deep neural networks have strong adaptability and learning ability, which can accept direct input from original data and transform it into more abstract representations to learn more complex functions from the data. Deep neural networks are known for their strong dependence on input data. The more data that are available for training, the better the performance of the model tends to be. This is particularly true when compared to fuel consumption prediction models built using other methods. Although deep neural networks can approximate any nonlinear continuous function with arbitrary accuracy, it cannot explain the complex decision-making process, and the model visualization is not strong; DNN has higher requirements for data quality, which increases the cost of manual annotation. In addition, the high complexity of the deep neural network model requires higher hardware performance of computing equipment.

4. Summaries and Prospects

4.1. Summaries

This paper mainly reviewed data-driven fuel consumption forecasting methods, including SVM, RF, Ann, BP, and RNN. By comparative analysis, the advantages and disadvantages of various fuel consumption prediction methods were summarized (Table 6), the prediction results were compared (based on the two evaluation indicators of R² and RMSE, as shown in Figure 12), and the following conclusions were drawn:

(1): In the study of the data-driven fuel consumption prediction models, since the fuel consumption process of vehicles is affected by multiple time-varying factors (such as the vehicle running state, driver habits, and driving environment), it is necessary to further consider the problem of poor fit caused by data coupling and so on. To solve this problem, PCA and other methods can be used to reduce the extraction of redundant features and solve the problem of poor model performance on high-dimensional data sets; the Pearson correlation coefficient method can also be used to analyze and screen out features highly correlated with fuel consumption as the input of the model to further ensure that the model has sufficient accuracy.
(2): Traditional machine learning methods have good predictive performance, but some methods need to extract features manually. Existing studies mainly concentrate on the use of a single scenario set, and the model has poor applicability and limited promotion. Therefore, in the data collection stage, considering the fusion of multi-dimensional features for fuel consumption modeling can effectively improve the accuracy and enhance the generalization capacity of the model.
(3): The prediction models of fuel consumption based on neural networks have high accuracy and stability in prediction, but they are too dependent on the size of input data. When the input data are insufficient, it is easy to show poor generalization ability or overfitting problems. To solve this problem, data enhancement can be used to increase the number of samples and maximize the utilization of sample data.
(4): The accuracy of fuel consumption prediction models largely depends on the quality and quantity of input data. Vehicle sensor data are widely used for their advantages of accuracy, reliability, large data volume, and low cost, but there are some problems, such as transmission delay. Using smartphones to obtain data is more real-time, efficient, and convenient. Therefore, in the future, rapid and comprehensive data collection can be achieved by combining onboard sensing devices and smartphones. In addition, when using large-scale datasets for model training, the generalization ability of the model can be effectively improved by using normalization and other processing methods in the preprocessing stage.
(5): The hybrid fuel consumption model is composed of different machine learning methods, which can synthesize the advantages of multiple models to deal with more complex tasks, with strong nonlinear expression ability and good model robustness. However, the structure of this model is complex, the calculation is large, the parameters are not easy to determine, and there are drawbacks in practical application.

From the figure above, the following conclusions can be drawn: (1) Neural network methods, such as BPNN and DNN, applied in fuel consumption prediction models, can provide relatively accurate prediction results. (2) Hybrid prediction models combining machine learning and neural network techniques leverage the advantages of different models, leading to highly correlated predictions with actual fuel consumption values. (3) Based on the error analysis results, deep neural networks and hybrid models demonstrate the best performance. The overall trend of the graphs is stable, and the error levels are relatively consistent.

In summary, the current research on fuel consumption prediction faces challenges in simultaneously meeting the requirements of predictive performance regarding input data and model selection. On the one hand, some researchers tend to overly consider multiple factors influencing fuel consumption, which increases the difficulty in data processing and leads to inadequate model accuracy. On the other hand, some studies lean towards selecting simpler prediction models and a smaller, more manageable dataset to achieve higher predictive performance. In conclusion, by using a rational dataset and selecting suitable neural network methods or hybrid models, it is possible to obtain satisfactory fuel consumption prediction results. However, this approach may increase the complexity of the research, but it can yield stable and reliable outcomes.

4.2. Prospects

The application of traditional machine learning methods, neural network methods, and hybrid models has broken the shackle of traditional fuel consumption prediction methods, and more information related to fuel consumption can be mined through big data and other technical means to accurately predict fuel consumption levels. The neural network model can map the nonlinear relationship between input and output well, and the prediction performance is more stable; the hybrid model can play the advantages of different prediction models more comprehensively, can balance the disadvantages of different methods, and has high prediction accuracy. Therefore, establishing the relationship between on-board data and fuel consumption prediction by coupling different neural network methods will be the development trend in the field of vehicle fuel consumption prediction.

The development of automobile intelligence, data acquisition, and processing makes them more intelligent, efficient, safe, and reliable. Considering the influence of drivers’ driving style, vehicle performance, road environment, and other factors on fuel consumption, it is also the development trend of this field to apply driver-vehicle-road related multivariate data coupling into the input of fuel consumption prediction models to establish a more comprehensive and accurate fuel consumption prediction model.

In future research, in addition to focusing on the aforementioned aspects, researchers should also consider the universality of the chosen models. It is important to conduct a comprehensive fuel consumption analysis by taking into account a variety of vehicle types such as passenger cars, trucks, buses, and the performance differences of these vehicle types in different geographic regions (high-altitude areas, mountainous regions, extremely cold regions, etc.).

Author Contributions

D.Z.: conceptualization; methodology; visualization. H.L.: writing-original draft; writing-review and editing. J.H.: data curation; writing-review and editing. P.G.: funding acquisition; investigation. Y.Z.: writing-review and editing. W.H.: methodology. Z.F.: methodology. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62073298), partly supported by the Key Research and Development Projects in Henan Province in 2022 (221111240200), partly supported by the Key Scientific and Technological Project of Henan Province (232102221040), partly supported by the Opening Project of Key Laboratory of operation safety technology on transport vehicles, Ministry of Transport, PRC (KFKT2022-05, and partly funded by the Research Foundation of Zhengzhou University of Light Industry (2021BSJJ021).

Data Availability Statement

This review is based on the analysis and review of existing research results in the literature, without creating new data sets.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, N.; Yang, L.; Xu, F.; Han, X.; Liu, B.; Zheng, N.; Li, Y.; Bai, Y.; Li, L.; Wang, J. Vehicle Emission Changes in China under Different Control Measures over Past Two Decades. Sustainability 2022, 14, 16367. [Google Scholar] [CrossRef]
Wang, J.; Rakha, H.A. Fuel consumption model for heavy duty diesel trucks: Model development and testing. Transp. Res. Part D Transp. Environ. 2017, 55, 127–141. [Google Scholar] [CrossRef]
Chang, X.; Chen, B.Y.; Li, Q.; Cui, X.; Tang, L.; Liu, C. Estimating Real-Time Traffic Carbon Dioxide Emissions Based on Intelligent Transportation System Technologies. IEEE Trans. Intell. Transp. Syst. 2013, 14, 469–479. [Google Scholar] [CrossRef]
Huang, W.; Guo, Y.; Xu, X. Evaluation of real-time vehicle energy consumption and related emissions in China: A case study of the Guangdong–Hong Kong–Macao greater Bay Area. J. Clean. Prod. 2020, 263, 121583. [Google Scholar] [CrossRef]
Hamed, M.A.; Khafagy, M.H.; Badry, R.M. Fuel Consumption Prediction Model using Machine Learning. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 14569. [Google Scholar] [CrossRef]
Topić, J.; Škugor, B.; Deur, J. Neural Network-Based Prediction of Vehicle Fuel Consumption Based on Driving Cycle Data. Sustainability 2022, 14, 744. [Google Scholar] [CrossRef]
Hussain, M.; O’ Nils, M.; Lundgren, J.; Carratú, M.; Shallari, I. Selection of optimal parameters to predict fuel consumption of city buses using data fusion. In Proceedings of the 2022 IEEE Sensors Applications Symposium (SAS), Sundsvall, Sweden, 1–3 August 2022; pp. 1–6. [Google Scholar]
Çapraz, A.G.; Özel, P.; Şevkli, M.; Beyca, Ö.F. Fuel Consumption Models Applied to Automobiles Using Real-time Data: A Comparison of Statistical Models. Procedia Comput. Sci. 2016, 83, 774–781. [Google Scholar] [CrossRef] [Green Version]
Hassan, M.A.; Salem, H.; Bailek, N.; Kisi, O. Random Forest Ensemble-Based Predictions of On-Road Vehicular Emissions and Fuel Consumption in Developing Urban Areas. Sustainability 2023, 15, 1503. [Google Scholar] [CrossRef]
Perrotta, F.; Parry, T.; Neves, L.C. Application of machine learning for fuel consumption modelling of trucks. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 3810–3815. [Google Scholar]
Yu, P.; Xi, J.; Yamauchi, H. Time and Environment Dependency Aware Fuel Consumption Tracking Method for Improving Drivers and Trucks Management. In Proceedings of the 2021 36th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), Jeju, Republic of Korea, 27–30 June 2021; pp. 1–4. [Google Scholar]
Yu, P.; Yamauchi, H. A Machine Learning Based Fuel Consumption Saving Method with Time and Environment Dependency Aware Management. In Proceedings of the 2022 5th International Conference on Electronics, Communications and Control Engineering (ICECC 2022), Association for Computing Machinery, New York, NY, USA, 25–27 March 2022; pp. 40–49. [Google Scholar]
Wysocki, O.; Deka, L.; Elizondo, D. Heavy duty vehicle fuel consumption modelling using artificial neural networks. In Proceedings of the 2019 25th International Conference on Automation and Computing (ICAC), Lancaster, UK, 5–7 September 2019; pp. 1–6. [Google Scholar]
Sun, R.; Chen, Y.; Dubey, A.; Pugliese, P. Hybrid electric buses fuel consumption prediction based on real-world driving data. Transp. Res. Part D Transp. Environ. 2021, 91, 102637. [Google Scholar] [CrossRef]
Hien, N.L.H.; Kor, A. Analysis and Prediction Model of Fuel Consumption and Carbon Dioxide Emissions of Light-Duty Vehicles. Appl. Sci. 2022, 12, 803. [Google Scholar] [CrossRef]
Metlek, S. A new proposal for the prediction of an aircraft engine fuel consumption: A novel CNN-BiLSTM deep neural network model. Aircr. Eng. Aerosp. Technol. 2023, 95, 838–848. [Google Scholar] [CrossRef]
Wang, J.; Shen, L.; Bi, Y.; Lei, J. Modeling and optimization of a light-duty diesel engine at high altitude with a support vector machine and a genetic algorithm. Fuel 2021, 285, 119137. [Google Scholar] [CrossRef]
Yang, Y.; Gong, N.; Xie, K.; Liu, Q. Predicting Gasoline Vehicle Fuel Consumption in Energy and Environmental Impact Based on Machine Learning and Multidimensional Big Data. Energies 2022, 15, 1602. [Google Scholar] [CrossRef]
Shang, R.; Zhang, Y.; Shen, Z.-J.M.; Zhang, Y. Analyzing the Effects of Road Type and Rainy Weather on Fuel Consumption and Emissions: A Mesoscopic Model Based on Big Traffic Data. IEEE Access 2021, 9, 62298–62315. [Google Scholar] [CrossRef]
Xu, Z.; Wei, T.; Easa, S.; Zhao, X.; Qu, X. Modeling relationship between truck fuel consumption and driving behavior using data from internet of vehicles. Comput. Aided Civ. Infrastruct. Eng. 2018, 33, 209–219. [Google Scholar] [CrossRef] [Green Version]
Araújo, J.P.C.; Palha, C.A.O.; Martins, F.F.; Silva, H.M.R.D.; Oliveira, J.R.M. Estimation of energy consumption on the tire-pavement interaction for asphalt mixtures with different surface properties using data mining techniques. Transp. Res. Part D Transp. Environ. 2019, 67, 421–432. [Google Scholar] [CrossRef] [Green Version]
Ahmadi, M.H.; Ahmadi, M.A.; Ashouri, M.; Astaraei, F.R.; Ghasempour, R.; Aloui, F. Prediction of performance of Stirling engine using least squares support machine technique. Mech. Ind. 2016, 17, 506. [Google Scholar] [CrossRef]
Yao, Y.; Zhao, X.; Liu, C.; Rong, J.; Zhang, Y.; Dong, Z.; Su, Y. Vehicle Fuel Consumption Prediction Method Based on Driving Behavior Data Collected from Smartphones. J. Adv. Transp. 2020, 2020, 9263605. [Google Scholar] [CrossRef]
Yang, L.; Tian, T.; Xu, Y.; Wu, C. Predicting fuel consumption of grain combine harvesters based on random forest. Trans. CSAE 2021, 37, 275–281. [Google Scholar]
Liu, X.; Jin, H. High-precision transient fuel consumption model based on support vector regression. Fuel 2023, 338, 127368. [Google Scholar] [CrossRef]
Asher, Z.D.; Galang, A.A.; Briggs, W. Economic and Efficient Hybrid Vehicle Fuel Economy and Emissions Modeling Using an Artificial Neural Network. SAE Tech. Pap. 2018, 01, 315–322. [Google Scholar]
Ziółkowski, J.; Oszczypała, M.; Małachowski, J.; Szkutnik-Rogoż, J. Use of Artificial Neural Networks to Predict Fuel Consumption on the Basis of Technical Parameters of Vehicles. Energies 2021, 14, 2639. [Google Scholar] [CrossRef]
Li, D.; Zhang, X.; Kang, Q.; Tavakkol, E. Estimation of unconfined compressive strength of marine clay modified with recycled tiles using hybridized extreme gradient boosting method. Constr. Build. Mater. 2023, 393, 131992. [Google Scholar] [CrossRef]
Shi, X.; Yu, X.; Esmaeili-Falak, M. Improved arithmetic optimization algorithm and its application to carbon fiber reinforced polymer-steel bond strength estimation. Compos. Struct. 2023, 306, 116599. [Google Scholar] [CrossRef]
Barbado, A.; Corcho, Ó. Interpretable machine learning models for predicting and explaining vehicle fuel consumption anomalies. Eng. Appl. Artif. Intell. 2022, 115, 105222. [Google Scholar] [CrossRef]
Heni, H.; Diop, S.A.; Renaud, J.; Coelho, L.C. Measuring fuel consumption in vehicle routing: New estimation models using supervised learning. Int. J. Prod. Res. 2023, 61, 114–130. [Google Scholar] [CrossRef]
Abukhalil, T.; Almahafzah, H.; Alksasbeh, M.; Alqaralleh, B.A.Y. Fuel consumption using OBD-II and support vector machine model. J. Robot. 2020, 2020, 9450178. [Google Scholar] [CrossRef]
Wickramanayake, S.; Bandara, H.M.N.D. Fuel consumption prediction of fleet vehicles using Machine Learning: A comparative study. In Proceedings of the 2016 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 5–6 April 2016; pp. 90–95. [Google Scholar]
Gong, J.; Shang, J.; Li, L.; Zhang, C.; He, J.; Ma, J. A Comparative Study on Fuel Consumption Prediction Methods of Heavy-Duty Diesel Trucks Considering 21 Influencing Factors. Energies 2021, 14, 8106. [Google Scholar] [CrossRef]
Zhang, J.; Li, K.; Xu, B.; Li, H. Estimation of Vehicle Instantaneous Fuel Consumption Based on Least Square Method. Qiche Gongcheng/Automo. Eng. 2018, 40, 1151–1157. [Google Scholar]
Zhu, G.; Zhao, L.; Huang, D.; Zhang, P. A method of vehicle fuel consumption estimation based on decision tree. J. Transp. Syst. Eng. Inf. Technol. 2016, 16, 200–206. [Google Scholar]
Bousonville, T.; Dirichs, M.; Krüger, T. Estimating truck fuel consumption with machine learning using telematics, topology and weather data. In Proceedings of the 2019 International Conference on Industrial Engineering and Systems Management (IESM), Shanghai, China, 25–27 September 2019; pp. 1–6. [Google Scholar]
Wang, Q.; Zhang, R.; Lv, S.; Wang, Y. Open-pit mine truck fuel consumption pattern and application based on multi-dimensional features and XGBoost. Sustain. Energy Technol. Assess 2021, 43, 100977. [Google Scholar] [CrossRef]
Zeng, I.Y.; Tan, S.; Xiong, J.; Ding, X.; Li, Y.; Wu, T. Estimation of Real-World Fuel Consumption Rate of Light-Duty Vehicles Based on the Records Reported by Vehicle Owners. Energies 2021, 14, 7915. [Google Scholar] [CrossRef]
Schone, A.; Byerly, A.; dos Santos, E.C., Jr.; Ben-Miled, Z. Route-Sensitive Fuel Consumption Models for Heavy-Duty Vehicles. SAE Int. J. Commer. Veh. 2021, 14, 85–951. [Google Scholar] [CrossRef] [PubMed]
Esmaeili-Falak, M.; Benemaran, R.S. Ensemble deep learning-based models to predict the resilient modulus of modified base materials subjected to wet-dry cycles. Geomech. Eng. 2023, 32, 583–600. [Google Scholar]
Moradi, E.; Miranda-Moreno, L. Vehicular fuel consumption estimation using real-world measures through cascaded machine learning modelling. Transp. Res. Part D Transp. Environ. 2020, 88, 102576. [Google Scholar] [CrossRef]
Dhanalaxmi, B.; Varsha, M.; Chowdary, K.R.; Mokshitha, P. An Enhanced Fuel Consumption Machine Learning Model Used in Vehicles. J. Phys. Conf. Ser. 2021, 1979, 012068. [Google Scholar] [CrossRef]
Yamashita, R.-J.; Yao, H.-H.; Huang, S.-W.; Hackman, A. Accessing and constructing driving data to develop fuel consumption forecast model. IOP Conf. Ser. Earth Environ. Sci. 2017, 113, 012217. [Google Scholar] [CrossRef] [Green Version]
Kim, Y.-R.; Jung, M.; Park, J.-B. Development of a Fuel Consumption Prediction Model Based on Machine Learning Using Ship In-Service Data. J. Mar. Sci. Eng. 2021, 9, 137. [Google Scholar] [CrossRef]
Tarelko, W.; Rudzki, K. Applying artificial neural networks for modelling ship speed and fuel consumption. Neural Comput. Apppl. 2020, 32, 17379–17395. [Google Scholar] [CrossRef]
Jeon, M.; Noh, Y.; Shin, Y.; Lim, O.-K.; Lee, K.; Cho, D. Prediction of ship fuel consumption by using an artificial neural network. J. Mech. Sci. Technol. 2018, 32, 5789–5796. [Google Scholar] [CrossRef]
Baumann, S.; Klingauf, U. Modeling of aircraft fuel consumption using machine learning algorithms. CEAS Aeronaut. J. 2020, 11, 277–287. [Google Scholar] [CrossRef]
Pan, Z.; Chi, C.; Zhang, J. A Model of Fuel Consumption Estimation and Abnormality Detection based on Airplane Flight Data Analysis. In Proceedings of the 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC), London, UK, 1 September 2018; pp. 1–6. [Google Scholar]
Katreddi, S.; Thiruvengadam, A. Trip Based Modeling of Fuel Consumption in Modern Heavy-Duty Vehicles Using Artificial Intelligence. Energies 2021, 14, 8592. [Google Scholar] [CrossRef]
Zargarnezhad, S.; Dashti, R.; Ahmadi, R. Predicting vehicle fuel consumption in energy distribution companies using ANN. Transp. Res. Part D Transp. Environ. 2019, 74, 174–188. [Google Scholar] [CrossRef]
Ali, S.; Saiied, M.A.; Mohammad, M.A.; Mehmet, S.K. Development of a multi-layer perceptron artificial neural network model to determine haul trucks energy consumption. Int. J. Min. Sci. Technol. 2016, 26, 285–293. [Google Scholar]
Panapakidis, I.; Sourtzi, V.-M.; Dagoumas, A. Forecasting the Fuel Consumption of Passenger Ships with a Combination of Shallow and Deep Learning. Electronics 2020, 9, 776. [Google Scholar] [CrossRef]
Bougiouklis, A.; Korkofigkas, A.; Stamou, G. Improving Fuel Economy with LSTM Networks and Reinforcement Learning. In Proceedings of the 27th International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; pp. 230–239. [Google Scholar]
Valido, M.R.; Gomez-Cardenes, O.; Magdaleno, E. Monitoring Vehicle Pollution and Fuel Consumption Based on AI Camera System and Gas Emission Estimator Model. Sensors 2023, 23, 312. [Google Scholar] [CrossRef]
Shi, F.; Chen, J.; Xu, Y.; Karimi, H.R. Optimization of Biodiesel Injection Parameters Based on Support Vector Machine. Math. Probl. Eng. 2013, 2013, 893084. [Google Scholar] [CrossRef] [Green Version]
Zhu, T.; Yin, X.; Na, X.; Li, B. Research on a Novel Vehicle Rollover Risk Warning Algorithm Based on Support Vector Machine Model. IEEE Access 2020, 8, 108324–108334. [Google Scholar] [CrossRef]
Wang, X.; Chen, X. A Support Vector Method for Modeling Civil Aircraft Fuel Consumption with ROC Optimization. In Proceedings of the 2014 Enterprise Systems Conference, Shanghai, China, 2–3 August 2014; pp. 112–116. [Google Scholar]
Zhang, X.; Wang, Y.; He, X.; Ji, H.; Li, Y.; Duan, X.; Guo, F. Prediction of Vehicle Driver’s Facial Air Temperature with SVR, ANN, and GRU. IEEE Access 2022, 10, 20212–20222. [Google Scholar] [CrossRef]
Ma, X.; Zhu, M. Asymmetric ε-band fuzzy support vector regression based on data domain description. In Proceedings of the 27th Chinese Control and Decision Conference (2015 CCDC), Qingdao, China, 23–25 May 2015; pp. 3280–3286. [Google Scholar]
Li, M.; Wang, W.; De, G.; Ji, X.; Tan, Z. Forecasting Carbon Emissions Related to Energy Consumption in Beijing-Tianjin-Hebei Region Based on Grey Prediction Theory and Extreme Learning Machine Optimized by Support Vector Machine Algorithm. Energies 2018, 11, 2475. [Google Scholar] [CrossRef] [Green Version]
Zeng, T.; Zhang, C.; Hu, M.; Chen, Y.; Yuan, C.; Chen, J.; Zhou, A. Modelling and predicting energy consumption of a range extender fuel cell hybrid vehicle. Energy 2018, 165, 187–197. [Google Scholar] [CrossRef]
Zeng, W.; Miwa, T.; Wakita, Y.; Morikawa, T. Exploring trip fuel consumption by machine learning from GPS and CAN bus data. J. East Asia Soc. Transp. Stud. 2015, 11, 906–921. [Google Scholar]
Li, Y.; Zhou, S.; Liu, J.; Tong, J.; Dang, J.; Yang, F.; Ouyang, M. Multi-objective optimization of the Atkinson cycle gasoline engine using NSGA III coupled with support vector machine and back-propagation algorithm. Energy 2023, 262, 125262. [Google Scholar] [CrossRef]
Pereira, G.; Parente, M.; Moutinho, J.; Sampaio, M. Fuel Consumption Prediction for Construction Trucks: A Noninvasive Approach Using Dedicated Sensors and Machine Learning. Infrastructures 2021, 6, 157. [Google Scholar] [CrossRef]
Baumann, S.; Neidhardt, T.; Klingauf, U. Evaluation of the aircraft fuel economy using advanced statistics and machine learning. CEAS Aeronaut. J. 2021, 12, 669–681. [Google Scholar] [CrossRef]
Lin, K.-C.; Lin, C.-N.; Ying, J.J.-C. Construction of Analytical Models for Driving Energy Consumption of Electric Buses through Machine Learning. Appl. Sci. 2020, 10, 6088. [Google Scholar] [CrossRef]
Wang, Q.; Zhang, R.; Wang, Y.; Lv, S. Machine Learning-Based Driving Style Identification of Truck Drivers in Open-Pit Mines. Electronics 2020, 9, 19. [Google Scholar] [CrossRef] [Green Version]
Huang, Y.; Ng, E.C.Y.; Zhou, J.L.; Surawski, N.C.; Lu, X.; Du, B.; Forehead, H.; Perez, P.; Chan, E.F.C. Impact of drivers on real-driving fuel consumption and emissions performance. Sci. Total Environ. 2021, 798, 149297. [Google Scholar] [CrossRef]
Zhou, M.; Jin, H.; Wang, W. A review of vehicle fuel consumption models to evaluate eco-driving and eco-routing. Transp. Res. Part D Transp. Environ. 2016, 49, 203–218. [Google Scholar] [CrossRef]
Massoud, R.; Bellotti, F.; Berta, R.; De Gloria, A.; Poslad, S. Exploring Fuzzy Logic and Random Forest for Car Drivers’ Fuel Consumption Estimation in IoT-Enabled Serious Games. In Proceedings of the 2019 IEEE 14th International Symposium on Autonomous Decentralized System (ISADS), Utrecht, The Netherlands, 8–10 April 2019; pp. 1–7. [Google Scholar]
Hu, Z.; Zhou, T.; Osman, M.T.; Li, X.; Jin, Y.; Zhen, R. A Novel Hybrid Fuel Consumption Prediction Model for Ocean-Going Container Ships Based on Sensor Data. J. Mar. Sci. Eng. 2021, 9, 449. [Google Scholar] [CrossRef]
Fam, M.L.; Tay, Z.Y.; Konovessis, D. An Artificial Neural Network for fuel efficiency analysis for cargo vessel operation. Ocean. Eng. 2022, 264, 112437. [Google Scholar] [CrossRef]
Tran, T.A. Design the prediction model of low-sulfur-content fuel oil consumption for M/V NORD VENUS 80,000 DWT sailing on emission control areas by artificial neural networks. Proc. Inst. Mech. Eng. Part M J. Eng. 2019, 233, 345–362. [Google Scholar] [CrossRef]
Zhang, F.; Martinez, C.M.; Clarke, D.; Cao, D.; Knoll, A. Neural Network Based Uncertainty Prediction for Autonomous Vehicle Application. Front. Neurorob. 2019, 13, 31133839. [Google Scholar] [CrossRef] [PubMed]
Hegedüs, F.; Gáspár, P.; Bécsi, T. Fast Motion Model of Road Vehicles with Artificial Neural Networks. Electronics 2021, 10, 928. [Google Scholar] [CrossRef]
Seo, J.; Park, S. Optimizing model parameters of artificial neural networks to predict vehicle emissions. Atmos. Environ. 2023, 294, 119508. [Google Scholar] [CrossRef]
Huang, J.; Wang, Y.; Liu, Z.; Guan, B.; Long, D.; Du, X. On modeling microscopic vehicle fuel consumption using radial basis function neural network. Soft Comput. 2016, 20, 2771–2779. [Google Scholar] [CrossRef]
Ling, G.; Lindsten, K.; Ljungqvist, O.; Löfberg, J.; Norén, C.; Larsson, C.A. Fuel-efficient Model Predictive Control for Heavy Duty Vehicle Platooning using Neural Networks. In Proceedings of the 2018 Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018; pp. 3994–4001. [Google Scholar]
Illahi, A.A.C.; Bandala, A.; Dadios, E.P. Neural Network Modeling for Fuel Consumption Base on Least Computational Cost Parameters. In Proceedings of the 2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Laoag, Philippines, 29 November–1 December 2019; pp. 1–5. [Google Scholar]
Wu, J.-D.; Liu, J.-C. Development of a predictive system for car fuel consumption using an artificial neural network. Expert Syst. Appl. 2011, 38, 4967–4971. [Google Scholar] [CrossRef]
Du, Y.; Wu, J.; Yang, S.; Zhou, L. Predicting vehicle fuel consumption patterns using floating vehicle data. J. Environ. Sci. 2017, 59, 24–29. [Google Scholar] [CrossRef]
Jafarmadar, S.; Khalilaria, S.; Saraee, H.S. Prediction of the Performance and Exhaust Emissions of a Compression Ignition Engine Using a Wavelet Neural Network with a Stochastic Gradient Algorithm. Energy 2018, 142, 1128–1138. [Google Scholar]
Hu, Z.; Jin, Y.; Hu, Q.; Sen, S.; Zhou, T.; Osman, M.T. Prediction of Fuel Consumption for Enroute Ship Based on Machine Learning. IEEE Access 2019, 7, 119497–119505. [Google Scholar] [CrossRef]
Witaszek, K. Modeling of fuel consumption using artificial neural networks. Diagnostyka 2020, 21, 103–113. [Google Scholar] [CrossRef]
Soofastaei, A.; Alamdari, S.; Basiri, M.H.; Mousavi, A. Application of Machine Learning Techniques to Predict Haul Truck Fuel Consumption in Open-Pit Mines. J. Min. Environ. 2022, 13, 69–85. [Google Scholar]
Schone, A.; Byerly, A.; Hendrix, B.; Bagwe, R.M.; dos Santos, E.C. A machine learning model for average fuel consumption in heavy vehicles. IEEE Trans. Veh. Technol. 2019, 68, 6343–6351. [Google Scholar] [CrossRef] [Green Version]
Zhao, X.; Yao, Y.; Wu, Y.; Chen, C.; Jian, R. Prediction model of driving energy consumption based on PCA and BP network. J. Transp. Syst. Eng. Inf. Technol. 2016, 16, 185–191. [Google Scholar]
Zhou, Y.; Zhu, Y.; Wang, L.; Guo, Y. Prediction model of fuel consumption of heavy truck based on improved BP neural network. In Proceedings of the 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Falerna, Italy, 12–15 September 2022; pp. 1–6. [Google Scholar]
Chen, C.; Chen, Q.; Liu, Q.; Yan, J. Research on vehicle fuel consumption prediction model based on Cauchy mutation multiverse algorithm. In Proceedings of the 2022 9th International Forum on Electrical Engineering and Automation (IFEEA), Zhuhai, China, 4–6 November 2022; pp. 1115–1119. [Google Scholar]
Wang, K.; Wang, J.; Huang, L.; Yuan, Y.; Wu, G.; Xing, H.; Wang, Z.-Y.; Wang, Z.; Jiang, X. A comprehensive review on the prediction of ship energy consumption and pollution gas emissions. Ocean Eng. 2022, 266, 112826. [Google Scholar] [CrossRef]
Ko, K.; Lee, T.; Jeong, S. A Deep Learning Method for Monitoring Vehicle Energy Consumption with GPS Data. Sustainability 2021, 13, 11331. [Google Scholar] [CrossRef]
Li, Y.; Zeng, I.Y.; Niu, Z.; Shi, J.; Wang, Z.; Guan, Z. Predicting vehicle fuel consumption based on multi-view deep neural network. Neurocomputing 2022, 502, 140–147. [Google Scholar] [CrossRef]
Li, Y.; Tang, G.; Du, J.; Zhou, N.; Zhao, Y.; Wu, T. Multilayer Perceptron Method to Estimate Real-World Fuel Consumption Rate of Light Duty Vehicles. IEEE Access 2019, 7, 63395–63402. [Google Scholar] [CrossRef]
Carvalho, E.; Ferreira, B.V.; Ferreira, J.; de Souza, C.; Carvalho, H.V.; Suhara, Y.; Pentland, A.S.; Pessin, G. Exploiting the use of recurrent neural networks for driver behavior profiling. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 3016–3021. [Google Scholar]
Kanarachos, S.; Mathew, J.; Fitzpatrick, M.E. Instantaneous vehicle fuel consumption estimation using smartphones and recurrent neural networks. Expert Syst. Appl. 2019, 120, 436–447. [Google Scholar] [CrossRef]
Ping, P.; Qin, W.; Xu, Y.; Miyajima, C.; Takeda, K. Impact of Driver Behavior on Fuel Consumption: Classification, Evaluation and Prediction Using Machine Learning. IEEE Access 2019, 7, 78515–78532. [Google Scholar] [CrossRef]
Kan, Y.; Liu, H.; Lu, X.; Chen, Q. A Deep Learning Engine Power Model for Estimating the Fuel Consumption of Heavy-Duty Trucks. In Proceedings of the 2020 6th IEEE International Energy Conference (ENERGYCon), Gammarth, Tunisia, 28 September–1 October 2020; pp. 182–187. [Google Scholar]
Jain, N.; Mittal, S. A machine learning pipeline for fuel-economical driving model. Int. J. Intell. Comput. Cybern. 2022, 15, 473–496. [Google Scholar] [CrossRef]
Wang, G.; Zhang, L.; Xu, Z.; Wang, R.; Hina, S.M.; Wei, T.; Qu, X.; Yang, R. Predictability of Vehicle Fuel Consumption Using LSTM: Findings from Field Experiments. J. Transp. Eng. A Syst. 2023, 149, 04023030. [Google Scholar] [CrossRef]
Hua, Y.; Sevegnani, M.; Yi, D.; Birnie, A.; McAslan, S. Fine-Grained RNN with Transfer Learning for Energy Consumption Estimation on EVs. IEEE Trans. Ind. Inf. 2022, 18, 8182–8190. [Google Scholar] [CrossRef]
Aamir, M.; Rahman, Z.; Abro, W.A.; Tahir, M.; Mustajar, S. An Optimized Architecture of Image Classification Using Convolutional Neural Network. Int. J. Image Graph. Signal Process. 2019, 11, 30–39. [Google Scholar] [CrossRef] [Green Version]
Sharma, N.; Jain, V.; Mishra, A. An Analysis of Convolutional Neural Networks for Image Classification. Procedia Comput. Sci. 2018, 132, 377–384. [Google Scholar] [CrossRef]
Yan, Q.; Chen, X.; Jian, H.; Wei, W.; Wang, W.; Wang, H. Design of a deep inference framework for required power forecasting and predictive control on a hybrid electric mining truck. Energy 2022, 238, 121960. [Google Scholar] [CrossRef]
Han, S.; Zhang, F.; Ren, Y.; Xi, J. Predictive Energy Management Strategies in Hybrid Electric Vehicles Using Hybrid Deep Learning Networks. China J. Highw. Transp. 2020, 33, 3352. [Google Scholar]
Kheirandish, A.; Shafiabady, N.; Dahari, M.; Kazemi, M.S.; Isa, D. Modeling of commercial proton exchange membrane fuel cell using support vector machine. Int. J. Hydrog. Energy 2016, 41, 11351–11358. [Google Scholar] [CrossRef]
Bera, P. Fuel consumption analysis in dynamic states of the engine with use of artificial neural network. Combust. Engines 2013, 155, 16–25. [Google Scholar] [CrossRef]
Predić, B.; Madić, M.; Roganovic, M.; Kovačević, M.; Stojanović, D. Prediction of passenger car fuel consumption using artificial neural network: A case study in city of NIŠ. Engineering 2016, 15, 105–116. [Google Scholar]

Figure 1. Classification of vehicle fuel consumption forecasting methods.

Figure 2. The general process of using models to predict fuel consumption.

Figure 3. Research status of data-driven fuel consumption prediction methods.

Figure 4. The general process of dataset creation.

Figure 5. The basic principle of SVM.

Figure 6. Fuel consumption forecasting process based on SVM.

Figure 7. The basic principle of RF.

Figure 8. Fuel consumption forecasting process based on RF.

Figure 9. The basic structure of the artificial neural network.

Figure 10. The cycle cell structure of LSTM.

Figure 11. The basic structure of CNN.

Figure 12. (a) Comparison of results based on R²; (b) comparison of results based on RMSE.

Table 1. The calculation equation and evaluation standard for each index.

Index	Equation	Evaluation Standard	Equation No.	References
R²	$R^{2} = 1 - \frac{\sum_{m = 1}^{n} {(y_{m} - y_{m}^{’})}^{2}}{\sum_{m = 1}^{n} {(y_{m} - \bar{y})}^{2}}$	Range: 0 to 1 Higher is best	(1)	[5,6,7,8,9,10,11,12,13,14,15,16]
MSE	$M S E = \frac{1}{n} \sum_{m}^{n} (y_{m}^{’} - y_{m})$	Lower is best	(2)	[17,18,19,20]
RMSE	$R M S E = \sqrt{\frac{1}{n} \sum_{m}^{n} (y_{m}^{’} - y_{m})}$	Lower is best	(3)	[8,9,10,11,13,14,21,22,23,24]
MAE	$M A E = \frac{1}{n} \sum_{m = 1}^{n} \|y_{m}^{’} - y_{m}\|$	Lower is best	(4)	[18,24,25,26]
MAPE	$M A P E = \frac{1}{n} \sum_{m = 1}^{n} \|\frac{y_{m}^{’} - y_{m}}{y_{m}}\| \times 100 %$	Lower is best	(5)	[18,25,27]
SI	$S I = \frac{R M S E}{\bar{y}}$	$S I$ < 0.05: higher accuracy; 0.05 < $S I$ < 0.1: Good accuracy	(6)	[28,29]
U₉₅	$U_{95} = 1.96 \sqrt{(S D^{2} + R M S E^{2})}$	Lower is best	(7)	[28,29]

where

y_{m}^{’}

is the

m

predicted fuel consumption in the test sample, L/h;

y_{m}

is the true fuel consumption of the

m

in the test sample, L/h;

n

is the number of samples;

\bar{y}

is the average of the true fuel consumption, L/h;

S D

is the standard deviation of the variance between true and predicted values.

Table 2. Classification of fuel consumption data.

Data Type	Relevant Data	Collection Mode
Vehicle inherent variables	Vehicle and engine model, engine capacity, total vehicle mass	Provided by the vehicle manufacturer
Driving behavior variables	Driving speed and acceleration, engine speed and torque, engine load rate and revolution, driving distance, the real fuel consumption of vehicles	GPS, Gyroscope, OBD-II, CAN bus, Smartphone
Driving environment variables	Weather factors, altitude, road slope, and other road conditions information	GPS, Radar, Infrared ray

Table 3. Analysis of fuel consumption models based on SVM.

Reference	Model	Inputs	Vehicle Type	Performance	Characteristics
Abukhalil et al. [32]	SVM	Engine speed, revolutions per minute, speed, etc.	Passenger vehicles	RMSE: 2.43	Processing large-scale data is inefficient and time-consuming; the model accuracy is low when dealing with noisy data sets. The SVM model is relatively simple and has low requirements for hardware and software.
Zeng et al. [63]	SVM	Data collected from GPS and CAN bus (trip distance, speed, engine capacity, etc.)	Probe vehicles	R²: 0.92
Hussain et al. [7]	SVM	Information from On-board Sensors and records (traveled distance, hour of the day, driver ID, bus ID, etc.)	City buses	R²: 0.95
Capraz et al. [8]	SVM	Trip distance, speed, vehicle weight, acceleration, road slope, etc.	Passenger vehicles	R²: 0.94
Araújo et al. [21]	SVM	Load, speed, pendulum test value, mean texture depth, estimating the surface texture depth; load, speed, mean texture depth	Various types of vehicles	RMSE: 0.303 RMSE: 0.410
Liu et al. [25]	SVR	Engine speed and torque, speed, acceleration, engine oil temperature, etc.	Passenger vehicles	MAE: <0.16
Ahmadi et al. [22]	LSSVM	Rotation speed, temperature of heat source, pressure, etc.	Stirling engine	RMSE: 0.067 R²: 0.98	The performance is highly dependent on the model fit and the characteristics of the data set. High computing resources and storage space are required.
Wang et al. [17]	GA-SVM	Data from sensors (engine speed and torque, temperature, air mass flow rate, etc.)	Light diesel engine	MSE: 1.344 R²: 0.967
Li et al. [64]	NSGA-SVM	Speed, load, engine speed, cylinder pressure, etc.	Gasoline engine	Relative error: <3%

Table 4. Result analysis of fuel consumption models based on RF.

Reference	Model	Inputs	Vehicle Type	Performance	Characteristics
Hassan et al. [9]	RF	Vehicle speed, vehicle specific power, engine speed, engine stress	Passenger vehicles	RMSE: 0.15 R²: 0.871	The effect of processing high-dimensional sparse data is not good. It is difficult to explain the specific reasons for the model prediction; when the number of trees is large, high computing resources and storage space are required.
Gong et al. [34]	RF	21 variables extracted from the driver-vehicle-road-environment	Heavy-duty diesel trucks	Accuracy: 86.58%
Perrotta et al. [10]	RF	Vehicle speed, acceleration, the torque and revolutions of the engine, etc.	Trucks	RMSE: 4.64 R²: 0.87
Yu et al. [11]	RF	Mileage and speed, temperature, air pressure, etc.	Trucks	RMSE: 5.073R²: 0.98
Yao et al. [23]	RF	Driving data collected from smartphone applications (speed, acceleration, etc.)	Taxis	RMSE: 0.783
Yang et al. [18]	RF	Engine power and the number of cylinders, driving speed, driving habits, temperature, wind speed, etc.	Light-duty vehicles	MAE: 0.63 MSE: 0.805
Yang et al. [24]	RF	Engine speed, engine torque, speed, acceleration, deceleration, etc.	Gain combine harvesters	MAE: 0.24 RMSE: 0.14
Yu et al. [12]	Hybrid model	Mileage and speed, temperature, air pressure, etc.	Long-distance vehicles	R²: 0.976	The model is complex and requires high hardware and software.

Table 5. Result analysis of fuel consumption models based on NN.

Reference	Model	Inputs	Vehicle Type	Performance	Characteristic
Wysocki et al. [13]	ANN	Data from the CAN bus (engine speed and torque, etc.)	Heavy-duty trucks	RMSE: 0.32 R²: 0.99	The parameter setting is complicated, and the result is easily affected by the quality and quantity of input data. High requirements of hardware and software resources of the device.
Witaszek [85]	ANN	Vehicle speed and acceleration, road slope, throttle opening degree, selected gear number, and engine speed	Passenger vehicles	Relative error: <3%
Soofastaei et al. [86]	ANN	Data from past records (payload, total resistance, actual speed)	Haul trucks	R²: 0.903
Asher et al. [26]	ANN	Data from OBD-II (speed, acceleration, engine speed, etc.)	Hybrid vehicles	MAE: 0–0.1%
Schone et al. [87]	FNN	Distance, seven predictors derived from vehicle speed and road slope	Heavy-duty vehicles	RMSE: 0.0132 R²: 0.91
Topić et al. [6]	FNN	Speed, acceleration, slope time series	City buses	R²: >0.97
Du et al. [82]	BP	Data from records (time, location, speed, road condition, driver’s personal information, etc.)	Various types of vehicles	Accuracy: 81.7%	Nonlinear mapping strength; easy to fall into local optimal solution; susceptible to initial values; when the data scale is large, the device requires high configuration.
Zhao et al. [88]	BP	Data from OBD-II and GPS (distance, acceleration, speed, etc.)	Taxis	Accuracy: 92.46%
Shang et al. [19]	HMM-BP	Data from records (vehicle ID, vehicle speed, moving direction, GPS longitude, latitude, etc.)	Taxis	MSE: <0.06 R²: >0.95	The performance is highly dependent on the model fit and the characteristics of the data set. High computing resources and storage space are required.
Zhou et al. [89]	GSA-BP	Engine speed and torque, vehicle speed, load rate, driving distance, etc.	Heavy duty vehicles	Accuracy: 96.51%
Chen et al. [90]	CMVO-BP	Vehicle speed, engine speed, and torque	Truck	Accuracy: 97.5%

Table 6. Characteristics of different fuel consumption prediction models.

Model Type	Traditional Machine Learning Model [7,8,9,10,11,21,23,106]	Neural Network Model [13,82,85,86,88,107,108]	DNN Model [15,20,97,98,99]	Hybrid Model [16,72,89,90]
Interpretability	Good	Middle	poor	Poor
Efficiency	High	Middle	Low	Low
Accuracy	Low	Middle	High	High
Advantages	Simple model, suitable for processing small sample data (SVM) and high dimensional data (RF)	Strong nonlinear mapping ability, relatively simple structure	Can process data related to timing (RNN, LSTM), automatic feature extraction, model stability	Suitable for all kinds of scenarios, accepts input from different types of data
Disadvantages	Features need to be extracted manually, poor performance when dealing with large amounts of data	Easy to fall into local optimality (BPNN), features related to timing cannot be obtained	Computationally heavy, over-reliance on the amount of input data	The model structure is complex, and parameter adjustment difficulty

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, D.; Li, H.; Hou, J.; Gong, P.; Zhong, Y.; He, W.; Fu, Z. A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption. Energies 2023, 16, 5258. https://doi.org/10.3390/en16145258

AMA Style

Zhao D, Li H, Hou J, Gong P, Zhong Y, He W, Fu Z. A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption. Energies. 2023; 16(14):5258. https://doi.org/10.3390/en16145258

Chicago/Turabian Style

Zhao, Dengfeng, Haiyang Li, Junjian Hou, Pengliang Gong, Yudong Zhong, Wenbin He, and Zhijun Fu. 2023. "A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption" Energies 16, no. 14: 5258. https://doi.org/10.3390/en16145258

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of the Data-Driven Prediction Method of Vehicle Fuel Consumption

Abstract

1. Introduction

2. Data Analysis of Vehicle Fuel Consumption

3. Prediction Models of Vehicle Fuel Consumption

3.1. SVM Model

3.2. RF Model

3.3. Neural Network Model

3.4. Deep Neural Network

4. Summaries and Prospects

4.1. Summaries

4.2. Prospects

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI