1. Introduction
Due to the generation mechanism and harmful effects of nitrogen oxides (NOx) in the exhaust emissions of low-speed marine diesel engines, the International Maritime Organization and national regulations have strict NOx emission limits for marine engines. Low-speed marine engines are verified by emission tests to meet these emission regulations before being installed on board ships. However, as marine low-speed engines operate for long periods, environmental changes and the degradation of certain components such as fuel injectors and solenoid valves affect their emission performance. In order to ensure that marine diesel engines can operate efficiently with low emissions under different operating conditions, it is necessary to be able to obtain accurate parameter information during diesel engine operation; as a result, more and more sensors are being installed on marine diesel engines. Among the many sensors, the NOx sensor is a necessary tool for the detection of NOx emissions, which provides NOx data to SCR systems to ensure that all NOx emissions in the exhaust gas are reduced by urea as much as possible. However, in the process of practical application, the NOx sensor needs to be exposed to the engine exhaust gas for a long time, and impurities such as sulfides and particulate matter in the exhaust gas cause the poisoning of sensitive components and reduce the service life. Related papers show that the continuous measurement of ship engine exhaust gas will reduce the service life of NOx sensors to 85 h [
1]. Due to the reduced lifetime of NOx sensors, the cost of use and the uncertainty in the application process are increased. In this situation, soft sensors or virtual sensors based on soft measurement techniques are being applied and developed. The basic idea of soft measurement technology to solve the lifetime problem in NOx sensor applications is to predict NOx emissions from diesel engines by modeling the correlation between NOx emissions and other directly measured or easily measurable variables without using NOx sensors directly. According to the principles of modeling NOx simulations, the models can be divided into physical mechanism models and data-driven models [
2].
The mechanism-based NOx simulation model is still mainly based on the widely accepted Zeldovich mechanism as the NOx generation model. Based on constructing relevant variables, it can be divided into two categories: (1) Mathematical equations based on the working mechanism of the engine to form a more suitable NOx prediction through the appropriate simplification or additional consideration of the mixing and combustion conditions in the process of NOx formation [
3,
4,
5,
6,
7,
8,
9,
10]; and (2) use some physical signals (such as in-cylinder pressure, torque, etc.) during the working process of the engine to obtain intermediate variables through signal analysis and processing, and then combine the Zeldovich mechanism to analyze and deduce the generation of NOx to construct NOx predictive models [
11,
12,
13,
14,
15]. The method of the physical mechanism model needs to be familiar with the formation mechanism of NOx under the characteristics of different models and fuel characteristics and conduct strict physical formula derivation, which can describe the in-cylinder combustion and the formation process of NOx more completely. The solution process often requires iteratively solving mathematical and physical equations, and the results obtained have strong interpretability, but the solution process is complicated. Using empirical equations to simplify the calculation of parameters such as NOx can reduce the computational complexity, but cannot guarantee sufficient accuracy.
The data-driven model uses artificial intelligence algorithms to build a corresponding calculation model to train and learn between input and output, and finally obtain a relatively accurate calculation model. It is considered to be an effective way of solving the prediction of difficult-to-measure physical quantities. This provides new ideas for performance degradation and lifetime prediction. The NOx simulation model based on data-driven construction is less dependent on the mechanism and can achieve satisfactory accuracy under appropriate algorithm selection and optimization. The research on intelligent algorithm modeling of the NOx simulation model mainly focuses on the following: (1) Using different artificial intelligence algorithms (such as artificial neural networks (ANN), deep neural networks (DNN), convolutional neural networks (CNN), long and short-term memory neural networks (LSTM), etc.) and optimization algorithms to construct different models to predict the emissions and characteristic parameters of gasoline engines or medium and high-speed diesel engines [
16,
17,
18,
19,
20,
21,
22,
23]. (2) Use appropriate algorithms to study the performance characteristics and predicted emissions of dual-fuel engines (such as isoamyl alcohol/gasoline, biodiesel/diesel, etc.) under different fuel ratios [
24,
25,
26,
27,
28,
29]. (3) Research into the promotion and adaptive optimization of artificial intelligence algorithm model prediction abilities [
30,
31,
32,
33,
34,
35,
36].
From the modeling characteristics and applications of the physical and data-driven models described above, it is clear that the modeling process of the data-driven model is relatively simple because it does not rely on complex mathematical physical equations based on the working principles of the object. Therefore, it facilitates the exploration of research questions that lack detailed theoretical support, such as the characterization of dual-fuel or new-fuel engines. Many algorithms can use complex computing models and algorithms based on large data sets to achieve high prediction accuracy, which is not conducive to the adaptive expansion of the model and the practical application of the model.
In this paper, a NOx calculation model for marine diesel engines is developed using a data-driven modeling construct. Compared to CNN and DNN, a backpropagation neural network (BPNN) has a simpler structure. After a BPNN learns the data samples, each data undergoes multilayer and multinode repeated operations to realize data regression prediction. Compared with the simple feedforward ANN, BPNN optimizes and adjusts the weight matrix and bias matrix of the neural network by introducing error feedback backpropagation. Selecting an appropriate activation function makes the input and output have a highly nonlinear mapping relationship, and the BPNN can approximate any nonlinear function with arbitrary precision. For non-time series data, such as diesel engine operating conditions, the BPNN is more suitable for the nonlinear regression prediction of multidimensional input and output systems such as diesel engines than LSTM. Meanwhile, to make the structure of the BPNN as simple as possible, the BPNN is optimized using the group method of data handling (GMDH), and the final GMDH–BP model-based NOx calculation model is constructed.
The main contents of this paper are as follows: (a) establishing a three-dimensional simulation model using experimental data and conducting simulation experiments to provide data samples for the NOx calculation model; (b) selecting variables related to NOx emissions from the perspective of mechanism generation, using the grey relational analysis correlation method to initially select influencing factors with strong correlation and using the principal component analysis method to reduce the dimensionality of influencing factors to obtain low-dimensional input data; (c) establishing three computational models to compare the computational accuracy: the GMDH–BPNN NOx computational model using GMDH adaptive optimization of the structure of the BPNN neural network model; the GA–BPNN NOx computational model using the genetic algorithm (GA) to optimize the weights and bias of the BPNN; and, finally, the GMDH–BPNN, GA–BPNN, and unoptimized BPNN models are compared.
2. Materials and Methods
In this paper, the NOx calculation model is constructed as in
Figure 1.
Firstly, from the perspective of control, the relevant factors were initially selected as alternative input parameters for the NOx release model according to the mechanism of NOx formation during diesel engine operation and the controllable factors affecting NOx formation. Then, the Pearson correlation coefficient and grey relational analysis correlation methods are used to correlate the relevant parameters, and principal component analysis is used to reduce the dimensionality of the influencing factors to obtain the input parameter combinations of the model. In this way, the input data can be reduced in dimensionality with the minimum loss of important information, thus reducing the complexity of the whole model. Finally, the structure of the BP neural network is optimized adaptively using GMDH to improve the accuracy of the model. In addition, the weights and biases of the BPNN are optimized using a genetic algorithm and used as a comparison model; and the unimproved BPNN is used as another comparison model.
2.1. Input Parameters Selection
The research object of this paper is a 6EX340EF low-speed two-stroke diesel engine, and some of its parameters are shown in
Table 1 below.
Since the research object is a low-speed two-stroke diesel engine, the cost of the bench test is high, and only limited experimental data can be obtained.
Table 2 below shows the experimental data under four typical working conditions obtained from the bench test. According to the working principles of marine diesel engines and the complex physical–chemical reaction processes of fluids such as combustible gas mixtures in three-dimensional space, computational fluid dynamics software can be used to create algebraic equations to obtain numerical solutions at discrete time/space points that can accurately calculate the required performance parameters.
The experimental data were used to calibrate the 3D simulation model of the marine diesel engine, and then this model was used to construct orthogonal experimental data to obtain more experimental data. The process of obtaining data is shown in
Figure 2 below:
According to the data, the NOx content of exhaust emissions produced by the marine low-speed diesel engine burning heavy oil (HFO) is 8.7% according to the mass ratio (NOx/HFO) [
37]. The NOx is divided into the thermal type (83.92%), fuel type (11.08%), and fast type (5%) [
38]. For the research on the formation mechanism of NOx, the thermal NOx formation explanation of the Zeldovich mechanism is widely recognized. This mechanism clarifies that NOx is produced by the chemical reaction of
and
in the combustion mixture. Since NOx is a product of combustion, the relevant parameters of the scavenging process, compression, fuel injection, combustion process, and structural characteristics of the marine diesel engine will affect the combustion and then affect the formation of NOx. From the formation mechanism of the thermal NOx chemical reaction, it is known that the distribution of temperature and pressure in the cylinder, the concentration and distribution of oxygen, and the flame propagation rate control the formation of NOx. The impact relationship of structural parameters, initial state parameters, working process parameters, and typical performance characteristic parameters involved in these three conditions in NOx generation are outlined in
Figure 3.
Many factors affect the formation of NOx. Selecting appropriate input parameters can reduce the computational complexity of the NOx simulation model and achive higher accuracy. Therefore, it is necessary to further analyze the degree of influence between related factors and NOx formation and describe the correlation quantitatively. Since this paper takes the 6EX340EF engine as the research object, its relevant structural parameters will not change during use, so the correlation with NOx is not considered. A certain number of data samples for the working parameters and performance parameters of the engine under different control parameters are designed by the simulation experiment, and the influencing factors present a normal distribution. Therefore, the commonly used Pearson correlation coefficient and grey relational analysis correlation methods were used to correlate the relevant variables; the results using the Pearson correlation coefficient are shown in
Table 3, and the results of the grey relational analysis are shown in
Figure 4.
It can be seen from the correlation coefficients that P_CR, torque, IMEP, PFP, HRR, Q_HR, T_max, and P have a significant positive correlation with NOx. The positive correlations of T_oil, Ex_o, and T_sca are weak. The correlations of Inj, Inj_D, Ex_c, P_sca, T_exh, P_exh, CA50, and CA10 are negative.
From the grey relational analysis correlation graph, it can be seen that all of the analyzed variables are strongly correlated with NOx, and this result has similarities and differences with the Pearson results. To make full use of the information for the relevant parameters and to reduce the number of model input parameters, principal component analysis was used to perform dimensionality reduction for the above 20 parameters. After principal component analysis, the resulting scree plot is shown in
Figure 5, and its component coefficient is shown in
Table 4. As can be seen from
Figure 5, after the 7th principal component, the data tends to flatten out, that is, 20 principal components can be abbreviated to 7 principal components by using the coefficients in
Table 4.
2.2. Construction of NOx Simulation Model
The marine diesel engine is a MISO system when NOx is the object of study. This is because the NOx output is influenced not only by the parameters associated with the current cycle but also by those of the previous cycle. Therefore, the complexity and diversity of the physical model and its parameters make it difficult to be simulated and controlled in real-time applications. The BPNN is a typical data-driven model that obtains information by simulating the forward and feedback transmission of signals between neurons and then calculates and predicts NOx.
The GMDH is a heuristic self-organizing method to study the relationship between variables. It can be used to select and classify input parameters to reduce the dimension of input features. For example, when there are three layers and four inputs, the sample architecture of the GMDH algorithm is shown in
Figure 6. In
Figure 6, there are four inputs (X1, X2, X3, and X4). From these input variables, there are three (X1, X2, and X4) control systems. X3 does not affect classification. The GMDH algorithm selects the important features that have an impact on classification. The GMDH algorithm generates a series of active neurons by cross-combining various input units of the system. Each neuron has the function of selecting the optimal transfer function and then selects several neurons closest to the target variable from the generated generation of neurons. The selected neurons combine to create new ones again, repeating the process of dominance inheritance, survival competition, and evolution until the most complex model is chosen.
The GMDH neural network has the following characteristics: the modeling process is self-organizing control without any initial assumptions; optimized complexity and high-precision prediction; the ability to use the optimal structure of each layer of the self-organizing multilayer neural network, that is, it can automatically retain useful variables and remove redundant variables; automatically selects the optimal number of network layers and the number of neurons per layer.
The BPNN model topology consists of an input layer, an output layer, and several hidden layers. The structure of the BPNN network is shown in
Figure 7. Each layer of the BPNN has a specific number of neurons, and the network structure used in this paper can be found in [
10,
14].
In the forward propagation process, the calculation principle is shown as Equation (1): the output of each layer is multiplied by the input of the layer by the weight plus the bias, and, finally, the nonlinear transmission is realized under the action of the activation function (such as sigmoid, Relu, tanh, et al.). After using the GMDH to determine the structure of the model, the weight and bias of the BPNN are mainly determined. Without using the optimization algorithm, the weight value and bias that meet the accuracy requirements are obtained by repeating the training process.
In Equation (1), is the output of neuron; is the weight between neuron and neuron ; is the bias of neuron .
In the backpropagation, the overall error of groups of training sample data is shown in Equation (2). The gradient descent method is used to update the weight and deviation, and the updated weight and deviation are shown in Equations (3) and (4). Through the forward and backpropagation of errors, the weight and deviation of the neural network are constantly adjusted until a higher accuracy is achieved.
In Equation (2), is the calculated output, is the expected output, and are the weight and bias of the layer.
In this paper, the data samples designed by the simulation experiment are used to train and test the GMDH–BPNN algorithm model. To reflect the prediction performance of GMDH–BPNN, the unimproved BPNN is used as a blank comparison model. In addition, genetic algorithms achieve multiparameter optimization by simulating the inheritance, crossover, compilation, and evolution of populations in nature. Thus, the BPNN, optimized by the genetic algorithm, is used as another comparison model. In each test, the same amounts of data are used in training and testing, but the content of the included data is randomized. After repeated training and testing, the mean value of many test results is selected as the prediction accuracy of the model.
3. Results
In this paper, the regression coefficient (R2), the average relative average error (MAE), and the absolute root-mean-square error (RMSE) of the test set are selected for error evaluation. The three errors of the different calculation models are shown in
Table 5. In order to reflect the randomness of NOx data occurrence in practical applications and to calculate the prediction effect of the models, a dotted line plot is used to compare the true and predicted values. The results of the comparisons between the predicted and true values of the three models on the test and training sets are shown in
Figure 8,
Figure 9 and
Figure 10.
It can be seen from
Table 5 that although the error indicators of the three models are not very obvious, the errors are all lower than 10%, and the R2, MAE, and RMSE of the GMDH–BPNN model are all smaller than the other two models. It shows that the fitting degree between the GMDH–BPNN prediction curve and the original data curve is better, the error between each prediction value is also smaller, and the error between the obtained prediction sample and the test sample is smaller.
From the prediction curves of the above three models on the training and test sets, it can be seen that all models can fit the trend of the real NOx values better when changes occur, but the prediction on the extreme value point is still not satisfactory when large abrupt changes occur. This may be due to the small volume of the sample data and the overfitting phenomenon on the training set, resulting in unsatisfactory results on the test set. Meanwhile, since the accuracy of the GMDH–BPNN is slightly higher than that of GA–BPNN, it indicates that for the current data, the structure of the BPNN neural network has slightly more influence on its accuracy than the influence of weights and biases.
4. Discussion
In this paper, principal component analysis is used to reduce the dimensionality of numerous correlated factors, which makes full use of the correlation between the correlated factors and NOx while reducing the dimensionality of the input parameters of the model. The GMDH algorithm is also used to optimize the structure of the model and further reduce the computational effort of the NOx calculation model. Based on the results of this paper, the following studies can be continued in depth.
- (a)
The sample data used in this paper was small, and the obtained model may not be able to fully explain the performance of the studied model under the target operating conditions.
- (b)
In order to retain the relevant parameter information as much as possible, 20 relevant parameters were dimensionally reduced in this paper to obtain the input parameters, and subsequent control parameters and easily accessible parameters can be selected as input parameters to facilitate the application of the NOx calculation model to the actual diesel engine.
- (c)
Based on (b), the NOx calculation model can be integrated with the hardware, which can achieve NOx values without using NOx sensors at all, and the application environment is relatively good and not easily damaged. Further optimization of the accuracy and calculation speed of the NOx calculation model integration in hardware can be performed, which can provide a solution for the online monitoring of marine diesel engine NOx emissions.
5. Conclusions
In this paper, the low-speed marine diesel engine was used as the research object. The input parameters of the computational models were obtained by correlation analysis and principal component analysis of the relevant parameters, and the simulation data were then used to develop and validate the three models. The results show that the prediction errors of the three NOx simulation models, GMDH–BPNN, GA–BPNN, and BPNN can be reduced to 2.296%, 4.678%, and 5.425%, respectively.