A comparative study of data-driven virtual sensors upon a real-life natural gas steam reforming hydrogen production process is provided in this section. Meanwhile, a case study is useful for understanding the hydrogen production processes and data features. The state-of-the-art virtual sensors and developed models (DVBPCA and MW-DVBPCA) are investigated, including partial least squares (PLS) [
30], VBPCA, long short-term memory (LSTM) [
31], echo state networks (ESN) [
32], and dynamic PLS (DPLS) [
33]. The PLS is a classic static model with the advantages of modeling data colinearities and simple data structure. The VBPCA is a static probabilistic model. Different from the PLS, the VBPCA determines the number of principal components automatically, which is also the basis of our developed models (DVBPCA and MW-DVBPCA). The LSTM and ESN are advanced dynamic neural network models that consider the dynamics. The DPLS is the improvement of the PLS based on the FIR paradigm, which also accounts for the dynamics of the hydrogen production process.
4.1. Natural Gas Steam Reforming Hydrogen Production Process
The natural gas steam reforming hydrogen production process is the largest industrial source of hydrogen, and it consists of four main processes: feedstock purification, steam reforming, medium temperature conversion, and pressure swing adsorption. Among these processes, steam reforming is the dominant reaction process, which is schematically shown in
Figure 3 [
34]. Processed gases are mixed with steam in the pre-reformer, where all hydrocarbons and some
are converted to
,
and
. The temperature then decreases, which is not conducive to promoting hydrogen generation. The pre-reformer output is heated in a pre-heater and then continuously fed to the primary reformer for complete reforming.
The main chemical reactions of this process are shown as follows:
According to Equation (
34), the exit gases of the hydrogen production process consist of
,
,
, and
, and the concentrations of these gases are KVs (as labeled “Y” in
Figure 3) related to product quality. Therefore, these KVs need to be strictly monitored. In practice, offline laboratory analysis and hardware sensors are traditional methods for measuring the KVs, but have delays and high investment. Meanwhile, series-wound devices (such as pre-reformers and pre-heaters) introduce considerable transportation delays, which must be considered in virtual sensor modeling. Therefore, for giving real-time predictions of the KVs, a virtual sensor considering transportation delays is desirable.
4.4. Parameter Selection
All models’ optimal parameters need to be chosen to obtain different models’ best prediction performance. Select the DE algorithm to minimize the average RMSE (i.e., the mean of the RMSE of the four KVs) on the validation set for parameter optimization of different models. For the PLS, the number of principal components and the time delays are the parameters to be optimized. For the DPLS, the number of principal components, the time delays and the dynamic orders are the parameters to be optimized. For the VBPCA-based models, the time delays and the dynamic orders are the parameters to be optimized. For the LSTM and ESN, the time delays are the parameters to be optimized.
For the VBPCA-based models, the key issue of automatic parameter determination of the principal component number is whether each column of the loading matrix
is either insignificant or significant, i.e.,
or
, which can be controlled by
according to Equation (
8). Take DVBPCA for example, the fact that the variance of each
can be quantified by
for
, which is displayed in
Figure 6. The color distinction of the points in
Figure 6 means the value of
is either insignificant or significant by setting threshold, where red means significant and blue means insignificant. As shown in
Figure 6, the appropriate dimensionality of the principal component subspace is selected as 39.
For the ESN, set the input regulation scale to 0.1, set the reservoir size to 50, and set the spectral radius to 0.8. Considering that the ESN involves the random weights in the reservoir computing step, 20 modeling tests were performed. For each trial, the random weights are saved and fixed, and the DE algorithm is used to optimize the model delays further. Then, the best results from 20 tests are picked for model performance comparison. Note that the ESN is a single-output model; we therefore construct four ESN models, one for each KV. For the LSTM, based on the debugging experience, set the time step to 1, set the learning rate to 0.1, set the hidden layer to 1, and set the neuron number in the hidden layer to 100, which performs favorably empirically in each of the replicated experiments of this work.
For the MW-DVBPCA, the impact of MW size is detailed in
Figure 7. The MW size is in units of 10 min, i.e., setting the MW size to 50 means that the MW contains 500 min of data. As shown in
Figure 7, for small MW sizes, the data in the window may not appropriately represent the relationship between process variables. In contrast, excessive MW size covers too much outdated sample data so that the MW-DVBPCA fails to track the process change adequately. Therefore, according to
Figure 7, the MW size was selected as 50.
4.5. Results and Analysis
The estimations on the test set of the KVs obtained by the investigated seven virtual sensors are visualized in
Figure 8,
Figure 9 and
Figure 10. In
Figure 8, obviously, the PLS and VBPCA have the poor estimation performance. Because of the significant dynamics of the hydrogen production process, the estimation accuracy of the static models, PLS and VBPCA, is not as satisfactory as that of the other five dynamic models (such as around the 120-th sample of the
concentration).
Figure 9 shows that the estimated values of the DVBPCA tracks real values better than the other three models (such as around the 300-th sample of the
concentration and around the 300-th sample of the
concentration). That is because, on the one hand, the DVBPCA can deal with the colinearities between the EVs compared with the LSTM and ESN. On the other hand, the DVBPCA can tackle the overfitting results from high-order variable augmentation compared with the DPLS. Moreover, the ESN constructs four virtual sensors, one for each KV for estimation, but the DVBPCA is a multi-output model that considers the inherent relationships between the KVs.
Figure 10 illustrate that the estimations of the four KVs by the MW-DVBPCA match the real values much better (particularly in the localized area of the
concentration around the 440-th sample) than other models, revealing the importance of considering time variation properties in the virtual sensor modeling of the hydrogen production process. Moreover, although the overall predicted values do match well with the true ones when the proposed MW-DVBPCA method was used, some discrepancies can be observed between test sample number 200 and 300 for all the four KVs. The possible reasons for these differences in the predicted and true values are as follows. Firstly, the characteristics of the samples between test sample number 200 and 300 are changing rapidly, and model learning does not accommodate such changes in time. Secondly, the samples between test sample number 200 and 300 have nonlinearities, but the local model constructed by MW-DVBPCA is linear. Overall, it is recognized that the proposed models show noticeable advantages over the benchmark models.
Figure 11 compares the seven models in terms of scatter plots. Based on
Figure 11, a further comparison of the prediction of the data-driven models can be made. As shown in
Figure 11a, the predictions of the PLS for
component deviate obviously from the real values.
Figure 11b,c shows that the VBPCA model improves the prediction accuracy of
and
concentrations somewhat compared to the PLS model. However, the overall prediction accuracy is still very low.
Figure 11d reveals that all models have relatively poor prediction accuracy for the
concentration, but the MW-DVBPCA presents a better result. As indicated in
Figure 11, the scatters by MW-DVBPCA are more closely and clearly located around the diagonal line than those of other models, thus illustrating better performance.
The estimation performance of all data-driven virtual sensors is quantitatively tabulated in
Table 2 and
Table 3. For a further comparison,
Table 2 and
Table 3 show the estimations of the data-driven virtual sensors not considering transportation delays. These models’ parameter selection is consistent with the corresponding model considering transportation delays. Overall, the estimation results in
Table 2 and
Table 3 provide an initial validation of the effectiveness of the data-driven virtual sensors. However, the performances of the different data-driven virtual sensors vary considerably. The performance of the models that consider delays is better than that of the corresponding models that do not, as shown in
Table 2 and
Table 3. Take the DVBPCA as an instance. The predictive performance based on the RMSE index of the four KVs by the model accounting for the time delays is improved by 18.6%, 13.0%, 2.3%, and 5.5%, respectively, compared to the model ignoring time delays. This is because the hydrogen production process has substantial time delays, and it has been proven that ignoring the delays could result in significantly deteriorated performance [
29]. The
values for two static models, i.e., the PLS and VBPCA, are as low as below 0.5; in contrast, the dynamic models, such as LSTM, ESN, and DPLS, show significantly better performance than the PLS and VBPCA, indicating the dynamic model better fits the data features of the actual hydrogen production process. Moreover, due to the capability of dealing with overfitting, the DVBPCA performs better than the DPLS. Concretely, compared to the DPLS, the RMSEs of the four KVs obtained by the DVBPCA are decreased by 13.4%, 1.8%, 3.2%, and 4.8%, respectively. Moreover, the MW-DVBPCA further improves the estimations of the four KVs. The
s of the
and
concentrations reach as high as up to 0.9. Compared with the DVBPCA, the predictive performance on the four KVs by the MW-DVBPCA improves by 1.3%, 10.3%, 2.8%, and 33.4%, respectively, in terms of the MAE index.
Furthermore, to check whether the MW-DVBPCA’s performance is significantly different from that of other models, the Wilcoxon test is employed for statistical testing [
35]. The Wilcoxon test is a non-parametric testing method which is used to examine whether there is significant difference in the median values of the squared estimation errors obtained by the two virtual sensors. In Wilcoxon’s test, the likelihood that the corresponding hypothesis will be accepted is measured by calculating the
p-value. The smaller the value of
p-value, the lower the probability that the corresponding hypothesis will be accepted. Typically, the hypothesis should be rejected if the
p-value is less than the given significance level
, the hypothesis should be rejected; that is, statistically the median values of two virtual sensors are different.
The Wilcoxon test results are given in
Table 4, where
,
,
,
,
,
, and
mean the median values of squared estimated errors obtained by the PLS, VBPCA, LSTM, ESN, DPLS, DVBPCA, and MW-DVBPCA, respectively. Additionally, set the significance level
at 5%. As shown in
Table 4, all hypothesized
p-values are far below
. Hence, all hypotheses are rejected. In other words, there is statistical significance in comparing the MW-DVBPCA with other virtual sensors in the hydrogen production process.
4.6. Computational Efficiency Analysis
Since this article is concerned with real-time estimation, examining the runtime of the model is desirable. The offline and online computational efficiency of the virtual sensors is evaluated using the average over 10 independent simulations, including the CPU time consumed offline for parameter optimization () and the CPU time consumed online (). All experiments were computed on a Core i5 (2.90 GHz × 2) with 8 GB RAM, Windows 10 and R2021a version of MATLAB.
Table 5 lists the time taken by each virtual sensor on parameter determination. As can be observed, the
index for the DVBPCA model is much smaller than that for the LSTM, due to its more concise structure. The
indices for other dynamic global models are almost the same as those for the DVBPCA, but other dynamic global models have lower accuracy than the DVBPCA given in
Table 2 and
Table 3. Note that the
index for the MW-DVBPCA is much larger than that for the DVBPCA, which is because the MW-DVBPCA needs to rebuild the model each time it predicts a new valid sample. Fortunately, the parameter determination processes are carried out offline. In other words, this process hardly affects the online calculative efficiency of the MW-DVBPCA. The last column of
Table 5 illustrates the online computation time of MW-DVBPCA. Consequently, the online computational efficiency of the developed models is not an issue. In practice, the
indices for all virtual sensors are less than 0.1 s/sample, significantly faster than the minimum sampling period for the KVs in the hydrogen production process. The results show that all data-driven virtual sensors meet the time requirements for real-time estimation, including the developed DVBPCA and MW-DVBPCA.