1. Introduction
Hydrogen-fed polymer electrolyte membrane fuel cells (PEMFC) are known to be an efficient and flexible energy conversion technique with the capability of operating in future carbon-free, renewable, and distributed energy markets. Although PEMFCs are already commercialized, further research, especially regarding stack level, is required [
1]. Computational tools and numerical modeling of PEMFC stacks play an important role in studying the effect of operating conditions, system parameters, and different configurations on fuel cell performance [
2]. The fuel cell system comprises not only the fuel cell assembly, but also the auxiliary equipment related to fuel storage, air and fuel supply, cooling, and power conditioning [
1]. In fuel cell stack models, electrochemical phenomena, energy balance, and mass balances are typically treated as separate modules. The electrochemical behavior of PEMFC is governed by the membrane electrode assembly, which is depicted in
Figure 1, and characterized by a polarization curve, which is presented in
Figure 2. An equivalent circuit model is used to determine the polarization curve of the fuel cell (FC) in the steady state. The dynamics of a FC are governed by the energy and mass balance, as well as their interlinkage to the polarization curve model.
The polarization curve depicts the voltage–current characteristics of a fuel cell. Typically, such a model is constructed to describe the thermodynamic potential (or the open-circuit voltage) and the different losses (or overvoltage) in the performance. Often, the model is presented as a parameterized semi-empirical model with a theoretical background [
3,
4,
5,
6]. In addition, several other model structures can be found in the literature, and they have been recently presented [
7]. This kind of modeling approach aims to provide a generalizable model structure for different fuel cells, but it is simple enough to be utilized in a simulation of complete fuel cell power systems, where rigorous first principle (mechanistic) models may often be too complicated to be applied [
2]. The semi-empirical models can also be found as a part of more rigorous fuel cell models [
8].
Parameter estimation (or identification) of the above-mentioned semi-empirical models has gained a lot of attention over the past 10 years. In particular, it leads to an industrially relevant nonlinear optimization problem for evolutionary optimizers. Evolutionary optimizers are understood as heuristic search methods that mimic the evolution in nature. For example, established methods such as genetic algorithms [
9,
10], differential evolution [
11,
12], and many new methods [
13,
14,
15] have been proven to work for the parameter estimation problem of the fuel cell polarization curve. During the course, the authors have also altered the model structure [
15] in order to improve the model accuracy.
In this study, an alternative way of applying evolutionary optimizer tools for polarization curve modeling is presented. Our novel approach involves model structure identification for the steady-state FC polarization curve model. Although evolutionary optimizers, such as genetic programming [
16], have previously been applied for determining solid oxide fuel cell models, they have been focused on reproducing the behavior of one FC system. Here, the aim is to find one model structure that can also be generalized to different PEMFC systems. The resulted FC polarization curve model is linear with respect to its parameters, but nonlinear by its variables. This significantly simplifies the parameter estimation part of the model development. Although the structure identification essentially leads to a data-driven black-box model, the models developed in this study can be considered semi-empirical or gray box, as they utilize some of the auxiliary equations that are used also in the well-known semi-empirical model [
9,
10,
12]. In this study, the structure search results from Ohenoja et al. [
7] are extended to handle an alternative input variable set, and the structure search is repeated with varying model complexity. The proposed model structure is validated for different fuel cells and operating conditions, and the applicability of the model is discussed.
The outlook of this paper is as follows. In
Section 2, the optimization algorithm and the data utilized are presented. In
Section 3, the model structure identification results for three different cases are presented. In
Section 4 and
Section 5, the results are discussed, and the findings are summarized.
3. Results
The model structure identification is presented for three different cases. The model structure inspected in numerous studies [
9,
10,
12,
22] consists of seven unknown parameters. Hence, the first case presented in this study considers a maximum number of seven terms (
n = 7) in the model structure, ensuring a comparable model complexity with the widely used semi-empirical model structure. This case was also presented in Ohenoja et al. [
7], and is here accompanied by new results from two additional cases. In the second case, a different set of possible model variable candidates is taken. In the third and final example, the model structure identification is performed with varying model complexity. The model structures presented are tested with four different FCs, one of them comprising polarization curves in four different operating conditions. The optimizations were run with 3.4 GHz, 12 GB i7 PC using Matlab
® R2016a without parallel computing. The model structure search involved 5,000,000 objective function evaluations (and 20,000,000 regressions), requiring 38−51 min of wall-clock time, where the elapsed time increased linearly as a function of allowed model complexity.
3.1. Case 1
The input variables for the first case, which were also presented in [
7], were
i,
T,
CO2,
pO2,
pH2,
ilim, and a calculated variable,
i/
ilim. The optimization results for the nominal model complexity are presented in
Figure 6, where the seven predicted polarization curves with experimental data are shown. Clearly, the model can follow the experimental data very accurately. In
Table 2, the
SSE values for each case are given. In comparison, the
SSE values from our earlier studies [
10,
24] are given. It should be noted that all of these studies use exactly the same data sets, and thus, a comparison of
SSE values is straightforward.
Table 2 shows that the optimized model structure led to better results than the model structure in [
24] in terms of the
SSE for the SR-12 and Ballard FCs. In the BCS case, [
24] gives a better fit. For these three fuel cells,
SSE values are very low, and both studies led to acceptable results. For the 250W FC, we had data on several operating points, and the optimized model structure results in significantly lower
SSE than in [
10]. In Ohenoja et al. [
10], the best fit had an
SSE value of 8.4854, whereas this model achieved an
SSE of 1.6154. Yang et al. [
15] reached even a lower
SSE value of 1.1746 for the 250W FC. This was achieved by adding three more free parameters in the modeling and parameter fitting.
The optimized model structure found for the FC polarization curve can be written as:
where:
The linear regression coefficients
a0 …
a7 are presented in the
Supplementary Material. Equation (5) shows that the model structure that was found incorporates operating conditions (
CO2,
pH2,
T). This is important in order to generalize the model predictions into different operating conditions. However, from the high
SSE value for the 250W/4 polarization data, it is clear that this generalization ability is limited. Term
i/
ilim is repeated several times in the optimized model. Indeed,
ilim has a strong effect on the polarization curve, as it determines the end point for the curve. In this exercise, it was assumed that the value for the
ilim is known and fixed throughout the operating conditions. Naturally, unknown
ilim values would require new model structure identification.
3.2. Case 2
As recognized in [
7], the model structure search can also incorporate additional variables. Hence, the optimization is repeated with an altered set of input variables. In this case, the input variables were
i,
T,
CO2,
pO2,
pH2,
Lmem,
A, and
ilim. This way, the number of variable candidates remains constant. The model complexity is also kept at a comparable level by allowing up to seven terms in the resulting model. The resulting
SSEs are presented in
Table 2, and the optimized model structure in Equations (6) and (7):
where:
In this case, the model does not comprise
ilim. This might result in a predicted polarization curve that has zero crossing outside the actual current range of the FC. Hence, predictions in high currents may not be reliable with this model structure. Additionally, the incorporation of the logarithmic transformation of
i leads to a situation where the model cannot represent open-circuit voltage, as it has no solution when
i = 0. However, such a shortage may also be found in the semi-empirical models, as discussed in [
25]. The comparison of the
SSE values between the two model structures shows that this model structure has a higher total error (2.55). A closer look at the
SSE values shows that in Case 2, the prediction performance of the 250W FC was emphasized with a cost of the modeling performance of the three other fuel cells. This indicates a poor generalization ability for the model structure. Hence, the conclusion is that the variables set used in Case 1 is preferred over Case 2.
3.3. Case 3
In the third case, the effect of the model complexity on the model performance is studied. The structure search was performed with the number of allowable model terms varying between four and nine. The same variable candidates as in Case 1 were used. The resulted total
SSEs (all data) and
SSE for 250W FC are shown in
Figure 7. As expected, the model performance increases as the number of allowable model terms increases. However, it is notable that only minor improvement is achieved after
n = 5.
The results in Case 3 outperform the model fitting observed in [
15], where a 10-parameter model structure had a
SSE value of 1.1746 for the 250W FC. They focused on the high-current region, and proposed two new expressions for the high-current region in order to improve the curve-fitting properties of the semi-empirical model. In
Figure 7, it is indicated that an even better model fitting for the 250W FC can be achieved with a nine-parameter model.
4. Discussion
The model structure found in this study (Case 1) provides very low
SSE values with the same number of free parameters, and therefore with a comparable model complexity to the semi-empirical model structure used in [
9,
10,
12,
22,
24]. Naturally, the remarkable difference in the observed
SSE values can be partly explained with the constrained parameter values in the semi-empirical model. It was shown in [
10] that the expanded search range leads to more accurate results. With the approach taken here, the optimization problem is unconstrained. It should also be noted that several authors have used the semi-empirical model structure, and have been able to find better
SSE values than reported in [
10] using different optimization algorithms. For example, Sun et al. [
12] managed to reach a
SSE value of 7.99 for the 250W fuel cell compared with 8.49 in [
10]. However, the data sets are not exactly the same, although they were interpreted from the same original polarization curves. Therefore, the direct comparison of
SSEs between different studies can be misleading.
In Case 2, the altered variable candidate list did not improve the model performance. An optimization case formulated with all of the possible variables, comprising
ρm and the number of cells in an FC stack, for example, would be preferable. In addition, the structure identification could incorporate more than one nonlinear transformation in a series. Such alternations can be made with the approach presented, but require carefully made modifications to the optimization algorithm. Based on the results in Case 3, the model structure search could be utilized to extend the work in [
15]. It would be interesting to observe what kind of expression for the high-current region could be found with an evolutionary optimizer. However, in order to facilitate such a test, more data is required, as only a few data points in the polarization curve data that were utilized in this study are in this high-current region.
The approach taken in this study is fundamentally data-driven, and the link between the model parameter values and the physical phenomena that the mechanistic models hold is lost. However, the model structure established had (limited) generalization ability, and the model was linear with respect to its parameters. This feature can be beneficial in many FC system applications related to diagnostics, control, and large-scale simulations where easy model implementation or continuous adaptation is required. Naturally, an extension to dynamic models is required in such instances. The model structures that were identified in this study incorporated operating conditions, which can be linked to the mass and energy balances of an FC system. However, the model’s ability to capture transient states requires further studies.
Typically, the data-driven methods use a rather vast amount of (measured) data and data pre-processing. In this exercise, the data were very limited. Although the model was efficiently fitted to four different fuel cell polarization curves, a more rigorous benchmark test with a rich data set is needed. Data originating from [
9,
17] provide only an example with the limited amount of data points. Benchmark data could be, for example, simulated via a rigorous model in different operating conditions with added noise and drift elements. Such data would allow efficient testing of the model performance.