The proposed design considers the different situations in which the measured variables, the variables obtained through the Kalman filter frame algorithms and the variables obtained after the filtering of the feature variables, are used as inputs for comparison. Furthermore, it considers the time series of the input variable and its statistics as the transformation input feature variables for comparative analysis.
3.3.1. Combination of Input Variables Based on Measured Variables
The measured current and voltage variables, the transformation of the sliding window with 50 time points, and the transformation of statistics of the sliding window with 50 time points are considered as the inputs of the six machine learning algorithms for experimental comparison.
- (a)
Variable Ontology
We used the measured voltage,
Ut, and current,
It, to estimate the
SOCt at time,
t, as follows:
where,
F represents the function fitted by the algorithm.
- (b)
Transformation variables of sliding window with 50 time points
The
SOCt at time,
t, is estimated by the voltage,
, and current,
, at time,
t, and before 49 time points, as shown below:
where,
F represents the function fitted by the algorithm, each segment separated by a semicolon refers to a time series combination of a variable, and the input variables of 49 time points before time 0 are filled with the value of time, 0.
- (c)
Transformation variables of statistics of sliding window with 50 time points
The
SOCt at time
t is estimated by using the statistics of the mean, standard deviation, and median of the voltage and current measured at time,
t, and at the previous 49 time points, as shown below:
where,
F represents the function of the algorithm fitting,
Mean represents the function of determining the mean value of the variable,
Median represents the function of determining the median of the variable,
Std represents the function of determining the standard deviation of the variable, and the input variables of 49 time points before time, 0, are filled with the value of time, 0.
3.3.2. Combination of Input Variables Based on Kalman Framework Algorithm
The measured current, voltage, and statistics of 27 variables based on Kalman filter frame algorithm, the transformation of the sliding window with 50 time points, and the transformation of statistics of the sliding window with 50 time points, are considered as the inputs of the six machine learning algorithms for experimental comparison.
- (a)
Variable Ontology
The Kalman filter framework algorithm establishes the state equation based on the equivalent circuit model of the battery to estimate the battery terminal voltage and SOC. It dynamically corrects the state variable using the error of the measured terminal voltage. In this paper, we employ the common second-order equivalent circuit DP model, whose circuit structure is depicted in
Figure 2.
It can be observed that the DP model comprises a voltage source, ohmic internal resistance, and RC network, where Uocv represents the battery open circuit voltage, R0 represents the battery ohmic internal resistance, and Ut represents the battery terminal voltage. The parallel network constructed by R1 and C2 is used to describe the long-term concentration polarization effect, the parallel network constructed by R2 and C2 is used to describe the short-term electrochemical polarization effect, and U1 and U2 represent the low order and high order polarization voltages of the second-order circuit model, respectively.
According to Kirchhoff’s law, the output voltage of the DP model is given as follows:
where,
and
represent the derivatives of
and
corresponding to time, respectively.
The general form of the state variables of the state equation of the above circuit model is that the superscript T refers to the transposition of the matrix.
The five algorithms based on the second-order model of Li-ion batteries, i.e.,
EKF, UKF, MIUKF, UKF-VFFRLS and
MIUKF-VFFRLS, follow the framework of Kalman filtering. Their state variables follow the general form presented above, which includes five output variables, i.e.,
U1 gain,
,
U2 gain,
,
SOC gain,
, estimated terminal voltage,
UT, and estimated
SOC; these five variables are closely related to the final real SOC value. Therefore, a total of 25 variables, obtained from five variables in five algorithms, are selected as the input variables of the machine learning algorithm. The original measured current and voltage variables are the basic variables used to estimate the SOC, and are also included in the input variables. Therefore, a total of 27 variables are used to estimate the SOC, as shown below:
where,
F represents the function fitted by the algorithm, and the superscript refers to the corresponding Kalman filter framework algorithm.
- (b)
Transformation variables of sliding window with 50 time points
The SOCt is estimated by the 27 input variables at time, t, and before 49 time points. The input variables of the 49 time points before time, 0, are filled with the value of time 0. This formula is omitted due to its complexity.
- (c)
Transformation variables of statistics of sliding window with 50 time points
The SOCt at time, t, is estimated by using the statistics of the mean, standard deviation, and median of 27 input variables at time, t, and the previous 49 time points. The input variables of 49 time points before time, 0, are filled with the value of time, 0. This formula is also omitted due to its complexity.
3.3.3. Combination of Input Variables after Elimination of Redundant Variables Based on Kalman Frame Algorithm Variables
The transformation of the time sliding window increases the computational complexity of the tree model algorithms (Random Forest, XGBoost, and AdaBoost) due to the large number of algorithm variables based on the Kalman filtering framework, which considerably increases the computational time of the experiment. Furthermore, Linear Regression and SVR are sensitive to the redundant feature variables; therefore the redundant variables must be eliminated. Additionally, the LSTM algorithm performs a performance comparison between all the feature variables and the selected feature variables. Therefore, in this paper, we employed the recursive feature elimination method to eliminate the redundant variables based on the Kalman filter framework, which ultimately selects the three most important feature variables corresponding to the different algorithms, as the input.
- (a)
Variable ontology after PEF recursive feature elimination
We employed a top-down method, in which all the features were included at the beginning, and some were gradually discarded to analyze the results. The evaluation method is described below.
We divided the data set into five parts, each of which was disjoint, of the same size. We then successively selected one of these five parts as the validation set, and the remaining four parts were used as the training set. Therefore, five separate model training and validation actions were performed. The average of the five validation results was considered as the validation error of this model.
Lastly, the three most important features for each algorithm were selected, as shown in
Table 1.
- (b)
Transformation variables of sliding window with 50 time points after PEF recursive feature elimination
The SOCt is estimated by the three most important input variables at time, t, and before the 49 time points. The input variables of the 49 time points before time, 0, are filled with the value of time, 0. This formula is omitted due to its complexity.
- (c)
Transformation variables of statistics of sliding window with 50 time points after PEF recursive feature elimination
The SOCt at time, t, is estimated by using the statistics of the mean, standard deviation, and median of the three most important input variables at time, t, and the previous 49 time points. The input variables of the 49 time points before time, 0, are filled with the value of time, 0. This formula is omitted due to its complexity.