1. Introduction
The battery is a vital energy storage device widely utilized in various sectors, including automobiles, power, and communication. However, issues such as aging of electrode plates, sulfation, and grid corrosion drastically reduce the battery’s capacity, leading to premature failure. Accurate prediction of state of health (SOH) can provide a reliable basis for battery replacement and reduce the cost of battery pack replacement. At present, the mainstream power batteries on the market are categorized as lead–acid batteries and lithium batteries. When compared to the extensive research conducted on SOH estimation of lithium batteries, the research conducted on lead–acid batteries is significantly less. Therefore, we will also refer to the latest literature on lithium-ion batteries, which can aid us in better assessing the advantages and disadvantages of the SOH estimation methods.
Capacity degradation is the main failure mode of lead–acid batteries. Therefore, it is equivalent to predict the battery life and the change in battery residual capacity in the cycle. The definition of SOH is shown in Equation (1):
where
Ct is the actual capacity,
C0 is nominal capacity. In other words, SOH is the ratio of the actual capacity and nominal capacity of a battery [
1]; when the SOH of the battery reaches 80%, the battery reach its end of life [
2].
Shida Jiang et al. divided SOH estimation methods into four categories: direct measurement-based methods, model-based methods, data-driven methods, and hybrid methods [
3]. Lei Zhen et al. combined the improved ampere hour method and internal resistance method to quantitatively calculate the remaining capacity of the battery during charging and discharging by accurately measuring the internal resistance of the battery and qualitatively analyzed the health status of the battery during floating charging [
4]. However, this method is limited by accumulated errors and sensor noise.
The model-based method mainly carries out modeling and exploration from the physical mechanism of the battery. The model parameters are determined by the directly measured data, and the relationship between the model parameters and SOH is established by training some data. In reference [
5], a second-order equivalent circuit model of a lead–acid battery was established, and the parameters in the model were identified by using the iterative recursive least square method. However, the model is relatively simplified, and it does not consider more complex and practical working conditions, so it is unable to describe the complex electrochemical reaction of the battery. Reference [
6] establishes a second-order equivalent circuit model for battery corrosion and uses corrosion detection to simulate the aging of lead–acid batteries, thereby evaluating the quality of the batteries.
The data-driven method can be adaptive to meet the changing system parameters, with good real-time performance and robustness. Its estimation principle is relatively simple. By analyzing a large number of data, even without analyzing the aging mechanism of the battery, it can usually obtain very high accuracy. Currently, many studies have applied machine learning methods such as neural networks and long short-term memory (LSTM) to predict health status. Reference [
7] uses advanced particle swarm optimization to optimize the parameters of the least squares support vector machine (LS-SVM) regression model, which improves the prediction accuracy. Reference [
8] established a nonlinear autoregressive neural network model and combined the principal component analysis method to improve the health state prediction accuracy. Reference [
9] uses Pearson correlation coefficient and neighborhood component analysis for feature selection, combines a convolutional neural network (CNN) and LSTM for training prediction, and achieves good prediction accuracy on NASA and Oxford datasets. In reference [
10], the current and voltage of the battery are used as inputs of an artificial neural network (ANN) to estimate the open circuit voltage (OCV) of the battery, and then the state of charge (SOC) is calculated. Finally, the slope of the SOC and current is used as the input of a neural network to estimate the SOH of the battery, and the prediction accuracy is higher than that of other traditional SOC estimation methods. The hybrid method is the integration of circuit model and data-driven methods, trying to overcome the dependence of the data-driven method on the amount of data and the estimation error caused by inaccurate model parameter identification in the model-based method. Reference [
11] uses the off-line battery model parameters as the feature of capacity degradation to train the grey neural network model and use it for on-line capacity estimation.
In general, the data-driven method can realize the online estimation of battery health more easily and quickly.
The bat algorithm is a new metaheuristic swarm intelligence algorithm, which has the advantages of fast convergence, few parameters, and simple model. It has achieved rich results in intrusion detection, fault location, model recognition, and other fields [
12]. Reference [
13] points out that the bat algorithm is more powerful than the particle swarm optimization (PSO) algorithm, genetic algorithm, and harmony search algorithm through experimental comparison results. Reference [
14] used the improved bat algorithm to optimize the structure and weight of an artificial neural network and applied it to the actual time series prediction problem. It was found that this method can accurately predict future rainfall data and proved that the combination of a swarm intelligence algorithm and machine learning algorithm can obtain a more accurate time series prediction model. Therefore, we propose the prediction model of the bat algorithm combined with LSTM to predict the health status of lead–acid batteries.
Our main contributions are as follows: (I) An LSTM model optimized by the bat algorithm for SOH estimation is proposed. (II) Only a few short-term charging curve segments need to be used to extract features and to achieve fast prediction of battery health state estimation. (III) The experimental data were smoothed first and then trained for prediction. The results show that the prediction effect is better after smoothing.
The rest of this paper is organized as follows: In
Section 2, the features are extracted from the charging curve, that is, the five equal time voltage differences and the total constant current charging time during the constant current charging period, which are used as the inputs of the SOH estimation model.
Section 3 describes the method proposed for SOH prediction, introduces the LSTM network optimized by the improved bat algorithm of Levy flight, and uses the proposed model to complete the prediction of the health status of lead–acid batteries. The experimental results are verified and analyzed in
Section 4, and the conclusions about the work described in this paper are given in
Section 5.
3. The LSTM Prediction Model Based on the Levy Bat Algorithm
The LevyBA-LSTM SOH estimation model is proposed, using the six features extracted in the previous section as inputs to the LSTM of the model. We also introduce an improved bat algorithm to optimize the parameters of the LSTM network, thereby better achieving the health status assessment of batteries. The following subsections provide a detailed introduction to the construction principles of the model.
3.1. LSTM Evaluation Model
The LSTM model is essentially a specific form of recurrent neural network (RNN). The LSTM model solves the problem of short-term memory in the RNN by adding gates on top of the RNN model [
18,
19], enabling recurrent neural networks to truly and effectively utilize long-range temporal information. LSTM defines a cell state
Ct as an internal memory unit running throughout the entire chain and updates information within the cell state through three gate structures: forgetting gate
ft, input gate
it, and output gate
ot. The output of LSTM units during time
t is calculated by the input
i(
t), previous state
h(
t − 1), and
Ct, as shown in Equations (4)–(9)
where:
Wf,
Wi,
Wc,
Wo,
bf,
bi,
bc,
bo are the corresponding weight coefficient matrix and bias terms, respectively; σ, tanh are sigmoid function and hyperbolic tangential activation function, respectively;
is the standby update content;
ht is the output value of LSTM current moment.
3.2. Bat Algorithm (BA) Based on Levy Flight
The LSTM model not only has a better memory transfer function for long sequence data, but also can eliminate the problem of reverse gradient vanishing. However, during the prediction process, the LSTM neural network model encounters difficulties in adjusting hyperparameters and slow convergence speed. Therefore, a combination of the bat optimization algorithm is proposed to form the BA-LSTM model to optimize LSTM hyperparameters and reduce the impact of difficulties in adjusting hyperparameters and slow convergence speed.
The BA is an effective way to search the global optimal solution by simulating the foraging behavior of a bat. The bat detects the position of prey by echoing the sound pulse, and according to the degree of proximity of prey it adjusts the loudness A, the pulse rate r, and the frequency . Once a target is found, the rate of the pulse is increased and the loudness is decreased, so that the velocity and position are updated to search for the global optimal solution.
Assuming that the foraging space of the bat is a definite-dimensional space, the individual bat in the global search process, the position
, and the velocity
update formulas are given as:
where
is a random vector drawn from a uniform distribution. Here,
is the global optimal solution of the current bat.
For a local search, once each bat has selected a solution from the existing optimal solution, the new solution is generated by a random walk around the optimal solution, which is set as:
where
is the scaling factor,
is the average loudness of the total number of iterations for all bats,
is a random selection of solutions from the current optimal solution.
Update rules of loudness
and rate
: Assuming that as long as the bat finds its prey, the loudness of the pulse decreases and the rate of the pulse increases gradually. The loudness
and the rates of pulse
are adjusted according to Equations (14) and (15):
where
is the initial rate;
and
are constant, generally set to 0.9.
As can be seen from the description of the basic BA, when the bat changes the velocity, the velocity inertia weight is fixed at 1, which leads to a single change in velocity, which is not conducive to the flexible flight of a bat. Considering the experience of the bats themselves, the inertia weight factor
is introduced to increase the flexibility of the bat flight. In addition, in order to improve the performance of the algorithm, a good random Levy flight [
12] is introduced in the location update to help the individual bat to jump out of the local optimum.
Based on the above ideas, the velocity update formula of the improved BA is shown as follows:
where
is the maximum value of
w(
t), and
wmin is the minimum value.
The location update formula is as follows:
where
is a uniformly distributed random parameter, the symbol
represents the point multiplication,
, the random step comes from the Levy distribution.
The bat algorithm of Levy’s flying optimizes the LSTM in the following steps:
Step 1. Initialization parameters: Determine the hyperparameters that the LSTM algorithm needs to optimize, including hidden layer nodes, training times, and initial learning rate; initialize the parameters of the bat algorithm.
Step 2. Data preprocessing: Divide the dataset into training and testing sets, normalize them, and set the output dimensions. Assign vectors in the
x,
y, and
z directions of the bat’s position in the population to the three parameters that the LSTM needs to optimize. Calculate the fitness value of the position vector according to Formula (18), which is:
where
y′i represents the predicted value of the LSTM model, and
yi represents the expected output value.
Step 3. Updates the bat’s speed and position parameters according to Formulas (16) and (17) to maintain the minimum fitness value.
Step 4. If the maximum number of iterations has not been reached, repeat step 3. If the maximum number of iterations is reached, the optimal parameters are output at this time.
Step 5. Assign the optimized parameters to the LSTM network, train and predict based on these parameters.
The global framework is shown in
Figure 4.
4. Experimental Analysis
4.1. Data Description
In this paper, the common charging method is adopted, that is, constant current and then constant voltage, namely CCCV. This charging method is relatively “mild”, which can weaken the impact on the battery and ensure high charging efficiency. The cycle charging and discharge experiments are used to obtain the capacity attenuation curve of the lead–acid battery, while the discharge experiment uses constant current discharge.
The battery pack used in this study consists of 5 tandem single cells with a nominal voltage of 12 V and a nominal capacity of 32 Ah. For a single cell, charging current is 6 A, constant current constant voltage is 14.8 V, discharge current is 10 A, discharge cut-off voltage is 10.5 V. The charging and discharging process settings are: Charging current is 7 A, discharge current is 10 A, charging cut-off voltage is 74 V, discharge cut-off voltage is 52.5 V; charging time is 5 h; test equipment is Xinkehua Capacity & Lifespan Tester (XT05), 60 V–7 A charger, Dekang Battery Charging and Discharging Repair Integrated Tester (SF100-5); ambient temperature is 26 °C.
The experimental steps are: (1) 7 A constant current charging; (2) when the terminal voltage reaches 14.8 V, it switches to constant voltage charging. When the charging time reaches 5 h, the charging phase ends; (3) 10 A constant current discharge, while recording the discharge current and time, calculate the actual capacity using ampere-hour metering method; (4) when the terminal voltage reaches the cut-off voltage of 10.5 V, the charging and discharging cycle ends; (5) repeat steps (1) to (4) 47 times. Analyze the 10,950 sets of data collected during the charging and discharging process mentioned above. These data include recordings of the total voltage, total current, individual battery voltage, charging capacity, and discharge capacity every two minutes.
4.2. Performance Indicators
To comprehensively analyze the effectiveness of the selected method, this article selects the following four indicators to evaluate the performance of the model: Root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and decision coefficient (R
2). MAE is an absolute error, suitable for situations where there is a significant error between predicted and actual observations, and the MAE is relatively large for larger errors. The smaller the RMSE and MAPE, the higher the prediction accuracy of the model; R
2 is a comprehensive evaluation indicator that represents the degree of interpretation from data input to result output. The closer R
2 is to 1, the higher the degree of interpretation. The calculation formulas are as follows:
where
SoHi is the actual measured value,
SoHi′ is the model evaluation value, and
n is the number of samples.
4.3. Data Preprocessing
Data outlier handling: If the outliers in the data deviate significantly from the remaining observed values of the sample, they need to be removed from the sample.
Data normalization processing: To reduce prediction errors, it is necessary to unify the dimensions of parameters before conducting model training and normalize the data to within [0, 1]. The equation is:
where
xm is the raw data,
xn is the normalized data, and max(
x) and min(
x) are the maximum and minimum values of the variable
x, respectively.
4.4. Model Training
The population size N of the bat optimization algorithm was set to 10 and the maximum number of iterations was set to 10, the individual dimension D was set to 3. The initial parameters of the LSTM model are set as: The number of hidden layers is set to 1, the number of hidden layer cells is set to 200, the number of iterations is set to 20, and the initial learning rate is set to 0.005.
The normalized 6-feature data are used as input to the LevyBA-LSTM model, the structure is , where i represents the number of i-th cycles. The model output is the current actual capacity of the battery Ct. The current SOH is calculated from Equation (1), and the performance index is calculated from Equations (19) to (21). The first 30 cycles of the dataset were used as the training set, and the remaining 17 loops were used as the test set.
4.5. Model Validation
In this paper, the experimental validation was performed in the Matlab R2019b environment. To further illustrate the predictive performance of the LevyBA-LSTM algorithm, this method was compared with the LSTM algorithm and the PSO-LSTM algorithm. For each validation, for a reliable assessment of the error, the model was trained 10 consecutive times, and after completing 10 consecutive training sessions, we took the average of 10 consecutive performance scores for the final performance score.
Figure 5 compares the predicted maximum available capacity of the proposed model with the PSO-LSTM model and the basic LSTM model, and the curves show that the model prediction curve with the number of cycles is consistent with the true trend, which means that the architecture is quite robust. Specific numerical values are presented in
Table 2.
As shown in
Table 2, while the PSO-LSTM model performs better on RMSE and R
2 compared with the LSTM model and the LevyBA-LSTM model, while the LevyBA-LSTM model performs better on MAE and MAPE compared with the LSTM model and the PSO-LSTM model. The MAE of the LevyBA-LSTM model is 0.377 Ah, which indicates that the MAE of SOH is 1.2%, and this accuracy is acceptable in practice.
In order to reduce the measurement noise, the experimental data are smoothed before training prediction. The first 30 cycles of the dataset are used as the training set, and the remaining 17 cycles are used as the test set.
Figure 6 shows the comparison of data before and after smoothing. The prediction error results are shown in
Figure 7, and the specific prediction values are shown in
Table 3.
As shown in
Table 3, after performing smooth preprocessing on the target data, we can find that the proposed LevyBA-LSTM model obtains the smallest RMSE, MAE, and MAPE compared with the LSTM model and PSO-LSTM model. Additionally, the LevyBA-LSTM model obtains the highest R
2. The MAE of the LevyBA-LSTM model is 0.216 Ah, which indicates that the MAE of SOH is 0.68%, and the prediction performance is 0.5% higher than that without data smoothing. To summarize, based on smoothing preprocessing, the proposed LevyBA-LSTM model exhibits satisfactory performance for SOH estimation on the target dataset.
5. Conclusions
In this paper, the health status of lead–acid battery capacity is the research goal. By extracting the features that can reflect the decline of battery capacity from the charging curve, the life evaluation model of LSTM for a lead–acid battery based on bat algorithm optimization is established. The accuracy of the battery life evaluation model is improved through continuous testing, training, and optimization of the battery evaluation model. In addition, the basic LSTM and PSO-LSTM are constructed for comparison and verification. The experimental results show that the proposed model, especially when the experimental data are smoothed before training and prediction, has better adaptability and prediction accuracy.
Our future work will focus on considering the types and conditions of increasing the training set, so that the model can obtain reliable SOH prediction values for different working conditions. In addition, the prediction method studied in this paper is mainly aimed at the battery state under stable working conditions, and further study of the health state prediction method under loads with relatively large fluctuations is needed.