Lithium-Ion Battery Capacity Prediction with GA-Optimized CNN, RNN, and BP

Durmus, Fatih; Karagol, Serap

doi:10.3390/app14135662

Open AccessArticle

Lithium-Ion Battery Capacity Prediction with GA-Optimized CNN, RNN, and BP

by

Fatih Durmus

^*

and

Serap Karagol

Department of Electrical–Electronics Engineering, Ondokuz Mayis University, Samsun 55270, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5662; https://doi.org/10.3390/app14135662

Submission received: 21 May 2024 / Revised: 24 June 2024 / Accepted: 26 June 2024 / Published: 28 June 2024

(This article belongs to the Special Issue Lithium-Ion Battery Health and Safety Estimation)

Download

Browse Figures

Versions Notes

Abstract

:

Over the last 20 years, lithium-ion batteries have become widely used in many fields due to their advantages such as ease of use and low cost. However, there are concerns about the lifetime and reliability of these batteries. These concerns can be addressed by obtaining accurate capacity and health information. This paper proposes a method to predict the capacity of lithium-ion batteries with high accuracy. Four key features were extracted from current and voltage data obtained during charge and discharge cycles. To enhance prediction accuracy, the Pearson correlation coefficient between these features and battery capacities was analyzed and eliminations were made for some batteries. Using a genetic algorithm (GA), the parameter optimization of Convolutional Neural Network (CNN), Backpropagation (BP), and Recurrent Neural Network (RNN) algorithms was performed. The parameters that provide the best performance were determined in a shorter time using GA, which includes natural selection and genetic processes instead of a trial-and-error method. The study employed five metrics—Mean Square Error (MSE), Root Mean Square Error (RMSE), Normalized Root Mean Square Error (NRMSE), Mean Absolute Error (MAE), and Squared Correlation (R²)—to evaluate prediction accuracy. Predictions based on NASA experimental data were compared with the existing literature, demonstrating superior accuracy. Using 100 training data, 68 data predictions were made with a Root Mean Square Error (RMSE) of 0.1176%. This error rate represents an accuracy level 2.5 times higher than similarly accurate studies in the literature.

Keywords:

capacity; convolutional neural network; genetic algorithm; lithium-ion battery; state of health

1. Introduction

The use of lithium-ion batteries has increased rapidly in recent years due to their advantages over other batteries, such as their low cost, high energy density, low self-discharge rate, and long life [1]. Consequently, these batteries have found extensive applications across diverse sectors such as mobile communications, transportation, electric power storage, electric vehicles, storage of new energy sources, and aerospace [2]. However, it is crucial to accurately monitor and predict capacity, as incorrect capacity prediction, over-charging, or over-discharging can cause permanent damage to the battery [3]. Given the widespread use of lithium-ion batteries, great emphasis must be placed on their safe operation. Capacity and State of Health (SOH) are regarded as critical parameters for assessing the current status and performance of lithium-ion batteries [4].

A number of studies have been conducted to accurately predict battery health. This research can generally be categorized into two main groups: model-based methods and data-based methods [5,6].

Model-based methods estimate the SOH of a battery by modeling the battery and considering the internal degradation process [7]. The fundamental working principle of these models is to analyze the mode of performance variation of reactions within a lithium-ion battery. They consider the effects of internal and external battery state variables on performance and then build a cell degradation model [8]. Although some progress has been made in model-based predictions, the presence of complex chemical reactions inside lithium-ion batteries makes it difficult to build an accurate aging model [9]. Furthermore, the condition of lithium-ion batteries is highly dependent on environmental factors such as operating temperature, anode materials, cathode materials, and others [10]. Therefore, there is no general and accurate battery degradation model to determine the key parameters of battery life for accurate capacity prediction [11].

Data-driven methods have recently attracted more attention in capacity estimation due to their flexibility and efficiency in nonlinear observability. These methods use statistical or machine learning models to predict the capacity or health status of a battery.

An adaptive SOH estimation method using Feed-Forward Neural Network (FNN)-, online AC complex impedance [12]-, and simple Recurrent Neural Network (RNN)-based approaches have been proposed to estimate the SOH of lithium-ion batteries using dynamically operating RNNs [13]. Support Vector Machines (SVMs) [14,15], Gaussian Regression Processes (GPRs) [16,17], Autoregressive Integrated Moving Average (ARIMA) [18], Extreme Learning Machines (ELMs) [19,20], Long- and Short-Term Memory Networks (LSTM) [11,21], Gated Recurrent Units (GRUs) [22,23], Savitzky–Golay filters with gated recurrent units (SG-GRU) [24], and CNN algorithm [25], which make predictions using current and voltage measurement data, have been widely used for lithium-ion battery capacity prediction due to their nonlinear mapping and self-learning capabilities. Data-driven methods can predict the SOH of a battery without requiring electrochemical information about the internal structure of the battery and aging mechanisms. Therefore, these methods can be easily applied without the need for knowledge about the electrochemical properties of a battery and environmental factors. In previous studies, various models have been proposed to enhance prediction accuracy, typically utilized through trial and error or with standard values. While these approaches have yielded satisfactory results, they often do not achieve optimal performance due to the lack of systematic parameter optimization. Optimizing these parameters is crucial because it directly impacts the model’s ability to accurately predict battery health and lifespan. Traditional methods of parameter selection can be time-consuming and may not explore the full potential of the model’s architecture. To address this, the use of a genetic algorithm (GA) is proposed for parameter optimization. Genetic algorithms are powerful search heuristics that mimic the process of natural selection, effectively exploring a large parameter space to identify the most suitable configurations for the models. Therefore, optimizing these parameters using a genetic algorithm can improve the efficiency and accuracy of the models. In this study, the parameters of the CNN, RNN, and BP algorithms, which are among the models used in prior studies, were optimized using a genetic algorithm, resulting in improved prediction accuracy. Consequently, the following contributions were acquired.

Accuracy is enhanced using the most optimal parameters, rather than random or standard parameters, when establishing the parameters of the algorithms via the genetic algorithm.
A total of four features are extracted for charging and discharging times. Using a relatively small number of features compared to the literature, the processing load and potential error situations are reduced. The resulting high performance has increased the importance of this process.
Capacity prediction is conducted to estimate the capacity of the battery at each specific location. Consequently, these data provide information on both battery health and remaining battery life.
Compared to the literature, accurate results are produced with fewer data, validating the usefulness and correctness of the approach.

The succeeding parts of the paper will be structured as follows. Section 2 outlines the general flow of the investigation, including the methodology, data information, and feature extraction. The algorithms utilized in the research are presented in Section 3. The results are provided in Section 4. Conclusions and future work are addressed in Section 5.

2. Overview and Dataset Information

The general diagram of the proposed work is shown in Figure 1. In summary, the flow comprises five steps: feature extraction, data processing stage, parameter optimization, capacity prediction, and performance evaluation.

Step 1: In the first step of the study, four different battery characteristics are extracted from the dataset obtained through experiments conducted by NASA in a laboratory setting. Subsequently, a Pearson correlation analysis is performed to determine the linear relationship between these features and battery capacity.

Step 2: In this step, the features and battery capacity slated for use undergo normalization within the range of 0–1. This normalization process is performed using the range method. After normalization, the relevant data for the three different training sets are separated into training and test datasets.

Step 3: In this step, the genetic algorithm is used to the relevant parameters of the CNN, BP, and RNN algorithms for capacity prediction. During this process, the genetic algorithm selects the parameters that lead to the lowest Root Mean Square Error (RMSE) values. The appropriate parameters are determined separately for each different training set, resulting in optimal parameter sets being identified.

Step 4: Using the training data determined in the second step and the optimal parameters identified in the third step, capacity prediction is conducted using three distinct algorithms: Convolutional Neural Network (CNN), Backpropagation (BP), and Recurrent Neural Network (RNN) algorithms. For each algorithm, five separate capacity predictions are generated by training different networks five times each. Subsequently, these five predictions are averaged. The resulting average predictions undergo denormalization in preparation for the performance analysis.

Step 5: Five different indicators are used to check and evaluate the accuracy and reliability of the predictions.

2.1. Dataset Information

The lithium-ion battery accelerated life testing platform primarily consists of a programmable electronic load, a programmable DC power supply, a thermostat, various sensors, a data acquisition unit, and an electrochemical impedance spectroscopy tester [26]. The four batteries commonly used for SOH (State of Health) estimation are labeled B0005, B0006, B0007, and B0018 [10,27,28]. The test procedure for a lithium-ion battery is outlined as follows: First, in the charging experiment program conducted at an ambient temperature of 24 °C, the battery is charged at a constant current of 1.5 A. Once the terminal voltage reaches 4.2 V, the battery is switched to constant voltage charging mode until the charging current drops to 20 mA, indicating that the charging is complete. Finally, in the discharge experiment conducted at 24 °C ambient temperature, batteries B0005, B0006, B0007, and B0018 are discharged with a constant current of 2 A until the terminal voltages drop to 2.7, 2.5, 2.2, and 2.5 V, respectively.

Figure 2 shows the 15th charge and discharge cycle of the B0005 battery obtained from the NASA dataset. The charging procedure method used during the acquisition of the dataset refers to the procedure applied through the charger to fully charge the lithium-ion battery. In this procedure, constant current (CC) and constant voltage (CV) are basically used for the lithium-ion battery. The initial stage of charging involves the application of constant current (CC), followed by the second stage where constant voltage (CV) is applied. The battery reaches full charge through this progressive charging process. The voltage and electrical characteristics of the aforementioned charging methods, as well as the temperature of the operating environment, are important factors that can affect the lifetime of a lithium-ion battery.

Figure 3 shows the capacity decrease of four different batteries decreases during the charge and discharge cycle. The point at which the capacity of a lithium battery drops to 70% is considered battery failure [28]. Therefore, for a 2 Ah battery to be classified as dead, its capacity must drop to 1.4 Ah. In the chart, the data for battery B0007 indicate that its lifetime is not yet complete. The lifetimes of B0005, B0006, and B0018 are 124, 108, and 96 cycles, respectively. Each battery dataset contains three fields of information: charge, discharge, and impedance.

2.2. Feature Extraction and Selection

Voltage and current measurements during the charging and discharging cycles of the batteries were used to predict the battery capacity.

During Charging: The first stage of charging is constant current (CC), followed by the second stage of constant voltage (CV). The duration of the CC and CV stages varies depending on the capacity and number of cycles of the battery. Therefore, based on the current measured during charging, the time during which the current remains constant is determined as the first characteristic. In the current–time graph during charging, shown for the B0005 battery in Figure 4a, the duration of constant current is arranged from smallest to largest for the 75th, 50th, and 25th cycles. It is predicted that this duration of constant current will decrease as the number of cycles increases. The duration of constant voltage is also determined as the second feature. As shown in Figure 4b, for the B0005 battery during charging, the duration of constant voltage was arranged from smallest to largest for the 25th, 50th, and 75th cycles. With an increase in the number of cycles, it is expected that the duration of constant voltage will also increase.

During Discharging: Two important features were derived from the battery discharge measurements conducted by NASA during the discharge process. The average voltage measured during a cycle varies depending on the number of cycles. As clearly seen in Figure 5, the average voltage decreases with an increase in the number of cycles. Therefore, the average voltage was determined as the third characteristic. Additionally, the point at which the lowest level is reached in the measured voltages varies depending on the number of cycles and, consequently, the battery’s capacity. It is observed that the lowest voltage is reached at the 75th, 50th, and 25th cycles, respectively. As the number of cycles increases, the duration of this lowest voltage moment is predicted to decrease. The moment of the lowest measured voltage was identified as the fourth determining feature.

Since the other battery characteristics and voltage–current charts exhibit similar trends, it is considered sufficient to present only the feature extraction from the B0005 battery. These four features were extracted from all four different batteries in the NASA dataset. The linear relationship between the two independent variables—direction and strength—was determined using the correlation coefficient. The correlation coefficients of the extracted features with the target battery capacities are shown in Table 1. Here, HFs represent feature numbers: For the features during charging, HF1 is constant current time and HF2 is constant voltage time. For the features during discharge, HF3 is the average voltage and HF4 is the moment when the measured voltage is lowest. It is observed that the four features extracted from batteries B0005 and B0006 have high correlation coefficients. Therefore, these four features were used for capacity prediction in these batteries. However, the correlation coefficient of HF2 in B0007 and HF1 in B0018 is low. Therefore, HF1, HF3, and HF4 for battery B0007 and HF2, HF3, and HF4 for battery B0018 were used for capacity prediction.

2.3. Normalization

In this paper, range–type normalization was applied to capacities and characteristics. All data are normalized between 0 and 1. The equation for range normalization is shown in Equation (1),

{\bar{x}}_{i} = \frac{x_{i} - \min (x)}{\max (x) - \min (x)}

(1)

where

{\bar{x}}_{i}

is the i^th value in the normalized series,

x_{i}

is the i^th value in the series,

\min (x)

is the smallest value in the series, and

\max (x)

is the largest value in the series.

3. Related Algorithms

Genetic algorithm, CNN, BP, and RNN algorithms were used in this paper. CNN was utilized because of its performance capabilities reported in the literature. BP was used since it represents artificial neural networks in their most general form, and the RNN method was used because of its widespread usage in time series analysis. The genetic algorithm was employed for the parameter optimization of these algorithms, as it is effective in complex, multidimensional, and general searches or optimization problems.

3.1. Genetic Algorithm

The genetic algorithm is a computational method for solving optimization and search problems, inspired by natural selection and genetic processes. It is part of a sub-discipline called evolutionary computing.

The genetic algorithm is based on the principle of natural selection. This principle suggests that better-matched individuals within a population, i.e., individuals who can solve the problem more effectively, are more likely to contribute more to the next generation. The genetic algorithm mimics this process of natural selection and aims to advance a solution space by allowing individuals within a population to interact with each other through various genetic operators, such as crossover, mutation, and selection.

The genetic algorithm is particularly effective for complex, multidimensional, general search, or optimization problems. For instance, it can be applied to address challenges such as the traveling salesman problem, the optimization of machine learning models, or route planning. The working process of the genetic algorithm consists of the following steps:

Initial Population Generation: Generating the first-generation individuals using random or heuristic methods.
Fitness Assessment: Calculating the fitness value of each individual, i.e., measuring how good the solution is.
Selection: Selecting individuals based on their fitness values. Individuals with higher fitness have a greater probability of being selected.
Crossover: The process of crossing over, which allows the exchange of genetic material between selected individuals, thus creating new individuals.
Mutation: Introducing random genetic changes to newly created individuals.
Creating New Population: The individuals resulting from crossover and mutation constitute the next generation population.
Controlling the Termination Condition: Checking whether certain termination conditions have been met (e.g., reaching the maximum number of iterations or achieving a desired fitness level).
Evaluation of Results: Analyzing the obtained results and assessing the proximity to the desired solution.

The genetic algorithm functions used in this study are as follows:

Single Crossover: This is the simplest form of the crossover operator. In this method, the genetic material of two selected parental individuals is cut at a specific point and then the pieces are swapped. This process results in the creation of two offspring individuals. Figure 6 shows an example of a single crossover.

Uniform Mutation: A mutation operator that applies random changes to the genetic material of an individual. This operator ensures that each gene is replaced with random values with a certain probability. Uniform mutation helps the search process cover a wider area by increasing genetic diversity. In this way, the population of the genetic algorithm enhances its ability to discover and optimize potential solutions.

Before Mutation: 01011011 After Mutation: 01111011

Roulette Selection Function: This function, used as a selection operator, allows the selection of individuals based on their fitness values. In this method, the probability of selecting each individual is determined by the ratio of its fitness value to the total fitness value. Thus, more suitable individuals are more likely to be selected. Consequently, fitter individuals are more likely to be passed on to the next generation. An example schematic of the roulette selection function is shown in Figure 7.

In this paper, a genetic algorithm was used to determine the parameters of CNN, BP, and RNN for each battery separately with 40, 70, and 100 training data. In previous studies using the NASA dataset, predictions typically utilized 100 training data, which constitutes 60% of the data [29,30,31,32]. Therefore, this study started to make predictions using 100 training data. Additionally, it was observed that 76 training data were used in [33]. Subsequently, to evaluate the method’s performance, predictions were also made with a reduced number of training data. Therefore, the number of training data was set to 40 and 70 for further predictions. The parameters Filternumber and filtersize were determined for CNN, while the hiddenlayersize parameters were determined for BP and RNN.

3.2. CNN

A Convolutional Neural Network (CNN), traditionally used for image processing to extract features from two-dimensional data, is applied in this paper to predict battery capacity from one-dimensional time series data. Figure 8 shows the architecture of the 1D-CNN used for the time series prediction model.

This architecture consists of an input layer, a convolutional layer, a flattened layer, a fully connected layer, and an output layer. Input features are fed into the convolutional layer, where a filter is applied to each input feature to produce a feature map. The activation function is then applied to the results. The output from the convolution layer is passed to a flattened layer to convert it into a one-dimensional array. The output from the flattened layer is fed to the fully connected layer, where weights are applied for data processing. The output of the fully connected layer is fed to the output layer. In this study, no activation function is applied in the convolution layer. In summary, the layers applied when using the CNN algorithm are as follows: 1. the sequence input layer, 2. the CNN layer, 3. the fully connected layer, and 4. the regression layer. The training options for the CNN network are shown in detail in Table 2.

3.3. BP

The BP neural network is one of the most representative artificial neural network algorithms [34]. The topology of a BP neural network model includes an input layer, hidden layers, and an output layer. Figure 9 shows the architecture of a basic BP network. The fundamental concept of the BP neural network involves constructing a multilayer feed-forward neural network. During the propagation process, it adjusts the connection weights and thresholds between layers and nodes based on the error between the actual and expected output values, using the backpropagation algorithm. This algorithm does not require a predetermined mathematical mapping between input and output but learns rules and generates mathematical simulations of signals through error feedback. The learning process of the BP neural network is divided into two stages: the forward propagation process of the input signal and the BP process of the prediction error [35].

The number of neurons in the output layer is

n

, the output of the BP neural network is

y^{'}

, and the expected output is

y

. The error of the model

ε

is calculated as follows [36]:

ε = \frac{1}{2} \sum_{j = 1}^{n} {(y_{j} - y_{j}^{'})}^{2}

(2)

The modification value of each weight is given by

∆ ω_{i j} = - η \frac{d ε}{d ω_{i j}} = - η \frac{d ε}{d I_{j}} \frac{d I_{j}}{d ω_{i j}}

(3)

where

ω_{i j}

is the weight from the input layer node

i

to hidden layer node

j

, and

η

is the learning rate.

I_{j}

is the transfer function of the

j

^th hidden layer.

In this study, the Levenberg–Marquardt training function has been selected for the BP neural network. This function serves as a backpropagation training function and incorporates Jacobian derivatives. Despite the fast execution characteristics of Levenberg–Marquardt training functions, it is important to note that they are not supported on GPU hardware.

3.4. RNN

The RNN is a type of neural network where the output from the previous step serves as input for the current step, creating dependencies among all inputs. Figure 10 illustrates the architecture of an RNN, where

X_{t}

represents the input at time

t

,

h_{t}

is the hidden layer output, and

Y_{t}

is the output.

ω_{h (t - 1)}

is the weight between the previous hidden layer and the hidden layer at time

t

,

ω_{x t}

is the weight between the input and the hidden layer, and

ω_{y t}

is the weight between the hidden layer and the output. Equations (4) and (5) depict the formulas for the hidden state and RNN output, respectively.

h_{t} = \tanh (ω_{h (t - 1)} h_{t - 1} + ω_{x t} X_{t})

(4)

Y_{t} = ω_{y t} h_{t}

(5)

3.5. Evaluation Parameters

In this paper, five different metrics—Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Normalized Root Mean Squared Error (NRMSE), Mean Absolute Error (MAE), and Squared Correlation (R²)—will be used to measure the similarity between the predictions generated by the proposed system and the actual values. These metrics serve to evaluate the system’s performance from various perspectives. The calculations for these metrics will be conducted using the formulas provided in Equations (6)–(10).

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - {\hat{x}}_{i})}^{2}

(6)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - {\hat{x}}_{i})}^{2}}

(7)

N R M S E = \frac{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - {\hat{x}}_{i})}^{2}}}{\max (x_{i}) - \min (x_{i})}

(8)

M A E = \frac{1}{N} |x_{i} - {\hat{x}}_{i}|

(9)

R^{2} = {(\frac{\sum (x_{i} - m e a n (x)) ({\hat{x}}_{i} - m e a n (\hat{x}))}{\sqrt{\sum {(x_{i} - m e a n (x))}^{2} \sum {({\hat{x}}_{i} - m e a n (\hat{x}))}^{2}}})}^{2}

(10)

where

x_{i}

is the true value,

{\hat{x}}_{i}

is the predicted value, N is the number of cycles,

\max (x_{i})

is the largest true value,

\min (x_{i})

is the smallest true value,

m e a n (x)

is the mean of the true values, and

m e a n (\hat{x})

is the mean of the predicted values.

4. Results and Discussion

After normalizing the extracted features and capacities and partitioning them into training and testing sets, the first step involves determining the optimal parameters. Therefore, the relevant parameters of the RNN, CNN, and BP algorithms were optimized using GA to achieve the lowest RMSE value. The GA options used during optimization are summarized in Table 3 based on the respective algorithms. Except for the range values, the other parameter values were kept the same for all algorithms. The main reason for this decision is that for BP and RNN, high values of hiddenlayersize lead to an increase in RMSE and an interruption of the program. In the BP and RNN algorithms, a large hidden layer size can lead to the overfitting of the models and reduce the generalization ability of the model. It also increases the training and prediction processes and increases the memory requirement. As a result, it is appropriate to keep this parameter in this range, as using a high number of hidden layers will lead to more complexity and low performance over a longer time. For CNN, no such problem was encountered up to 2000 in the filternumber value, but at higher values, again, positive values for RMSE cannot be obtained. Therefore, the range 1–2000 was chosen.

The parameters obtained through genetic algorithm (GA) optimization, based on the aforementioned choices, are presented in Table 4. During the GA process, the parameters are derived from the best-performing individual after conducting 20 iterations, as mentioned earlier. This optimization process is repeated separately for different training datasets of each battery. The performance criterion used here is RMSE, which is commonly compared in the literature. Typically, the CNN algorithm demonstrates superior performance with a “filtersize” value of 1. However, optimal results were observed with “filtersize” values of 10 and 3 for the cases of 70 and 100 training data in the B0006 battery, respectively.

The most optimal parameters for B0005 were determined and the relevant training data were used. The prediction results, obtained by averaging five iterations for 40, 70, and 100 training data, are presented in Table 5. Based on these results, the GA-CNN model exhibited the highest performance for all values.

The most optimal parameters for B0006 were determined and the relevant training data were used. The prediction results, obtained by averaging five iterations for 40, 70, and 100 training data, are presented in Table 6. Generally, except for the case of 70 training data, the GA-CNN model showed the highest performance. However, with 70 training data, the GA-RNN model was found to be the best-performing model in terms of MSE, RMSE, and NRMSE values, while the GA-BP model performed best in terms of MAE value. The GA-CNN model achieved the highest R² value.

The optimal parameters for the B0007 and B0018 batteries were determined and the relevant training data were used. The prediction results obtained by averaging five iterations for 40, 70, and 100 training data are presented in Table 7 and Table 8. Based on these results, the GA-CNN model exhibited the best performance overall.

The best prediction results for all training data across all batteries are summarized in Table 9. The table lists the algorithms that achieved the best performance based on RMSE values. Except for the 70 training data of the B0006 battery, the algorithms that performed well in terms of RMSE also demonstrated high performance in other evaluation metrics. For this particular training dataset, the GA-RNN algorithm was selected because it delivered good results in three out of five parameters, primarily focusing on RMSE.

The charts of the prediction outcomes provided in the tables for all batteries for each of the training types and for every model are displayed in Figure 11. Additionally, the percentage point errors of the best prediction algorithms are shown in Figure 12. These algorithms are given in Table 9 with their training numbers. Here, the training data are displayed with zero error. As can be seen from the graphs, percentage errors start with the test data.

Table 10 summarize the RMSE and R² values from various studies on battery life prediction types such as capacity, SOH, and RUL. The data in this table include results obtained from studies conducted on NASA data. The methods used to improve the prediction accuracy are also indicated in the studies. As shown in the table, the performance of the proposed method is generally superior in all results except for the prediction using 100 training data for the B0006 battery. Specifically, predictions with 100 training data for battery B0005 achieve an RMSE performance that is twice as good as in the literature studies. Similarly, the results obtained using 100 training data for battery B0007 are approximately three times better than the literature studies, while predictions with 70 and 100 training data for battery B0018 are about twice as good as the literature studies. Notably, the prediction with 70 training data for battery B0007 is achieved with the proposed model, which outperforms existing studies in the literature.

5. Conclusions

In this study, parameters were determined using the genetic algorithm. GA minimizes time loss caused by trial-and-error methods to find optimal results. Moreover, its successful performance in prediction results enhances reliability by providing high accuracy. As evident in the prediction graphs and performance parameters, the CNN algorithm generally outperforms other algorithms across all batteries and training datasets. The highest accuracy achieved was 99.88% (1-RMSE), with 100 training data for battery B0005. The proposed method demonstrates superiority over the literature studies for three out of four batteries, underscoring the accuracy and reliability of this study. However, the lower prediction performance of battery B0006 compared to the literature studies highlights the challenges in predicting measurement results or electrochemical structures of this specific battery using methods like artificial neural networks and deep learning.

In conclusion, this study, which provides the opportunity to predict with high accuracy even with low-training datasets, can be used in battery health and capacity prediction. In addition to the studies, the parameter optimization of the algorithms can be conducted in the future using different optimization methods. Moreover, optimization with genetic algorithms can be combined with different algorithms to further improve the prediction accuracy.

Author Contributions

Conceptualization, F.D. and S.K.; methodology, F.D. and S.K.; software, F.D.; validation, F.D. and S.K.; formal analysis, F.D.; investigation, F.D. and S.K.; resources, F.D.; data curation, F.D.; writing—original draft preparation, F.D.; writing—review and editing, S.K.; visualization, S.K.; supervision, S.K.; project administration, S.K.; funding acquisition, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the scientific research projects coordination unit of ondokuz mayis university, grant number PYO.MUH.1904.23.004.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available at https://phm-datasets.s3.amazonaws.com/NASA/5.+Battery+Data+Set.zip (accessed on 21 May 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huang, S.-C.; Tseng, K.-H.; Liang, J.-W.; Chang, C.-L.; Pecht, M.G. An online SOC and SOH estimation model for lithium-ion batteries. Energies 2017, 10, 512. [Google Scholar] [CrossRef]
Shen, M.; Gao, Q. A review on battery management system from the modeling efforts to its multiapplication and integration. Int. J. Energy Res. 2019, 43, 5042–5075. [Google Scholar] [CrossRef]
Dai, H.; Jiang, B.; Hu, X.; Lin, X.; Wei, X.; Pecht, M. Advanced battery management strategies for a sustainable energy future: Multilayer design concepts and research trends. Renew. Sustain. Energy Rev. 2021, 138, 110480. [Google Scholar] [CrossRef]
Shen, P.; Ouyang, M.; Lu, L.; Li, J.; Feng, X. The co-estimation of state of charge, state of health, and state of function for lithium-ion batteries in electric vehicles. IEEE Trans. Veh. Technol. 2017, 67, 92–103. [Google Scholar] [CrossRef]
Tian, H.; Qin, P.; Li, K.; Zhao, Z. A review of the state of health for lithium-ion batteries: Research status and suggestions. J. Clean. Prod. 2020, 261, 120813. [Google Scholar] [CrossRef]
Zhao, D.; Li, H.; Zhou, F.; Zhong, Y.; Zhang, G.; Liu, Z.; Hou, J. Research progress on data-driven methods for battery states estimation of electric buses. World Electr. Veh. J. 2023, 14, 145. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, W.; Yang, K.; Zhang, S. Remaining useful life prediction of lithium-ion batteries based on attention mechanism and bidirectional long short-term memory network. Measurement 2022, 204, 112093. [Google Scholar] [CrossRef]
Xiong, R.; Li, L.; Tian, J. Towards a smarter battery management system: A critical review on battery state of health monitoring methods. J. Power Sources 2018, 405, 18–29. [Google Scholar] [CrossRef]
Li, Y.; Liu, K.; Foley, A.M.; Zülke, A.; Berecibar, M.; Nanini-Maury, E.; Van Mierlo, J.; Hoster, H.E. Data-driven health estimation and lifetime prediction of lithium-ion batteries: A review. Renew. Sustain. Energy Rev. 2019, 113, 109254. [Google Scholar] [CrossRef]
Ren, L.; Zhao, L.; Hong, S.; Zhao, S.; Wang, H.; Zhang, L. Remaining useful life prediction for lithium-ion battery: A deep learning approach. IEEE Access 2018, 6, 50587–50598. [Google Scholar] [CrossRef]
Zhao, S.; Zhang, C.; Wang, Y. Lithium-ion battery capacity and remaining useful life prediction using board learning system and long short-term memory neural network. J. Energy Storage 2022, 52, 104901. [Google Scholar] [CrossRef]
Xia, Z.; Qahouq, J.A.A. Adaptive and fast state of health estimation method for lithium-ion batteries using online complex impedance and artificial neural network. In Proceedings of the 2019 IEEE Applied Power Electronics Conference and Exposition (APEC), Anaheim, CA, USA, 17–21 March 2019; IEEE: New York, NY, USA, 2019; pp. 3361–3365. [Google Scholar]
Chaoui, H.; Ibe-Ekeocha, C.C. State of charge and state of health estimation for lithium batteries using recurrent neural networks. IEEE Trans. Veh. Technol. 2017, 66, 8773–8783. [Google Scholar] [CrossRef]
Xiong, W.; Mo, Y.; Yan, C. Online state-of-health estimation for second-use lithium-ion batteries based on weighted least squares support vector machine. IEEE Access 2020, 9, 1870–1881. [Google Scholar] [CrossRef]
Stighezza, M.; Bianchi, V.; De Munari, I. FPGA implementation of an ant colony optimization based SVM algorithm for state of charge estimation in Li-ion batteries. Energies 2021, 14, 7064. [Google Scholar] [CrossRef]
Feng, J.; Jia, X.; Cai, H.; Zhu, F.; Li, X.; Lee, J. Cross trajectory gaussian process regression model for battery health prediction. J. Mod. Power Syst. Clean Energy 2020, 9, 1217–1226. [Google Scholar] [CrossRef]
Zhou, D.; Yin, H.; Fu, P.; Song, X.; Lu, W.; Yuan, L.; Fu, Z. Prognostics for state of health of lithium-ion batteries based on Gaussian process regression. Math. Probl. Eng. 2018, 2018, 8358025. [Google Scholar] [CrossRef]
Zhou, Y.; Huang, M. Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model. Microelectron. Reliab. 2016, 65, 265–273. [Google Scholar] [CrossRef]
Yang, J.; Peng, Z.; Pei, Z.; Yuan, H.; Wu, L. Remaining useful life assessment of lithium-ion battery based on HKA-ELM algorithm. Int. J. Electrochem. Sci. 2018, 13, 9257–9272. [Google Scholar] [CrossRef]
Jia, J.; Yuan, S.; Shi, Y.; Wen, J.; Pang, X.; Zeng, J. Improved sparrow search algorithm optimization deep extreme learning machine for lithium-ion battery state-of-health prediction. iScience 2022, 25, 103988. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Zhang, L.; Wang, Z.; Dong, P. Remaining useful life prediction for lithium-ion batteries based on a hybrid model combining the long short-term memory and Elman neural networks. J. Energy Storage 2019, 21, 510–518. [Google Scholar] [CrossRef]
Ding, G.; Wang, W.; Zhu, T. Remaining useful life prediction for lithium-ion batteries based on CS-VMD and GRU. IEEE Access 2022, 10, 89402–89413. [Google Scholar] [CrossRef]
Wang, Y.-X.; Chen, Z.; Zhang, W. Lithium-ion battery state-of-charge estimation for small target sample sets using the improved GRU-based transfer learning. Energy 2022, 244, 123178. [Google Scholar] [CrossRef]
Guo, J.; Wan, J.L.; Yang, Y.; Dai, L.; Tang, A.; Huang, B.; Zhang, F.; Li, H. A deep feature learning method for remaining useful life prediction of drilling pumps. Energy 2023, 282, 128442. [Google Scholar] [CrossRef]
Shen, S.; Sadoughi, M.; Chen, X.; Hong, M.; Hu, C. A deep learning method for online capacity estimation of lithium-ion batteries. J. Energy Storage 2019, 25, 100817. [Google Scholar] [CrossRef]
Saha, B.; Goebel, K. NASA Ames Prognostics Data Repository; NASA Ames: Moffett Field, CA, USA, 2007; Available online: https://phm-datasets.s3.amazonaws.com/NASA/5.+Battery+Data+Set.zip (accessed on 7 March 2022).
Choi, Y.; Ryu, S.; Park, K.; Kim, H. Machine learning-based lithium-ion battery capacity estimation exploiting multi-channel charging profiles. IEEE Access 2019, 7, 75143–75152. [Google Scholar] [CrossRef]
Khumprom, P.; Yodo, N. A data-driven predictive prognostic model for lithium-ion batteries based on a deep learning algorithm. Energies 2019, 12, 660. [Google Scholar] [CrossRef]
Yao, X.-Y.; Chen, G.; Pecht, M.; Chen, B. A novel graph-based framework for state of health prediction of lithium-ion battery. J. Energy Storage 2023, 58, 106437. [Google Scholar] [CrossRef]
Wei, M.; Ye, M.; Wang, Q.; Twajamahoro, J.P. Remaining useful life prediction of lithium-ion batteries based on stacked autoencoder and gaussian mixture regression. J. Energy Storage 2022, 47, 103558. [Google Scholar] [CrossRef]
Xie, Q.; Liu, R.; Huang, J.; Su, J. Residual life prediction of lithium-ion batteries based on data preprocessing and a priori knowledge-assisted CNN-LSTM. Energy 2023, 281, 128232. [Google Scholar] [CrossRef]
Lyu, G.; Zhang, H.; Zhang, Y.; Miao, Q. An interpretable remaining useful life prediction scheme of lithium-ion battery considering capacity regeneration. Microelectron. Reliab. 2022, 138, 114625. [Google Scholar] [CrossRef]
Tian, Y.; Wen, J.; Yang, Y.; Shi, Y.; Zeng, J. State-of-Health Prediction of Lithium-Ion Batteries Based on CNN-BiLSTM-AM. Batteries 2022, 8, 155. [Google Scholar] [CrossRef]
Jin, W.; Li, Z.J.; Wei, L.S.; Zhen, H. The improvements of BP neural network learning algorithm. In Proceedings of the WCC 2000—ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000, Beijing, China, 21–25 August 2000; IEEE: New York, NY, USA, 2000; pp. 1647–1649. [Google Scholar]
Li, J.; Cheng, J.-H.; Shi, J.-Y.; Huang, F. Brief introduction of back propagation (BP) neural network algorithm and its improvement. In Advances in Computer Science and Information Engineering; Springer: Berlin/Heidelberg, Germany, 2012; Volume 2, pp. 553–558. [Google Scholar]
Zhang, Y.; Hu, Q.; Li, H.; Li, J.; Liu, T.; Chen, Y.; Ai, M.; Dong, J. A Back Propagation Neural Network-Based Radiometric Correction Method (BPNNRCM) for UAV Multispectral Image. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 16, 112–125. [Google Scholar] [CrossRef]
Zhang, C.; Zhao, S.; He, Y. An integrated method of the future capacity and RUL prediction for lithium-ion battery pack. IEEE Trans. Veh. Technol. 2021, 71, 2601–2613. [Google Scholar] [CrossRef]
Ding, G.; Chen, H. A novel lithium-ion battery capacity prediction framework based on SVMD-AO-DELM. Signal Image Video Process. 2023, 17, 3793–3801. [Google Scholar] [CrossRef]

Figure 1. General flow diagram.

Figure 2. Charge–discharge cycle.

Figure 3. Capacity–cycle chart.

Figure 4. The charts during charging: (a) current–time chart; (b) voltage–time chart.

Figure 5. Voltage–time chart during discharging.

Figure 6. Example of single crossover.

Figure 7. Example of roulette selection.

Figure 8. One-dimensional CNN architecture.

Figure 9. BP architecture.

Figure 10. RNN architecture.

Figure 11. NASA batteries capacity–cycle predictions for (a) B0005, 40 train data; (b) B0005, 70 train data; (c) B0005, 100 train data; (d) B0006, 40 train data; (e) B0006, 70 train data; (f) B0006, 100 train data; (g) B0007, 40 train data; (h) B0007, 70 train data; (i) B0007, 100 train data; (j) B0018, 40 train data; (k) B0018, 70 train data; and (l) B0018, 100 train data.

Figure 12. Percentage capacity prediction error of best performances for (a) B0005, 40 train data; (b) B0005, 70 train data; (c) B0005, 100 train data; (d) B0006, 40 train data; (e) B0006, 70 train data; (f) B0006, 100 train data; (g) B0007, 40 train data; (h) B0007, 70 train data; (i) B0007, 100 train data; (j) B0018, 40 train data; (k) B0018, 70 train data; and (l) B0018, 100 train data.

Table 1. Correlation Coefficient Between Battery Capacities and Extracted Features.

Battery Num.	HF1	HF2	HF3	HF4
B0005	−0.9057	0.9962	0.9824	0.9999
B0006	−0.9242	0.9753	0.9379	0.9817
B0007	−0.8927	0.0289	0.9611	0.9997
B0018	−0.4831	0.7462	0.9856	0.9998

Table 2. CNN training options.

Solver	Adam
MaxEpochs	1000
GradientThreshold	0.5
InitialLearnRate	0.0001

Table 3. Genetic Algorithm Options.

Algorithm	CNN	BP	RNN
Parameter	`(Filternumber)`	`(hiddenlayersize)`	`(hiddenlayersize)`
Population	5	5	5
Max. gen.	20	20	20
Range	1–2000	1–100	1–100
Func. toler	10⁻⁶	10⁻⁶	10⁻⁶

Table 4. Genetic Algorithm Results.

		CNN			BP			RNN
	40	70	100	40	70	100	40	70	100
B0005	1969	969	1344	43	86	3	18	12	12
B0006	1719	157(10) ¹	63(3) ¹	68	3	3	18	18	92
B0007	1844	1719	157	18	3	3	86	61	68
B0018	1844	1969	844	68	3	4	49	12	4

¹ CNN filter size was set as 10 and 3 for the optimization for 70 and 100 training data for the B0006 battery, respectively, and 1 for the others.

Table 5. Battery B0005 performance parameters.

Train Num.	Algorithm	MSE (%)	RMSE (%)	NRMSE (%)	MAE (%)	R²
	GA-CNN	0.2766	5.2588	10.3894	4.6970	0.9999
40	GA-BP	8.0148	28.3105	55.9306	24.6709	0.6640
	GA-RNN	7.0370	26.5274	52.4080	22.6094	0.3318
	GA-CNN	0.0008	0.2794	0.8348	0.2576	0.9999
70	GA-BP	0.0234	1.5281	4.5659	1.0706	0.9885
	GA-RNN	1.2446	11.1560	34.4457	8.7653	0.6568
	GA-CNN	0.0001	0.1176	0.5635	0.0951	0.9996
100	GA-BP	0.0027	0.5212	2.4981	0.4355	0.9997
	GA-RNN	0.1750	4.1839	20.0532	3.3743	0.8629