1. Introduction
In recent years, the increasing concern regarding global warming and the rising price of fossil energy have forced the automotive industry to pay close attention to economic and emission reduction topics. As a new development trend, electric vehicles are occupying the mainstream in the transportation industry by virtue of their advantages of environmental protection, energy saving, and low cost. At the same time, the lithium-ion battery is widely used in new energy vehicles, given its high energy/power density, extended cycle life, etc. However, in recent years, owing to the failure of lithium-ion batteries, spontaneous combustion and explosion accidents of electric vehicles have occurred frequently. In the field of energy-based maintenance (EBM), the safety of lithium-ion battery systems has received considerable attention [
1,
2]. Through the battery management system (BMS), the lithium-ion battery can be monitored, and the collected data can be processed and analyzed to achieve the management of the lithium-ion battery. When an abnormal state of the battery cell is found, the BMS will quickly sound an alarm to prevent the occurrence of safety accidents [
3,
4,
5,
6]. Nevertheless, BMS can not detect all kinds of faults of Li-ion batteries, so exploring new fault diagnosis methods is an important research direction in the field of EBM.
The faults of lithium-ion batteries are divided into short circuit faults, abuse faults, contact faults, and sensor faults. The short circuit fault of the lithium-ion battery is divided into external short circuit (ESC) [
7] and internal short circuit (ISC) [
8,
9,
10]. The ESC fault releases a lot of heat in a very short time, which can easily cause a fire. The ISC fault is not obvious at the early stage of the fault, but will deteriorate rapidly with time, and eventually lead to the ESC fault. At the same time, the abuse of a lithium-ion battery, such as thermal abuse, mechanical abuse, and electrical abuse, will cause irreversible damage to the lithium-ion battery, which not only reduces the performance of the lithium-ion battery, but also increases the possibility of ISC fault [
11]. Secondly, due to the defects of the production process and improper connection of lithium-ion batteries, the increase in the resistance of the connection points between lithium-ion batteries affects the thermal safety and aging speed of lithium-ion batteries [
12,
13]. In addition, the sensor in the battery system fails unexpectedly, which leads to inaccurate data obtained by the battery management system, and then gives wrong response measures. So the safety of lithium-ion batteries remains an obstacle to further development. In addition, as a chemical power source, a lithium-ion battery is easily affected by the external environment, leading to its highly nonlinear characteristics, which brings challenges to the fault diagnosis process of lithium-ion batteries [
14].
Safety issues of lithium-ion batteries have attracted significant attention recently, with a significant amount of work dedicated to fault detection and diagnosis. These methods can be roughly divided into two categories: model-based diagnostic methods and model-free diagnostic methods. The model-free diagnostic methods include knowledge-based diagnostic methods and data-driven diagnostic methods.
The model-based fault diagnosis method usually involves building equivalent circuit models or electrochemical models of lithium-ion batteries [
15,
16]. The model-based diagnostic method accurately describes the internal state of the system and is effective for diagnosing single fault systems. Yu et al. [
17] established a diagnostic model combining voltage parameters and current parameters. They combined the least square method with the unscented Kalman filter to predict the fault current, and compared it with the current threshold to judge the fault. Wei et al. [
18] put forward an electrothermal coupling model based on the battery electrical and thermal dynamic behavior, combined with the battery internal resistance estimator based on Lyapunov to diagnose thermal abuse faults. A model using the real charging state and estimated charging state residual was proposed by Xiong et al. [
19], and they use a threshold to diagnose sensor faults. A model observer based on partial differential equations was proposed by Dey et al. [
20] and the threshold limit and Lyapunov stability theory were to judge the thermal failure of lithium-ion batteries. The model-based fault diagnosis method has the advantages of high efficiency and accuracy. However, the method is unable to diagnose multiple fault problems due to inconsistencies between lithium-ion batteries and identification parameter problems.
The knowledge-based diagnosis method depends on the characteristics of the experimental object, namely the lithium-ion battery system itself [
21]. Knowledge-based diagnosis methods include many algorithms, such as expert system theory and fuzzy logic rule judgment. With the understanding of the operating mechanism of the experimental object, a large amount of knowledge and experience are accumulated to help establish the mapping relationship between various faults and data characteristics. This method can provide clear fault diagnosis interpretation and requires less data support. In the available literature, many people have used this method. Wu et al. [
22] proposed a fuzzy logic diagnosis method for overcharge, over-discharge, and heat abuse faults by taking advantage of changes in electrical parameters and capacity parameters. The knowledge-based fault diagnosis method is often combined with battery models and relies on historical data. However, this method not only has difficulty in obtaining a large amount of historical data, but also relies heavily on the prior knowledge of expert systems. Furthermore, it is relatively difficult to update and maintain the knowledge base when establishing the expert system. Therefore, the knowledge-based fault diagnosis method is effective in fault detection but hindered by factors such as historical data and subjective selection.
The data-driven diagnosis method mainly uses the real-time data of BMS to judge and classify battery faults [
23]. The voltage correlation coefficient method based on the recursive moving time window was put forward by Xia et al. [
24,
25] to identify the battery short circuit faults. Meanwhile, the staggered voltage measurement method was proposed for the battery topology to realize the accurate location of the faulty battery and set the threshold to identify the battery fault. Kang et al. [
26,
27] proposed a cross-measurement topology for battery packs and realized multiple fault diagnosis by using an improved correlation coefficient method. Yang et al. [
28] used the principal component analysis algorithm to extract data features and the correlation vector machine to realize the classification of data features, so as to complete the classification of battery faults and fault degree. Zhou et al. [
29] used various types of neural networks to diagnose power battery faults and output fault types. The fault method of lithium-ion batteries based on the isolated forest algorithm was proposed by Jiang et al. [
30], and the variational mode decomposition algorithm was used for data preprocessing, finally realizing the multi-fault diagnosis of lithium-ion batteries. Ma et al. [
31] used the contribution degree of principal component analysis to judge the abnormal data, and used kernel principal component analysis to reconstruct the battery parameters and compared them with the normal parameters to achieve fault classification. Xu et al. [
32] extracted the data features of the simulation model and used decision tree and cloud algorithm to judge the fault classification. The data-driven diagnosis method has the merits of strong adaptability, low cost of update learning, and high robustness. In fact, noise in the measurement data and inconsistencies between the cells can cause the final data to not accurately reflect the actual state of the battery. In addition, high demands on data quality and black boxes in the diagnostic process all can affect the accuracy and readability of data-driven methods. In summary, the data-driven fault diagnosis method requires neither the creation of multiple algorithms for different faults nor the support of large amounts of historical data, and it improves the speed and accuracy of fault diagnosis.
Based on previous research, a fault location and classification method for lithium-ion battery pack based on double fault windows and least squares support vector machine-grey wolf optimization (GWO-LSSVM) algorithm is proposed in this paper. Firstly, in the aspect of data preprocessing for the given analysis, the data sequence is decomposed by an improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), and we reconstruct the data sequence. In addition, the correlation coefficient method based on dichotomy and the correlation coefficient method based on time window are designed to identify fault location principal component analysis (PCA), which is used to reduce the dimension of the decomposed intrinsic mode functions (IMFs), and the fault characteristics are determined by the contribution degree. Finally, the parameters of the least square support vector machine (LSSVM) are confirmed by the grey wolf optimization (GWO) algorithm, and the fault classification is realized. The main contributions of this paper are as follows:
1. Building a fault experiment platform, testing the fault, and recording the voltage data.
2. A double fault window location method based on dichotomy correlation coefficient and time window correlation coefficient is proposed to effectively judge the fault window, reducing the calculation cost of fault location and improving efficiency.
3. Considering the data noise and the inconsistency between batteries, the model has good robustness.
The structure of this paper is as follows.
Section 2 describes the experimental content, and the experimental platform and experimental methods are described. In
Section 3, the lithium-ion battery fault location and fault classification methods are introduced in detail. In order to verify the effectiveness of the proposed scheme, the performance of the corresponding method for the fault classification problem of lithium-ion batteries is discussed in
Section 4. The conclusion is given in
Section 5.
3. Fault Location and Classification Methods
3.1. Voltage Correlation Coefficient
As shown in
Figure 2, the batteries are connected in series by wires. In order to identify sensor faults, staggered voltage sensors are used to collect the voltage of each part of the battery pack. Battery faults are different from sensor faults. When
of a battery string is faulty, abnormal data are collected by the adjacent voltage sensors
and
. The measurement data from other voltage sensors remain unaffected. When voltage sensor
is faulty, the batteries in the battery string are not affected [
24,
25]. The adjacent voltage sensors
and
collect normal data. The Pearson correlation coefficient is defined as follows:
where
is
x and
y of Pearson’s correlation coefficient,
is the covariance of
x and
y, and
and
are the standard deviations of
x and
y, respectively.
In an ideal state, since the working state and working process of the series battery are the same, the variation trend of the battery voltage is also the same, so the correlation coefficient of each voltage sensor value should be 1. However, under normal operation, the difference between charge state and health state will inevitably lead to inconsistency between batteries, and the correlation coefficient of adjacent voltage sensors is close to 1, but not equal to 1. Assume that the voltage sensor collects the voltage sum of two adjacent batteries, and only one of the adjacent batteries fails. By calculating the correlation coefficients of and , battery faults and voltage sensor faults are distinguished. If and are not close to 1, then it proves voltage sensor fault. If the value of is close to 1, and the value of is not close to 1, then the of the battery is faulty.
Similarly, when the measurement range of the voltage sensor is expanded, the above judgment rule is still applicable. When the number of batteries collected by the voltage sensor is N, the correlation coefficient method is used to compare the correlation coefficients of the values of N adjacent voltage sensors successively. Because voltage sensor failure and battery failure have different effects on the battery voltage trend, the fault type can be easily identified.
3.2. Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
In practical situations, the failure of lithium-ion batteries or various kinds of mechanical abuse, thermal abuse, etc., will lead to abnormal voltage change data. However, in the early stages of abuse and failure, the voltage changes reflecting the battery status information are very weak and difficult to distinguish. In addition, in the measurement of lithium-ion battery voltage, noise is often caused by mechanical oscillation, instrument drift, and other factors, which will greatly affect the analysis and extraction accuracy of lithium-ion battery voltage data, and even lead to data distortion. In view of the above problems, voltage data should be decomposed to eliminate noise and save fault information.
Empirical mode decomposition (EMD) was proposed to solve the trend term problem of signal data [
33]. Compared with the wavelet analysis method, EMD performs signal decomposition according to the time scale characteristics of data itself, without setting the basis function in advance. As a signal processing method in the time-frequency domain, EMD is essentially a means to stabilize non-stationary signals. The wave and trend of different scales in the signal are decomposed step by step to produce a series of data sequences with different characteristic scales, which are called IMF. In order to solve the spurious components and mode aliasing in EMD, the ICEEMDAN [
34,
35] method is used. In the process of signal processing, ICEEMDAN adds Gaussian white noise processed by EMD decomposition. Then, the residual and IMFs are obtained. The ICEEMDAN decomposition process of the original voltage data is as follows.
White noise is added to the original signal to obtain a new signal.
where
is the system parameter,
is the estimated parameter,
x is the original signal,
denotes
k-order modal components generated by EMD decomposition,
denotes the
i-th white noise to be added.
where
represents signal data after white noise is added.
where
represents the first residual value. The local mean of the resulting signal is denoted by
.
where
is the first IMF value. Then, the second IMF value can be calculated by
, and
.
Similarly, according to , where , the k-th IMF value can be calculated.
Through the above steps, the signal decomposition of ICEEMDAN is realized. Furthermore, ICEEMDAN can only be used when residual components can be decomposed by EMD.
3.3. Double Fault Window Location Method Based on Correlation Coefficient
3.3.1. Fault Window Location Based on Dichotomy
In the long-term test experiment, a large amount of battery pack text data is accumulated, but the sudden faults of the battery pack will happen in a very short time and lead to the rapid deterioration of the health of the faulty battery. Therefore, locating the fault window from a large amount of test data is the key to detecting sudden faults. Dichotomy is a kind of efficient search algorithm which can find the specific interval data in a large amount of data efficiently. Each comparison halves the search area, so the failure window can be quickly located. In addition, since the length of the data is unknown, the length of the fault interval is set as the condition of termination dichotomy. The fault window is defined as the actual fault data length, and the maximum fault window is the fault interval.
3.3.2. Fault Window Locating Based on Time Window
A time window is often used in data segmentation. The length of the time window dictates the sensitivity to voltage changes. A smaller time window results in higher sensitivity, while a larger time window leads to lower sensitivity. A suitable time window can find the fault location under the premise of inconsistent lithium-ion batteries. For the contact fault data that cannot be distinguished by dichotomy, the time window method is used to divide the fault data, and the correlation coefficient of each time window is calculated in turn to diagnose the fault window.
3.4. Principal Component Analysis
PCA is a commonly used data dimensionality reduction algorithm. The algorithm not only transforms high-dimensional data into low-dimensional data, but also preserves as much information as possible. The basic idea of PCA is to map the original data to a new coordinate system by linear transformation, so that the data in the new coordinate system have the maximum variance. In this way, dimensionality reduction is achieved by retaining the principal components of the largest variance. Principal components are the projections of the original data in the new coordinate system, and they are the directions that best represent the data distribution in the original data.
When the number of samples and indicator variables are
p and
q, the raw data are first normalized to
. This process calculates the mean and standard deviation by column using a formula to obtain standardized data.
Then, the principal component consists of
.
Finally, assuming
is the eigenvalue, the cumulative contribution rate of
z principal components, denoted as
, is represented as follows:
3.5. Grey Wolf Optimization Algorithm
The GWO algorithm is an intelligent optimization algorithm proposed by Mirjalili et al. [
36] based on the social behavior and hunting behavior of gray wolves. As shown in
Figure 3, the gray wolf pack is divided into four classes:
,
,
, and
.
wolves are the highest leader in the grey wolf population and have the highest decision-making power on hunting, habitat selection, and other activities in the grey wolf population. The secondary leaders of the grey wolf population are the
wolves, whose main job is to assist
wolves’ management and also manage the activities of the rest of the pack.
wolves are watchdogs in the gray wolf population, whose main job is to scout out the boundaries of the pack, warn them of danger, and care for weak and injured wolves.
wolves are the lowest wolves in the grey wolf population. They obey the other three kinds of wolves in terms of leadership, and make great contribution to the balance and reproduction of the grey wolf population. In hunting, first, gray wolves seek and pursue prey. Second, they chase and harass prey. Then, they attack prey, and finally they capture prey.
GWO establishes a mathematical model according to the social hierarchy of the grey wolf. The first three optimal solutions correspond to the first three classes of the wolf group, , , and . The wolves are called a candidate solution, and the position is updated according to , , and .
In the grey wolf optimization algorithm, gray wolves use the following position update formula to surround prey during hunting:
where
and
are the position vector of the prey and the position vector of the gray wolf, respectively, and
t is the current iteration number.
and
are definite coefficients, calculated as follows:
where
and
are two random number vectors with one-dimensional components in [0,1],
is used to simulate the attack behavior of gray wolves on prey, and its value is affected by
. The convergence factor
is a key parameter to balance the exploration and development capability of GWO. The value decreases linearly from 2 to 0 as the number of iterations increases.
where the positions
,
,
of
,
, and
are updated using the above equation for all gray wolves.
,
, and
, are the distances of gray wolves from
,
, and
wolves, respectively. In the iterative process,
,
, and
are used to guide the movement of
, thus achieving global optimization.
where in
,
, and
, respectively, represent the positions of individual
wolves that need to be adjusted under the influence of wolf pack
, wolf pack
, and wolf pack
. Here, the average value
represents the final position.
3.6. Least Squares Support Vector Machine
The Least Squares Support Vector Machine is an improved algorithm based on the support vector machine (SVM) framework. LSSVM changes the hinge loss function used in SVM to solve the empirical risk value, simplifies the inequality constraints in SVM, simplifies them into equality constraints, and reduces the complexity of the solution process of the Lagrange multiplier. The quadratic programming problem in SVM is transformed into solving a system of linear equations, which reduces the computation and complexity of the algorithm.
The formula is the nonlinear optimal classification function of LSSVM;
is the weight vector of the space, and b is the offset. Define the sample dataset
, where
,
, and the sample dataset
D is mapped to a high-dimensional space by a nonlinear transformation function
.
The above formula is the constraint condition of LSSVM, where
J represents the optimization function,
represents the penalty factor, and
represents the
i error variable.
The formula is defined by the Lagrange function.
The partial derivatives of
,
b,
, and
are calculated.
Eliminating
and
in the formula gives another formula. In that formula,
,
,
.
In the formula,
is the kernel function, and the radial basis function is usually chosen.
is the undetermined width parameter of the radial basis kernel function.
3.7. Fault Diagnosis Thought
As shown in
Figure 4, the process of battery pack fault diagnosis is divided into three parts: data preprocessing, data feature processing, and fault type classification. The data preprocessing process includes ICEEMDAN decomposition and fault window location. The original voltage data of the battery pack undergo decomposition by ICEEMDAN, with the IMFs and residual terms being retained and reconstructed. The fault location is determined by the double fault window location method. Processing data features involves taking the IMFs as fault features and using PCA to reduce the feature dimension. The process of inputting the reduced-dimension fault features into GWO to find the LSSVM parameters and construct the optimal GWO-LSSVM classifier is called the fault type classification process.