1. Introduction
As a core component of the power converter, insulated gate bipolar transistor (IGBT) modules have been widely employed in high-reliability and safety-critical systems, such as electric vehicles [
1,
2,
3], aircraft [
4], renewable energy generation [
5,
6], and high-speed railway [
7]. However, owing to the challenging thermal environment combined with the aggressive power density, there may be huge temperature fluctuations in the IGBT modules, which may lead to severe thermal stress. Thermal stress may deteriorate the module’s electrical specifications and cause different degrees of thermo-mechanical failure, leading to reliability issues in power electronic applications. Research has shown that more than 30% of power conversion system breakdowns are caused by the power device failure. Moreover, nearly 60% of device failures are induced by thermal stress [
8,
9,
10,
11]. Therefore, thermal management has become a significant issue in power conversion systems from the point of view of reliability [
12,
13].
A large number of research articles on the thermal management of IGBTs have been presented in the past. For example, temperature monitoring provides an efficient approach to evaluating prototypes and further to limiting the device’s operational temperature to its threshold value; hence it is a feasible approach to intensifying the reliability of power conversion systems. Existing IGBT temperature monitoring approaches may be divided into optical methods, electrical methods, and physically contacting methods [
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24]. These studies focus on ensuring that the mean operational temperature (i.e.,
) keeps below a safety threshold value during the full-charge conditions. Nevertheless, the reliability tests on the power devices indicate that a power device’s fatigue lifetime depends on thermal cycling, which is the temperature swing within devices caused by power cycles, i.e., loads. Thermal cycling strength can be characterized by the mean operational temperature (
) and the amplitude of temperature fluctuations (
), as shown in
Figure 1 [
25,
26,
27].
Due to the mismatch of coefficients of thermal expansion of the various material layers typically used in devices and other packages, the bond wires and the solder layers subjected to thermomechanical stresses cause bond wire degradation and solder joint fatigue, i.e., thermal damage. Power devices fail when thermal damage surpasses the threshold. Moreover, as a matter of fact,
has a greater impact on the device failure than
. From
Figure 1, it can be concluded that, if
is reduced by the same amount that
is increased by, a much higher number of cycles to failure can be achieved. Unfortunately,
is not taken into consideration in temperature monitoring approaches. Consequently, the device is only able to “safely” operate continuously at the maximum allowable temperature, while the potentially damaging stresses due to
in the module cannot be avoided.
To address the aforementioned issues, active thermal control (ATC) techniques are developed to control against
and
simultaneously. In practical applications, by means of regulating the module’s cooling system or power losses, ATC techniques can reduce the amplitude of temperature fluctuations and the mean level of temperature. Furthermore, ATC techniques do not need to change the design of the power conversion system, meaning this type of technique is cost-effective [
28,
29,
30]. There have been many efforts focusing on ATC techniques, which can be categorized into dynamic-cooling approaches and electrical-parameter approaches.
The dynamic-cooling approach to performing ATC depends on the active control of the cooling system. An advanced dynamic cooling strategy is proposed to reduce the module’s thermal cycling during the operation by controlling the speed of the fan or the flow velocity of the water-cooling system, aiming to improve the reliability performance of the power converter [
31,
32,
33]. This type of technology can be employed in any system with controllable cooling.
The electrical-parameter approach to achieving ATC relies on the active control of the electrical parameters, which have a direct or indirect influence on the power losses generation or distribution of the module. Generally, the control strategy of the electrical-parameter approach may be categorized into three control levels: modulation level, converter level, and system level. In the modulation level, the gate driver and modified modulation patterns are always utilized to regulate the module’s power losses [
34]. In the converter level, the thermal cycling of the modules can be reduced by modifying some controllable variables, such as DC-link, current, and switching frequency [
35,
36]. At the system level, the presence of multiple power converters can be utilized to adjust the losses distribution without disturbing the main converter’s goal [
37,
38].
Although both the dynamic-cooling approach and the electrical-parameter approach are competent at reducing the module’s thermal cycling to improve the reliability of the power converter effectively, there are still some limitations in practical applications: (a) an output observer for the information of and is essential to provide the feedback control signal, making the system costly and complex, and (b) the magnitude of and can only be reduced, while it cannot be controlled with precision based on the applications and the desired profiles. The above weaknesses may make these methods conservative. Consequently, designing a feedback control system independent of an observer for controlling and with precision remains challenging and is vital to improve the reliability of power electronic applications.
Motivated by the analysis described above, in this paper, we propose a novel feedback control system based on the theory of finite-time boundedness (FTB), which is able to precisely control the and of a power device without notably affecting the normal converter’s operation. Two aspects of the work are demonstrated: (a) a state-space thermal model of the power device is built, which is able to obtain the temperature information (i.e., and ) in real-time according to the electrical variables of the converter, and (b) a feedback controller based on FTB is proposed to precisely control and . Through the approach in this paper, the power losses of the module are regulated and the thermal stress is controlled; thereby, the damage caused by the thermal cycling is reduced, improving the reliability performance of the converter. Compared to the traditional ATC techniques, this approach has two advantages: (a) without using an observer, an accurate real-time estimate of junction temperature for a power device is still available and (b) the values of and can not only be reduced but also assigned precisely by the feedback controller.
The remainder of the paper is organized as follows. In
Section 1, the development of a state-space thermal model of a power device is demonstrated. In
Section 2, the feedback controller based on the FTB is introduced. In
Section 3 and
Section 4, the effectiveness of the proposed method is validated by simulation and experimental results, respectively.
2. Development of a State-Space Thermal Model
An accurate real-time estimation of junction temperature for the power device is an important part of the ATC algorithms. One way to obtain the temperature information is the use of integrated sensors. Negative temperature coefficient (NTC) resistors and on-chip diodes are the two common types of sensors. Generally, NTC resistors are installed in the direct-bond-copper (DBC) substrate to acquire the baseplate temperature [
39], and on-chip diodes are integrated within the IGBT chip itself to perform online measurement of the chip temperature [
40]. However, during the design and manufacturing of the power device, both types of sensors require some special considerations, such as electrical isolation, the layout and/or compatibility of pins, which may lead to the increase of manufacturing cost and induce some new reliability problems.
Another way to acquire the temperature information is the use of thermo-sensitive electrical parameters (TSEPs) in the power device, such as turn on/off time, on-state collector-emitter voltage, short-circuit current, and peak gate current. However, these electrical parameters face many difficulties in online implementation; for instance, the demand to compensate for the operation conditions and the demand for a high-precision measurement circuit or the redesign of the converter structure. Therefore, an economic and straightforward temperature estimation approach is significant for the ATC of a power device. In this paper, we proposed a state-space thermal model for the real-time estimation of the junction temperature according to the thermal behavior of the module.
2.1. Modeling
Typically, a power module is composed of IGBT chips and diode chips, which act as the heat sources that contribute to the entire heat flow inside the module.
Figure 2 demonstrates a commercial power module (SKM75GB123D) made by SEMIKRON, where one IGBT chip and one free-wheeling diode chip are combined in parallel on each substrate tile. Generally, the temperature of an IGBT chip on one single substrate tile is influenced by the adjacent diode chips. In this paper, due to the long distance between the IGBT chip and the diode chip, cross-coupling is relatively small and, thereby, is ignored. As a result, only the self-heating of the IGBT chip is considered in the modeling process.
The IGBT chip, acting as the heat source, contributes to the entire heat inside the device. The heat is generated in the chip and spreads through several layers with different materials down to the baseplate, constituting the thermal path of the device, as illustrated in
Figure 3. Most of the heat spreads down along an angle of 45°, regarded as the optimal thermal path. This thermal path can be described by the transient thermal impedance from the junction to ambient
, which is shown as follows:
where
denotes the transient thermal impedance from junction to ambient,
P denotes the total power losses of the module,
denotes the junction temperature, and
denotes the ambient temperature.
It is worth highlighting that the variations of ambient temperature are generally slower in comparison to the thermo-dynamic characteristics of the power device, and the ambient temperature normally remains constant via the dynamic cooling system (e.g., by controlling the speed of the fan). Therefore, the ATC of
is equivalent to the ATC of
, and
could be described as
for model simplification. Thus, (1) can be rewritten as follows:
The function of
can be described by an electrical equivalent resistance-capacitance (
) network, shown in
Figure 4, which is known as the Foster network. A series of exponential terms is used to characterize the time response of the Foster network as follows:
where
and
denote thermal resistance and thermal capacitance of the electrical equivalent network.
Taking the Laplace transformation of (
3), the partial fraction expansion form of the transfer function of
in the frequency domain is obtained as follows:
where
and
denote the residues and poles of the transfer function, respectively, and
s denotes the complex variable.
According to the algebraic transformation, it was found that poles and residues have the relationships with the
components as follows:
It should be pointed out that there is no correlation between the elements and the physical characteristics of the thermal path, since the Foster network is just an equivalent circuit model of the module’s thermal system.
The partial fraction expansion shown in (
4) can be easily transformed into a state-space model, as shown in the following form:
where the state vector
denotes the heat through the thermal path,
is the input of the thermal system, and
is the power losses of the power module. The output equation gives the temperature of the system
, which is the junction temperature of the power module.
Based on the relationships between residues, poles, and
elements, the state-space model can be transformed into a parallel form with a diagonal system matrix, as shown in the following [
41]:
where
is the system matrix,
is the input matrix, and
is the output matrix.
2.2. Identification of Model Parameters
The state-space thermal model of a power device is given by (
7), and the model’s parameters may be extracted for temperature estimation. As can be seen in (
7), the matrices of the state-apace model are composed of
parameters of the Foster network; thereby, the only thing we need to do is to acquire the
parameters.
As an equivalent circuit model, the parameters of the Foster network are fitted from the transient thermal impedance . Meanwhile, can be easily obtained by finite element analysis (FEA). A transient thermal analysis of the model system based on the dimensions and materials of the device and heat sink is processed by a commercial FEA software, i.e., ANSYS.
The design of the thermal analysis is implemented as follows. (a) The heat sink is cooled by forced-air convection, and the cooling surface keeps constant at 25 °C. (b) The IGBT module operates in a full-bridge inverter, as shown in
Figure 5. The total power losses of the IGBT module are composed of conduction loss and switching loss, and can be estimated as follows [
28,
42,
43,
44]:
where
P denotes the total power losses of the module,
denotes the conduction power loss,
denotes the switching power loss,
is the collector current,
is the on-state collector-emitter voltage,
and
represent the turn-on and turn-off energy of the module, respectively, and
is the switching frequency. Based on the operation conditions in
Table 1, the power losses of the module are estimated by (
8). (c) The thermal analysis is processed in ANSYS under a transient mode for 10 s and the sampling interval is 0.001 s. Placing the power losses on the IGBT chips, the results of the thermal analysis for the IGBT module are obtained, as shown in
Figure 6. At the same time, we set up the thermal analysis experiment platform. In the experiment, the IGBT module proceeded with the same operating conditions as the simulation, and the temperature distribution map of the upper surface of the device was obtained by an infrared camera, as shown in
Figure 6. From
Figure 6, we can see that the temperature distribution results of the simulation and experiment are consistent. The slight temperature difference may be caused by the difference in heat dissipation conditions.
By substituting the simulation results into (
1), the transient thermal impedance curves
for the IGBT are derived, as shown in
Figure 7. It has been found that a fourth-order Foster network has a good approximation for the transient thermal impedances. Adopting the least-square fitting method, the values of the
component are obtained and shown in
Table 1.
This way, the state-space thermal model has been built and can achieve a temperature estimate in real time according to the power loss, which depends on the electrical variables. In contrast to the traditional temperature measurement method, the proposed method can be commonly applied to any type of power device, and has economic and convenient advantages.
4. Simulation Validation
In this section, the effectiveness of the proposed method, which is able to precisely control the values of
and
, is validated by a numerical analysis. The control scheme using the electrical variables to adjust the power losses to control the module’s thermal cycling is shown in
Figure 8.
In the control scheme, the switching frequency , load current , and DC-link voltage are collected by the physical system. In addition, the device’s electrical characteristics are taken into account to identify the voltage and the switching energies and . These electrical parameters are used to calculate module’s power losses, P, which are presented to the state-space model to estimate the module’s temperature. The feedback controller is utilized to precisely regulate the electrical variables according to the set value of and . In this section, the electrical variables, including and , are set to be constant to simplify the control complexity, while the switching frequency is selected as the only variable to regulate the power loss for temperature control.
The design of the simulation analysis is performed as follows to eliminate the influence of various operation conditions: (a) The heat sink is water-cooling, and the water-cooling runners are inside the heat sink, as shown in
Figure 9. The heat sink material is aluminum, and the water-cooling runners keep the heat sink temperature constant at 25 °C; (b) the DC-link voltage
is 100 V, and the load current
is sinusoidal current, as shown in
Figure 10; (c) the basic value of the switching frequency
is 10 kHz, while it can vary from 5 kHz to 20 kHz; (d) the converter modulation frequency
is 10 Hz; (e) the simulation analysis is processed in ANSYS under a transient mode for 10 s. The results of the simulation test are described next.
Firstly, the effectiveness of the state-space model, which is to achieve a temperature estimate according to the power losses of the device, is demonstrated. The power module used in this section is shown in
Figure 2, and the power losses are calculated with (
8). The state-space model, which is composed of (
6) and (
7) with the parameters in
Table 1, estimates the temperature on the basis of currents, shown in
Figure 10. Meanwhile, the FEA of the power module is processed in ANSYS to acquire the temperature according to the same load currents. The junction temperature
estimated by the state-space model is compared with the results from FEA, which is shown in
Figure 11.
The estimate via the state-space model is consistent with the FEA results during the various operation conditions. The correlation coefficient between the two results is more than 0.95, and the maximum error is about 1.2 °C or 1.6% of the total range for each waveform. The difference may be linked to errors inherent in the modeling process. This indicates that the state-space model can accurately estimate the junction temperature in real time.
Additionally, the converter modulation frequency also affects the junction temperature response. Hence, the modulation frequency of the converter is varied to demonstrate the the consistency of the
estimate under various operation conditions. The
estimate during the modulation frequency of 5 Hz and 10 Hz is shown in
Figure 12. The
estimate via the state-space model agrees with the FEA results. It can be seen that
in
is reduced with the increase of frequency, whereas the mean temperature
remains constant. This is attributed to the frequency response of the thermal system replicating a low-pass filter.
The results described above indicate that, without using an observer, the temperature is still accurately obtained via the state-space model during various operating conditions.
Secondly, the effectiveness of the feedback controller based on FTB, which can precisely control the temperature, is illustrated. The results described in
Figure 11 clearly show that the thermal cycle
is large enough, which will lead to severe thermal damage and accelerate the fatigue of the module. Thus, the thermal cycle needs to be reduced for the improvement of the power device’s reliability.
Recall from
Section 3.2, the values of
and
should be properly selected on the basis of the practical applications and desired profiles to obtain the controller
K. In this section, the values of
and
are selected according to the load currents in
Figure 10 and the temperature results in
Figure 11, where the modulation frequency is 10 Hz. The values of
and
are set to be 3 °C and 60 °C, respectively. Considering (
19), we have
According to (
23), the values of
and
are acquired and used to calculate the controller
K. Based on the
Theorem of FTB, the controller
K is obtained by solving the linear matrix inequality (LMI) of (
13) and (
14) and is
. The controller is able to change the electrical variable (i.e., switching frequency in this section) to regulate the power losses for ATC.
The designed controller test during various operation conditions has been carried out by simulation. Based on the mission profiles shown in
Figure 10, the temperature results with and without control are demonstrated in
Figure 13. Compared to temperature results without control, the designed controller can reduce the amplitude of temperature fluctuations and the mean temperature effectively. The variations of
and
are shown in
Table 2.
According to the results in
Table 2, the designed controller strongly reduced the temperature fluctuations during a fast changing power demand, and it almost eliminates the effect of the varying power loss profiles. Moreover, the mean temperature with control is equal to the set value (i.e., 60 °C), and the temperature variations never exceed the boundary of 3 °C. This indicates that the designed controller is able to precisely control the temperature.
The simulation results with the realistic load profiles demonstrate the ability of the designed controller to precisely control and . As a consequence, the thermal stress of the devices can be reduced and the lifetime can be extended by the active thermal controller. Moreover, the controller can be adapted for any multi-layer structured power device to extend the operational reliability.
5. Experimental Validation
In this section, the effectiveness of the proposed method is further exhibited by an experimental study. The experimental scheme, which consists of a power converter formed by the IGBT module shown in
Figure 2 (the packaging of one IGBT module is intentionally removed), a control system to process the ATC of the IGBT module, a gate driver to generate gate signals for the IGBT module, a DC power supply for the test currents, an IR camera to measure the junction temperature of IGBT, and an aluminum heat sink to cool the IGBT module, is illustrated in
Figure 14.
Adopting the control strategy in the simulation section, the switching frequency
is selected as the only variable to regulate the module’s power losses in order to precisely control the junction temperature, and the test conditions are set as follows: (a) the heat sink is cooled by forced-air convection, and the temperature of the bottom surface keeps constant at 25 °C; (b) the DC-link voltage
is constant, and the test current
is sinusoidal current, shown in
Figure 8; (c) the basic value of switching frequency
is 10 kHz, while it can change from 5 kHz to 20 kHz; (d) the experimental analysis is processed in a transient mode for 300 s. The results of the experimental test are described next.
Firstly, the effectiveness of the state-space model, which is proposed to estimate the junction temperature, is demonstrated. As described in
Section 2.2, the parameters of
R and
C have been obtained using the FEA method and are shown in
Table 1. The electrical variables, including
,
, and
, are collected and utilized to calculate the power loss of the module according to (
8). The state-space model, which is composed of (
6) and (
7) with the parameters in
Table 1, estimates the junction temperature based on the calculated power losses. Meanwhile, the IR camera is used to measure the junction temperature. The temperature results from the state-space model and the IR camera are shown in
Figure 15.
It is obvious that the estimate from the state-space model tracks the measurement from the IR camera accurately during the various operation conditions with a maximum error of 2.8%. The maximum difference between the two signals is located at the peaks of the temperature profile, and is about 2.2 °C, which may be linked to the errors inherited from the modeling process and/or measurement noise from the IR camera. The results indicate that the state-space model can accurately obtain the junction temperature information during various operation conditions.
Secondly, the effectiveness of the feedback controller based on FTB, which is able to precisely control the temperature, is illustrated. The temperature results shown in
Figure 15 show that the thermal cycle
has reached 10 °C, which will lead to severe thermal stress and, thereby, needs to be controlled to reduce the thermal damage.
As described in
Section 3.2, the values of
and
should be properly selected based on the practical applications and desired profiles to calculate the controller
K. In this section, the values of
and
are set according to the temperature results in
Figure 15, and are consistent with the setting of simulation section. The values of
and
are 3 °C and 60 °C, respectively, and are used to calculate
and
by (
23). Substituting
and
in (
13) and (
14), the thermal controller
K is obtained, which is
.
The test of the designed controller during various operation conditions has been performed experimentally. The test load condition profiles are based on the currents shown in
Figure 10. The test temperature results with and without control are shown in
Figure 16. Compared to the temperature results without control, the designed controller is competent at reducing the module’s thermal cycling due to the variations of current profiles. The values of
and
are illustrated in
Table 3.
As can be seen, the value of
reduces greatly from 9.4 °C to 2.36 °C, and the value of
reduces from 67.3 °C to 59 °C. In addition, the value of
with control is approximate to the set value, and the value of
never exceeds the given boundary of 3 °C. It should be noted that
Figure 16 only shows the test currents running at a fixed value of
, but different
and
combinations have also been tested and receive very similar, good results.
To further demonstrate the effectiveness of the proposed method, the realistic reliability improvements made by decreasing thermal damage are presented. Typically, the reliability performance of the IGBT module can be indicated by the evolution of on-state collector-emitter voltage
. The evolution of
can be realized using a power cycling test. The power cycling tests during the operation conditions with/without ATC are processed respectively, and the values of
under these two test conditions are measured continuously. Then, the test results are obtained and are shown in
Figure 17. Compared with the test results with ATC, the values of
without ATC have a larger growth during the power cycling test, and the difference between the two voltage signals increases monotonously, meaning that the IGBT module without ATC has greater thermal damage. This phenomenon indicates that, in practical applications, the reliability performance of IGBT can be improved by the proposed ATC method.