1. Introduction
In recent years, environmental protection and renewable energy have gained increasing attention [
1], and in the automotive industry, traditional fuel vehicles have gradually been replaced by more environmentally friendly new energy vehicles. Electric motors are one of the essential components of new energy vehicles, and permanent magnet synchronous motors (PMSMs) are widely used due to their high efficiency, simple structure, and high power density. However, the temperature inside the motor will rise sharply during operation, posing risks of insulation failure and demagnetization [
2] due to exceeding thermal limits. How to estimate the temperature distribution inside the motor accurately and stably is a key issue that must be focused on for practical use.
The temperature estimation methods for PMSMs are mainly classified into two categories: sensor-based and sensorless methods. Sensor-based methods involve directly measuring the temperature at certain positions inside the motor using thermal sensors [
3,
4]. However, these methods involve additional costs and manufacturing complexities, making them unsuitable for large-scale industrial production. Moreover, the repairing and replacing can be time-consuming and costly when encountering sensor failure.
Sensorless methods can be further divided into direct and indirect methods. Indirect methods include flux observer [
5,
6] and signal injection [
7,
8]. Direct methods generally predict the temperature at the internal positions of the motor by directly establishing a thermal model. Among direct methods, lumped-parameter thermal network (LPTN) [
9] is the most widely used, which replaces the motor with some nodes. The complex thermodynamic behavior inside the motor is equivalently modeled as interactions between these nodes, based on the flow paths of heat, the law of heat conservation, and the mechanism of heat generation [
10]. Parameters such as thermal losses, thermal capacitances, and thermal resistances in this thermal model can be obtained through theoretical or empirical formulas [
11], finite element analysis (FEA) [
12], computational fluid dynamics (CFD), or different data-driven methods [
13,
14]. Another common approach is treating temperature estimation as a time-series prediction problem [
15,
16,
17] utilizing supervised learning to fit nonlinear relationships based on data. However, pure data-driven methods commonly lack physical interpretability, diverge from physical mechanisms, and fail to utilize the actual physical information of the motor.
Recently, the concept of physics-informed machine learning (PIML) or physics-based deep learning (PBDL) has gained prominence. These approaches combine prior knowledge of physics with data-driven methods, which is very helpful when training data are scarce, model generalization is limited, or some physical constraints need to be satisfied. One adds the differential equations of dynamic systems as several regularization terms into the loss function, corresponding to the physics-informed neural network (PINN) [
18,
19]. Therefore, the backpropagated gradients contain information provided by differential equations. Another approach integrates the complete physical model with deep learning. In the context of the motor temperature estimation problem, several potential integration patterns are illustrated in
Figure 1. Among them, the neural network first often requires the physical model to be differentiable, namely, differentiable physics (DP) [
20,
21,
22], so as to enable the backpropagation of gradients.
In this work, we propose a lightweight end-to-end trainable framework for temperature estimation by integrating neural networks, differentiable physical models, and simulation results. Specifically, according to the real geometry, material properties, winding and cooling configurations, and other information of the investigated PMSM, we establish a corresponding thermal simulation model in MotorCAD, which is an electromechanical design software. The simulation model provides the structure of the thermal network and simulated thermal parameters, including thermal losses, capacitances, and resistances, that can serve as reasonable initial values. Considering the time-varying characteristic of thermal parameters, a neural network for parameters correction is introduced. The network dynamically adjusts the thermal parameters based on the real-time operating conditions and temperature distribution. The corrected parameters are then fed into the corresponding differentiable LPTN, which significantly improves the accuracy of temperature estimation. To the best of our knowledge, it is the first time in the literature that the integration of differentiable physics into the domain of motor temperature estimation has been investigated.
The principal conclusions drawn from this work highlight the effectiveness of the proposed method in accurately estimating motor temperature using both synthetic and real-world data. The integration of physical principles through a differentiable physics model not only improves the accuracy and robustness of temperature estimations but also maintains consistency with physical mechanisms. This method is deemed highly practical, offering a significant improvement over purely data-driven methods by incorporating physical model constraints and simulations, which result in more reliable and physically consistent outcomes.
2. Related Work
Most prior works based on LPTN primarily focus on how to identify the thermal parameters. Veg and Laksar [
23] established a seven-node LPTN for a high-speed permanent magnet synchronous motor and calculated thermal resistances and other parameters using heat transfer coefficients. The accuracy of this method based on the theoretical formula is limited. Choi et al. [
13] utilized measured data under different operating conditions and employed the least square method to obtain a set of optimal fixed thermal parameters, but this method is unable to ensure the physical consistency of the results and ignores the time-varying characteristic of thermal parameters. Wallscheid and Böcker [
24] constructed a four-node LPTN for a 60 kW HEV permanent magnet synchronous motor. Using the global particle swarm optimization algorithm and extensive measured data, they identified the unknown coefficients in empirical formulas, while considering various physical constraints and prior knowledge like heat transfer theory. This method effectively adds prior knowledge into the optimization algorithm, but the explicit empirical formulas generally make some simplifications, making it difficult to capture different or more complex nonlinear patterns. Kirchgässner et al. [
25] viewed the four-node LPTN as a recurrent neural network and then proposed a so-called thermal neural network. At each time step, the thermal parameters that lose physical meanings were directly predicted by independent neural networks and then computed the temperature after discretizing the differential equations of the corresponding LPTN. The error between the estimated temperature with ground truth was used to update the neural networks in the end. However, their method predicted thermal parameters merely based on data, still towards a data-driven fashion. When discarding the neural networks, the remaining cannot work independently as a physical model, and the behavior of the neural networks is relatively uncontrollable and prone to violate physical consistency. Wang et al. [
26] established a ten-node LPTN for an automotive PMSM and incorporated three independent neural networks to predict thermal parameters based on theoretical values. This is a feasible attempt that combines physical models with neural networks. However, they neglect the deviation between theoretical and real values of thermal parameters, which limits the final accuracy and robustness and is unable to ensure that the estimated temperatures at all nodes in LPTN conform to physical reality when underconstrained. Additionally, their work lacks more in-depth experiments and analyses, as well as comparisons with other algorithms to validate the method and the rationality of certain settings.
3. Background
The main idea of LPTN is to simplify the representation of various components inside the motor (such as windings, stator, rotor, etc.) by using lumped nodes and then represent heat flows through an equivalent circuit diagram. Each node has a thermal capacitance to characterize the heat storage capacity of the corresponding component. There typically exists a thermal resistance between every pair of nodes, reflecting the heat transfer process between internal components of the motor. Additionally, several components may generate power losses, such as copper loss, iron loss, etc. The losses are the major factor causing the change in internal temperature distribution. A schematic of the i-th node in a typical thermal network is illustrated in
Figure 2.
For node
i, based on heat transfer theory and heat diffusion equation [
27], the following simplified ordinary differential equation can be derived [
25]:
where
R denotes the thermal resistance between nodes,
C the thermal capacitance,
P the loss, and
the temperature. The number of thermal resistances generally increases quadratically with the number of nodes. For a thermal network with
n nodes, the equations can be combined and written in the following matrix form:
with
From the perspective of state space, the state variable
represents the temperature at each node,
is the state transition matrix, and
is the input matrix. If the matrices
and
are time-invariant, then given the initial condition of temperature
, the temperature
at each time can be calculated as follows:
However, in practical situations, the matrices
and
vary with time, because the capacitances and resistances actually change with the operating points and the temperature distribution inside the motor. For example, as the speed increases, the thermal resistances related to ventilation may decrease accordingly. The losses vary due to different speed and torque during operation; thus, the total loss as well as the ratio between losses is variable. Therefore, the key to improving the accuracy of temperature estimation lies in determining
,
, and
at each step, that is, thermal capacitances, thermal resistances, and losses. Then, several numerical methods can be used to solve Equation (
2), such as forward or backward Euler, Runge–Kutta methods, etc. Implicit methods generally have better numerical stability. Taking backward Euler as an example, the equation can be discretized as follows:
then
We can implement this equation in an automatic differentiation framework, as it is entirely matrix-based, so the gradients will not be blocked.
4. Differentiable Physics Temperature Estimation Framework
We have implemented a differentiable LPTN in PyTorch and incorporated a neural network to dynamically correct thermal parameters online. The specific estimation framework is shown in
Figure 3, which illustrates the flow path to estimate the temperature at each timestep. In general, the raw simulation thermal parameters need to be optimized first to obtain the optimized values that are more in line with the reality (thermal parameter optimization) and then fine-tuned by a neural network to compensate for the relatively small time-varying change (dynamic correction). After that, these thermal parameters are used for solving Equation (
2) to obtain the estimated temperatures (differentiable LPTN), which are then transferred to loss calculation and gradient backpropagation during training. A detailed explanation for different components is provided in the following.
4.1. Thermal Parameter Optimization
For a thermal network with n nodes, there typically exist n thermal capacitances, thermal resistances, and less than n thermal losses. These thermal parameters’ simulated values (SVs) exported directly from simulation software, while based on relevant physical theories and empirical formulas, often diverge from their real-world counterparts due to model simplification, the diversity of operating conditions, and environmental impacts. This discrepancy can lead to a decrease in the accuracy of the estimation model. Hence, before directly utilizing these simulated thermal parameters, it is crucial to optimize them to better align with the measured data, which is the key step in enhancing the final estimation accuracy.
Therefore, we add a scaling ratio vector
corresponding to thermal capacitances and resistances, which is a learnable parameter, into our framework. By element-wise multiplying simulated values of capacitances
and resistances
with
, we obtain the optimized values (OVs) for these thermal parameters, namely, optimized values of capacitances
and resistances
. That is,
where the learnable
is updated via gradient descent to improve the final temperature estimation accuracy during the training process.
For the simulated values of losses
, first, the current operating condition
(including speed, torque) is used to determine the total loss based on a lookup table (LUT) derived from real-world motor testing. By normalizing
(i.e., element-wise division by the sum) and then multiplying it with total loss, a more accurate
is obtained. That is,
4.2. Dynamic Correction
After obtaining the optimized thermal parameters
,
, and
, considering the time-varying characteristic of these parameters, we introduce a neural network into our framework. Taking into account the mechanisms of change and influencing factors of these thermal parameters, the network inputs operating conditions
(such as speed, torque, coolant temperature, and ambient temperature) and the estimated temperatures of all nodes at the previous time. Then, it outputs the correction vectors
,
, and
, corresponding to
,
, and
, respectively. The learnable weight is
. This step allows for the fine-tuning of the optimized thermal parameters dynamically to improve the final accuracy of temperature estimation. For the i-th node in the lumped parameter thermal network model at time
t, its loss
, thermal capacity
, and thermal resistance
between node
i and node
j are adjusted accordingly, that is,
Using these corrected thermal parameters, the temperature at the next moment can be calculated by Equation (
5) and then used for loss calculation as well as gradient backpropagation.
To avoid parameter coupling between and and limit the parameter feasible regions during the actual training of the proposed framework, it is better to conduct the training in two steps. First, the is trained to obtain optimized thermal parameters. This step significantly reduces the temperature estimation error and, due to the fewer learnable parameters of , is unlikely to result in overfitting. Then, the is trained to represent the time-varying characteristics of thermal parameters. At this point, with the error already reduced after the first step, the initial phase of training is less prone to challenges such as gradient explosion, severe fluctuations, or falling into poorly generalized local minima.
4.3. Loss and Backpropagation
The corrected thermal losses, capacitances, and resistances are fed into the subsequent differentiable LPTN to estimate the temperature. The estimated temperature is then compared with the true temperature. Finally, the gradients are backpropagated to update and .
In this work, the loss function includes not only the error between the estimated temperature and the true measured temperature at each time step, denoted as , but also an additional term related to the error between the temperature change rate and , denoted as . This transient characteristic is primarily introduced by thermal capacitances. Therefore, adding this loss term is also beneficial for the training. The weight of these two loss terms is adjusted by the coefficient , i.e., . Different results in different learning curves and accuracy, which is a hyperparameter.
One can see that temperature estimation is essentially an iterative process that requires real-time operating conditions and the temperature information of the previous time. Therefore, the proposed framework in this paper works like a recurrent neural network (RNN). To avoid excessively long sequences that incur gradient explosion or gradient vanishing, we employ truncated backpropagation through time (TBPTT), a method commonly used to train RNN-like networks, to train the proposed framework. As shown in
Figure 4. Specifically, we need to manually truncate the temperature sequence into smaller segments and then backpropagate the errors through these segments during training.
5. Simulation
In this section, we first establish a fine-grained simulation model of the PMSM based on MotorCAD. Then, we generate simulation data under various operating conditions to validate the effectiveness of the proposed method. Finally, we investigate the performance and behavior of the framework under different settings through multiple experiments.
5.1. Thermal Simulation Model
The motor investigated in this work is an 8-pole, 48-slot PMSM designed for automotive use. The fundamental geometric and material parameters are presented in
Table 1. The motor’s hairpin winding consists of 5 layers, connected in a Y configuration. To establish a corresponding simulation model in MotorCAD software, we first need to specify more detailed actual geometric parameters in the geometry panel, including radial and axial dimensions, for example, stator inner and outer diameters, axial length, slot depth and width, number of layers of permanent magnets, and the length and angle of each layer, shaft diameter, cooling ducts diameter, etc. The configured radial section, axial section, and 3D view are shown in
Figure 5.
Then, it is necessary to set the specific connection of the winding. The software supports directly selecting hairpin windings and allows customization of the winding connections. The customized winding connections are shown in
Figure 6.
By setting the materials of the stator, rotor, and permanent magnets, the software itself provides material-related properties such as thermal conductivity, specific heat, density, etc. For thermal simulation calculations, the cooling of this motor includes housing water jacket cooling, rotor water jacket cooling, and winding end spray, which can be found in
Figure 5. The temperature of these coolants is controllable and measurable.
Finally, we can manually formulate duty cycle data for transient temperature calculation. The definitions of duty cycle mainly include torque-speed, loss-speed, and current-speed. When calculating, MotorCAD can build a thermal network based on the actual information of the motor and obtain simulation values for thermal parameters through theoretical and empirical formulas. The fine-grained simulation LPTN includes 135 nodes and is based on the actual geometric parameters, material properties, windings, and cooling system configurations. Subsequently, a simplified thermal model is developed, which consists of 10 nodes, as shown in
Figure 7 and
Table 2.
Apart from the thermal resistances between the coolant nodes, there are in total 42 thermal resistances. Similarly, the software can provide simulation values for thermal parameters in the simplified thermal model, including torque-speed grid loss data, , and . The torque-speed grid loss data are utilized for obtaining by bilinear interpolation.
5.2. Synthetic Data
We randomly select from candidate operating points within the motor’s maximum torque/speed curve for constructing a specific set of operating conditions. Subsequently, these conditions are imported into MotorCAD, and the fine-grained thermal model is simulated to obtain temperature data as ground truth. With the simplified thermal model and the corresponding simulation thermal parameters, our proposed method is employed to enhance the temperature estimation accuracy of nodes in the simplified thermal model, thereby validating the effectiveness of our approach. Different sets of candidate operating points are used for generating training and testing conditions to avoid overlap, as indicated by circles in
Figure 8.
We finally generated 30 training conditions (20 transient conditions + 10 steady conditions) and 10 testing conditions (5 transient conditions + 5 steady conditions). Each set of conditions has a duration of 800 s and the frequency is 2 Hz.
5.3. Validation Based on Synthetic Data
As described in the previous chapter, firstly, we optimize the simulation thermal parameters that are directly exported from the software with all training data using gradient descent to obtain
and
. The training process contains 1400 epochs with a small learning rate of 1 × 10
−5 and the error curve during training is shown in
Figure 9.
Then, we set the neural network with two hidden layers with sizes of 32 and 64 neurons, respectively, and use Hardswish [
28] as the activation function. The optimizer is Adam and initial learning rate is 1 × 10
−4 with cosine annealing decay strategy. The training contains 1200 epochs, with a tbptt size of 1024 and mean squared error (MSE) as loss function. The error curve for the mean absolute error (MAE) and MSE of 7 nodes is as shown in
Figure 10.
Figure 11 shows the estimation results of the proposed method. Compared with the results calculated merely based on simulation parameters, it can be seen that the proposed method can achieve excellent accuracy in areas with drastic temperature changes.
To better understand the behaviors of the network, further exploration of model interpretability is conducted. It is meaningful to observe the distribution of correction ratios. Hence, we create a histogram that represents the frequency distribution of correction ratios for all thermal resistances and thermal capacitances in the testing set, as shown in
Figure 12 and
Table 3. This provides insights into how the corrections are distributed across different components and nodes in the thermal model.
The correction ratios for thermal capacitances are reasonably balanced, showing neither over-correction nor under-correction. The correction magnitudes are relatively small, such as for the stator yoke and rotor. It is also observed that the optimized values for the magnet, Wdg_R, Wdg_A, and tooth are generally larger, leading to correction ratios all less than 1. A similar analysis can be applied to the correction ratios for thermal resistances. The correction magnitudes are mainly distributed between 0.8 and 1.5, indicating subtle rather than drastic adjustments. It is noteworthy that the network actually has the ability to output very small or large correction ratios.
5.4. Ablation Study
Based on synthetic data, we have conducted the following three ablation studies, with the final errors on the testing set shown in
Table 4.
5.4.1. The Importance of Simulation Values
We firstly investigated the necessity of , , and , which indicates whether the introduction of simulation values will have an impact on the final temperature estimation accuracy. When making predictions without relying on some simulation values, the network may directly predict values instead of ratios. In this situation, when initializing, the total loss is evenly distributed among the seven nodes. For resistances, considering most of the simulation values are small, all thermal resistances are randomly initialized with a mean of 1/e, and the network’s outputs undergo exponentiation with base e to obtain the final predicted thermal resistances. For capacitances, similarly, the simulation values are in the range of hundreds to thousands, so each node’s thermal capacitance is initialized to around 1200. The outputs of the network need to undergo exponentiation with base 10 to obtain the final predicted thermal capacitances. Such conversion also ensures non-negativity. Furthermore, experiments are conducted under different data sizes, including all data (20 + 10), twelve transient and eight steady conditions (12 + 8), and seven transient and three steady conditions (7 + 3).
5.4.2. Loss Term
For the loss function , the weight of the differential term loss can be adjusted by the coefficient . As mentioned before, the thermal network’s transient characteristics are caused mainly by thermal capacitances. Intuitively, adding a transient-related loss term can benefit the training of the neural network. Therefore, we compare four sets of experiments: , , , and using only . It is important to note that the previous researches are based on . When , the ratio between and is approximately 10:1, and when , it is about 1:1.
5.4.3. Without Correcting One
As shown in
Figure 3, considering the time-varying characteristic of thermal parameters, there exists dynamic correction for thermal capacitances, resistances, and losses, respectively, namely,
,
, and
. To examine the impact and necessity of the dynamic correction, the following three different settings are conducted: (1) without correcting capacitances, that is, the capacitances remain unchanged rather than dynamic correction during training and testing; (2) without correcting resistances, that is, the resistances remain unchanged rather than dynamic correction during training and testing; (3) without correcting losses, that is, the losses remain unchanged rather than dynamic correction during training and testing.
7. Discussion
Previous studies have largely focused on either purely data-driven methods or models heavily reliant on physical principles without integrating the advantages of machine learning techniques. This paper proposes a temperature estimation framework that integrates physical information with data-driven methods. The proposed framework effectively combines neural networks, differentiable physical models, and simulation results and addresses the limitations of purely data-driven methods (lack of physical interpretability and potential divergence from physical principles) and purely physical models (rigidity and potential inaccuracies in modeling complex real-world phenomena). The effectiveness of this method is validated by using both synthetic data and measured data, including a thorough ablation study of various settings, diverse comparisons with common data-driven methods, and the exploration of temperature estimation for the node without any associated labels. Due to the incorporation of physical principles, the output temperatures are more reasonable and robust, and the overall results exhibit better physical consistency. This method holds significant practical value and is crucial for optimizing motor performance, extending lifespan, and ensuring safety in applications where thermal management is critical.
While the current findings are promising, several future research directions can further enhance the framework’s applicability:
Validating the proposed method’s effectiveness and generalization ability by utilizing a more extensive and diverse set of real-world data;
Investigating other neural network architectures, such as graphic neural networks (GNNs) or convolutional neural networks (CNNs), could provide insights into their efficacy in capturing temporal dynamics and spatial relationships within motor systems;
Implementing the framework in real-time control systems and validating its performance in operational environments would be a crucial step toward its industrial application.