*3.2. Loss Function*

The SMD-Net is trained with the training data {**<sup>p</sup>***i*, **<sup>p</sup>***ci*}*Ntr i*=1, in which **p***i* and **p***ci* are the measurement and the labels, respectively, and *Ntr* is the number of the training samples. Then, the loss function is defined as:

$$loss = \frac{\rho\_1}{N\_{lr}} \sum\_{i=1}^{N\_{lr}} \left\| \mathbf{\hat{p}\_{c\_i}} - \mathbf{p}\_{c\_i} \right\|\_2^2 + \frac{\rho\_2}{N \times N\_{lr}} \sum\_{k=1}^{N} \sum\_{i=1}^{N\_{lr}} \left\| \mathbf{N}^{-1} \left( \mathbf{N} \left( \mathbf{h}\_i^{(k)} \right) \right) - \mathbf{h}\_i^{(k)} \right\|\_2^2 \tag{22}$$

where *N* denotes the total number of the SMD-Net block; *ρ*1 and *ρ*2 indicate the weight parameters of the two constraint items; **^ p***ci* and **p***ci* are the *i*th interferogram estimated; *h*(*k*) the *i*th ideal interferogram; and **h**(*k*) *i*represents the residual error in the *k*th SMD-Net block.
