*4.4. Feature Fusion*

Conventional CNN–LSTM is a stacked structure, in which CNN extracted features are processed by LSTM again. Different from conventional CNN–LSTM, this paper adopted a parallel pattern of CNN–LSTM to extract the features and then merged the features they extracted with flattening statistics components. Therefore, we obtained fusion features that are multi-scale and multi-domain (time and statistic domains), which are expressed as:

$$F \text{LISION}\_{\text{features}=} \text{Concatenate}(\text{CNN}\_{\text{features}}, \text{LSTM}\_{\text{features}}, \text{Filter}(\text{Reshape}(\text{Statistics}(\mathbf{x}(t))))) \tag{23}$$
