*4.1. Case One*

4.1.1. Dataset Preparation and Parameter Settings

The NetCMAs system is installed with vibration sensors, stress sensors, temperature sensors, etc., which can effectively detect the status of the whole port crane system in real time. The whole system has 32 sampling channels, and sampling information points are distributed in the T frame of the upper beam area, beam rod load area, lifting motor, gearbox, car motor, etc. The collection positions of the gearbox are as follows, V-directional and H-directional vibration on the left side of the high-speed shaft, the temperature on both the left and right side of the high-speed shaft, and V-directional vibration on the left side of the low-speed shaft. The sampling frequency of the detection system is 2.5 kHz. The sampling time is 0.8 s, and the sampling interval is 10 s. In order to avoid continuous data transmission and storage consumption, the valid values of each sampling period are calculated and saved. Figure 5 shows the driving part of the port crane lifting mechanism. Figure 6 shows the reduction gearbox and bearings. Figure 7 shows the installation position of the vibration sensor and the damaged bearing.

**Figure 5.** Lifting mechanism of port crane.

**Figure 6.** Gear box of lifting mechanism of port crane. (**a**) Drive gear set for installation. (**b**) Gear bearing.

**Figure 7.** Experimental settings. (**a**) Installation position of vibration sensor. (**a**) Installation position of vibration sensor.

The dataset used in this case comprises four years of data on the No. 8114 lifting bridge of a port crane. The data were recorded from the time when the gearbox was first equipped until the time of failure. Due to extremely large volumes of data, only representative data are shown in Figure 8. It can be seen that there are some shock components in the wave. According to our practical application experience, all data are classified into four categories: healthy (H), sub-healthy (SH), failure (F), run-in period (R), and health (H).

**Figure 8.** V-direction vibration waveform of the gearbox.

After each new gearbox is re-installed, it will run through a run-in period of time before it enters into a healthy state. After a period of operation, the equipment will be in a sub-healthy state due to the occurrence of wear of the equipment. Eventually, the damage was so significant that the equipment entered a fault state. Table 1 describes in detail the label and quantity information of the four state data in the experiment. Sample labels are onehot encoded, and each state contains 100 samples. The dimension of each sample is 1600, so the dimensions after the 2D conversion are 40 × 40. In order to ensure the speed of iteration, the batch size is set to between 20 and 40.

**Table 1.** Four types of fault status information.


4.1.2. Experiment and Analysis

In order to demonstrate the effectiveness of AF-CDN, a simulation comparison is performed. Table 2 shows the comparison between the algorithm proposed in this paper and the 1D-CNN algorithm extracted by the FFT signal, and the 2D-CNN algorithm based on raw data.



As can be seen from the results of the 10 experiments in the table, AF-CDN provides better diagnostic results than both the algorithm that performs signal spectral analysis alone and the algorithm based on raw signal feature extraction. Figure 9 shows the statistical information of 10 experimental results. It can be seen that AF-CDN can not only effectively improve the diagnosis accuracy but also can greatly reduce the error of each diagnosis result. The performance of AF-CDN is 10.09% higher than that of FFT alone and is 1.3% higher than that of original signal feature extraction. The average execution time of AF-CDN was 0.196 s. The 1D-CNN was 0.083 s and Rawdata-CNN was 0.154 s. The improvement in accuracy compared to 1D-CNN is significant, so the computational consumption is worthwhile. Compared to Rawdata-CNN, the computational consumption is 0.042 s higher. In a practical scenario, the computational complexity can be significantly reduced by using a high-performance GPU device, so it is worthwhile to increase the computational consumption slightly.

**Figure 9.** Results of the 10 experiments.

Figure 10 uses T-SNE visualization technology to visually display the experimental classification results. In Figure 10a,c the visualization and error matrix of the AF-CDN are presented, respectively. In Figure 10b,d the visualization and error matrix of 1D-CNN algorithm are presented, respectively.

**Figure 10.** Visualization of classification results and error matrix: (**a**) visualization of AF-CDN; (**b**) error matrix of AF-CDN; (**c**) 1D-CNN visualization; (**d**) 1D-CNN error matrix.

At the same time, it should be pointed out that, compared with the method of feature extraction based on raw data for diagnosis, the AF-CDN in this paper demonstrates grea<sup>t</sup> improvement in reducing the probability of two kinds of misdiagnosis. This is also intuitive in the visualization and error matrix. The main reasons for this result can be summarized as follows: (1) compared with the single FFT signal or the original 2D signal, and the AF-CDN fuses the two signals. (2) Compared with SGD, the gradient fusion algorithm based on the Kalman filter has a stronger parameter integration ability in feature fusion of gradient information. In this way, the diagnostic ability of the network can be better improved. As can be seen from the visual figures of the two algorithms, there are more cases of misdiagnosis between H and SH and between S and F. At the same time, it should be pointed out that compared with the method of feature extraction based on original data for diagnosis, the AF-CDN provides a grea<sup>t</sup> improvement in reducing the probability of misdiagnosis.
