*4.1. Simulated Model*

A mathematical model verifies the methodology. It is corresponding to the bearing vibratory signature, with an outer race defect (*xBPFO*). Equations (8)–(10) [27] describes the used model to present the e ffect of a rolling element at each passage in the faulty outer race according to time *t*. The passage of balls in the defect of the outer race creates impacts at the frequency *fBPFO*. This impact generates an impulse response of the structure with a natural frequency *f*0 and a damping μ. Frequency *fBPFO* depends on the rotation speed of the motor, *fr*, and bearing's geometry, Equation (10).

Thus, the model is defined by four-parameters: amplitude *A*, damping factor μ, rotational speed *fr*, the amplitude of the noise signal *b*(*t*). The exponential formula implanted in place of amplitude *A*. Roller bearing simulated is a type SKF 6206 whose characteristics listed in Table 3. Every signal contains 16384 samples ( *N*) with a sampling rate of 51.2 kHz.

$$\chi\_{\rm BFPO}(t) = \sum\_{k=1}^{N} A.exp\left(-2\pi\mu f\_0 \left(t - \frac{k}{f\_{\rm BFPO}}\right)\right) \sin\left(2\pi f\_0 \left(t - \frac{k}{f\_{\rm BFPO}}\right)\right) + b(t) \tag{8}$$

$$A = \frac{e^{4\omega} - 1}{e^4} \tag{9}$$

$$f\_{\rm BPFO} = \frac{n\_b}{2} f\_{\rm r} \left( 1 + \frac{d\_{\rm bull}}{D\_{\rm m}} \cdot \cos(\alpha) \right) \tag{10}$$



To simulate the appearance and evolution of the defect, the database was made of created fifty-one di fferent values of the amplitude *A*, noted *Ai* with *i* = 1 ... 51. For each value, twenty signals were generated with a Gaussian variability of ±5% for the three parameters *fr*, μ and *fo*, Table 4. Thus, the database was made on 51 signals × 20 signals ordered by increasing values. The amplitude for *Ai*=1−<sup>10</sup> was constantly equal to zero, which had no variation for amplitude *A*. The deviation started from eleven to fifty-one, introducing Equation (9) in Equation (8), to create signals, Table 4.


**Table 4.** Simulation characteristics.

#### *4.2. E*ff*ect of Internal Parameters of The OPTICS Method*

OPTICS uses two parameters ε and *MinPts*. ε was calculated in the initialization phase, after collecting all the data. ε depends on the *MinPts* value. For simulation, ε had a value in a range (0.909–0.920) for a range *MinPts* =(2–20). Thus, this value varied only slightly during the initialization phase. Its value for the *MinPtsth* neighbour was kept for the rest of the algorithm. Figure 3 confirms the value of ε. After the initialization phase, ε increased abruptly.

**Figure 3.** ε as a function of *MinPts* for 0.1*b*(*t*) level noise.

*Minpt*s was related to the number of signals for an instant. Table 5 shows the effects of *MinPts* for three levels of noise. This table aimed to represent the effectiveness of the automatized ε and *MinPts*, with the initial state that is the Euclidean distance that exists in the OPTICS algorithm, and all the features (seventeen). From this table, the optimal value of *MinPts* was *n*/2. This value of *MinPts* made it possible to detect the fault before the others.


**Table 5.** Effect of *Minpts* on detection time, with Euclidian distance, 17 features, ε = 0.094.

The selection of the distance measure affects the results of clustering algorithms. In this section, the advantages and disadvantages of every distance method used are shown in Table 6. The Euclidean distance used in the OPTICS algorithm in clustering, to calculate the distance between two vectors, was significantly difficult to iterate even an approximate of the precise values of data. Table 7 below shows the effect of distance implanted in the AOC-OPTICS method for three noise levels. The Manhattan distance could lead to the detection of the defect with global accuracy for the different signal to noise ratio equal to 96.7%, and then the second one was the Mahalanobis distance that detected at 88.2%, for the other distances the accuracy equaled 85.5%.


**Table 6.** Advantages and disadvantages of different uses distance. *x* and *y* are features vectors.

**Table 7.** Effect of distances, with *MinPts* = *n*/2, 17 features, ε = 0.094.


#### *4.3. E*ff*ect of Ranking Features*

Usually ranking features is used in preprocessing data as a feature subdivision. The concept for use is to count the random instance, then calculate their nearest neighbors and set the vector of weighting features, which can distinguish the features from neighbors of various classes.

Two methods chi-square and relief were compared. Table 8 represents the result of the ranking features. The comparative study presents the effectiveness of the relief method that could detect the defect in the high accuracy from features number ten to the end. The chi-square start to recognize the highest efficiency with twelve features. From the results of Table 8 could conclude that the method of relief ranking features was the best with just ten features that was enough to obtain the highest accuracy.

**Table 8.** Global accuracy. Effect of ranking features with Manhattan distance, *MinPts* = *n*/2, ε = 0.094.

