*Article* **Classification of Electrical Power Disturbances on Hybrid-Electric Ferries Using Wavelet Transform and Neural Network**

**Aleksandar Cuculi´c \*, Luka Drašˇci´c, Ivan Pani´c and Jasmin Celi´ ´ c**

Faculty of Maritime Studies, University of Rijeka, Studentska 2, 51000 Rijeka, Croatia **\*** Correspondence: aleksandar.cuculic@pfri.uniri.hr

**Abstract:** Electrical power systems on hybrid-electric ferries are characterized by the intensive use of power electronics and a complex usage profile with the often-limited power of battery storage. It is extremely important to detect faults in a timely manner, which can lead to system malfunctions that can directly affect the safety and economic performance of the vessel. In this paper, a power disturbance classification method for hybrid-electric ferries is developed based on a wavelet transform and a neural network classifier. For each of the observed power disturbance categories, 200 signals were artificially generated. A discrete wavelet transform was applied to these signals, allowing different time-frequency resolutions to be used for different frequencies. Three statistical parameters are calculated for each coefficient: Standard deviation, entropy and asymmetry of the signal, providing a total of 18 variables for a signal. A neural network with 18 input neurons, 3 hidden neurons, and 6 output neurons was used to detect the aforementioned perturbations. The classification models with different wavelets were analyzed based on accuracy, confusion matrices, and other parameters. The analysis showed that the proposed model can be successfully used for the detection and classification of disturbances in the considered vessels, which allows the implementation of better and more efficient algorithms for energy management.

**Keywords:** hybrid-electric ferry; maritime transport; marine electrical systems; electrical power disturbances; wavelet transform; neural network

## **1. Introduction**

Ferry transport of goods and passengers plays an important socio-economic role in most coastal countries. This is particularly true in Europe, where an estimated 900 ferries are currently active, accounting for about 70% of global ferry traffic [1]. Despite its undoubted importance, the increase in ferry traffic brings with it a number of problems, of which environmental pollution is perhaps the most important.

The majority of ferries in operation today use conventional marine diesel engines for propulsion and electrical power generation, which pose a significant challenge in meeting the requirements of future environmental standards [2]. For this reason, there is a trend towards hybridization and electrification of the existing ferry fleet, especially in EU countries, which is further encouraged by generous government subsidies for the use of alternative and environmentally friendly energy sources [3].

Key technologies for the development of the current generation of hybrid-electric ferries and associated land-based infrastructure are electric propulsion and energy storage systems (ES) [4–6]. At the beginning of the application, in the first decade of the 21st century, all-electric and hybrid ferries had a relatively small capacity of passengers and cars and operated on short routes. These first ships served primarily as test platforms for evaluating new technologies and gaining in-depth knowledge of the advantages of use and possible disadvantages compared to conventional propulsion systems. As operational experience has shown that such solutions offer significant potential to reduce fuel consumption,

**Citation:** Cuculi´c, A.; Drašˇci´c, L.; Pani´c, I.; Celi´ ´ c, J. Classification of Electrical Power Disturbances on Hybrid-Electric Ferries Using Wavelet Transform and Neural Network. *J. Mar. Sci. Eng.* **2022**, *10*, 1190. https://doi.org/10.3390/ jmse10091190

Academic Editors: Marco Altosole, Maria Acanfora, Flavio Balsamo and Bowen Xing

Received: 20 July 2022 Accepted: 22 August 2022 Published: 25 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

pollutant emissions and operating costs [7], ferry operators have started to introduce all-electric and hybrid vessels with higher capacity and range in their fleets. The largest hybrid-electric plug-in ferry currently in operation has a capacity of 2000 passengers, 419 cars and 642 trailers, with an installed 5000 kWh Li-Ion battery ES [8].

Conventional ferries with diesel-mechanical propulsion (which still represent the majority in the global ferry fleet) have simple power grids fed by relatively low-power diesel generators, requiring simple rule-based power management systems and electrical protection systems. On the other hand, due to the peculiarities of the power plant on such ships and the increased requirements for installed power, it is necessary to develop new design solutions based on the integration of several different power sources that often requires the use of a combination of DC and AC networks within the same system. One of the challenges is the need to connect the ship's own power plant to the shore-side power grid for cold ironing and ES charging.

It can be said that electrical networks on hybrid-electric ships have become very similar to terrestrial micro grids mainly due to topology, islanding and increased use of power electronics and energy storage devices, but in some segments, they have to meet much stricter requirements. This refers primarily to very high reliability requirements, the deployment restraints in relatively limited ship space and fast dynamic load changes, especially in consideration of large pumps and electric propulsion [9].

An increased number of power electronics converters for energy storage grid connection and power flow control can be the source of power quality disturbances (PQ). A review of the literature found that these disturbances are mainly manifested in voltage and frequency fluctuations, transients, sags, swells, harmonic distortions, power factor variations, etc. [10–13]. The mentioned types of interference and their characteristics are clearly defined by industry standards, which apply regardless of whether it is a ship- or land-based system [14,15]. The maximum permissible voltage and frequency deviations of the ship's network, as well as the duration of transients, are specified in the regulations of the leading classification societies, and the electrical protection devices are adjusted accordingly [16–18]. On most ships with classical propulsion and a standard rule-based power management system (PMS), the protection is activated only when a certain parameter exceeds the set point, but most phenomena, especially transients of short duration, harmonics and voltage and frequency changes, whose values are below the limits at which the protection operates, are not detected. It is precisely such unrecognized phenomena that very often lead to disruptions in the ship's energy supply system, often with serious consequences. This problem, which points to the necessity of continuous monitoring detecting the aforementioned phenomena, is discussed in great detail in the paper [19].

On hybrid-electric ships with fast dynamic load changes as well as island grids with limited power, such undetected changes can have a very strong impact on the power quality [20], and may cause various side effects, such as failure of digital and communication devices, unwanted tripping of circuit breakers and protection devices, overheating of electric motors, etc. Micro grid topology on hybrid and electric ferries, as well as the number and type of power sources, depend on the planned route, speed, number of voyages, and characteristics of land infrastructure at ports. This ultimately leads to very different power grid solutions, which often also react differently to the above phenomena. It presents a major challenge to the PMS, which must respond to any changes in power system parameters that may jeopardize the safety of the ship and the availability of power through the timely coordination of multiple power sources.

Research and practice have shown that conventional rule-based PMSs are not up to these challenges and that optimization-based PMSs and machine learning are better suited for hybrid and electric ships [21–25]. In order for such a PMS to properly perform its function, it is important that, among other things, it receives timely and accurate information about disturbances and changes in the parameters of the power system. In addition, the detected disturbances must be classified so that the PMS can associate them with a possible system failure or human error and take the necessary action.

The first step towards overcoming the aforementioned challenges is to identify the faults present in the system which often involve a very large number of complex or nonlinear perturbations that must be generalized, quickly identified and classified. In addition, the method used should be generally applicable to all types of the aforementioned faults, detect them all quickly and accurately, and be relatively easy to implement in the ship's PMS. The information by which the computer can distinguish different disturbances is obtained by signal analysis. Nowadays, it is possible to analyze the disturbances using different transforms, of which the best known are the Fourier transform, the short-time Fourier transform and the Wavelet transform, which is used in this work.

By reviewing recent scientific literature [26–30] and similar examples from other industrial branches, it has been shown that it can be beneficial to use neural networks for classification of such signals. With proper preparation and processing of input data, the neural network can classify faults quickly and with great accuracy, which is a great help in achieving the ultimate goal of timely detection of faults and improvement of power quality and power supply reliability on board hybrid-electric ferries. Although this work deals with hybrid electric ferries, because these vessels present special challenges for power and energy management, mentioned in the introduction, the proposed method can be useful in identifying power quality problems on all other types of vessels, regardless of the type and propulsion systems, and can contribute to the safety of the vessel, increase the efficiency of the power system, and develop better and more efficient PMS systems.

This paper is organized as follows. Electrical power disturbances and their effects in ship micro grids are analyzed in Section 2. The motivation and reasons for using the wavelet transform are presented in Section 3. Analysis of electrical power disturbances using discrete wavelet transform is described in Section 4. In Section 5, neural network classifier is presented. Neural network input data are explained in Section 6. Description of the power disturbance classification process is described in Section 7. Classification model performance analysis was carried out in Section 8. Finally, conclusions are outlined in Section 9.

#### **2. Power Disturbances and Their Effects in Ship Micro Grids**

Power disturbances can have harmful consequences for power grid components, but also for the devices of the end users. What makes this phenomenon even more problematic is the fact that the impact on equipment is usually not visible until a fault occurs. Even if equipment failure does not occur, poor power quality increases losses and reduces equipment lifetime [31].

Electromagnetic interference can be divided into low-frequency interference with a range up to 9 kHz and high-frequency interference with a range above 9 kHz. Here, each frequency range is subdivided into the conducted range and the radiated range, depending on the propagation mode. In addition to frequency range, disturbances can be divided by state (stable and unstable), by duration (very short to 3 periods of fundamental frequency, short to 3 min, long to 3 h, and very long over 3 h), and by waveform [31]. Some of the most common low-frequency disturbances that occur in marine electrical systems are: Voltage surges, voltage dips, higher order harmonic disturbances, oscillatory transients, and voltage notches.

Voltage surges are a phenomenon in the electrical system when the RMS value of the voltage increases by 10% to 80% of the nominal value. They can generate additional thermal load on the equipment and wiring, stressing and accelerating the wear of the insulation material. Typically, voltage spikes last between 10 ms and 1 min [15]. They can occur during a ground fault of a single-phase conductor in an insulated neutral system, which is the predominant topology on ships. In this case, the voltages of the healthy phases toward ground rise from the phase value to the line value, i.e., there is an overvoltage on the right phases that continues as long as the ground fault is not removed. Internal voltage surges can also occur when the system is suddenly unloaded due to intentional or unintentional disconnection of a large load or when capacitor banks are switched on [32].

Voltage dips are defined as the reduction of the RMS value of the grid voltage to between 10% and 90% of the nominal value. Most voltage dips do not fall below 50% of the nominal value and usually last between 10 ms and 1 min [31]. They are usually caused by switching on large loads with high inrush current, such as large induction motors or propulsion transformers. An internal voltage drop in the ship's electrical network can also occur in the event of a phase short circuit. Voltage drops can cause computers and programmable logic controllers (PLCs) to reset or shut down, relays and contactors to trip unintentionally [33], and frequency inverter operation problems due to unstable firing circuits that generate control pulses for semiconductor valves. In electric motors, the voltage drop causes a reduction in torque, which can result in the motor not starting when loaded.

Both voltage surges and voltage dips can be represented by Equation (1):

$$v(t) = A \cdot (1 + a \cdot (h(t - t\_1) - h(t - t\_2))) \cdot \sin(\omega t - \varphi),\tag{1}$$

where *v*(*t*) is instantaneous voltage, *A* voltage magnitude, *α* coefficient that determines the amplitude of the interference, *h*(*t*) Heaviside step function, *t* time, *t*<sup>1</sup> moment of the beginning of interference, *t*<sup>2</sup> moment of the end of interference, *ω* angular frequency and *ϕ* phase angle.

Harmonics are considered sinusoidal voltages or currents whose frequency differs from the fundamental frequency of the network and can be divided into three groups: integer harmonics, interharmonics, and subharmonics. The main cause of harmonic voltage distortion in the electrical network of ships is power electronic converters used to control propulsion and general service motors. The presence of harmonics in the grid voltage causes a number of problems, mainly in electric motors and transformers. Increased iron losses occur due to hysteresis losses proportional to frequency and eddy currents proportional to the square of the frequency. Harmonic currents in electric motors cause torsional vibrations that can damage the bearings and shaft, especially if there is resonance between the torsional vibrations and the shaft.

Oscillatory transients are defined as momentary deviations in voltage or current from steady state. There is no clear boundary between voltage fluctuations and oscillatory transients, but any event lasting less than 10 ms can be considered a transient. [34]. They can be divided into low frequency with a frequency of less than 5 kHz and a duration of 0.3 ms to 50 ms, medium frequency with frequencies of 5 kHz to 500 kHz and a duration of several tens of microseconds, and high frequency with a transition frequency of more than 500 kHz and a duration of several microseconds [35]. Oscillatory transients can be represented by Equation (2):

$$w(t) = A \cdot \left[ \sin(\omega t - \varphi) + \alpha e^{\frac{-(t-t\_1)}{\tau}} \cdot \sin(\omega\_n (t - t\_1) - \theta) \cdot \left( h(t - t\_2) \, h(t - t\_1) \right) \right], \tag{2}$$

where *τ* is time constant and *ϑ* disturbance phase angle. The rest of the parameters are same as in (1).

Microcomputers and PLCs are particularly sensitive to oscillatory transients, which may significantly reduce their service life [36]. If the oscillatory transient voltage is applied to the input of the voltage source frequency converter, a current flow, which charges the capacitor used to stabilize the voltage in the DC circuit. If this current is not limited by serial chokes or a transformer, the capacitor is suddenly charged to a value higher than the predicted one, which creates a state of overvoltage. In this case, a surge protection is activated which disconnects the frequency converter and the electric motor from the mains [37].

Voltage notches are periodic short-term disturbances of power quality caused by operation of power electronics devices during current commutation. This type of interference is located between harmonics and transients. The reason is that, on the one hand, notches occur during normal operation and can be isolated as part of the harmonic spectrum of

the voltage signal, but on the other hand their frequency is high and cannot be analyzed with standard equipment for harmonic distortion analysis [35]. Voltage notches can be represented by Equation (3):

$$v(t) = A\left[\sin(\omega t - \varphi) - \operatorname{sgn}(\sin(\omega t - \varphi))\sum\_{n=0}^{i} k(v(t - (t\_c + s \cdot n)) - v(t - (t\_d + s \cdot n)))\right] \tag{3}$$

where *k* is coefficient that determines the depth of the notch, *n* certain period in which the disturbance occurs, *i* the total number of periods in which the disturbance occurs, *tc* and *td* the moment of the start or end of the disturbance, respectively.

Voltage notches in the mains voltage cause problems with synchronization clocks and counters that use the natural voltage zero crossing for counting. If a voltage notch that reaches zero occurs, then the counter will increase the value and thus count more than the actual value [38]. Also, in the event that the additional mains voltage passes through zero, the circuit breakers may break prematurely. If voltage notches occur together with harmonic distortion in the voltage and frequency control circuits for generators, voltage and frequency instability in the network can occur [39].

A summary of power disturbances and their causes and effects is given in Table 1.


**Table 1.** Summary of power disturbances, their causes and effects.

#### **3. Motivation for Using Wavelet Transform**

Displaying signals in the time domain does not provide enough information to effectively identify different types of disturbances in the power system. Therefore, it is necessary to apply a mathematical transformation to the base signal to obtain additional information

about it and to obtain relevant and sufficiently accurate parameters required for neural network training.

The Fourier transform, which is most commonly used to analyze periodic signals, is a reversible transform, meaning that it can be switched between the time and frequency domains at will, but only one of the two domains can be represented at a time. This means that information from the frequency domain is not available in the time domain and vice versa. For stationary signals whose frequency content does not change over time, i.e., which have the same frequency components throughout, this is not a problem, since it is not necessary to know at what time certain frequency components of the signal occur. On the other hand, the disturbances mentioned in the previous chapter are typical representatives of non-stationary signals whose time-frequency characteristics and duration depend on a series of random events such as disturbances and operating conditions.

The above-mentioned drawbacks of the Fourier transform for the analysis of nonstationary signals can be solved by applying the short-time Fourier transform (STFT), in which the original signal is divided into equal parts at which the signal can be considered stationary. Using a suitable window function, a three-dimensional diagram is obtained in which the vertical axis represents amplitude and the horizontal axis represents time and frequency. From such a diagram, one can see at what time there are frequency components belonging to that part of the signal, i.e., one obtains the time-frequency representation of the signal [40].

The disadvantage of STFT is that one does not know which individual frequency components are present at any given time. Instead, it is only possible to know the time intervals in which certain frequency bands are present. This problem is caused by the window function having a finite width that covers only a portion of the signal, resulting in poor frequency resolution. When the window is infinite, the same result is obtained as with the Fourier transform, i.e., excellent resolution in the frequency domain and no resolution in the time domain. The narrower the window, the better the time resolution and the better the approximation to a stationary signal, but at the same time the frequency resolution is lower and vice versa. Therefore, the main problem of the short-time Fourier transform is the correct choice of the width of the window function that can be used to analyze different signals.

The wavelet transform allows the analysis of signals with multiple resolutions by using different resolutions at different frequencies. Multiresolution analysis is particularly suitable for signals where high-frequency components are short-lived while low-frequency components are long-lived, and these are precisely the interference signals considered in this work. Short-lived high-frequency components require very accurate temporal localization achieved by a narrow wavelet, resulting in poor frequency resolution, and vice versa. Low frequency components often determine most of the signal characteristics, and these characteristics are best quantified when the frequency resolution is as good as possible [41,42]. In view of the above, a multilevel discrete wavelet transform (DWT) is chosen for the power disturbance signal analysis.

#### **4. Power Disturbance Analysis Using Multilevel DWT**

The DWT is the most common method for implementing the wavelet transform in computers. It allows signal analysis (decomposition) and synthesis (reconstruction), and filters are used for this purpose. Multilevel decomposition and signal reconstruction is most often used in practice. The reason is that it allows higher frequency resolution, which means that the presence of individual frequency bands in time can be determined with greater accuracy.

The DWT uses filters with different cutoff frequencies to analyze signals at different scales. At each level of signal analysis, two half-band filters with impulse response are used, one of which is a low-pass filter and the other a high-pass filter. Each filter consists of a number of coefficients that differ for the low-pass filter, the high-pass filter, and for signal decomposition and reconstruction. The number of coefficients depends on the type of wave. In signal processing, only orthonormal waves such as Daubechies, Coiflet, and Symlet waves are used because they have both the DWT and inverse DWT capability.

Daubechies waves used in this study are referred to as dbN, where N is the order of the waves, i.e., the number of vanishing moments. The length of the filter or the number of filter coefficients is 2N. The largest wave order is 45, but waves up to order 10 are most commonly used [43].

As the order of the waves increases, the time required to perform the transformation increases. On the other hand, such waves are smoother and better localized in the time domain, which is why the oscillations in the original signal can be better represented.

The operation of DWT can be explained with reference to Figure 1. The discrete input signal x[n] consists of n samples and contains the maximum frequency f. HP and LP denote high and low bandwidth filters, the symbol ↓ 2 denotes the subsampling method with 2, while cD and cA denote the detail coefficients and the approximation coefficients, respectively. The signal x[n] must be sampled at twice the frequency of the signal bandwidth. Since the bandwidth is B = f - 0, the sampling frequency is 2f.

**Figure 1.** Example of signal decomposition (3 levels).

The interfering signal x[n] is first passed through a LP which transmits frequencies from 0 Hz to half the maximum frequency of the signal. Since the signal after filtering contains half the bandwidth than before filtering, the signal contains twice as many samples as required. A subsampling of 2 is performed (every other sample of the filtered signal is discarded) to remove this redundancy of information on the filtered signal and obtain cA.

Then, the same input signal x[n] is passed through a HP that sweeps frequencies from f/2 to f. As with LP, the bandwidth is reduced by half, and subsampling must be performed. During subsampling, the content of the frequency components in the range f/2 to f is shifted to the new range 0 to f/2. Since there are no frequency components in this frequency range because they were previously removed by a HP filter, there is no loss of information and it is possible to reconstruct the signal if necessary. In this way, the cD is obtained, which are twice the number of samples of the input signal x [n]. The process that constitutes one stage of the decomposition can be described mathematically by Equations (4) and (5),

$$
\varepsilon A[k] = \sum\_{n} \mathfrak{x}[n] \cdot \mathrm{LP}[2k - n] \tag{4}
$$

$$cD[k] = \sum\_{n} x[n] \cdot HP[2k - n] \tag{5}$$

where *k* is the number of approximation samples (details), *n* is the number of samples, and LP and HP are the low-pass and high-pass filter functions, respectively.

The approximation coefficients represent the general trend of the original signal, while the details contain the high frequency components of the signal. The approximation is a low-resolution representation of the original signal, and the details represent the difference between two successive approximations [44].

The values of the low-pass and high-pass filters with length L are not chosen arbitrarily, but the rule described in Equation (6) applies.

$$HP[L-1-n] = (-1)^n \cdot LP[n] \tag{6}$$

The coefficients of the high-pass filter are actually the coefficients of the low-pass filter in reverse order, and every other coefficient has the opposite sign. Such filters are called quadrature mirror filters and are often used in signal processing.The symbolic representation of the frequency responses of the filters obtained by decomposition into three stages is shown in Figure 2.

**Figure 2.** Symbolic representation of filter frequency responses for three level decomposition.

As the degree of decomposition increases, narrower frequency bands are obtained. The narrowest and lowest frequency band is always that of the approximation coefficients of the last stage. The bandwidth can be calculated according to Equations (7) and (8) [45],

$$B\_{cA} = \begin{bmatrix} 0, \ 2^{-p-1} \cdot f\_s \end{bmatrix} \tag{7}$$

$$B\_{cDp} = \begin{bmatrix} 2^{-p-1} \cdot f\_{s\nu} & 2^{-p} \cdot f\_s \end{bmatrix} \tag{8}$$

where *p* is the decomposition level, *BcA* is the bandwidth of the approximation coefficients, *BcDp* is the bandwidth of the detail coefficients at the level *p* of the samples (details), and *fs* is the sampling frequency.

The signal reconstruction process is shown in Figure 3. The oversampling procedure with 2 (↑ 2) is performed over the coefficients cA and cD. This means that another sample with the value 0 or an interpolated value of adjacent samples is added between each of the two samples of the mentioned coefficient. Such signals are passed through a high-pass filter HP' and a low-pass filter LP' for signal synthesis [40]. These filters are responsible for returning the coefficients to the original frequency domain [46]. The filters for signal synthesis are identical to those for signal analysis, except that their coefficients are listed in reverse order. After the signal has passed through the filters, they add up to obtain the output signal x[n].

**Figure 3.** Example of signal reconstruction (3 levels).

The signal reconstruction can be described by Equation (9) [40].

$$\exp[n] = \sum\_{k=-\infty}^{\infty} \left( cA[k] \cdot LP'[2k-n] \right) + \left( cD[k] \cdot HP'[2k-n] \right) \tag{9}$$

As an example, the DWT decomposition of the oscillatory transient is performed with a db6 wavelet with composition level *p* = 5. The example noise signal lasts 1 s and is sampled at fs = 8 kHz, which means that the maximum frequency that can be detected is 4 kHz. The transform coefficients of DWT have the frequency bands calculated using Equations (7) and (8) and listed in Table 2.

**Table 2.** Frequency range of DWT coefficients for db6 wavelet and fs = 8 kHz.


The DWT decomposition of the oscillatory transient is shown in Figure 4, which shows a total of seven plots. The first diagram shows the original signal, the second the approximation coefficient cA, and from the third to the last diagram the cD1-cD5 are plotted.

**Figure 4.** DWT decomposition on the example of oscillatory transient.

It should be noted that for clarity, Figure 4 shows only the first 2000 samples of the signal, but each signal displayed consists of 8000 samples. The original signal has a fundamental frequency of 50 Hz, while a frequency of 733 Hz occurs during the oscillating transient. The cA contains a frequency component of 50 Hz, which makes the signal shown in this diagram look like a pure sinusoidal voltage.

The graphs of cD5 and cD4 are zero because there is no frequency component of the original signal in these frequency bands. The frequency component of the 733 Hz signal is included in the cD3 graph, where it can be seen that it begins at the 380th sample, after which the amplitude decreases as the oscillatory transient phase in the original signal decreases until it disappears. In the cD2 diagram, the same phenomenon can be observed as in the cD3 diagram, but with a much smaller amplitude. Ideally, this detail should have a value of zero, but due to the imperfection of the wave filters, the frequency bands overlap, as symbolically shown in Figure 2. Finally, in the cD1 diagram, all values are close to zero because the original signal has no frequency components in this frequency band.

Similarly, the DWT decomposition of voltage rise, fall and dip also shows the fundamental frequency component in the approximation diagram as in the previous cases. A sudden change of the voltage value in the original signal at the beginning and at the end of the transient causes peaks in the detail plots at the same time.

#### **5. Neural Network Classifier**

The proposed model for classifying electromagnetic disturbances in the energy system of the considered ships is based on the use of a shallow feed-forward neural network.

The neurons receive input signals multiplied by the associated weights, add the obtained products, add a sensitivity threshold to this sum, and pass it through the activation function. The output of the jth neuron *yj* can be expressed by Equation (10) as follows:

$$y\_j = \psi\left(\sum\_{i=0}^n w\_{ij} x\_i + \theta\right) \tag{10}$$

where *ψ* is transfer or activation function of the j-th neuron, *xi* input signal, *wij* weight coefficients at the input of j-th neuron and *ι* sensitivity threshold.

Input layer receives the input signals and forwards them to the hidden layer. It does not perform any processing on the input signals, nor are weights or sensitivity thresholds assigned to them. The number of neurons in the input layer is equal to the number of input variables.

The central layer is called the hidden layer and contains neurons that perform data processing. The number of neurons in the hidden layer is determined by the trial-and-error method, starting with the smallest number and observing the resulting error. Then, a minimal number of neurons is chosen at which a satisfactorily small error is obtained. Too few hidden neurons will result in large learning errors and poor generalization due to undertraining, while too many hidden neurons will result in a small learning error but learning will be unnecessarily slow. The activation function in the hidden layer must be nonlinear to approximate the nonlinear and linear relationships between input and output variables [47]. In this classification model, the hyperbolic tangent function is used. It provides values in the bounded range from −1 to 1. The outputs are oriented to zero, so that it can be achieved that the mean of all outputs in a layer is zero, which facilitates and accelerates learning in the next layer of neurons.

The outputs of the hidden layer are directed to the last, the output layer, which is the output of the network. The number of neurons in the output layer corresponds to the number of categories used in classification, i.e., each neuron represents a category. When the input vector of the corresponding category is input to the network, the corresponding neuron in the output layer should provide output 1 and the other neurons should provide output 0 [47]. To achieve this, a SoftMax function is used in the output layer to generate a probability vector for each category.

After the neural network is formed, the weight coefficients and thresholds must be determined. This procedure is important because the time required for the neural network to learn "well" depends on these parameters, i.e., it directly affects the convergence rate of the objective function towards a minimum [48]. If the network is known a priori, this knowledge can be used to set weights to specific values. In most cases, such knowledge is not available, so the weights are initialized with random values that are uniformly distributed in a certain interval. One of the methods used to achieve this is the Nguyen– Widrow method, which is standard in the MATLAB environment. This method generates random values in the interval [−1, 1]. This distributes the neuron inputs (sum of weights) approximately uniformly in the active region of the neuron, avoiding saturation of the neuron and slowdown in learning the network. The interval [−2, 2] is considered as the active region for the hyperbolic tangent function. It should also be mentioned that this method can only be used for those transfer functions that have a limited active range, as is the case for the sigmoidal function or the hyperbolic tangent function [49].

When using a neural network, it goes through three phases of work: Learning, Validation, and Testing. Each phase requires separate data sets, which are obtained by dividing the total number of samples into three subsets in certain proportions. Often 70% of the samples are reserved for learning, 15% for validation, and 15% for testing the network, but other ratios are possible. The partitioning is usually done by random sampling.

Training of the neural network is an iterative process to adjust the network parameters according to a given algorithm. The learning process aims to determine such values of the network parameters for which the error is minimal for the whole set of learning patterns. Simultaneously with the learning process, the validation of the network is performed on a set of samples for validation. During the validation process, only the input variables of the sample without results are passed to the network to check whether the network has the property of generalization, and this can be done only on a set of samples that has not been used for learning.

The learning and validation process end when one of the conditions for the completion of the learning process is met. The most common conditions are: a sufficiently small squared error has been achieved in all samples, a sufficiently small gradient has been achieved, the maximum number of learning epochs has been achieved, the set learning time has elapsed, the allowed number of consecutive epochs has been exceeded, etc. The network is then trained and can be used to evaluate the category of new data.

#### **6. Preprocessing of Data**

Signals obtained via DWT cannot be fed directly into the neural network input layer because they contain a different number of samples. Even if this problem could be solved by interpolating the values or by other means, the problem of too many input signals remains. Therefore, it is necessary to perform statistical data processing. Each signal obtained by DWT is represented using standard deviation, asymmetry and entropy.

The standard deviation of DWT coefficients is expressed by Equation (11):

$$
\sigma\_i = \sqrt{\frac{1}{N} \cdot \sum\_{j=1}^{N} \left( D\_{ij} - \mu\_i \right)^2} \tag{11}
$$

where *σ<sup>i</sup>* is standard deviation of the i-th level of DWT coefficient, *N* number of samples the i-th level of DWT coefficient, *Dij* the j-th sample i-th level of DWT coefficient and *μ<sup>i</sup>* arithmetic mean i-th level of DWT coefficient.

The arithmetic mean of the samples in the individual DWT coefficient can be calculated according to the Equation (12) and the asymmetry according to the Equation (13).

$$\mu\_i = \frac{\sum\_{j=1}^{N} D\_{ij}}{N} \tag{12}$$

$$Sk\_{\bar{i}} = \frac{1}{N} \cdot \sum\_{j=1}^{N} \left(\frac{D\_{\bar{i}j} - \mu\_{\bar{i}}}{\sigma\_{\bar{i}}}\right)^{\bar{3}} \tag{13}$$

where *Ski* is asymmetry of the i-th level of DWT coefficient.

The entropy of the i-th level of DWT coefficient *Hi* can be calculated according to the Equation (14).

$$H\_i = -\sum\_{j=1}^{N} D\_{ij}^2 \log \left( D\_{ij}^2 \right) \tag{14}$$

Some of the calculated parameters may have values very close to zero, while others may have very large values. If such data were input directly into the neural network, the error function would converge very slowly toward the minimum, i.e., the neural network would take a very long time to learn. To avoid this, it is necessary to scale the input data to the same scale. The input data should be reduced to the interval where the activation functions have the largest derivative to increase the learning speed [46]. Since the hyperbolic tangent function or the sigmoidal function is widely used, the input data is often scaled to the interval [−1, 1] and in some cases to the interval [0, 1]. There are two common methods for this purpose: Normalization and Standardization. The *patternnet* tool in MATLAB automatically scales the input data to the interval [−1, 1] using the *mapminmax* function for normalization and *mapstd* for standardization. Tests have shown that standardization of this particular input data results in less variation in classification accuracy than normalization, so this method is used in this work.

#### **7. Power Disturbance Classification**

The low frequency power disturbances used to test the neural network classifier were artificially generated using the MATLAB script pqmodel.m developed by R. Igual et al. [15]. With this script it is possible to generate all the mentioned low frequency interferences as well as many others. This part of the program is executed only the first time, because after that the generated signals are stored and recalled at each restart, so that the classification is done with the same signals.

To create power disturbance signals, six parameters must be entered into the script, namely: number of signals per disturbance type *ns*, signal sampling frequency *fs*, fundamental frequency of the electrical signal *f*, number of fundamental frequency periods in a signal *n*, signal amplitude *A* and disturbance category number. The current disturbance parameters used in this work are listed in Table 3.


**Table 3.** Power disturbance parameters.

The numbers of the current disturbance categories have the following meanings: 1. pure sinusoidal voltage, 2. voltage sag, 3. voltage swell, 4. oscillatory transient, 5. harmonically distorted voltage, and 6. voltage notches. Figure 5 shows examples of the training data for each of the disturbance categories (three randomly selected signals from each category are shown).

**Figure 5.** Samples of figures of the training data for each disturbance category.

With a total of six disturbance categories to classify and 200 signals in each category, the total data set consists of 1200 signals. When signals are generated, they are in a variable representing a three-dimensional matrix with the following dimensions: Number of signals per disturbance type × Number of points of the discrete signal × Total number of disturbance types.

In the script, the time of the beginning of the disturbance is chosen randomly. It is possible to define the limits within which the duration of the disturbance is randomly chosen in terms of the number of fundamental signal periods. Thus, for voltage sags and swells, a minimum duration of one fundamental signal period and a maximum duration of 25 periods, or half the duration of the entire signal, was chosen. In addition, it is possible to specify the limits of voltage dip or overshoot. In case of voltage drop, the minimum amplitude is 40% and the maximum amplitude is 70% of the nominal amplitude. In case of overvoltage, the minimum amplitude increase is 40% and the maximum is 70% of the nominal amplitude. It is also possible to change the proportion of harmonics and the limits of the proportion of each harmonic component in relation to the nominal value of the voltage. In the case of harmonic disturbances, third-, fifth- and seventh-order harmonics are generated, and the proportion can be arbitrarily selected between 5% and 15% of the nominal value. For the oscillatory transients, a minimum frequency of 300 Hz and a maximum frequency of 900 Hz were chosen, and the duration from half of the fundamental period to one third of the total number of fundamental periods in a signal. Finally, for the voltage dips, the minimum and maximum dip depths are set to 10% and 40% of the nominal voltage, while the number of voltage dips in a period is randomly chosen between 1, 2, 4 and 6.

In the next part of the program, the DWT is performed over the input signals. The signal decomposition was performed in five stages. Since there are no subharmonics or interharmonics in the observed disturbances, increasing the decomposition level would not do much good since only the lowest frequency band (0, 125) Hz would be further decomposed.

The decomposition into five levels results in one approximation coefficient and five detail coefficients, which are forwarded to the third part of the program, where the statistical parameters are calculated. The obtained statistical parameters form a vector of features or variables that are input into the neural network. For each signal, a total of 18 variables are entered into the network. In addition, a matrix of the actual signal categories is formed, which is also input into the neural network.

A neural network with 18 input neurons, 3 hidden neurons and 6 output neurons is used (Figure 6). This corresponds to 54 weight and 3 threshold values between the input and hidden layer as well as 18 weight and 6 threshold values between the hidden and output layer.

**Figure 6.** Neural network structure.

The hyperbolic tangent function is used in hidden neurons and the softmax function in output neurons. The Algorithms section lists in order: the method of data division, the training algorithm used, performance, and the method of computation. The settings listed correspond to those specified in the program. The Progress section lists in order: the number of epochs, the time needed for learning, the value of the achieved error function, the value of the achieved gradient and the number of epochs in which successive validation checks take place. After the neural network has completed the phase of learning, validation and testing, the classification results are obtained, i.e., the parameters for evaluating the performance of the classification model.

Neural network weight and threshold values obtained for db6 wavelet transformed signal are presented in Table 4.


**Table 4.** Weights and threshold values for db6 wavelet neural network classifier.


**Table 4.** *Cont.*

Since the training, testing, and validation datasets are randomly partitioned from the overall dataset, the weights and thresholds of a given neural network classifier may vary. However, the Hinton diagram can be used to visualize the value of weights and thresholds within each layer of the neural network, where a particular rectangle correlates with the influence of the particular weight or threshold [50,51]. The overall configuration of weights and thresholds for the db6 wavelet transform neural network classifier is shown with the Hinton plot in Figure 7.

**Figure 7.** Hinton plot of weights and thresholds for db6 wavelet transform-based neural network.

#### **8. Performance Analysis of Classification Models**

Performance analysis is performed for three classification models that use different wavelets, while all other parameters remain unchanged. The root wavelets to be used are db1, db4 and db6 wavelets. The classification models are analyzed using a confusion matrix for the test data set, which consists of 180 randomly selected signals used to test the neural network. A total of 30 signals are sinusoidal (category 1), 27 with voltage sag (category 2), 32 with voltage swell (category 3), 24 with oscillatory transients (category 4), 31 with harmonic distortions (category 5), and 36 with voltage notches (category 6). Since each signal category contains approximately the same number of signals and the signal categories are equally important, the accuracy of classification can be considered as a relevant parameter.

The classification model performance results for all three wavelets used are shown in Figure 8. The execution time refers to the execution of the entire program. The truepositive rate (TPR), precision, and F1 value are calculated separately for each signal category. These parameters are not critical for selecting a classification model but may help in selecting models where the other parameters are approximately the same, especially for large datasets.

**Figure 8.** Classification model performance results.

The confusion matrix provides more detailed information about the classification model. The neural network interface provides four confusion matrices: the first for the learning dataset, the second for the validation dataset, the third for the testing dataset, and the fourth is the overall confusion matrix. Only the third matrix, which is based on a test data set, is of interest for performance analysis. The confusion matrix for the classification model with db1 wavelet is shown in Figure 9.


**Figure 9.** Confusion matrix for classification model with db1 wavelet.

Figure 9 shows that the classifier poorly detects category 4 signals or oscillatory transients. This conclusion comes from examining the TPR in the fourth column, which is only 37.5%. Out of a total of 24 signals of this type, only nine signals are correctly classified, while six signals are classified as sinusoidal voltage, one signal as a signal with harmonic distortion, and eight signals as voltage notches. This means that this model is very poor at distinguishing oscillatory transitions from sinusoidal voltage and voltage notches. Examination of column 6 shows that the TPR is 86.1% and out of a total of 36 signals with voltage dips, 31 are correctly classified, two as sinusoidal voltage, one as voltage dips and two signals as oscillatory transients. Other signal categories are well distinguished by the classifier. Looking at the first row of the confusion matrix, it is found that the classifier classified a total of 41 signals as sinusoidal voltage, which is why the lowest accuracy in this row is 73.2%. This data shows that the classifier often predicts the category of sinusoidal voltage for different signals and was correct only 73.2% of the time. Low prediction accuracy also exists in row 6, where it is 79.5%. The confusion matrix shows that this model detects voltage sags, voltage boosts and harmonic distortions very well, since the values of the TPR parameters and the precision are high for the mentioned signal categories. It can be concluded that, despite the high precision, the classification model with db1 or Haar wavelet is not satisfactory since it has a very poor knowledge of oscillatory transients and therefore distinguishes them poorly from sinusoidal voltages and voltage notches.

The confusion matrix for the classification model with a db4 wavelet is shown in Figure 10. Looking at this confusion matrix and the one shown in Figure 9, a significant improvement can be seen in the classification of signals with oscillatory transients. This classification model correctly classified all signals with oscillatory transients, however, it classified four signals with voltage notches as oscillatory transients, which still indicates that the model does not perfectly distinguish between these two categories of disturbances. It is also noted that the model achieves higher accuracy for the sinusoidal signal category, as fewer signals are misclassified as sinusoidal voltage. Looking at the 6th column of the matrix, it is noticeable that out of a total of 36 signals with voltage notches, 32 were correctly classified, while the remaining 4 were classified as oscillatory transients. Thus, this is the only relevant direction for further improvement of the model.


**Figure 10.** Confusion matrix for classification model with db4 wavelet.

The confusion matrix for the classification model with db6 wavelet is shown in Figure 11. It can be seen that this classification model distinguishes the signal categories very clearly. Examination of the confusion matrix shows that this model is an improvement over the db4 wavelet model because it detects voltage notches better and thus classifies with fewer errors. It classifies other signal categories as well as the db4 model, and the entire program executes 0.4 s faster, making it the best classification model of all the models examined in this dataset. In case of an increase in the size of the data set, equally good results or even an improvement of the results can be expected, since the network then has more learning patterns.


**Figure 11.** Confusion matrix for classification model with db6 wavelet.

#### **9. Conclusions**

Any electrical power system is subject to disturbances, most of which are caused by the operation of power electronics equipment, outages, the connection and disconnection of loads, or human error. In complex marine microgrids, which consist of several different power sources and energy storage devices, require connection to shore infrastructure during port calls, and are characterized by frequent rapid dynamic load changes, timely detection and classification of disturbances is a prerequisite for their elimination or mitigation. In this work, voltage dips, overvoltage, oscillatory transients, harmonic distortions and voltage notches were observed. The aforementioned disturbances must be analyzed to obtain additional information by which the neural network will be able to distinguish the disturbances. The Fourier transform is not satisfactory because it does not provide information about when the frequency component occurs, and in the short-time Fourier transform there is only time-frequency resolution, which makes it difficult to analyze fast and slow changes in the signal simultaneously. Therefore, a discrete wavelet transform was chosen for the analysis, which analyzes low-frequency components with high time resolution and high-frequency components with high frequency resolution. The decomposition of each signal was performed in five stages and separately with db1, db4 and db6 filters. For each obtained coefficient (frequency band), three statistical parameters are determined: standard deviation, entropy and signal asymmetry. This results in a total of 18 variables representing a signal, which are introduced into the neural network. For interference detection, a probabilistic feed-forward neural network with 18 input neurons, 3 hidden neurons and 6 output neurons was used.

Classification models with different filters were tested on a separate dataset of 180 interferences with an approximately uniform distribution of samples across interference categories. The models were analyzed based on program execution time, accuracy, precision, TPR parameters, and F1 value. The confusion matrices of each classification model were also analyzed. The analysis showed that the model with the db1 valve had the shortest program execution time and satisfactory values for all parameters. However, the analysis of the confusion matrix shows that it is very poor at distinguishing oscillatory transients from sinusoidal voltages and voltage notches. Therefore, this model is still not satisfactory. The model with the db4 valve distinguishes the mentioned disturbances better and gives significantly better results for all parameters, except for the program execution time, which increases by 18.4% compared to the model with the db1 valve. The last tested model with a

db6 valve gives the best results in terms of accuracy and other performance parameters, and also has a slightly shorter execution time than the model with a db4 valve. Thus, it is the best model for the given data set. The proposed model can be successfully applied to the detection and classification of faults in the considered vessels, which can contribute to the safety and reliability of the power supply and serve as a basis for the development of advanced machine learning-based power management systems.

**Author Contributions:** Methodology, A.C., L.D., I.P. and J.C.; Software, A.C., L.D., I.P. and J. ´ C.; ´ Resources, A.C., L.D., I.P. and J.C.; Writing—original draft preparation, A.C. and L.D.; Writing— ´ review and editing, A.C., L.D., I.P. and J.C.; Supervision, A.C. and L.D. All authors have read and ´ agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

