**1. Introduction**

Over the years, global electric energy consumption has increased from 440 Mtoe in 1973 to 1737 Mtoe in 2015 [1]. This has resulted in electricity becoming a specific product that is subject to market regulation in both quantitative and qualitative terms. Quantitative analysis is mainly focused on the balance between energy that is produced, transmitted, stored, consumed or lost. The current issues connected to quantitative aspects of energy consumption are related to demand-side

response, the integration of renewable energy sources, and energy storage systems with electrical power systems [2–5]. The qualitative approach mainly uses power quality analysis. The issues related to power quality (PQ) include definitions of the parameters, and the methods of measurements and assessment, which are already standardized, among others [6–8]. The methods described in the standards of power quality analysis are based on power quality parameters measured during a representative period of time, normally one week, which corresponds to the normal working conditions of the observed network [9]. The parameters which characterize power quality include: frequency variation, voltage variation, voltage fluctuation, voltage asymmetry, and voltage waveform distortion. These parameters are collected during the period of observation, with the aggregation time interval usually equal to 10 min; however, 1 min aggregation intervals are also currently studied [10]. Using these parameters creates a significant number of data to be considered in the analysis. Moreover, PQ data depends on the network conditions, load changes, generation level, or configurations of the network. For this reason, a rational approach is to search for data mining techniques able to extract and classify vectors of the power quality data that represent different features. This would allow the range of qualitative analyses to be extended by correlating the information of the network, environment or market condition.

There are many works dedicated to power quality disturbance extraction, PQ events recognition, and the classification of PQ events and disturbances, which are all directly focused on the measured voltage and current signals. Most of the works propose wavelet transforms, S-transforms, empirical mode decomposition, and other different decomposition techniques, which are supported by artificial neural networks in order to find valuable methods for PQ disturbance extraction and recognition [11–13]. However, a different problem can be formulated when there is a need for identifying and extracting some of the data that represent different features from the long-term aggregated power quality data. This necessitates a comprehensive method for the automatic classification of long-term power quality data into groups that represent similar features. This task is essentially a data mining area of interest. One of the data mining techniques that can meet these requirements is cluster analysis (clustering). The general application of data mining techniques in power systems is presented in [14]. Specific applications of cluster analysis to electrical power networks include:


This article extends the cluster analysis (CA) proposed by the authors in [9]. Jasi ´nski et al. [9] present the results of the application of CA in order to achieve a desirable division of the long-term 10-min aggregated power quality data into groups of data representing similar features. The collection of the PQ data comes from four real points of measurement in the supply network of a copper mine. The significant elements of the investigated power network are combined heat and power (CHP) plants with gas-steam turbines working as a local distributed generation (DG), and also a welding machine (WM) as the main time-varying load. Time-varying PQ conditions were intentionally created. The distributed generation was switched on and off for a period of time, and a network reconfiguration was also performed. The results discussed in [9] confirm the possibility of using cluster analysis for the extraction of power quality data into groups related to the different working conditions of an electrical network, including the influence of DG, reconfiguration of the network, working days, and holiday time. In [9], the methodology of application of the cluster analysis, including the preparation of the database structure, was also described. The idea presented in [9] leads to efficient classification of the power quality data, but it does not provide a suitable method for the assessment of collected clusters of the data. Searching for (1) a comprehensive solution that provides automatic classification of the multipoint measurement data, and (2) a method for comparative evaluation of the collected data, remains a desirable aim for wide-area monitoring systems and smart grids. Thus this article

is an extension of the previously obtained classification [9] in order to a achieve quality assessment of obtained clusters using global power quality indices. This leads to an automatic classification of working conditions of an electrical power network (EPN), and the possibility of an easy comparison using global values, that incorporate the impact of different PQ elements.

Both cluster analysis of PQ data and global power quality index (GPQI) application may be found in the literature:


The aspect that distinguishes the solution proposed in this paper from the methods described in quoted works is the area-based approach to the PQ assessment, involving all measurement points for the cluster analysis, as well as development of a new synthetic power quality index. Novel aspects of the method proposed in this article include:


The remaining structure of this paper is as follows: Section 2 reviews the present application of global power quality indices in the electrical power network, and also proposes a new definition of the GPQIs proposed in our assessment of clustered PQ data. Section 3 describes the proposed algorithm methodology for the comparative assessment of the power quality conditions using a combination of clustering and global power quality indices. The first step of the algorithm is the identification and allocation of the power quality data into groups that represent similar features. This part is based on previous experience with CA application described in [9]. The second step is the assessment of the collected data using the proposed GPQI. The results of the assessment are presented using real multipoint power quality measurements in a medium voltage electrical network supplying the mining industry. Additionally, this section also contains a sensitivity analysis of the proposed GPQI in terms of the selection of the power quality parameters used to construct the GPQI. The presented results are towered to realize one of the article's aims—to highlight the impact of DG on PQ in the industry network. The obtained clusters represent different conditions of PQ indices which are directly associated with impact of the DG. Qualitative assessment of the PQ data collected in the identified clusters using the proposed global power quality indices allows us to confirm several relations between DG impact on PQ condition. Section 4 contains the discussion of the obtained results. Section 5 formulates the conclusions, interpretations in perspective studies, and implications for the future.

## **2. Global Power Quality Indices**

Classical power quality assessment is a multi-criteria analysis approach that is independently applied to particular power quality parameters. The idea of a simplified and generalized assessment of the power quality condition uses a single index, known as a global, unified, total or synthetic index. In this paper, we decided to use global power quality indices (GPQIs) as a unified name. Before new definitions of GPQIs are introduced, it is relevant to have a review of the knowledge concerning the development of GPQIs. Singh et al. [43] present the application of a unified power quality index that uses the matrix method. The index, corresponding to voltage sag severity, was highlighted as a suitable proposition for power quality assessment, and is carried out in a three-stage approach. The first stage requires the preparation of a graphical system model (attribute digraph). The second step is the conversion into an attribute matrix. The next step is the presentation of the matrix as a variable permanent function. Ignatova and Villard [44] define green-yellow-red indicators for all PQ problems. The proposed algorithm obtains the green-yellow-red indicators for both events and disturbances. The index consists of all individual PQ parameters, which are expressed as a percentage in a range from 0% to 100%, where 0% denotes the worst PQ and 100% the optimal PQ. The index may be defined for each single point or for the whole facility. The benefit of the proposed generalization is the possibility to easily understand the interpretation of the PQ condition in the monitoring systems. Nourollah and Moallem [45] present the application of data mining to determine the unified power quality index which corresponds to all power quality parameters, with further classification, normalization, and incorporation. The proposed fast independent component analysis algorithm was proposed to determine the power quality level of each distribution site. The mentioned article proposes two indexes: the Supply-side Power Performance Index, which expresses the impact of six voltage indices; and the Load-side Power Performance Index. The second index corresponds to three current PQ indices. Raptis et al. [46] present artificial neural networks as a sufficient tool to support PQ assessment using an index called Total Power Quality Index. The index is the artificial neural network combination of eight power quality values used as input variables. The presented method uses a multilayer perceptron artificial neural network. Lee et al. [47] propose another power quality index. This index includes the power distortion, which concerns non-linear loads. The indicated aim of the proposed PQI is to support harmonic pollution determination in a distributed power system. The work [47] proposes a new distortion power quality index. The application of this index is a determination of the harmonic pollution ranking for different non-linear loads. It is realized by multiplication of the load composition rate and the load currents' total harmonic distortion. Hanzelka et al. [48] propose the idea of a synthetic PQ index. This index is based on the maximum values of traditional PQ parameters. These parameters are slow voltage change, harmonic content in voltage (represented by total harmonic distortion in voltage, and a particular harmonic from 2nd to 40th), unbalance, and voltage fluctuation

(represented by long-term flicker severity). The proposed assessment provided only satisfactory or unsatisfactory results.

In the present work, two definitions of GPQIs are proposed—one for 10-min aggregated data, and the other for the events. The proposed indices are inspired by the synthetic approach described in [48,49]. Some elements of the GPQI definitions, in terms of the multipoint measurements, were also proposed by the authors in [50]. Typical for the generalization process is that global indices are usually less sensitive due to synthetization. In order to enhance the sensitivity, the global indices proposed in this work are not only based on classical 10-min aggregated power quality parameters, but they are also extended by other parameters like an envelope of voltage changes based on 200-ms values. In order to demonstrate the proposed approach, we also present an analysis of how selected parameters comprising the global index influence its sensitivity.

The first proposed global power quality index is called the aggregated data index (*ADI*), and is expressed in (1).

$$ADI = \sum\_{i=-1}^{7} k\_i \cdot \mathcal{W}\_i \tag{1}$$

*ADI*—aggregated data index;

*i*—number of the factor ranging from 1 to 7;

*Wi*—the particular power quality factors which create a synthetic aggregated data index;

*ki*—the importance rate (weighted factors) of the particular power quality factor constituting the synthetic aggregated data index, range of [0, 1], where <sup>7</sup> *<sup>i</sup>* <sup>=</sup> <sup>1</sup> *ki* = 1.

The *ADI* utilizes five classical 10-min aggregated PQ parameters, including: frequency (*f*), voltage (*U*), short-term flicker severity (*P*st), asymmetry factor (*ku*2), total harmonic distortion in voltage (*THDu)*, and also two additional parameters which are responsible for the enhancement of the sensitivity of the proposed global index. The first additional parameter is represented by an envelope of voltage deviation obtained by the difference between the maximum and minimum of 200-ms voltage values identified during the 10-min aggregation interval. The second is a maximum of the 200-ms value of the total harmonic distortion in voltage, similarly identified in the 10-min aggregation interval. The mentioned parameters are calculated and refer to standard IEC 61000-4-30 [7]. Three phase values, like *U*, *P*st, and *THDu* are reduced to one using the mean value of the three phase values. To be more specific, particular factors that create the proposed *ADI* index are based on the differences between the measured 10-min aggregated power quality data and the recommended limits stated in the standards. The differences are expressed as a percentage in relation to the limits. The final values of the factors taken in the *ADI* calculation are the mean values of the time-varying factors during the time period of observation. Additionally, the contribution of the particular power quality factors in global indices can be controlled by the importance factors, which serve as the weight of the contribution of particular parameters. The values of weighting factors are normalized to one. Selection of importance factors makes it possible to check the impact of single parameters as well as groups of parameters. The selection of parameters may be defined by a priori analysis of EPN problems (e.g., harmonics, voltage variations). No a priori statements were conducted in this work, so the weight of all parameters is the same and the priorities of particular parameters were the same. The aim of the introduced weighted factors is to open the possibility to make the analysis more focused on particular PQ parameters and neglect others—in other words, to obtain an analysis that is more sensitive for selected PQ phenomena controlled by weighted factors. For example, to justify adding 200-ms values, analyses with and without them were conducted.

Particular factors which create the global *ADI* index are defined as follows [50]:

$$\mathcal{W}\_1 = \mathcal{W}\_f = \frac{mean(\left| f\_{\rm m} - f\_{\rm nom} \right|)}{\Delta f\_{\rm limit}} \tag{2}$$

*W*<sup>1</sup> = *Wf*—factor of frequency change;

*f*m—10-min measured value of frequency;

*f*nom—nominal value of frequency;

*mean <sup>f</sup>*<sup>m</sup> <sup>−</sup> *<sup>f</sup>*nom —mean of frequency deviations in the observation time period; Δ*f*limit—limit value of frequency change as a %.

$$\mathcal{W}\_2 = \mathcal{W}\_{\mathcal{U}} = \frac{\text{mean}(|\mathcal{U}\_\mathbf{m} - \mathcal{U}\_\mathbf{c}|)}{\Delta \mathcal{U}\_{\text{limit}}} \tag{3}$$

*W*<sup>2</sup> = *WU*—factor of the voltage level;

*U*m—mean of 10-min measured values of voltage from three phases;

*U*c—declared voltage;

*mean*(|*U*<sup>m</sup> − *U*c|)—mean of voltage deviations in the observation period of time; Δ*U*limit—limit value of voltage change in volts.

$$\mathcal{W}\_{\text{3\\_}} = \mathcal{W}\_{\text{Pst}} = \frac{\text{mean}(\text{Pst}\_{\text{m}})}{\text{Pst}\_{\text{limit}}} \tag{4}$$

*W*<sup>3</sup> = *WPst*—factor of voltage variation;

*Pst*m—mean of 10-min measurement value of the short-term flicker severity index from three phases;

*mean*(*Pst*m)—mean of voltage variations in the observation time period;

*Pst*limit—limit value of short-term flicker severity.

$$\mathcal{W}\_{\mathsf{4}} = \mathcal{W}\_{\mathsf{ku2}} = \frac{mean(\mathsf{ku2}\_{\mathsf{m}})}{\mathsf{ku2}\_{\mathsf{limit}}} \tag{5}$$

*W*<sup>4</sup> = *Wku*2—factor of voltage unbalance;

*ku*2m—10-min measured values of voltage unbalance;

*mean*(*ku*2m)—mean value of voltage unbalance in the observation time period;

*ku*2limit—limit level of voltage unbalance.

$$\mathcal{W}\_5 = \mathcal{W}\_{THDu} = \frac{mean(THDu\_\text{m})}{THDu\_\text{limit}} \tag{6}$$

*W*<sup>5</sup> = *WTHDu*—factor of total harmonic distortion factor of voltage supply;

*THDu*m—mean of 10-min measurement values of the total harmonic distortion factor of the voltage supply from three phases;

*mean*(*THDu*m)—mean value of the total harmonic distortion factor in the observation time period; *THDu*limit—limit level of the total harmonic distortion factor of the voltage supply.

$$\mathcal{W}\_6 = \mathcal{W}\_{llemv} = \frac{\frac{m \text{cen} (|lL\_{\text{max}} - lL\_{\text{min}}| )}{\mathcal{U}\_c}}{2 \times \Delta l I\_{lrmit}} \tag{7}$$

*W*<sup>6</sup> = *WUenv*—factor of voltage deviation envelope;

*U*max—mean value of 200-ms voltage maximum values from three phases allocated in 10-min data; *U*min—mean value of 200-ms voltage minimum values from three phases allocated in 10-min data; *U*c—declared voltage;

*mean*(|*U*max − *U*min|)—mean of voltage envelope width in the observation time period; Δ*U*limit—limit level of voltage change.

$$\mathcal{W}\tau = \mathcal{W}\_{THD\text{unmax}} = \frac{\text{mean}(THD u\_{\text{max}})}{THD u\_{\text{limit}}} \tag{8}$$

*W*<sup>7</sup> = *WTHDu*max—factor of the maximum 200 ms value of the total harmonic distortion factor of voltage supply;

*THDu*max—mean value of 200-ms maximum values of the total harmonic distortion factor of voltage supply from three phases;

*mean*(*THDu*max)—mean of the total harmonic distortion factor in the observation time period;

*THDu*limit—limit level of the total harmonic distortion factor of the voltage supply.

Then, the preparation of the particular factors *W*<sup>1</sup> ÷ *W*<sup>7</sup> and the selection of its important rates, the aggregated data index factor expresses the PQ level in a global range. The interpretation of the obtained index values are as natural. A value of "0" represents the ideal PQ; "0–1" represents possible power quality deterioration, but in compliance with the requirements defined in the standards; and finally, a value greater than 1 indicates the permissible parameters level defined in the standard is exceeded.

The second proposed global index relates to events. The classical approach to power quality assessment utilizes a flagging concept, which generally prescribes the extraction of the aggregated values that are affected by events like dips, swells and interruptions. The authors propose to use the information about the number of data which are not considered in classical PQ analysis due to the flagging concept. This is used as the base for a global index called the flagged data index (*FDI*), defined as follows [50]:

$$FDI = \frac{f}{n} \times 100\% \tag{9}$$

*FDI*—flagged data index;

*f*—number of 10-min data, which were flagged in the observation time period;

*n*—number of all 10-min data in the observation time period.

Interpretation of the obtained *FDI* values can be formulated such that "0%" represents the ideal PQ without any event disturbances, and "100%" expresses measurement data where each averaged value is contaminated by voltage events.

The proposed concepts for the generalization of the power quality assessment using GPQIs can be implemented for the fixed time period of observation or for identified periods of time representing different features of the power quality condition of the monitored area of the power system. The identification of such periods can be achieved using cluster analysis.
