Next Article in Journal
Optical Centroid Efficiency in Infrared Sensors
Previous Article in Journal
Multi-Constellation/Multi-Frequency GNSS Signal Degradation Due to Foliage and Reflective Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Hourly Load Curves Disaggregated by Type of Consumer Using A Density-Based Spatial Clustering Technique †

by
Carlos Andrés Peñaloza
and
Patricia Elizabeth Otero-Valladares
*
Department of Electrical Energy, Escuela Politécnica Nacional, Quito 170525, Ecuador
*
Author to whom correspondence should be addressed.
Presented at the XXXI Conference on Electrical and Electronic Engineering, Quito, Ecuador, 29 November–1 December 2023.
Eng. Proc. 2023, 47(1), 23; https://doi.org/10.3390/engproc2023047023
Published: 7 December 2023
(This article belongs to the Proceedings of XXXI Conference on Electrical and Electronic Engineering)

Abstract

:
This paper discusses the innovative application of a new methodology for acquiring load curves in energy systems such as Azogues Electric Company (EEA). The proposed approach imports the database from Excel and, through an iterative clustering algorithm based on density with noise, generates daily load curves with a breakdown of weekdays and weekends, identifying one curve by type of consumer. Moreover, the groups obtained are validated by means of a Silhouette verification index (IS) identifying bad groupings, which are discarded to obtain results. Per unit value responses are presented through tables with hourly values on weekdays and weekends. The graphical comparison with the previous methodology of real measurements in Excel is also added.

1. Introduction

Electric companies carry out studies in the distribution system periodically to identify and execute the necessary investments in infrastructure, with the aim of reducing technical losses and maintaining quality standards [1]. These studies involve handling various data from low voltage measurements and measurements at the feeder’s head to develop electrical studies as load characterization and support the system operator. Certain measured data may present great similarity or homogeneity, so they are analyzed in daily load curves 0–23 h.
Currently, the Excel tool is used with average or average calculations to obtain load curves by consumption group [1], but there are negative and zero value measurements that modify the final shape of the group of curves, requiring previous work to manually cluster the load curves, eliminating atypical measurement. For this reason, it is worth having an algorithm based on data mining techniques to obtain one load curve broken down by each type of consumer and use them as the starting point for expansion planning and loss calculations at medium and low voltage levels [2].
Nowadays, clustering algorithms are used in data mining tools to identify important patterns and similar distributions in large information databases. There are some methods for obtaining groups or clusters that use averages, dendrograms, and grids, but the one chosen in this work is the method based on densities that identify groups with arbitrary shapes in the presence of atypical data or noise [3].

2. Materials and Methods

2.1. Meter Measurement Data

The 236 m with measurement data are obtained from the main feeders of the EEA distribution subsystem, as well as from users connected to the secondary circuits in the years 2018 and 2019. These meters are classified by type of consumer in the cadaster of users for the EEA. Therefore, Table 1 shows the description of meters and their voltage level. Low voltage meters measure in a 220 V grid, while de medium voltage meters are located in a 13.8 kV grid. The total values for each type of consumer are shown in the right column.

2.2. Per unit System of Data

The real daily values of average power obtained from the meter are normalized, taking as the base power the maximum on each day and in the upper part the measured data in a per unit system (pu) of 24 h (0–23 h). This is represented in Equation (1):
P p u = P r o m e d i o   P P m a x P o r D i a
  • P p u : Dimensionless value in per unit system.
  • P r o m e d i o   P : Average value of active power in kW.
  • P m a x P o r D i a : Maximum value of active power in kW per day.

2.3. DBSCAN in MATLAB

Two types of characteristic load curves are grouped according to the day of the week, a group from Monday to Friday and another from Saturdays to Sundays. Also, the types of consumers are identified as residential, commercial, industrial, and others.
In this process, the KNNsearch function is used to find a k-distances graphic of the curves; then the knee_pt function is used to find the knee of this graphic, that is, the value of Eps and its typical zone, which is a necessary argument to establish the clusters in the DBSCAN function. The Mpts argument is set to a relatively low value so that all points belonging to the same group are included. This algorithm requires three input parameters and returns an idx vector with the resulting grouping or cluster. Next, Equation (2) in MATLAB R2021b:
i d x = D B S C A N   ( X ,   E p s ,   M p t s )
  • X : Size of the neighbor list or data matrix.
  • E p s : Radius that delimits the neighborhood area of a point (neighborhood-Eps).
  • M p t s : Minimum number of data or objects around neighborhood-Eps.

2.4. KNNSearch

This function finds the K nearest neighbors, according to Euclidean distances, and returns their indices in a column vector d and their respective distances kD. It uses input data or a database. Equation (3) shows its structure in MATLAB R2021b for its correct use.
d ,   k D = k n n s e a r c h   ( X ,   Y , N a m e ,   V a l u e )
  • X : Values in pu of data 0–23 h for the measurement days.
  • Y : Values in pu 0–23 h.
  • N a m e : Write “K” and calculate the nearest neighbor distances.
  • V a l u e : Matrix size of how many nearest neighbors in the distance metric.

2.5. Kneepoint

The kneepoint function advances along the K-dist plot of distances, one bisection point at a time, fitting two lines, these being the first derivative and the second derivative. The knee is at a bisection or threshold point that minimizes the sum of errors for the two adjustments. Any value less than this threshold density Eps can efficiently cluster patterns because these would lie in typical k-dist plot territory [4]. Equation (4) in MATLAB:
x = k n e e _ p t ( d i s t a n c e s )
  • d i s t a n c e s : Euclidean distances of the K-dist graph.
  • x : Value in x of the elbow of the K-dist type graph.

2.6. Validation Index Silhouette (IS)

Each group can be represented by a silhouette, which is based on the comparison of their closeness and separation. This silhouette shows which objects are well classified within their group and which are simply infiltrating between the groups. The average width of the silhouette provides an assessment of clustering validity and could be used to select an “appropriate” number of clusters [5]. Equation (5) in MATLAB:
s = s i l h o u e t t e ( X , i d x )
  • X : Data between objects.
  • i d x : is the partition obtained (by applying some grouping or cluster technique).
  • s : Value between −1 and 1, denoting 1 as belonging to the cluster and −1 not belonging.

3. Discussion

3.1. Clustering on Weekdays and Weekends

Figure 1a is a K-dist type graph made to find Epsilon (Eps) with the function Kneepoint. The respective knee of the function of distances is denoted with a red circle and any point up to that curve will represent a correct value to use it as an entry parameter in DBSCAN. Eps is established with the kneepoint function, and Mpts is set to a value greater than 1. Groupings such as the one in Figure 1b begin to be obtained, which contain descriptions on the bottom, top, and lateral sides.
Then, Figure 2a shows the grouping of meters considered noise. Given the amount of data in this figure, a similar behavior is not distinguished in the characteristic curves, and a resulting yellow curve is also observed, which is not entirely true for this grouping. Then, the algorithm validates the grouping performance. For this task, the index IS, shown in Figure 2b, is the verification in MATLAB and detects which cluster is incorrectly grouped, having index values of −1.
The same procedure is applied for the weekends, obtaining different values on the kneepoint, the grouping, and the validation. It is important to mention that noise is treated again with filters that take the eliminated curves and apply the clustering technique to obtain even more results from the system.

3.2. System Results

After submitting the database to the algorithm proposed in MATLAB R2021b, load curves for weekdays and weekends were found. The objective is to establish only one curve per type of consumer and a system curve. For this, reference [6] is used. Figure 3 shows the characteristic curves of the different consumption groups and a total curve of the system, having in the x-axis the hour of the day 0–23 h and in the y-axis the per unit value of the measurement.
Table 2 presents the numerical results of the curves shown in Figure 3, they are separated in weekdays to the left side and the clustering results for weekends on the right.

3.3. Comparison with Previous Method

The method of obtaining load curves used by the EEA focuses on using the real measurements, exporting them to an Excel file, and then, with a series of filters on georeferencing, voltage level, or feeders, obtaining similar curves in real values of Active Power kWh. Then, empirically, eliminates curves that do not represent a characteristic behavior of the type of consumer to finally group and obtain the average of the similar curves. These curves are later used in distribution network simulation programs and must be in pu to obtain electrical analysis results, power flows, and losses.

Weekdays Comparison

Figure 4 and Figure 5 show the final curve during the weekdays where the left side is the result with the previous method of the electric company in Excel, while the right side shows the resulting curve obtained with the DBSCAN algorithm in Matlab. In addition, the new method changes the y-axis from real value in kW to pu values while it maintains the x-axis in daily hours.
The previous curves indicate different behavior for the same type of Urban Residential consumers, and the peaks are given in other hourly values. This is because in the old method, there is a lack of data to carry out filters in Excel, while the new one provides the direct grouping of similar curves with the density algorithm.
Figure 5 exhibits similar results at consumption peaks. This time, the database in Excel included many similar curves for the commercial consumer, and the grouping step was successful. In the same way, the values in pu are convenient for better identification of the resulting curve and its maximum demand.

4. Conclusions

The advantage of the method presented in this written work is the use of a clustering algorithm to obtain load curves in pu and the validation tool to discard atypical measurements. These two advantages outperform the previous EEA methodology by obtaining only one curve by type of consumer using a clustering technique instead of considering the analyst experiences.
During the clustering procedure, load curves that have the behavior of public lighting or industries that only operate in the early morning hours or at night were identified by the noise application of the algorithm. These types of consumers did not enter the obtaining of final curves due to their little intervention in the distribution electrical system.

Author Contributions

Conceptualization, P.E.O.-V. and C.A.P.; methodology, C.A.P.; software, C.A.P.; validation, C.A.P., P.E.O.-V.; formal analysis, P.E.O.-V.; investigation, C.A.P.; data curation, C.A.P.; writing—original draft preparation, C.A.P.; writing—review and editing, P.E.O.-V.; visualization, P.E.O.-V.; supervision, P.E.O.-V.; project administration, P.E.O.-V.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data is unavailable due to privacy restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ministerio de Electricidad y Energías Naturales No Renovables, “Plan Maestro de Electricidad 2018–2027”, Chapter 3, Estudio de la Demanda. Available online: https://www.recursosyenergia.gob.ec/plan-maestro-de-electricidad/ (accessed on 13 March 2023).
  2. ARCONEL. Plan Maestro de Electrificación 2013–2022, Perspectiva y Expansión del Sistema Eléctrico Ecuatoriano. Available online: https://www.regulacionelectrica.gob.ec/plan-maestro-de-electrificacion-2013-2022/ (accessed on 15 April 2023).
  3. Fong, S.; Rehman, S.U. DBSCAN: Past, Present and future. In Proceedings of the Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), Chennai, India, 17–19 February 2014. [Google Scholar]
  4. Irwin, D.; Albrecht, J.; Satopa, V. Finding a “Kneedle” in a Haystack: Detecting Knee Points in System Behavior. In Proceedings of the 31st International Conference on Distributed Computing Systems Workshops, Minneapolis, MN, USA, 20–24 June 2011. [Google Scholar]
  5. Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1999, 20, 53–65. [Google Scholar] [CrossRef]
  6. Gönen, T. Electric Power Distribution Engineering; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
Figure 1. Complementary algorithms and clustering (a) Kneepoint and Knnsearch on finding Eps (b) Partial result of a residential group in the clustering algorithm.
Figure 1. Complementary algorithms and clustering (a) Kneepoint and Knnsearch on finding Eps (b) Partial result of a residential group in the clustering algorithm.
Engproc 47 00023 g001
Figure 2. Noise and validation (a) Cluster 8 on weekdays NOISE; (b) SILHOUETTE validation index, identifying cluster 8 as noise.
Figure 2. Noise and validation (a) Cluster 8 on weekdays NOISE; (b) SILHOUETTE validation index, identifying cluster 8 as noise.
Engproc 47 00023 g002
Figure 3. Graphical results in P p u of each consumer and system curve. (a) Weekdays, (b) Weekends.
Figure 3. Graphical results in P p u of each consumer and system curve. (a) Weekdays, (b) Weekends.
Engproc 47 00023 g003
Figure 4. Final load curve on weekdays Urban residential. (a) Previous method; (b) DBSCAN.
Figure 4. Final load curve on weekdays Urban residential. (a) Previous method; (b) DBSCAN.
Engproc 47 00023 g004
Figure 5. Final load curve on weekdays Commercial. (a) Previous method; (b) DBSCAN.
Figure 5. Final load curve on weekdays Commercial. (a) Previous method; (b) DBSCAN.
Engproc 47 00023 g005
Table 1. Meter information by type of consumer.
Table 1. Meter information by type of consumer.
Type of ConsumerLow Voltage 220 VMedium Voltage 13.8 kV Total
Commercial252045
Industrial181432
Others9918
Residential1310131
No Identified 5510
Total 18848236
Table 2. Load curves numerical results imported from MATLAB (a) weekdays; (b) weekends.
Table 2. Load curves numerical results imported from MATLAB (a) weekdays; (b) weekends.
(a) weekdays
Daily HourResidential UResidential RCommercialIndustrialOthersSystem Curve-Weekdays
00.22950.40770.42930.03690.68480.4910
10.22340.39070.41790.03630.68360.4810
20.21660.36900.40700.03600.67900.4689
30.22030.35770.40830.03720.67300.4658
40.25920.37060.43300.04110.66950.4869
50.35440.40890.49160.05240.67230.5435
60.46870.44780.59400.11050.69730.6365
70.52110.45530.73550.29700.75100.7578
80.48860.43380.87410.60370.80160.8791
90.42130.40200.96730.87480.82570.9586
100.37430.37491.00001.00000.83050.9829
110.34630.36490.97850.98060.82870.9607
120.30980.37260.93080.87610.82450.9099
130.27180.39090.89900.80710.82190.8761
140.26610.41430.89570.84880.82450.8922
150.30170.44650.88530.92440.83120.9305
160.36470.49690.84040.89270.84800.9453
170.46680.59310.77680.68570.89090.9372
180.65310.75780.70390.39990.95500.9526
190.88000.93030.62240.18841.00000.9942
201.00001.00000.54990.09340.99881.0000
210.93260.94220.49980.06250.94870.9296
220.74210.82880.46980.05170.86920.8131
230.59920.75420.45710.04760.81480.7339
(b) weekends
Daily HourResidential UResidential RCommercialIndustrialOthersSystem Curve-Weekends
00.79270.51840.72380.74680.74840.7648
10.72190.50000.72060.72630.74540.7397
20.63700.48110.71570.70640.74010.7107
30.58610.47210.71150.70830.73510.6961
40.57320.47620.71210.72130.72330.6945
50.58420.49360.71950.73790.69840.7005
60.61410.52120.73850.74940.68530.7168
70.65980.55540.78670.77260.71290.7555
80.72030.59410.87220.82460.75630.8162
90.76890.61570.95930.87310.77670.8652
100.75790.61181.00000.91770.77500.8801
110.71400.61550.99010.95380.77020.8760
120.69590.62650.95310.94380.77220.8647
130.69440.62900.90910.91990.77980.8519
140.67040.63470.86530.94200.78880.8452
150.64970.64250.82230.98340.80390.8453
160.66610.65580.78141.00000.83970.8542
170.72620.73420.75780.99520.91030.8934
180.83550.87880.75550.95490.98390.9551
190.95000.99160.75550.91881.00001.0000
201.00001.00000.74850.89590.95850.9972
210.97450.91640.73980.87280.89370.9526
220.89810.79870.73410.86590.82720.8935
230.83800.72230.73220.86890.78930.8559
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Peñaloza, C.A.; Otero-Valladares, P.E. Hourly Load Curves Disaggregated by Type of Consumer Using A Density-Based Spatial Clustering Technique. Eng. Proc. 2023, 47, 23. https://doi.org/10.3390/engproc2023047023

AMA Style

Peñaloza CA, Otero-Valladares PE. Hourly Load Curves Disaggregated by Type of Consumer Using A Density-Based Spatial Clustering Technique. Engineering Proceedings. 2023; 47(1):23. https://doi.org/10.3390/engproc2023047023

Chicago/Turabian Style

Peñaloza, Carlos Andrés, and Patricia Elizabeth Otero-Valladares. 2023. "Hourly Load Curves Disaggregated by Type of Consumer Using A Density-Based Spatial Clustering Technique" Engineering Proceedings 47, no. 1: 23. https://doi.org/10.3390/engproc2023047023

Article Metrics

Back to TopTop