*4.2. Data Preprocessing: Feature Selection*

In the data, preprocessing, missing data imputation, deleting invalid data, selecting eligible days, and reducing dimensions are performed to obtain reasonable results. We conducted a feature selection as part of the dimension reduction for DR potential. When we select features, we consider which factor affects to DR reduction as follows:


Therefore, we selected three features (i.e., daily consumption, peak hour, difference betweenmaximum and minimum demand) from the load profile as principal factors. Deleting features through correlation analysis between these features should be processed. As a result of the correlation analysis, we selected two features (daily consumption and peak hour) for 1st stage clustering based on demand characteristics. The correlation analysis between demand characteristics' features is illustrated in Figure 4.

**Figure 4.** Feature selection from load profile characteristics of residential customers.

#### *4.3. Load Profile Segmentation of Residential DR Customers*

When load profile clustering is conducted for customer segmentation, it is essential to determine the optimal number of clusters in the data. We used the NbClust package in the R statistical software to estimate the number of clusters, following Charrad et al. [31]. This package provides 30 indices which determine the number of clusters in a data set and offers the best clustering scheme [32]. Hubert statistics values and Dunn index values are also provided by NbClust. These numbers provide a graphical method to determine the number of clusters. We can realize that the number of optimal clusters is situated in a peak point in their plots of second differences, indicating that the number of optimal clusters is six when first-stage clustering based on demand characteristics (i.e., daily consumption, peak demand time, and difference between peak and minimum demand) was conducted. Figure 5 shows the Hubert index and D index results.

**Figure 5.** Examination methods to determine the number of clusters by (**a**) the Hubert index and (**b**) DI index.

After the first-stage clustering, second-stage clustering for customer segmentation based on demand patterns was conducted. The optimal number of clusters for each of the six groups separated by demand characteristics were 3, 2, 2, 2, 2, and 2. Therefore, we separated residential customers, who participate in the PTR pilot program into 13 groups according to load patterns and consumption.

We performed load profile segmentation through our proposed method and then compared the resulting customer segmentation according to different clustering models. We examined 12 methods, including our proposed method. The remaining clustering methods are based on the fundamental k-means, SOM [16,17,33], and FCM [34–36] methodology in which the classification variables are: (1) demand characteristics, (2) load patterns, and (3) both characteristics and load patterns.

To compare the results, the internal evaluation measures Davis-Bouldin index (DBI) and Dunn index (DI) were used. The DBI and DI result of clustering methods were presented as Table 3. The proposed methodology showed the best result according to Table 3, so we conclude that our proposed methodology is indeed appropriate. We judged that the reason why the proposed methodology has a better result can be explained as follows. It is at a point that separation as two-stage clustering framework can reflect each feature impact, considering that factors affecting DR reduction in 1st stage segmentation make it so that rough load profile clustering before 2nd stage segmentation separates each feature impact by its pattern. Generally, clustering methods separate data based on the distance of input variables. Therefore, the undesirable result would be presented if a lot of input variables which can make each variable effect difficult to verify are used unnecessarily. However, the proposed methodology considers all variables by separating the clustering method into two stages. It can make an outstanding result in two-stage k-means clustering. The proposed method separates residential customers into 13 groups, with the load profiles of each group illustrated in Figure 6. Groups 1 through 13 contain 14, 15, 25, 76, 56, 38, 85, 120, 88, 98, 68, 85, and 79 customers, respectively. The load profile of the 13 groups showed morning peak, evening peak, nighttime peak, and dual morning and night peaks. Residential customers do not usually consume electricity during daytime, so these peak characteristics were consistent with residential load profiles.

**Table 3.** DR operation clustering result evaluation according to variable selection and clustering structure.


**Figure 6.** *Cont*.

*Energies* **2020**, *13*, 1348

**Figure 6.** *Cont*.

**Figure 6.** Load profiles of customer segmentation results by the proposed clustering methodology: (**a**) group 1; (**b**) group 2; (**c**) group 3; (**d**) group 4; (**e**) group 5; (**f**) group 6; (**g**) group 7; (**h**) group 8; (**i**) group 9; (**j**) group 10; (**k**) group 11; (**l**) group 12; (**m**) group 13.

#### *4.4. Customer Targeting for DR Operation*

Appropriate customer selection for DR participation by using the load profile segmentation result in the previous section is applied to study efficient DR operation. The 13 load patterns are shown in Figure 7. It can be possible to use load profile and the amount of consumption to estimate DR potential from the 13 load pattern. If customers consume little electricity, their DR participation would be inefficient to a DR operator, despite having a suitable pattern for DR (i.e., nighttime peak and dual morning and night peaks). Therefore, the DR operator considers both factors. To reflect these components, a boxplot analysis is conducted. Peak time (i.e., hour of the day) when maximum demand happens and average consumption boxplots for the 13 groups were analyzed, and they are illustrated in Figure 8.

First, we eliminated groups which experienced inconsistent peak demand occurrence times as the events. The groups not corresponding to this criterion were 6, 7, 8, and 9. Then, groups with low electricity consumption were also deleted, as they are inappropriate for economic purposes. The groups corresponding to little consumption were 4, 5, 12, and 13. We emphasize that utility companies should operate the PTR program using the remaining groups, namely, groups 1, 2, 3, 10, and 11. The total number of customers included in this targeted enrollment scenario was 220.

**Figure 7.** Load profile of customer clusters by the proposed method.

(**b**)

**Figure 8.** Boxplot of the proposed clustering method: (**a**) peak time, and (**b**) average consumption.

After finding targeting groups, calculating the amount of demand reduction was performed to identify the effect in accordance with targeting customers by using the actual PTR event data. There were 847 DR participation customers and there were nine event days when the utility company notified residential PTR customers to reduce demand. We considered that all of residential customer (i.e., 847 customers) are participated in PTR pilot program in case of Opt-in enrollment, and targeted enrollment is attracted by a group of customers who are able to reduce their demand more than other groups during the event. There were 220 customers in the targeted enrollment group, which is different from the number of customers in the opt-in enrollment group. To compare the demand reduction for both types of enrollment, we calculated average demand reduction of event days per customer in both cases. As the customer baseline load (CBL) should be estimated for demand reduction capacity due to the DR event, we applied the Max 4 of 5 method, which has been used for the PTR program in Korea [30]. The Max 4 of 5 method estimates CBL by averaging high demand of four days among five eligible days, which means days excluding weekends, event days, and holidays. Average demand reduction for event days per customer in opt-in enrollment, targeted enrollment, and the 13 groups are illustrated in Table 4. Average demand reductions for the opt-in enrollment and targeted enrollment program were 0.2620 (kWh) and 0.3496 (kWh), and the difference between them was 0.0876 (kWh).


**Table 4.** Average demand reduction for event days per residential customers during the Peak Time Rebate (PTR) pilot program.

The electricity consumption for 6~8PM was 1.3569 (kWh). The demand reduction ratio based on common demand during events was 19.31% and 25.76%, respectively. An improvement of 6.45% was observed, with targeted enrollment reduction increasing demand reduction by 33.44%, in comparison with opt-in enrollment. Thus, it is significantly more efficient to operate the DR program with customers who have larger DR potential, as defined in this study.

Additionally, we conducted a cost-effectiveness analysis for managing the DR program in two cases: residential customers who want to participate in the DR program and targeted residential customers who have large DR potential. We assume that the demand reduction of targeted customers is the same as the actual DR participants in identifying the cost-effectiveness of DR customer targeting. Economic analysis based on the California Standard Practice Manual is performed from the perspective of the DR operator [29]. There were 847 total households participating in the PTR pilot program whose total average reduction is 221.914 kWh, and 635 households (which comprise 75% of the total participants) that we determined as DR targeting participants.

Customer operation cost decreased due to the reduced number of customers, and the amount of increased benefit is 437.256 KRW, the exchange rate is 1100 KRW, marking a 108.58% benefit increase over the existing economic analysis result. The economic analysis changes by customer targeting is presented as Table 5.


**Table 5.** Cost-Effectiveness Analysis Changes by Incentive DR Targeting (Unit: KRW).
