5.1. Electricity Consumption Data
The consumption data summing up more than five hundred million rows were handled in MongoDB and analyzed with Python. In the following paragraphs, we depict the consumption data. In this sense, the total hourly consumption, for all types of consumers for each weekday, is shown in
Figure 21. It is obvious that weekend profiles are slightly different than the rest of the weekdays.
The total hourly consumption decomposition for residential groups and for each weekday is presented in
Figure 22. In this case, load profile of groups A and C are similar. Also, B and D groups’ load profiles are similar, while E is in between the two groups.
The proportion and the total consumption of consumers’ categories is given in
Figure 23. The majority (66%) consists in residential consumers. Almost half of the total consumption belongs to residential consumers.
The average consumption for each group of residential consumers is described in
Figure 24. Except for D and E groups, the averages are similar. While, at first glance, the figure may seem to indicate significant variations between D and E categories and the rest, the difference is small enough to ignore / attribute to statistical variations (3.76% between D and W, and 4.69% between E and W).
The daily load profile as total consumption for the three categories is given in
Figure 25. The differences are evident, and they are dependent on the activities of each consumers.
The daily load profile as total consumption for the six residential groups is given in
Figure 26. The shapes are similar especially for A and C or B and D groups, while E profile is in between of the two groups and W profile is almost flat.
The heatmap in
Figure 27 shows the average hourly consumption level for each test group that has allocated a certain tariff. It reveals several aspects:
- -
Actual peak, off-peak and mid-peak hours do not identically correspond with ToU rates. For instance, the peak hours stretch from 17 to 21;
- -
D and W groups have the lowest average hourly consumption probably as a consequence of high peak rate;
- -
A and B groups have the highest peak consumption due to the less punitive peak rates.
5.2. ToU Tariffs
The ToU tariff are multiplied by the hourly consumption of each consumer (identified by a meter ID) for entire period from December 2009 to November 2010. The data set is processed in a dataframe (df) format in Python Pandas library.
Five ToU tariffs with different rates are characterized by three levels .
Intervals associated to the rates are described as follows: peak hours: 18, 19 → 2 hours; off-peak: 0–8, 20–23 → 13 hours; mid-peak: 9–17 → 9 hours.
Several what-if scenarios are carried out simulating that a test group of consumers X pays other tariffs → The payment for a test group is calculated considering the five ToU tariff rates.
The optimal payment is the minim value of the payments with different ToU tariffs at the monthly level. The comparisons are monthly performed as the measures regarding house energy efficiency are gradually implemented during the trial period.
The difference between the payment with the initial tariff E and ToU tariffs are evaluated calculating a monthly coefficient
.
The reduction coefficient to improve the ToU tariffs are calculated as average of the
.
The residential consumers were initially allocated the ToU tariffs forming six groups that corresponds to each tariff as in
Figure 28. The purpose of applying various ToU tariffs was to test if, and in what measure, the consumers can be persuaded, via tariff, to change their consumption behavior. In addition, the electricity bill amount was observed as the payment is a significant incentive.
However, the recommended tariffs that would minimize the electricity payment considerably differ. For this analysis, we performed monthly what-if scenarios that lead to the conclusion that in most of the cases tariff A and W are recommended, since they minimize the consumers’ payment, as shown in
Figure 29.
Considering the frequency of recommended ToU tariffs, we conclude that only for group W, the allocated tariff minimized the payment in 50% of the time (for 6 months), while for group A, the tariff minimized the payment for 7 from the 12 months. For groups C and D, the more convenient option is tariff A or W, mainly because W has higher peak rate similar with their allocated tariffs. The advantage of tariff W is the flat lower tariff rate applied on weekend days. Thus, it advantages the consumers groups with the high consumption on the weekend. Also, A is an efficient tariff for all test groups (especially A and B), as it has the lowest peak rate.
Thus, we simulate that each group of consumers would pay each of the proposed tariffs, and identified the tariffs that minimize the payment, as in
Figure 30.
We also simulated the payment with each of the ToU tariffs for all consumers, regardless of the test group at the monthly level.
Figure 31 shows higher payment during winter months that is influenced by a higher consumption for heating. However, the lowest payment is obtained with tariff A (seven times) and W (five times).
In other words, tariffs B, C and D, with higher peak rates, could not be recommended as higher payments result with these tariffs. Still, most of the consumers are better-off with tariff A that has the lowest peak rate. Although tariff W has the highest peak rate during the working days, on weekend days the rate is flat and lower, corroborated with a high consumption, as in
Figure 32.
Figure 33 shows the differences between payment with tariff E and payment with the allocated ToU tariffs for each month.
On average, the consumers’ payment was with around 19.39% higher than with tariff E. In only three months—June, August and September—tariff W proved more efficient than tariff E.
Additionally,
Figure 34 shows the variation of ToU tariffs along a year, reveling that concentric circles for A–D tariffs, while W, with the butterfly shape is different crossing the A and B circles for certain months (December and March), mainly due to the weekend low rate.
Calculating the mean of these differences of payment between tariff E and ToU tariffs, we identified the reduction that will improve the ToU tariffs as in
Table 1.
While
Figure 26 seems to suggest that there are differences in the consumption profiles of the various customers categories, per ToU tariff, this is only due to the variated size of the categories, and not to their consumption patterns.
Figure 35, which shows the average consumption for each customers category, per tariff, indicates only slight variations between the categories.
Regarding the data the
Figure 35 is based on, the conclusion of our analysis is that the proposed method for changing the customers’ consumption behaviors via differential tariffs could not provide clear evidence. The consumers followed their inherent consumption patterns without any regard for the penalties imposed, in various degrees, by most of the ToU tariffs, on the consumption at peak hours. This gives a large variation in the total electricity to be produced by generators and transported/distributed by the grid operators, per hour, with a minimum in consumption, at 4’o clock (in the night), of 890,219 kWh and a maximum, at 18 (in the evening), of 3,691,177 kWh. The mitigation of this large variation was exactly the reason the ToU tariffs were proposed in the first place.
Following this conclusion, we attempted to identify if the bulk of the peak hour consumption can be attributed to any particular group, not regarding the ToU tariffs, as it was obvious that there are no real differences caused by the tariffs. We found that the same consumers, which are ranked highest by the total consumption are also ranking highest by the consumption at peak hours (17–22).
First, we ordered the consumers descending, by their total consumptions, obtaining the results as in
Table 2.
Next, on the same table, without any reordering, we attempted to identify the customers which are contributing more to the consumption at peak hours and found that they are almost the same, as given by
Table 3.
Our conclusion was that it would be possible to notably alter the total consumption hourly pattern, by changing the consumption behavior of a reduced set of customers (less than one third of them, for a radical change).
We further analyzed several what-if scenarios, based on the idea of changing the consumption behavior of various sized selected subsets. The proposed change was to move about 50% of the consumption of the selected consumers from the peak hours (17–22) to the off-peak interval (1–6, during the night). While direct modification of consumption behaviors may not be possible, for a small enough number of customers technical approaches may be found (e.g., small electricity accumulators).
The results of the what-if scenarios can be seen in
Figure 36.
The consequences of attaining the proposed scenarios are given in
Table 4:
As per the values given in
Table 4, if such results are achievable, the best-case consequence would be a 23.15% reduction on the maximal consumption could be sustained.
5.3. Questionnaire Insights
Clusters are one of the most efficient ways to represent features of data, and the SOM algorithm is especially relevant for analyzing survey data, because of its visualization properties. When data is unlabeled, the algorithm is efficient by indicating the number of classes, but if data is labeled, the algorithm may be used for dimensionality reduction.
The algorithm creates one or more prototype-vectors that are relevant for the input data set, and it preserves the topology of the data by projecting the set of the prototype vectors from the dimensional space onto a low-dimensional grid. Pre- and post-data surveys contain opened and closed questions related to consumer profile, consumption trends, but especially related to the attitude towards consumption and the considerations related to reducing electricity consumption.
The questionnaire responses were loaded into pandas’ Data Frames objects (questions were represented as columns and each respondent
id defined the index for each row) and various functions were applied in order to determine the data types, the number of missing responses or statistical insights. Data preprocessing implied replacing inaccurate data with significant values, in accordance with the question type, as described in the data processing steps in
Figure 37. Also, some redundant questions were removed.
For the vast majority of the questions, scaling was not necessary because of the question type—binary or categorical—and also because of the meaningful codification of answers where this was relevant. For some questions, the standard scaler was used, and for a few features with large magnitudes, a min-max scaler was applied.
The network was trained for 10,000 epochs with a learning rate of 0.01 and with a sigma of 1. One feature is selected (question: “
I [we] can reduce my [our] electricity bill by changing the way the people I live with and I [we] use electricity)”, and the distribution of nodes is presented in
Figure 38.
The map in
Figure 38 reflects the network nodes for a subset of the post-questionnaire set that includes answers related to the attitude in relation to the reduction of energy consumption. Answers related to the personal assessment of the knowledge of reducing consumption were considered in the input data. The SOM visualization helps to identify the classes of consumers, using as input space the attitude type questions towards a certain situation, such as: the society/individual must or should reduce the consumption of electricity, or the motivation behind consumption reduction: environmental problems or personal financial reasons.
A Unified-Index Matrix represents a special graph type that reflects the distance between the nodes in the grid. A large distance is represented by a dark area, while the lighter colored areas mean a smaller distance between nodes-edges between similar data groups. For an input space of dimension 11, the features were the answers to questions about the household income and the answers to the question
“I/we have made changes to the way I/we live in order to reduce the amount of electricity I/we use”. After generating a 30 x 30 SOM, in which each vector represents one or more items, the U-Matrix was constructed as in
Figure 39 by computing the sum of the Euclidian distances for each neighboring cell and calculating the average. If the result is small, then the items more likely belong to the same class.
By looking at the U-matrix, the lines suggest that there are four areas of similar consumers. Dimensionality reduction is graphed in
Figure 40.
We can deduct that items from the blue area are very different than the items in brown area and green-blue and brown-yellow areas are somewhat similar. Also, we can observe one SOM limitation, namely that categorical answers are not handled well, because the algorithm assumes that the variables are continuous. In the same time, inconsistent solutions were identified while running the analysis multiple times, because initial positions of neurons differ. The cluster number can only be determined after the algorithm consistency was established.
Another limitation is that the number of iterations is difficult to be determined, but according to [
28] the map will converge, after an adequate number of iterations. We also set up a group of questions, named set of questions 1, related to the same topic: “
perception of the usefulness of the instruments received at the beginning of the trial (monitors, stickers, magnets, etc.)”.
For all these questions the answers are on a scale from 1 to 6, where, to questions such as: “how useful were the stickers or magnets”, 1 as answer means totally useless, and 6 means very useful. The missing answers were filled with 0, and also scaling was applied to improve accuracy, because some questions only had the scale of the answers from 1 to 5.
This set of questions also contains questions linked to the electricity monitor, from the evaluation of the time of understanding of the device’s operating mode (1:very easy, and 6:very difficult) to the bill evaluation in terms of electricity consumption. It can easily be observed that questions with a similar response (black squares as in
Figure 41) have been arranged by the neural network very close to each other. Black squares represent the answer to the question codified by 5—which means strongly agreed or very satisfied with the outcomes. The default color for missing values (zeros) was red colored, but it was removed for a better and more understandable representation. Incidentally, other answers are represented close to each other (1-green, 2-cyan, 4-white,3-blue, 6-yellow). The white ones are in immediate closure to the black squares, which means the network identified correctly groups of respondents that agree or are satisfied with the electricity monitor and may consider that over the trial the amount of electricity was reduced. Other markers—blue, green and cyan are also close to each other, but more scattered over the map, which means that some of the respondents have strong beliefs in report to some situations or questions, but they disagree on other statements.
For another set of questions with a topic related to the person’s own measures taken to reduce consumption, the magnitude of these measures, the degree of modification of the consumption mode (day/night or hourly intervals), the grouping of answers is represented in
Figure 42.
We can see that the respondents who are represented by the yellow marker (6: strongly agree) are very well delimited by those who gave answers from 1 to 4. It can be deduced that those who have adopted their own measures (minor or major measures) have also observed a decrease in energy consumption and reported a general change in consumption mode.