Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessEditor’s ChoiceArticle

Peer-Review Record

A Novel on Transmission Line Tower Big Data Analysis Model Using Altered K-means and ADQL

Sustainability 2019, 11(13), 3499; https://doi.org/10.3390/su11133499

by Se-Hoon Jung¹

and Jun-Ho Huh^2,*

Reviewer 1:

Ni Zhang

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Sustainability 2019, 11(13), 3499; https://doi.org/10.3390/su11133499

Submission received: 14 May 2019 / Revised: 14 June 2019 / Accepted: 19 June 2019 / Published: 26 June 2019

(This article belongs to the Special Issue Artificial Intelligence for Renewable Energy Systems)

Round 1

Reviewer 1 Report

This paper presents a model to analysis the Electric Power Data, both internal and external, it shows the advantages of the proposed model compare to the other exiting ones. The paper is well organized, the reviewer has the following comments:

1. In the experiment part, the authors mainly compared the accuracy among all the methods. However, it would be better if the authors can take the performance/computation time into the consideration. If the proposed method takes much more computation power/time than other methods, then it might be a tradeoff.

2. In figure 8, the accuracy of the proposed method increased dramatically after certain episode, the author can add some more discussion about that.

Overall, this paper shows a promising model which can predict the power data more accurately. But this needs to be based on the large amount of data which captured by large amount of sensors, which might need a large capital investment if need to be implemented in the real world. The authors can keep improving it reduce this kind of “need” in the future.

Author Response

Reply->

Dear Editor and Reviewers, respectfully,

I would first like to thank you for your comments, which are about performance and calculation time compared with previous studies. Since I did not reflect it during performance evaluation, I am particularly grateful for your comments. Based on what you pointed out, I added comparison and evaluation results with the old power prediction system. The time required to measure prediction results based on the same data and analysis model was added to "5.4 Performance evaluation with the old prediction system based on electric pole big data."

Add ) Table 3 shows learning time for 175,176 data and calculation time to measure 75,076 test data based on the model proposed in a previous study. The learning time for 175,176 data was reduced by approximately 6~20 seconds from the previous study. The test time was reduced by approximately 3~8 seconds. The previous study reported that the time to measure test data was 200 seconds on average for data learning and model prediction. The analysis and calculation time of the algorithm proposed in the present study for the same power data prediction model was 182 seconds down by about 18 seconds. The proposed algorithm improved performance for model prediction calculation time by 12.5% compared with about 4.7% in the old previous study.

Table 3. Comparison of electric pole data prediction system(Computation Time).

Part	Prediction System	Learning Time(s)	Test Time(s)
Study of [85]	K-means + Random Forest	157	34
Study of [86]	Hierarchical Clustering	164	39
Study of [87]	K-means+Sequence	171	37
Proposed Study	Altered K-means+ADQL	151	31

In figure 8, the accuracy of the proposed method increased dramatically after certain episode, the author can add some more discussion about that.

Reply-

I would first like to thank you for your comments. Your comments are about the part in which the accuracy rate increased when algorithm repetition recorded 720 times or more in Figure 8. I sorted out your comments as follows: the accuracy rate increased at 720 times or more when learning was ended at 70% or higher or measurement was completed for the learning scope of outlier as seen in Figure 7 in which feedback values were compensated by the reinforcement learning algorithm. The algorithm increased model prediction rates by about 2~3% compared with the old algorithm due to the rising accuracy rate under the relearning condition. The details were added to the body to reflect your comments.

Add) The algorithm increased its model predictability by about 2~3% at 720 times compared with the old LSTM-DQN, DQN, and A3C algorithms. In the study's performance evaluation that repeated 1,000 times at 70% when the reinforcement learning of the prediction model was ended as seen in Figure 5, the section marked the ending of outlier learning in the electric pole sensor data and the ending of the prediction model. These findings confirm that the proposed algorithm were superior to the old algorithms in the judgment of relearning and outlier.

Figure 5. Prediction results of learning accuracy based on the clustering data.

Reply-

I would first like to thank you for your comments. Your comments are about the stage in which data was secured from large amounts of sensor data. I completely agree with you on this. I invested a lot of time and money in securing data and classifying content of preprocessing in the study. It is especially difficult to access and secure national electric pole sensor data in large amounts at an individual research level rather than a joint project level. Researchers have pointed out these considerable difficulties. Trying to solve them, I plan to add research on the improvement of prediction performance in the data analysis model in case of data deficiency. This was addressed in the conclusion.

Add ) Future study will investigate an improved analysis model for electric power data in a data deficient environment with only small amounts of electric power data given rather than an environment for the analysis of large amounts of data by conducting research on a model to assess data by reducing the scope of reward function in the environment of reinforcement learning to increase the performance of A-Deep Q-Learning proposed in the present study.

(Extra Reply) The overall block diagram of the system proposed in the study was added to promote the understanding of the reviewers. Also added were related research data and related researches to propose a smart grid system through the analysis of electric pole sensor data. I respectfully request another chance at review based on these data and another round of detailed review.

Add 1) Figure 1 shows the block diagram of the system proposed in the study. The proposed analysis model for electric pole data consists of three levels: the row data level for electric pole data preprocessing including electric pole data collection and preprocessing, the clustering level to which an altered K-means algorithm would be applied, and the reinforcement learning level to learn how to check outlier in clustered electric pole sensor data for itself. The row data level involves gathering electric pole data, eliminating unnecessary data, and normalizing data. The clustering level includes an altered K-means algorithm to which principal component analysis proposed in the study will be applied. The reinforcement learning level includes a learning model to predict outlier in low level clustering to apply A-Deep Q-Learning altered from the off-policy and Q-table methods.

Add 2) Optimization at Reference

21. J. Wu, K. Ota, M. Dong, J. Li and H. Wang, "Big Data Analysis-Based Security Situational Awareness for Smart Grid," in IEEE Transactions on Big Data, vol. 4, no. 3, pp. 408-417, 1 Sept., 2018.

22. L. Sun, K. Zhou, X. Zhang and S. Yang, “Outlier Data Treatment Methods Toward Smart Grid Applications,” in IEEE Access, vol. 6, pp. 39849-39859, 2018.

23. R. Menezes Salgado, T. Carvalho Machado and T. Ohishi, "Intelligent Models to Identification and Treatment of Outliers in Electrical Load Data," in IEEE Latin America Transactions, vol. 14, no. 10, pp. 4279-4286, Oct., 2016.

24. D. Li, and S. K. Jayaweera, “Machine-learning aided optimal customer decision for an interactive smart grid,” IEEE Systems Journal, vol. 9, no. 4, pp.1529–1540, Dec., 2015.

25. M. J. Ghorbani, M. A. Choudhry, and A. Feliachi, “A multiagent design for power distribution systems automation,” IEEE Transactions on Smart Grid, vol. 7, no. 1, pp. 329–339, Jan., 2016.

26. Y. B. He, G. J. Mendis, and J. Wei, “Real-time detection of false data injection attacks in smart grid: a deep learning-based intelligent mechanism,” IEEE Transactions on Smart Grid, vol. 8, no. 5, pp. 2505–2516, Sep., 2017.

27. G. K. Venayagamoorthy, R. K. Sharma, P. K. Gautam et al., “Dynamic energy management system for a smart microgrid,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no.8, pp. 1643–1656, Aug., 2016.

28. R. Thapa, L. Jiao, B. J. Oommen and A. Yazidi, "A Learning Automaton-Based Scheme for Scheduling Domestic Shiftable Loads in Smart Grids," in IEEE Access, vol. 6, pp. 5348-5361, 2018.

29. P. Palensky, D. Dietrich, "Demand side management: Demand response intelligent energy systems and smart loads", IEEE Trans. Ind. Informat., vol. 7, no. 3, pp. 381-388, Aug., 2011.

69. L. Yin, T. Yu, L. Zhou, L. Huang, X. Zhang, B. Zheng, "Artificial emotional reinforcement learning for automatic generation control of large-scale interconnected power grids", IET Gener. Transmiss. Distrib., vol. 11, no. 9, pp. 2305-2313, Jun., 2017.

70. T. Yu, B. Zhou, K. W. Chan, L. Chen, B. Yang, "Stochastic optimal relaxed automatic generation control in non-markov environment based on multi-step Q(λ) learning", IEEE Trans. Power Syst., vol. 26, no. 3, pp. 1272-1282, Aug., 2011.

71. Z. Yan and Y. Xu, "Data-Driven Load Frequency Control for Stochastic Power Systems: A Deep Reinforcement Learning Method With Continuous Action Search," in IEEE Transactions on Power Systems, vol. 34, no. 2, pp. 1653-1656, Mar., 2019.

72. D. Zhang X. Han and C. Deng, “Review on the Research and Practice of Deep Learning and Reinforcement Learning in Smart Grids,” Journal of Power And Energy Systems, Vol. 4, No. 3, pp.362-370 Sep., 2018.

73. F. Ruelens, B. J. Claessens, S. Vandeal et al., “Residential demand response of thermostatically controlled loads using batch reinforcement learning,” IEEE Transactions on Smart Grid, vol. 8, no.5, pp. 2149–2159, Sep., 2017.

74. Z. Wen, D. O’ Neill, and H. Maei, “Optimal demand response using device-based reinforcement learning,” IEEE Transactions on Smart Grid, vol. 6, no. 5, pp. 2312–2324, Sep. 2015.

77. Gang Ma, Linru Jiang, Guchao Xu, Jianyong Zheng, "A Model of Intelligent Fault Diagnosis of Power Equipment Based on CBR", Mathematical Problems in Engineering, vol. 2015, pp. 1, 2015.

78. Connor Jennings, Dazhong Wu, Janis Terpenny, "Forecasting Obsolescence Risk and Product Life Cycle With Machine Learning", Components Packaging and Manufacturing Technology IEEE Transactions on, vol. 6, no. 9, pp. 1428-1439, 2016.

79. P. Verma, P. Singh and R. D. S. Yadava, "Fuzzy c-means clustering based outlier detection for SAW electronic nose," 2017 2nd International Conference for Convergence in Technology (I2CT), Mumbai, 2017, pp. 513-519.

80. M. Gupta, J. Gao, C.C. Aggarwal, J. Han, "Outlier detection for temporal data: A survey", IEEE Transactions on Knowledge and Data Engineering, vol. 26, pp. 2250-2267, 2014.

81. W. Alves, D. Martins, U. Bezerra and A. Klautau, "A Hybrid Approach for Big Data Outlier Detection from Electric Power SCADA System," in IEEE Latin America Transactions, vol. 15, no. 1, pp. 57-64, Jan., 2017.

82. J. Xiong et al., "Enhancing Privacy and Availability for Data Clustering in Intelligent Electrical Service of IoT," in IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1530-1540, Apr., 2019.

83. M. Salehi, C. Leckie, J. C. Bezdek, T. Vaithianathan and X. Zhang, "Fast Memory Efficient Local Outlier Detection in Data Streams," in IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 12, pp. 3246-3260, 1 Dec., 2016.

84. R. Menezes Salgado, T. Carvalho Machado and T. Ohishi, "Intelligent Models to Identification and Treatment of Outliers in Electrical Load Data," in IEEE Latin America Transactions, vol. 14, no. 10, pp. 4279-4286, Oct., 2016.

We changed the title.

- (Before) A novel on Electric power Data Analysis Model using Altered K-means and ADQL

- (After) A novel on Electric Pole Big Data Analysis Model using Altered K-means and ADQL

Author Response File: Author Response.pdf

Reviewer 2 Report

This study proposed a data analysis and prediction model for electric power outliers to assess something wrong with electric power data based on deep reinforcement learning. This is well written and can be accepted by this journal.

Author Response

Reply-

Dear Editor and Reviewers, respectfully,

The overall block diagram of the system proposed in the study was added to promote the understanding of the reviewers. Also added were related research data and related researches to propose a smart grid system through the analysis of electric pole sensor data. I respectfully request another chance at review based on these data and another round of detailed review.

Add 2) Optimization at Reference

22. L. Sun, K. Zhou, X. Zhang and S. Yang, “Outlier Data Treatment Methods Toward Smart Grid Applications,” in IEEE Access, vol. 6, pp. 39849-39859, 2018.

24. D. Li, and S. K. Jayaweera, “Machine-learning aided optimal customer decision for an interactive smart grid,” IEEE Systems Journal, vol. 9, no. 4, pp.1529–1540, Dec., 2015.

28. R. Thapa, L. Jiao, B. J. Oommen and A. Yazidi, "A Learning Automaton-Based Scheme for Scheduling Domestic Shiftable Loads in Smart Grids," in IEEE Access, vol. 6, pp. 5348-5361, 2018.

29. P. Palensky, D. Dietrich, "Demand side management: Demand response intelligent energy systems and smart loads", IEEE Trans. Ind. Informat., vol. 7, no. 3, pp. 381-388, Aug., 2011.

74. Z. Wen, D. O’ Neill, and H. Maei, “Optimal demand response using device-based reinforcement learning,” IEEE Transactions on Smart Grid, vol. 6, no. 5, pp. 2312–2324, Sep. 2015.

77. Gang Ma, Linru Jiang, Guchao Xu, Jianyong Zheng, "A Model of Intelligent Fault Diagnosis of Power Equipment Based on CBR", Mathematical Problems in Engineering, vol. 2015, pp. 1, 2015.

80. M. Gupta, J. Gao, C.C. Aggarwal, J. Han, "Outlier detection for temporal data: A survey", IEEE Transactions on Knowledge and Data Engineering, vol. 26, pp. 2250-2267, 2014.

We changed the title.

- (Before) A novel on Electric power Data Analysis Model using Altered K-means and ADQL

- (After) A Novel on Electric Pole Big Data Analysis Model Using Altered K-means and ADQL

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper proposes a machine-learning algorithm for data analysis of power system data applied under the concept of smart grids.

1. Many parts of the paper can be presented in the appendix or omitted from the paper as the length of the paper is long.

2. Many parts of the paper such as the introduction are too wordy and in some cases not relevant to the main focus of the paper.

3. There are numerous examples of long run-on sentences and grammar errors (even in the title it seems that a word (approach) is missing). Editing by a professional English-speaking editor is recommended.

4. Some figures can be combined into on Figure with multiple subplots.

5. The terminology used in the paper is not very familiar for power system professionals and more matches with computer-science machine-learning experts. A review of this paper by power utility professionals is recommended.

6. legend need to be provided for all the figures.

7. All abbreviations need to be explained before used.

8. Too many keywords. Only keep the 5 most relevant keywords.

9. Pseudo-codes can be presented in the appendix. Comments are not really necessary.

10. More information on the data set need to be provided for the reader.

11. Verification using another data set is recommended as one experiment does not prove the validity and accuracy of the method.

12. There are many punctuation errors in the writing of the paper. Proper use of comma (,) is recommended.

13. Although, the literature review is complete but some papers from IEEE transactions and Journals which are very well related to this paper are missed such as (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8404086) Disclaimer: the reviewer has no affiliation with this paper.

14. "electric power outlier" which is a term repeated in this paper numerously is an unfamiliar term and not very commonly-used. Revising this term is recommended.

15. In many cases the term "Power" used in this paper might confuses the reader with the concept of electric P = VI while this paper talks about data related to power or energy systems. The term "electric power data" need to be clarified and maybe changed. What data are we talking about?

In equations presented in the paper, not all the parameters are introduced. Please revise.

Author Response

1. Many parts of the paper can be presented in the appendix or omitted from the paper as the length of the paper is long.

Reply-

Dear Editor and Reviewers, respectfully,

You have pointed out that I need to provide many parts of my paper in the Appendix or omit them. I totally agree with you on this. Writing this paper, I provided the content of engineering data analysis so that researchers both in the engineering and humanities field could check it. Combining your Comment No. 9 with this, I provided pseudocode in the Appendix. Thank you.

Add ) Algorithm 2 in the Appendix shows the altered K-means algorithm proposed in the study for the clustering of unlabeled data. Algorithm 4 in the Appendix shows the ADQL algorithm that is a reinforcement learning algorithm for the analysis and compensation of electric pole data.

Appendix

Algorithm 2 Clustering using PCA and Initial Centroid Subspace.

Data : Non-labeling Electric pole Data Set

Output : Data by Cluster

/*Scatter plots and p, the number of data, in the input data will be checked.*/

Input:

Training set , , , …

where (drop =1 by convention)

/* Principal component analysis will be conducted for all the input data entities, and principal components will be extracted until the point where a constant value will be maintained to explain all the data. */

/* The central point segmentation method will be applied to , the number of random clusters and the number of random central points, based on the principal components that have been extracted through principal component analysis. , the central point of each initial cluster, will be measured with a random cluster index vector. */

repeat each do

if > n, which is each all data

then

for each , do

/* , represent the principal component direction and element */

each data assign =

update the temporary centroid number

assign the temporary centroid for scope(),

end

/* The minimum value of , the central point of each segmented area, will be calculated with , the sum of squared distance to each entity. */

/* will be calculated, which is the minimum average distance between the central point of a random cluster and the entities included in an external cluster.*/

/* S(k), which is the maximum cluster dissimilarity based on a difference between , the degree of separation based on the average distance between the entities included in different clusters and all the other entities, and , the degree of cohesion based on the average distance between an entity within a cluster and one in an external cluster, will be treated as , which represents the number of clusters, K.*/

for each temporary, do

calculation from each data to cluster centroid(cohesion),

calculation from each data to cluster centroid(separation),

assign the initial centroid number, S(k)

end

if Selection -> Clustering number K

then

for check of each vector data,

each =

if > , which is the initial centroid

if when there is two object distributed

then assign the object to , the centroid of , the first cluster

else

then assign the object whose two vectors record the biggest length first to , the centroid of , the first cluster

/* Measure the distance() between the remaining objects() the centroid of , the first cluster */

for check of each vector data,

asign the object whose distance measurement is the biggest to , the centroid of , the second cluster

for check of each vector data,

assign the object with the maximum value to , the centroid of , the third cluster

if (of the maximum measurements)

then

until the remaining objects form a cluster in a direction that is the closest to the area selected based on the initial values. Then the user will appropriate the sum for the centroid of another cluster and repeat the Stages until there is no more travel of each cluster centroid.

end for

end for main

Algorithm 4 Electric pole Outlier Data using Reinforcement learning with clustering as ADQL.

Data : Electric pole Clustering Data set

Output : Outlier Data

Initialize experience memory L

Initialize parameters of representation Clustering Outlier () and action scorer () randomly

Initialize power clustering data value function with weight neural network

for episode=1, do

Initialize clustering and get start state description

for t=1, T do

then

if random() << span=""> do

Select a random action

else

Compute for all actions using

execute

Action , Observe reward , Observe new state

if > 0

set priority = 1

else

= 0

Store to A

Store in D

Select random mini batch of transitions from D

With fraction having = 1

Perform gradient descent step on loss reduction and neural network update

if == inlier

then

learning outlier network

end for

end for main

2. Many parts of the paper such as the introduction are too wordy and in some cases not relevant to the main focus of the paper.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. Reflecting your comments, I connected the introduction to a need for an electric pole data analysis model(a recent social issue) and wrote it again. I also added a block diagram of the proposed system to the body in paper.

Add ) There has been an ongoing need for researches to analyze the data of outdoor electric poles in the domain of internal power IoT data analysis. electric poles form a power supply line along the streets and have recently caused several issues worldwide including spontaneous combustion and fire. In South Korea, a spontaneous combustion case happened in Gangneung-si in April, 2019 and caused big damage to human life and property. If there had been an analysis model for electric pole data, it would have helped to reduce the damage scope. Since electric poles are all equipped with sensors, it is easy to collect and analyze their data. They are, however, classified as power data managed by the government, which makes it difficult to secure the data. electric pole data consists of the followings :

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. The contents have been revised from the readers perspective with the assistance of a native English speaker and both the contribution. Thus, I’d like to respectfully request your re-review if possible. The contents added or changed are being highlighted in red.

4. Some figures can be combined into on Figure with multiple subplots.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have added a figure(Figure 3~Figure 5 -> Figure 3). I once again request another review respectfully.

a. Clustering result in electric pole data (Roll-Acceropitch).

b. Clustering result in electric pole data(Acceropitch-Temperature).

c. Clustering result in electric pole data(Roll-Temperature).

Figure 3. Result of clustering (k=4) using altered K-means in electric pole data

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. Trying to reflect your comments, I discussed the content again with an expert on power data analysis. As a joint project with the Korea Electric Power Corporation, this study aimed to analyze the data of electric poles along the streets and build a model to predict their outlier. I discussed the parts about data analysis again with the research team at the corporation. I thank you for your comments.

6. legend need to be provided for all the figures.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I added legends to all the figures in the paper. Thank you.

a. Clustering result in electric pole data (Roll-Acceropitch).

b. Clustering result in electric pole data(Acceropitch-Temperature).

c. Clustering result in electric pole data(Roll-Temperature).

Figure 3. Result of clustering (k=4) using altered K-means in electric pole data

Figure 4. Scope of learning completion according to a reward function over episodes.

Figure 5. Prediction results of learning accuracy based on the clustering data.

7. All abbreviations need to be explained before used.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have added full names and explanations about them before their abbreviations.

8. Too many keywords. Only keep the 5 most relevant keywords.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have narrowed down keywords to five that have the greatest relevance to the paper.

Keywords: Altered K-means; A-Deep Q Learning; Electric Pole Big Data Analysis; Big Data Analysis; Python.

9. Pseudo-codes can be presented in the appendix. Comments are not really necessary.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. I provided pseudocode in the Appendix. Thank you.

Appendix

Algorithm 2 Clustering using PCA and Initial Centroid Subspace.

Data : Non-labeling Electric pole Data Set

Output : Data by Cluster

/*Scatter plots and p, the number of data, in the input data will be checked.*/

Input:

Training set , , , …

where (drop =1 by convention)

repeat each do

if > n, which is each all data

then

for each , do

/* , represent the principal component direction and element */

each data assign =

update the temporary centroid number

assign the temporary centroid for scope(),

end

/* The minimum value of , the central point of each segmented area, will be calculated with , the sum of squared distance to each entity. */

/* will be calculated, which is the minimum average distance between the central point of a random cluster and the entities included in an external cluster.*/

for each temporary, do

calculation from each data to cluster centroid(cohesion),

calculation from each data to cluster centroid(separation),

assign the initial centroid number, S(k)

end

if Selection -> Clustering number K

then

for check of each vector data,

each =

if > , which is the initial centroid

if when there is two object distributed

then assign the object to , the centroid of , the first cluster

else

then assign the object whose two vectors record the biggest length first to , the centroid of , the first cluster

/* Measure the distance() between the remaining objects() the centroid of , the first cluster */

for check of each vector data,

asign the object whose distance measurement is the biggest to , the centroid of , the second cluster

for check of each vector data,

assign the object with the maximum value to , the centroid of , the third cluster

if (of the maximum measurements)

then

end for

end for main

Algorithm 4 Electric pole Outlier Data using Reinforcement learning with clustering as ADQL.

Data : Electric pole Clustering Data set

Output : Outlier Data

Initialize experience memory L

Initialize parameters of representation Clustering Outlier () and action scorer () randomly

Initialize power clustering data value function with weight neural network

for episode=1, do

Initialize clustering and get start state description

for t=1, T do

then

if random() << span=""> do

Select a random action

else

Compute for all actions using

execute

Action , Observe reward , Observe new state

if > 0

set priority = 1

else

= 0

Store to A

Store in D

Select random mini batch of transitions from D

With fraction having = 1

Perform gradient descent step on loss reduction and neural network update

if == inlier

then

learning outlier network

end for

end for main

10. More information on the data set need to be provided for the reader.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have added explanations about the data sets in the study as follows:

Add ) The following data was reflected in performance evaluation to check this: the data was collected from electric pole sensors in Dalseong-gun, Daegu, South Korea on January 23~May 31, 2016 and provided by the Korea Electric Power Corporation. The collected data was 2.2GB, containing 547,621 data(6,699,561 rows). Each of the rows was stored by the data attributes including the electric pole number, location of equipment, temperature, humidity, pitch, roll, intensity of illumination, ultraviolet rays, pressure, remaining battery, and cycle. In addition, basic information including location, code, facility, date, time, pole, and position was collected and used in the analysis system. Of them, temperature, pitch, and roll data was extracted to analyze outliers in electric pole and power data and used in performance evaluation. A total of 1.2GB, containing 250,251 data(3,383,241 rows) were identified and used in the experiment.

11. Verification using another data set is recommended as one experiment does not prove the validity and accuracy of the method.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. Reflecting your comments, I have added content about the effectiveness and accuracy of the data provided in the study. Tables 2 and 3 offer detailed explanations about the data, reasons for the higher accuracy of the analysis model than previous studies, and improvement in performance time compared with other researchers. I once again request another review respectfully.

Add ) Table 2 shows the direct comparison and evaluation results between the old prediction system for electric pole data and the proposed prediction system for electric pole data outliers. For evaluation, each system was established with Python based on the algorithms proposed in each study. A total of 250,251 data of electric pole were entered in performance evaluation. While 175,176 data accounting for 70% of the entire data set were studied in the learning model of electric pole outlier data, the remaining 30% or 75,076 data were applied to the test models. The performance evaluation results show that the proposed prediction model recorded a prediction rate of 94,688% (5,616), which was approximately 2.29%~4.19% higher than previous studies. The accuracy rate of the entire data model including outlier and normal data was 95.544% (236,987), which is approximately 0.8% ~ 4.3% higher than previous studies. These results show that the techniques of previous studies choosing the number of clusters randomly rather than automatically recorded lower prediction rates for the electric pole data that had new inputs of outlier data learning. Table 3 shows learning time for 175,176 data and calculation time to measure 75,076 test data based on the model proposed in a previous study. The learning time for 175,176 data was reduced by approximately 6~20 seconds from the previous study. The test time was reduced by approximately 3~8 seconds. The previous study reported that the time to measure test data was 200 seconds on average for data learning and model prediction. The analysis and calculation time of the algorithm proposed in the present study for the same power data prediction model was 182 seconds down by about 18 seconds. The proposed algorithm improved performance for model prediction calculation time by 12.5% compared with about 4.7% in the old previous study.

Table 2. Comparison of electric pole data prediction system(Prediction Rate and Accuracy Rate).

Part	Prediction System	Prediction Rate	Accuracy Rate
Study of [85]	K-means + Random Forest	90.490%	91.242%
Study of [86]	Hierarchical Clustering	91.485%	94.678%
Study of [87]	K-means+Sequence	92.395%	94.148%
Proposed Study	Altered K-means+ADQL	94.688%	95.544%

Table 3. Comparison of electric pole data prediction system(Computation Time).

Part	Prediction System	Learning Time(s)	Test Time(s)
Study of [85]	K-means + Random Forest	157	34
Study of [86]	Hierarchical Clustering	164	39
Study of [87]	K-means+Sequence	171	37
Proposed Study	Altered K-means+ADQL	151	31

12. There are many punctuation errors in the writing of the paper. Proper use of comma (,) is recommended.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have corrected my punctuation errors except for the parts with some terms. Thank you.

Reply-

I appreciate that you found time to review my paper and give me comments for its revision despite your hectic schedule. As you have pointed out, I have added researches on electric pole sensor data analysis, reinforcement learning, and smart grid related to my paper. Thank you.

Add 1) Optimization at Reference