Next Article in Journal
Bionic Structure on Complex Surface with Belt Grinding for Electron Beam Welding Seam of Titanium Alloy
Next Article in Special Issue
Power-Balancing Software Implementation to Mitigate Side-Channel Attacks without Using Look-Up Tables
Previous Article in Journal
Development of Novel Polymer Supported Nanocomposite GO/TiO2 Films, Based on poly(L-lactic acid) for Photocatalytic Applications
Previous Article in Special Issue
Single Trace Analysis against HyMES by Exploitation of Joint Distributions of Leakages
 
 
Article
Peer-Review Record

An Automated End-to-End Side Channel Analysis Based on Probabilistic Model

Appl. Sci. 2020, 10(7), 2369; https://doi.org/10.3390/app10072369
by Jeonghwan Hwang and Ji Won Yoon *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2020, 10(7), 2369; https://doi.org/10.3390/app10072369
Submission received: 13 November 2019 / Revised: 18 March 2020 / Accepted: 23 March 2020 / Published: 30 March 2020
(This article belongs to the Special Issue Side Channel Attacks and Countermeasures)

Round 1

Reviewer 1 Report

The article is clearly nicely written, and the results seem interesting. 

 

However, I have some concerns regarding the novelty of the article. There is nowhere a strict comparison to other related results. 

In addition, the reference list is small and includes very old articles. 

 

This means that the authors might have not seen some related and latest results.

 

In addition there should be additional comments on the usefulness of the results.

Author Response

I have some concerns regarding the novelty of the article. There is nowhere a strict comparison to other related results.

--> Comparison with the sliding window and computing correlation method has been added to section 4.1.

In addition, the reference list is small and includes very old articles.
This means that the authors might have not seen some related and latest results.

--> To our best knowledge, our method is not found in side channel analysis, which exploits statistical signal processing with Bayesian approach.

In addition there should be additional comments on the usefulness of the results.

--> Subfigure 6.b was added so that usefulness can be seen intuitively.

 

All the changes in paragraphs have been highlighted in the manuscript.

Reviewer 2 Report

In this work, authors present a way to automatically split a power trace into different segments related to the execution of an exponentiation operation where the exponent is supposed to be "secret". To do so, they detect change points in the signal, divide the power trace into segments and later extract features from each segment. 

I like the idea of using the information of a single power trace without a previous model and I think it may lead to quite interesting results, but it looks to me that this is a work in progress rather than a finished work. Besides the authors could spend some more time checking prepositions, articles and the concordance of the third person.

In the abstract, authors say that they show the improvement of accuracy, however, there is no comparison with previous works, so it is really hard to tell whether such improvement exists or is just theoretical. 

In section 2.1 when the authors describe the notations, they mention a set of hyperparameters, but they do not mention how they are selected, how they are tuned (if they are tuned) and so on. The values of these hyperparameters are used to compute the change-points and during the preprocessing, so I believe the results of your approach depend on them. Then further information must be provided. I would like to see such information and some further reasoning about the shapes of the distributions. In addition, a window size appears on line 78 and was not mentioned before.

In section 3.4 authors mention they use 3 approaches to extract the features (and they acknowledge "this part is to be further researched") but they only mention 2; section 3.4.1 and 3.4.2. Authors should further test the approaches to find their best configurations instead of trying one configuration with no justification in the results section.

In section 4, it is not clear to me what is the relation between the standard deviation and the "power" of the noise signal, plus the details about how this noise signal is generated are not given. Authors use the number of change points detected to evaluate their approach, I assume the right number is 44, but this information should similarly be detailed. You mention a confusion matrix and then give the mean accuracy of the whole approach, since you mention the confusion matrix I would like to see the results referring to that matrix. In table 4, you say that no information about the key is used, then how can you compute the accuracy??

When conducting side-channel power attacks, it is important to give information about the hardware in which the victim application was running, the way each of the samples was obtained and the targeted algorithm. If you have used traces from any public dataset, you should mention it. Besides, authors say they retrieve the secret exponent, but they never mentioned the algorithm, details of its implementation or event the key length. In order to claim that this approach is general, it has to be tested in different scenarios.

Author Response


I have labeled your comments as C-#s and my responses to the comments as R-#s for your convenience.


C-1.
In the abstract, authors say that they show the improvement of accuracy, however, there is no comparison with previous works, so it is really hard to tell whether such improvement exists or is just theoretical.

R-1.
An additional experiment was added to section 4.1

C-2.
In section 2.1 when the authors describe the notations, they mention a set of hyperparameters, but they do not mention how they are selected, how they are tuned (if they are tuned) and so on. The values of these hyperparameters are used to compute the change-points and during the preprocessing, so I believe the results of your approach depend on them. Then further information must be provided. I would like to see such information and some further reasoning about the shapes of the distributions. In addition, a window size appears on line 78 and was not mentioned before.

R-2.
Hyperparameters are added.
--> This point is added to the first part of section 4.

C-3.
In section 3.4 authors mention they use 3 approaches to extract the features (and they acknowledge "this part is to be further researched") but they only mention 2; section 3.4.1 and 3.4.2. Authors should further test the approaches to find their best configurations instead of trying one configuration with no justification in the results section.

R-3.
A typo was corrected to two.

C-4.
<1> In section 4, it is not clear to me what is the relation between the standard deviation and the "power" of the noise signal,
<2> plus the details about how this noise signal is generated are not given.
<3> Authors use the number of change points detected to evaluate their approach, I assume the right number is 44, but this information should similarly be detailed.
<4> You mention a confusion matrix and then give the mean accuracy of the whole approach, since you mention the confusion matrix I would like to see the results referring to that matrix. In table 4, you say that no information about the key is used, then how can you compute the accuracy??

R-4.
<1> Standard deviation is important in estimating change points since likelihood, p(y | r), is calculated with standard deviation in the expression. STD will directly affect the ratio of likelihood to prior distribution where prior distribution of change points, p(r | lambda) works as regularizing the number of change points.
<2> This point is added to the first part of section 4.
<3> Here, it is 44 and we will elaborate on data.
<4> In a real attack environment, external information will not be used and with only the information we can process from the time series is to be used. DB-index is one of that information.
That's why we mentioned 'external evaluation' on clustering. As you mentioned the real key exponent information is external and evaluating clusters with that is not a part of the attack but only the way of evaluating our methods.
--> This will be added to section 4.3


C-5.
When conducting side-channel power attacks, it is important to give information about the hardware in which the victim application was running, the way each of the samples was obtained and the targeted algorithm. If you have used traces from any public dataset, you should mention it. Besides, authors say they retrieve the secret exponent, but they never mentioned the algorithm, details of its implementation or event the key length. In order to claim that this approach is general, it has to be tested in different scenarios.

R-5. The same as R-4 <2>

 

All the changes have been highlighted in manuscript.

Round 2

Reviewer 1 Report

Comments about future research and extensions should be added in the section with conclusions can improve this section.

Are there cases when this type of approximation can be used more effectively?

What are the major weaknesses of this method. Please comment.

Can it work effectively in large scale models?

The proposed results by the authors could be applied to other related applications like problems related to discrete time systems or power systems which can benefit from side channel analysis, which exploits statistical signal processing with Bayesian approach. Some suggested references that could be added for this are the following including some references therein:

Dassios I., Szajowski K. Bayesian optimal control for a non-autonomous stochastic discrete time system. Applied Mathematics and Computation, Elsevier, Volume 274, pp. 556-564 (2016).

I. K. Dassios, K. Szajowski, A non-autonomous stochastic discrete time system with uniform disturbances, IFIP Advances in Information and Communication Technology, Volume 494, 2016, Pages 220-229 (2016).

Dassios I., Zimbidis A., Kontzalis C. The Delay Effect in a Stochastic Multiplier–Accelerator Model. Journal of Economic Structures, Springer, Volume 3, Issue 7, pp. 1-24 (2014).

Author Response

C-1) Are there cases when this type of approximation can be used more effectively?
C-2) What are the major weaknesses of this method. Please comment.

R-1,2) Our approach finds a piecewise constant between the operations first and then cluster the segments.
So, it effectively works when the idle part in time series can be detected with mean change detection.
The weakness is the otherwise. That is, if the idle part changes drastically with its magnitude bigger than \sigma.
We have added the major weakness to the discussion part.

C-3) Can it work effectively in large scale models?

R-3) We have added one more experiment and result on section 4.5.

Reviewer 2 Report

Thank you for highlighting the changes in the manuscript, it really helps with the review. However, it has an inconvenience. Note that if I just look at the highlighted parts, and considering that you received a major review one could say that you have spent more time answering me than editing the paper. For example, I feel like there has been just a slight modification regarding the English in the manuscript. Some examples (there are more, so please review the manuscript):

"In general, the number of clusters are known known in data and if it is" --> is known

"it becomes very trivial problem to decide which operation or period is related to one of clusters" --> it becomes a very trivial problem to decide which operation or period is related to each of the clusters.

"We apply the median filter to power trace with its stride size same as the window length and reduce the length of power trace to analyze" --> We apply the median filter to the power trace. We use the stride size equal to the window length. As a result, we reduce the length of power trace to analyze. (That's just what I believe you mean)

Having said that, let's move to the technical part. I still like the approach and I consider it solves in an elegant way part of the synchronization problem between the power trace observed and the real execution of the target algorithm. The idea is quite clear, as it is the procedure you propose. Thank you for adding some details about the platform, just a couple of questions here:

1) You are using the square and multiply implementation of RSA, are you using it from any library? (Bouncy Castle, OpenSSL...) or did you implemented it? Please add this information too.

2) What's the frequency of the processor in the Samsung Galaxy S3? Just to compare it with the oscilloscope.

I appreciate that you have added a comparison with previous approaches, but you say "Our approach showed better performance in finding the locations of the executed operations and finding that of inserted noise". That is true based on the plots, but "better" does not provide any precise idea about the degree of improvement. Use accuracy or precision or an F1 score for that.

The length of the key you use in the experiments (16 bits) is really short, which makes me wonder how much is your approach affected by the key length. What is the limit? key length? The ratio between the frequency of the processor and the sampling device? I'd like if you could elaborate a little bit on that. 

To sum up, even when I am satisfied with most of the changes and the explanations given by the authors, I cannot say "minor review" because I would like to see the language reviewed and some other details fixed before that. 

Author Response


C-1) You are using the square and multiply implementation of RSA, are you using it from any library? (Bouncy Castle, OpenSSL...) or did you implemented it? Please add this information too.

R-1) We implemented it.

C-2) What's the frequency of the processor in the Samsung Galaxy S3? Just to compare it with the oscilloscope.

R-2) Specification says it's "1.4GHz Quad-core Cortex A9".

C-3) I appreciate that you have added a comparison with previous approaches, but you say "Our approach showed better performance in finding the locations of the executed operations and finding that of inserted noise". That is true based on the plots, but "better" does not provide any precise idea about the degree of improvement. Use accuracy or precision or an F1 score for that.

R-3) The ground truth for the executed time of the operations does not exist. In general they are done now by human eyes. And this is the very point where our approach is needed.
In this point of view, we can make the ground truth by dividing the time series into the operations with our eyes and evaluate each method. But we thought it is inappropriate to use
this ground truth and write "our approach showed better performance when we refer to the ground truth that has been set by us".
But as the reviewer has commented, we had quantified how much improvement our approach showed.

[This part is not added to paragraphs but only to reviewer]
With our approach, we had checked the result below:
Sensitivity : 100%, Precision : 100%
With naive algorithm, we had checked the result below: Depending on the minimum distance (MD) between two peaks (figure 8. in the article is the result when MD=20, MD = 20 was chosen to set the number of
peaks similar to that of the ground truth change points)

The attached graphs show the precision, sensitivity and F1 score of naive peak detection.

 

C-4) The length of the key you use in the experiments (16 bits) is really short, which makes me wonder how much is your approach affected by the key length. What is the limit? key length? The ratio between the frequency of the processor and the sampling device? I'd like if you could elaborate a little bit on that.

R-4) We have added one more experiment and result on section 4.5.

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

The article is nicely written but the approach is simple and in my opinion the authors should work more on adding most recent references which are extremely old for such a general approach. In addition there is still no strict comparison. 

 

Author Response

[Recent references]

Recent references have been added to Introduction and Discussion part.

[On strict comparison]

To our best knowledge, our approach is the first to exploit change point detection to side-channel analysis. That is why we compare our approach with relatively 'old' methods. Comparison with other methods is in the first experiment (section 4.1).

Reviewer 2 Report

As I have said during the whole review process I really like the idea of this work. I appreciate the author's willingness to answer all my questions about their experimental setup and the different results they have added to the paper, that in my opinion, have helped to clarify their work. I am satisfied with those, so thank you.

The authors still have to proofread the paper before it gets published, but the current version is already quite clear.

About the technical part, currently, my main concern is how could a reader of this work reproduce your results for comparison? Since you have coded your own version of the encryption algorithm, this can be hard. So I was wondering if it is possible to make the implementation available somehow (github for example) and in that case, you should add a link to the source code in the paper. 

Author Response

[For reproducibility]

Our work was done "given RSA power trace data". Implementation of RSA algorithm was not our part of the work and therefore, we do not have the source code for generating power trace data.

The data was generated from other part of the project we acknowledgements.

I have uploaded our source code with data for reproducibility. I have added the link to conclusion part.

Back to TopTop