Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Performance Comparison of Machine Learning Disruption Predictors at JET

Appl. Sci. 2023, 13(3), 2006; https://doi.org/10.3390/app13032006

by Enrico Aymerich^1,*

, Barbara Cannas¹

, Fabio Pisano¹

, Giuliana Sias¹, Carlo Sozzi², Chris Stuart³, Pedro Carvalho³, Alessandra Fanni¹

and the JET Contributors^†

Reviewer 1: Anonymous

Reviewer 2:

Yunqiang Zhao

Reviewer 3:

Iftikhar Ahmad

Reviewer 4:

Fan Yang

Appl. Sci. 2023, 13(3), 2006; https://doi.org/10.3390/app13032006

Submission received: 29 December 2022 / Revised: 26 January 2023 / Accepted: 1 February 2023 / Published: 3 February 2023

(This article belongs to the Special Issue New Challenges in Nuclear Fusion Reactors: From Data Analysis to Materials and Manufacturing)

Round 1

Reviewer 1 Report

Manuscript number. APPLSCI-2160223

Referee’s report on “Performance Comparison of machine Learning Disruption Predictors at JET”

By E. AYMERICH B. CANNAS, F. PISANO, G. SIAS, C. SOZZI, C. STUARY, P. CARVALHO, A. FANNI, AND THE JET CONTRIBUTORS

Tokamaks use strong magnetic fields to confine high-temperature plasmas with the goal of creating the conditions for extracting power from the resulting fusion reaction in plasma.

However, the plasma is the seat of many instabilities that may lead to minor or major disruptions. Major disruptions abruptly destroy the plasma’s magnetic confinement, thus terminating the fusion reaction and rapidly depositing the plasma energy into the confinement vessel. The resulting thermal and electromagnetic force loads can irreparably damage key device components. The physics associated with the generation of these mechanisms is complex and poorly understood at present. In addition, many physical processes, both kinetic or fluid (MHD), seem to intervene in the generation and amplification of the so-called major disruption mechanism. This mechanism is generally linked to the LH (low to high confinement) transition, with the formation of a transport barrier, but which is generally unstable.

For disruption avoidance in large-scale tokamaks such as ITER, a diagnostic system that provide adequate information to guide the steering of tokamaks is then required. Thus, if an impending major disruption is predicted with sufficient “warning time”, a disruption mitigation system using techniques as massive gas injection can be triggered. This mitigation system terminates the discharge but significantly reduces the deleterious effects of the disruption. But missing a real disruption or calling it too late is usually costly because its damaging effects go unmitigated (and must be avoided in large-size tokamaks), while triggered a false “alarm” (false positive) wastes experimental time and resources. However, these numerical codes require predictions in real time. The manuscript proposed by the authors enters in this type of study and proposes a comparison of three classical algorithms used in the artificial intelligence community. Three different machine learning are compared: a MLP-NN. (traditional neural networks), GTM. (topographic mapping). and CNN (deep learning neural network).

I believe the paper contains significant results and should be published after the following matters are dealt with properly. For publication the presentation needs improvement in several respects. Items to be addressed by the authors:

1) One of the strengths of the paper, consists in the generation of a two-dimensional map, presented in Figure 3, for example, obtained by the GTM technique. The training sequence was used to assign a different color to the map units. According to the authors, GTM has a less smooth disruptive likelihood and needs an assertion time to trigger the alarm. Is an assertion time introduced in the training phase to draw the two-dimensional map? Doesn't it introduce in a certain way, a certain di-symmetry between the three numerical scheme, by introducing an assertion time in the GTM algorithm, which avoids false alarms?

2)In the case of the GTM algorithm, could the authors give some additional information concerning how this map is obtained? I find it difficult to understand how we can obtain grey cells, i.e. an undefined situation (mixed disrupted and nodisrupted state), and why the GTM algorithm does not manage to obtain a solution that does converge. Is it due to a lack of information in the training phase, an insufficient number of tests in the training sequence or an experimental shot with lacking information?

3) A comparison between the three methods is provided from the three tables 4, 6, and 8. These three tables use the same database of P=108 disrupted pulses and N= 149 undisrupted pulses. These data constitute the "test set" of table 2, obtained from JET only. The number of test cases in total (108 + 149) seems quite low in absolute terms. In the case of the future ITER reactor, it seems unlikely to have a database of pulses corresponding to disruptions, given the risk of damage that these disruptions could cause. In this case, how can we guarantee that the different algorithms can be as efficient on a different machine like ITER, even if JET is a large tokamak? Did the authors test one of the algorithms on another machine and obtain similar results? In other words, can we talk about the portability of the algorithms to a tokamak different from JET?

4) In the comparison of the three tables 4, 6 and 8, the indicator MA (missed alarm) is very similar in both tables 4 and 8 (MA= 2.78%) and somewhat lower for table 6 (GTM algorithm) (MA= 1.85%). Thus, it seems that the GTM algorithm is able to detect a disruption situation while this is not the case for the other two algorithms, which do not detect this disruption. Have the authors analyzed these particular cases in more detail to understand the reasons, or the physical information associated, with this detection by GTM?

5) One method that could accelerate the convergence of the algorithms is to find a precursor to a major disruption, which depends directly on a physical phenomenon (tearing mode for example with a mode blocking type m/n = 1/1, 1D profile, …). Did the authors try to force the search for a precursor from a physical law, introduced directly by authors, to see if the MA or AUC indicators were strongly modified?

6) A direct comparison on the same exact database is key to accurately measure progress and to allow detailed comparaison of the relative and weaknesses of the different methods. The shots in the JET ILW campaigns may run at high power and plasma current, have higher disruption rates and may be affected by active disruption mitigation systems. Thus, many shots can be terminated by the DMS long before the onset of a disruption. In such cases, it is accordingly impossible to know whether any disruption would have actually occurred. For each shot, if there is disruption, this only occurs at the end, which make the problem complex.

One may also wonder, given the limited number of shots used in the training phase, if the number of shots that can actually be used is sufficient to guarantee the full efficiency of the algorithm. The authors should clearly indicate the limits here of this approach?

This type of situation may be quite common for campaigns carried out on large tokamaks such as JET, the situation is perhaps a little less critical for campaigns carried out on smaller tokamaks. Shouldn't we consider improving the training phase by taking into account, in addition to the shots made on JET, a database of complementary shots made on a small tokamak which provides a much higher sample of shots?

7) Could the authors specify the type of "normalized locked mode" used in the training phases of the various algorithms, or how it is taken into account in the various algorithms: it seems that it is not taken into account in GTM in the training phase, but more directly in the decision phase, which can "distort" a little the comparison between the various algorithms. This should be said more clearly in the text?

8)Page 18, in the conclusion, MLP-NN, GTM and CNN have been trained as disruptive predictors using the same plasma parameters: the electron temperature, density, radiation profiles, the locked mode signal, the radiation fraction of the total input power, internal inductance.

The choice of the various plasma parameters remains quite limited, however, it appears consistently with the locked mode, plasma current, radiated power, safety factor (magnetic field pitch) choice retained in ref25, that contain a large amount of disruption-relevant information. Could the authors indicate, in their opinion, what are the essential plasma data to include in the algorithms to guarantee a good prediction of the latter?

9)From the information given in tables 4, 6 and 8, it is difficult to estimate if the false alarms come from the classical methods which are often erratic and not attributable to physically meaning events, or are associated to small disruptive events, more difficult to detect in principle. For each slot, if there is a disruption, this only occurs at the end. Moreover, depending on the machine, disruptions in JET can be quite rare (<10% of shots). This means that the actual learning signal for disruption events is quite sparse. Do the authors use an up-weighting technique (as an hyperparameter) to stabilize the training and increase performance?

10) A user of a deployed version of the predictive system must define an alarm threshold, such that when the output signal reaches a certain value, an alarm is triggered and thus disruption mitigation actions are engaged. This alarm threshold allows the user to trade off between maximizing true positives and finally minimizing false positives. The different thresholds obtained for the three algorithms seem very close, is this related to the metric used (determination of the surface of the ROC curves)?

11) Usually, many signal (such as the plasma current, stored energy…) will present different characteristic scales on different tokamak machines, in particular for ITER. A normalization should ideally be used, having the property that signal has the same “Physical meaning”. Since it is practically impossible to operate the future ITER reactor in a disruptive regime, and since the characteristics of the machine remain out of the ordinary (e.g. a strong involvement of alpha particles in terms of self-heating is expected, a situation that cannot be encountered in smaller machines), what is the approach envisaged for the use of these algorithms on ITER, which are ultimately developed for smaller tokamaks?

12)The authors do not provide any indication in the manuscript in terms of performance or the HPC approach envisaged. The algorithms used really only reach their full capacity when using large computational resources and are particularly well adapted for GPU graphics processors which allow to process several shots in parallel, especially in the training phase. Could the authors indicate in a few lines their approach used in terms of calculation and the type of resources used?

13) These various algorithms allow a rather fine analysis of the experimental data. On the other hand, the LH transition was not predicted theoretically but observed experimentally first. Could these techniques be used to detect correlations or hidden physical mechanisms, thus allowing a contribution from the theoretical point of view?

Comments for author File: Comments.pdf

Author Response

Thank you for your review, according to your and other Reviewers’ evaluations we amended the paper. We report here the comments and in red our replies.

Author Response File: Author Response.docx

Reviewer 2 Report

In this study, three different machine learning models were used for disruption prediction in the ITER operation. The most common performance indices have been used to compare the different DP models. This study has a good reference value for the application of artificial intelligence in the field of nuclear power. In my opinion, this paper can be accepted.

Author Response

Thank you for your review, according to your and other Reviewers’ evaluations we improved the description of the methods.

Author Response File: Author Response.docx

Reviewer 3 Report

The manuscript compares the performance of three machine learning algorithms for disruption predictors at JET. The paper merely compare the performance of three machine learning techniques and as such no new technique or improvement is presented. Therefore, the novelty is minimum and is purely a horse-race paper. Despite this shortcoming, the paper may be of interest to researchers in the particular field of study. Overall the paper is well written and clearly explains the process, however, there are some minor comments that shall be addressed before final acceptance.

General Comments:

1. The dataset is too small (412 instances), why deep learning is selected for such as small dataset.

2. What measures are in place for addressing overfitting?

Specific Comments:

0. Abstract

0.1 Define the terms before using them such as ITER, DEMO, CFETR.

1. Introduction

1.1 Line 37: "Tokamak nuclear..... " [Provide a reference at the end of the sentence]

2. Results

2.1 Table 3 : Number of hidden neurons is mentioned as 10, however, it would make more sense to include number of hidden layers and mention the number of neurons in each layer.

Author Response

Thank you for your review, according to your and other Reviewers’ evaluations we amended the paper. We report here the comments and in red our replies.

Author Response File: Author Response.docx

Reviewer 4 Report

In this article, the performance indices have been used to compare different DP models, in particular 3 machine learning methods: MLP-NN, GTM, and CNN. This article does not contain new methods but provides a case study and tutorial, and this is helpful for industrial practice.

I have the following comments:

1. On the data selection in Section 2.2, there should be some explanations for the data set changes caused by the changes in JET operating space in 2013. If it is not clearly indicated whether the JET operation changes affect the data quality, or if some processing of the data has been changed, it will have a certain impact on the validation of the experimental results and the credibility of the method.

2. In terms of the structure of the article, Section 2 should focus on data and models, but in the experimental results of Section 3, the model is specifically described and introduced in combination with this project. Would it be better to move it to Section 2? Moreover, in Sections 2.3.1 and 2.3.3, MLP-NN and CNN are not explained with this project, but in Section 2.3.2, GTM is explained with this project.

3. The label of Figure 3 in Section 2.3.2 is not complete, and the specific mapping process of GTM is not clear when introducing GTM, which makes it difficult to understand GTM.

Author Response

Thank you for your review, according to your and other Reviewers’ evaluations we amended the paper. We report here the comments and in red our replies.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Manuscript number. APPLSCI-2160223

Referee’s report on “Performance Comparison of machine Learning Disruption Predictors at JET”

By E. AYMERICH B. CANNAS, F. PISANO, G. SIAS, C. SOZZI, C. STUARY, P. CARVALHO, A. FANNI, AND THE JET CONTRIBUTORS

This paper is a useful contribution to the understanding of a complex problem which is of growing scientific interest. The authors have modified the initial manuscript, and have perfectly answered the various questions.I believe the paper contains significant results and should be published in its present form.

Reviewer 4 Report

This version is better than the previous version. I don't have major comments.

Article Menu

Performance Comparison of Machine Learning Disruption Predictors at JET

Further Information

Guidelines

MDPI Initiatives

Follow MDPI