Next Article in Journal
Piping Stabilization of Clay Soil Using Lime
Previous Article in Journal
Carbonate Stable Isotope Data Suggest Freshwater Environment for the McMurray Formation (Aptian), Alberta, Canada
Previous Article in Special Issue
Possible Interrelations of Space Weather and Seismic Activity: An Implication for Earthquake Forecast
 
 
Article
Peer-Review Record

Feasibility of Principal Component Analysis for Multi-Class Earthquake Prediction Machine Learning Model Utilizing Geomagnetic Field Data

Geosciences 2024, 14(5), 121; https://doi.org/10.3390/geosciences14050121
by Kasyful Qaedi 1, Mardina Abdullah 1,2,*, Khairul Adib Yusof 1,3,* and Masashi Hayakawa 4,5
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Geosciences 2024, 14(5), 121; https://doi.org/10.3390/geosciences14050121
Submission received: 25 March 2024 / Revised: 24 April 2024 / Accepted: 26 April 2024 / Published: 29 April 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1. The actuality of the topic of the publication is high, because it lies in the trend of growing interest in the use of modern mathematical methods in earthquake forecasting tasks, since previously used technologies are not effective enough.

2. In my opinion, the authors did not pay enough attention to the description of the primary data used, especially magnetic data.

2.1. The authors use minute data for 1970-2021 compiled in the SuperMag database. The collected data are the results of magnetic measurements at stations and observatories of various levels and status, i.e. this data is highly differentiated in quality and may include noise, spikes, jumps, and gaps. The authors do not specify how critical this factor is for further analysis. An alternative to the SuperMag project is the INTERMAGNET network of magnetic observatories, which defines measurement and processing standards and provides multi-level control of results.

2.2. The 50-year period for which the magnetic data is considered is quite long. It covers both the era of the becoming of digital magnetic observations and the era of modern magnetometers. The data obtained during these epochs can vary significantly both in the quality of measurements and in the quality of preliminary processing. Thus, there is a problem of data uniformity — how critical is it?

2.3. The problem of data uniformity is also related to the problem of data completeness: the networks of magnetic observatories and stations in the 1970s and 2010s differ significantly. This applies both to the spatial distribution over the Earth and over time. The authors do not consider these issues, although it is clear that using outdated magnetic data with low quality for training may be ineffective in earthquake forecasting tasks.

2.4. When describing the data used, it is necessary to provide relevant references to sources in Internet.

2.5. The authors use the term "SuperMag station". This seems incorrect, because SuperMag is only a project to compile magnetic data in a single database. The stations and observatories themselves do not belong to this project. They belong to different organizations, but can be combined into different networks depending on the tasks and standards.

2.6. Problems of completeness and uniformity also apply to long-term series of seismic data (earthquake catalogue). In addition, any geophysical, geological or tectonic aspects are not taken into account, i.e. the heterogeneity of the conditions in which earthquakes occurred, for example, seismicity in subduction zones, spreading zones or intraplate are very different, i.e. training does not take into account this heterogeneity - why?

3. There are questions about the stage of preliminary analysis and data preparation:

3.1. It would be good to know how the authors' choice of the size of the zone around the magnetic observatory (200 km) and the size of the window before the earthquake (7 days) is determined.

3.2. Mainshocks are mentioned on line 92 – but there is no explanation of how they stand out against the background of aftershocks. This is not found in the article [25].

3.3. On lines 94-95, the authors write that they use the Ap index to "eliminate periods of geomagnetic quiescence", i.e. to analyze the perturbed magnetic field, in accordance with the work [26]. However, the following sentence says that Ap<27 were used to highlight quite conditions — to exclude them or to use them?

Lines 97-98 indicate that the Dst threshold of -30 nT was used to filter out very strong disturbances and provide a reference to [27]. But in [27], the Dst index is not used, only the total Kp index is used.

In general, it is unclear how the authors identify quiet or disturbed areas of magnetic data and which ones they use.

3.4. Line 101 - class IX with a magnitude range of 6.5-7.0 is not indicated

3.5. Lines 109-110 indicate the complexity of the data — it would be good to illustrate this graphically

4. Figure 3 is difficult to understand:

4.1. What do the "Data Points" show? How are they related to time, if they are related?

4.2. What is shown on the ordinate axis? In what units?

4.3. How do the X,Y,Z on the plots relate to the real X,Y,Z obtained from SuperMag?

4.4. The graphs are highly compressed, almost no details are visible. May be does it make sense to show several fragments with an acceptable scale?

5. Lines 182-184 - "Negative values aligned strongly with PC1, with a minimum of -2017.8 nT, while positive values also showed strong alignment,  reaching a maximum of 1334.9 nT." (end of citation). The logic is not very clear — why "while" if the effects for negative and positive values are similar?

6. Lines 184-185 — why does the yellow box reflect the correlation? May be does it make sense to plot a dependency PC1=f(X)? And why exactly for X?

7. Lines 185-186 — why did this special section arise? And why in this particular place?

8. Table 1 — is it necessary to show the values up to 0.01 nT if the changes between them noticeably exceed even 1 nT?

9. Line 237 — perhaps there should be a dot instead of a comma?

 

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors.

After reading your article there are 2-3 questions left for answer.
The first is about the data you use.
Why Mercalli  scale??This is a very subjective way of earthquakes estimation.  It  can be applied only in inhabited areas or territories where constructions exit. From the figure 1 it is obvious that you take in mind EQs  they occurred in the sea or deserts where direct application of Mercalli scale is not possible. I understand that you convert Richter magnitudes in Mercalli scales but this is not accurate at all. The transformation window from Richter to Mercalli is very wide e.g. an EQ of 5.0-5.9 Richter , which is a very wide range , could be denoted as a VI or VII in Mercalli which is very unclear and inaccurate.

The second question has to do with geomagnetic field data.
Figure 3, it is not clear. It is difficult to separate colors. 
The main question  is.  what is the predictor or the precursor in this data set. Is it the tight cluster which appear in all three panels of the figure 3? Is it the maximum or the minimum values of the geomagnetic data ??? I think it should be clearly mention what we would like to detect on this figure, or what is the important feature we should take in mind.

Finally the conclusion section is not very informative. It should be clearly mention what is the practical use of all this processing and  how possible is this method to be applied in a realistic earthquake prediction.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript uses geomagnetic field data from the SuperMAG database to derive EQ precursor signals. Principal Component Analysis was used for reducing the dimensionality of global geomagnetic field data to improve the accuracy of EQ predictive models. Two models: Multi-class ML models and Support Vector Machine (SVM) models are considered. The corresponding accuracies are obtained. In this case, this research is simply an improvement of existing models using machine learning.

 

The present manuscript is well structured. My main remark to the authors is that nowhere has it been proven that EQs can be predicted, especially with magnetic field variation data that contains about a strong Dst variation in itself, and it is not removed in this case. It is well known that Dst variations as well as Kp/Ap are one of the main parameters of solar-terrestrial physics.

 

Minor remarks:

 

1) Line 17 have been studied more appropriate

 

2) Lines 248, 285 have inaccuracies in the text

 

3) Figure 1 represents the map from SuperMAG (screenshot) and is well mentioned in the text.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Back to TopTop