Next Article in Journal
Multi-Feature Information Complementary Detector: A High-Precision Object Detection Model for Remote Sensing Images
Previous Article in Journal
Prediction of Strawberry Dry Biomass from UAV Multispectral Imagery Using Multiple Machine Learning Methods
 
 
Communication
Peer-Review Record

Deep Learning Application for Classification of Ionospheric Height Profiles Measured by Radio Occultation Technique

Remote Sens. 2022, 14(18), 4521; https://doi.org/10.3390/rs14184521
by Mon-Chai Hsieh 1, Guan-Han Huang 1, Alexei V. Dmitriev 1,2,* and Chia-Hsien Lin 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2022, 14(18), 4521; https://doi.org/10.3390/rs14184521
Submission received: 4 August 2022 / Revised: 2 September 2022 / Accepted: 6 September 2022 / Published: 9 September 2022
(This article belongs to the Section Satellite Missions for Earth and Planetary Exploration)

Round 1

Reviewer 1 Report

The paper is devoted to important problem to automatize ionospheric measurements interpretation. The authors made a model to classify ionospheric layers in radio occultation data. The research seems to be well designed and good quality. As I see the authors even indicated hyper-parameters estimation (that procedure is usually omitted). I would recommend some elaboration. 

 

1) It is not clear how new model advance against “classical” model. The authors could compare with their results in [Ratovsky et al., 2016, DOI: 10.1016/j.asr.2016.12.026]

2) Usually, when authors use term “electron content” they mean integrated parameters (for example, electron content profile can be latitudinal dependence of integrated electron content from satellite to GPS. As I understand, here the authors used it for electron density profiles. I would recommend to use that term. 

3) The figures if of low resolution.

4) Fig. 1. Add lines to separate (a) from (b) – when two panels are mentioned as (a) it is not so clear. Align (f) – caption.

5) Provide a reference for negative values due to Bias in RO.

6) What are test/validation/training sets? I am concerned if the same data were used for all sets.

7) It is not clear what was the ground truth and how do you exclude bad RO-profiles.

8) Why 1/5 and 1/6 in (1-3)?

9) Fig. 5a – It seems that wrong zero-point appear and make a line in the bottom of profile.

10) I wonder why the authors obtained so poor classification for Es. It has quite distinguish features. 

11) For further studies I would suggest to involve inosonde data to make a model and then classify RO data.

12) Also, I would note that with some modification suggested technique could be used for global ionosphere modeling like for TEC (see, for example [Zhukov et al., 2020, DOI: 10.1007/s10291-020-01055-1])

 

Author Response

Reply to Reviewer #1

We very appreciate the Reviewer’s comments and suggestions. They are very useful to improve our manuscript.

 

1) It is not clear how new model advance against "classical" model. The authors could compare with their results in [Ratovsky et al., 2016, DOI: 10.1016/j.asr.2016.12.026]

It is hard to compare directly our classification models with the classical IRI model as well as with ground based observations of ionosondes and incoherent scatter radars because of such transient phenomena as Es layer and scinitllations. On the other hand, we can say that the RO technique provides truthful data on EC height profiles. We add the following sentence in the Introduction Section (lines 58-61):

“The comparison of the FS-3/C-1 RO technique with the standard techniques of ground based digisonde and incoherent scatter radar at middle latitudes demonstrated very good agreement between the EC profiles obtained by different techniques both in the bottom and topside ionosphere [12(Ratovsky et al., 2017)].”

 

2) Usually, when authors use term "electron content" they mean integrated parameters (for example, electron content profile can be latitudinal dependence of integrated electron content from satellite to GPS. As I understand, here the authors used it for electron density profiles. I would recommend to use that term.

In the revised manuscript, we have replaced the “electron content” with the “electron density” and use abbreviation EC for the latter one.

 

3) The figures if of low resolution.

The quality of Figures was improved

 

4) Fig. 1. Add lines to separate (a) from (b) - when two panels are mentioned as (a) it is not so clear. Align (f) - caption.

Additional lines have been added in order to separate the panels.

 

5) Provide a reference for negative values due to Bias in RO.

We add the following sentence and references in the paper (lines 51-54):

“In routine GNSS RO measurements, the effect of bias term cannot be always eliminated. That results in unexpected negative values in the resultant EC profiles in the bottomside ionosphere [9,10 (Yue et al., 2011; Li, J.; Jin, 2016)].”

Yue, X.; Schreiner, W.S.; Rocken, C.; Kuo, Y.-H., Evaluation of the orbit altitude electron density estimation and its effect on the Abel inversion from radio occultation measurements. Radio Sci. 2011, 46, RS1013, doi:10.1029/2010RS004514.

Li, J.; Jin, S. Second-order ionospheric effects on ionospheric electron density estimation from GPS Radio Occultation. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2016, 3952-3955, doi: 10.1109/IGARSS.2016.7730027.

 

6) What are test/validation/training sets? I am concerned if the same data were used for all sets.

The number of samples for the training and the test set is added as Table 2 accordingly. The training set does not have intersection with the test set. We add the following text in the revised manuscript (lines 150-154):

“We evenly split the six classes into training and test sets by the percentage 80% and 20% respectively. Due to insufficient size of the data, we do not further split the validation set from the training set. For the same reason, we apply the cross-validation to reduce the bias of the model evaluation. The exact number of samples in the training and the test set with respect to each class is shown in Table 2.”

 

7) It is not clear what was the ground truth and how do you exclude bad RO-profiles.

The ground truth is presented in Supplementary Materials. The bed RO profiles were not used for the modelling. We added the following explanation in the Section 2. Experimental Data (lines 117-124):

“We have analyzed around seven thousands of RO profiles obtained during the years from 2011 to 2013 (solar maximum) and selected manually only those containing prominent features described above (totally 712 profiles). The number of samples selected for each class is listed in Table 1. All the samples of each class are presented in Supplementary Materials. They are used as a ground truth for classification models (see the next Section). It should be noted that we excluded from the consideration bad or corrupted RO profiles. We also did not consider profiles containing simultaneously several prominent layers such as Es, F1 and F2 layers in order to avoid misclassifications during the modeling.”

 

8) Why 1/5 and 1/6 in (1-3)?

The value 1/5 corresponds to the average over five test sets. The value 1/6 corresponds to the average over six classes. The explanation to the two values are improved accordingly. We notice it in the revised manuscript. (lines 219-222):

“ is the average of five confusion matrices for the models evaluated by the corresponding test sets, is the average accuracy of the i-th class over the five test sets, and is the average accuracy over the five test sets and the six classes.”

 

9) Fig. 5a - It seems that wrong zero-point appear and make a line in the bottom of profile.

No, the peak in the E-layer is supported by several firm data points. However, Figure 5 was revised substantially.  

 

10) I wonder why the authors obtained so poor classification for Es. It has quite distinguish features.

We cannot say that the accuracy of ~0.7 for Es in the CNN model is poor classification. We have recalculated the models on the base of corrected data set. After that, the accuracy increased up to 0.75.

In the Discussion section, we discuss this issue (lines 277-285)

 “Both models demonstrate a moderate accuracy (>0.6) in identification of the sporadic Es layer (F2E class) and F1 layer (F2F1 class). The numbers in the confusion tables indicate that the two models either have a preference to classify F2E as Sc classes or to classify F2F3 as F2ED. This could be due to loss of position information in the max-pooling [25] or noisy RO structures…. Note that although the performance for the F2F3 class is poor, it is still much higher than the lucky guess of 0.17. ”

 

11) For further studies I would suggest to involve inosonde data to make a model and then classify RO data.

12) Also, I would note that with some modification suggested technique could be used for global ionosphere modeling like for TEC (see, for example [Zhukov et al., 2020, DOI: 10.1007/s10291-020-01055-1])

We add the following paragraph in the end of 5. Discussion Section (lines 335-340):

“From confusion matrices shown in Figure 4, it can be seen that the number of misclassifications for most of classes is relatively small. In the future, it will be interesting to consider a possibility of applying the present technique for the classification of ionospheric height profiles acquired from the ground-based ionosondes in addition to the spaceborne RO data. This technique with some modification might be also useful for ionospheric modeling such as global TEC models [29 (Zhukov et al., 2021)].”

Author Response File: Author Response.doc

Reviewer 2 Report

See attached PDF

Comments for author File: Comments.pdf

Author Response

Reply to Reviewer #2

We are very grateful to the Reviewer for valuable comments and suggestions, which help us to significantly increase the quality of the paper.

 

- Please consider adding more information on how the Table 1 dataset is obtained. How is the initial classification done and what is the certainty that misclassifications are not contaminating your dataset.

We added the following text in the Section 2. Experimental Data (lines 117-124):

“We have analyzed around seven thousands of RO profiles obtained during the years from 2011 to 2013 (solar maximum) and selected manually only those containing prominent features described above (totally 712 profiles). The number of samples selected for each class is listed in Table 1. All the samples of each class are presented in Supplementary Materials. They are used as a ground truth for classification models (see the next Section). It should be noted that we excluded from the consideration bad or corrupted RO profiles. We also did not consider profiles containing simultaneously several prominent layers such as Es, F1 and F2 layers in order to avoid misclassifications during the modeling.”

 

 

- Consider explaining some basic information on what layers such as convolution and max-pooling do within a neural network architecture. It does not have to be very extended but some basic information should be provided. See for example [1] for motivation.

The required information is added to the model description accordingly (lines 172-175):

“The convolution kernel can extract features from the RO profile. The max-pooling layer down-samples the profile which allows the next convolution kernel to extract more rough features.”

 

 

- The same applies for the more advanced layers such as batch normalization and for the techniques of hyperparameter grid search and cross-validations. I understand that you cannot go into too much detail on these, but it should be the case that these are explained and introduced before you present them. I suggest a format like “We include batch-normalization layers… these layers can improve the training time of the neural network….” and don’t forget to reference key articles when applying such techniques (i.e., [2])

The required information is added to the model description accordingly

Lines 177-182

“In this part, the output array from the convolutional layers is firstly flattened into a 1D-array, which is then imported to 3 fully-connected (FC) layers and 2 dropout layers, which randomly sets the input to 0 to reduce the overfitting [20].”

Lines 188-189:

“The batch normalization layer regularizes the input and can accelerate the convergence [21].”

 

- Please include the separation between train/validation/test or train/validation (not 100% sure what you do since it is not fully explained as far as I could see) in a clear manner. Consider adding a similar table as table 1 but with the numbers of samples taken for train and test. This is extremely important in such multi classification problems that involve imbalanced datasets.
The number of samples for the training and testing sets is added as Table 2 accordingly. The training set does not have intersection with the testing set. Due to insufficient size of the data, we do not further split the validation set from the training set. The description to the dataset splitting is added in the end of Section 2 (lines 150-154):

“We evenly split the six classes into training and test sets by the percentage 80% and 20% respectively. Due to insufficient size of the data, we do not further split the validation set from the training set. For the same reason, we apply the cross-validation to reduce the bias of the model evaluation. The exact number of samples in the training and the test set with respect to each class is shown in Table 2.”


- Please explain in the text the hyperparameters (k, n, r, d, p) not just on the figure captions.

The explanation of the hyperparameters was presented in the text (lines 189-193):

“In the model, 1-D convolutions of kernel size k are repeated by r times. Each time there are n features extracted by the kernels. The convolution process is followed by a batch normalization layer and a max-pooling layer to form a pooling process. The pooling process is repeated by d times, and is then flattened with a dropout rate of p.”

 

- Please include what framework/library you used for your modelling process (MATLAB, TensorFlow, keras, Pytorch etc.) and please proceed to appropriately reference its use.
The description to the framework and the library are added in the beginning of Section 3 (lines 164-165):

“The models are constructed using the Keras library under the Tensorflow framework [19].”

 

- Consider adding some typical scores/evaluation metrics apart from accuracy and confusion matrices. These could include ROC curve, recall, precision and F1 score. This is an important matter in adjacent fields to yours (e.g., solar physics [3]) and it will be important to become known in all applied machine learning fields.

The related metrics including the recall rate, precision, binary accuracy, and F1-score of the two models are added as Table 3. The related descriptions are also added to Section 4 (lines 262-267).

 

Further analysis may also be included or at least commented on. My personal suggestion is to include some of the suggestions below since your dataset is small enough to be directly applicable:

- Your dataset could definitely be used to perform a leave-one-out validation (similar to K-fold cross validation but for k=1) as discussed in other works (e.g., [4, 5]). This would give a “real” value of the accuracy you obtain where you use “all your dataset” and predict the one remaining and sequentially re-train your model with the rest of the dataset to test the next one and so on.

See our common reply to the next Comment

 

- Very importantly, while you have a clear imbalanced dataset you are not using (or at least not mentioning) any techniques that could greatly improve your results (see package [6]). There are many ways to do that. Starting from the simplest one to impose an imbalance weighted change on the training process to under sampling/oversampling techniques (e.g., [7]), or even directly adapting the loss function to take under consideration the different ratio between your classes (see [4], eq 2.).

I strongly believe such adaptations would provide better results. Consider adding them or elaborate why you avoid their use in your response.

We have recalculated the models in several different ways. Although the poor performance on the imbalanced class can be improved by adjusting the weighting in the loss function, we found that the performance of the classes that are not affected by the imbalance problem will decrease accordingly. We have add the following text in Discussion Section (Lines: 306-313):

“Although by adjusting the weighting in the loss function can improve the poor performance on the imbalanced class [26], we found that the performance on the unaffected classes also decreased accordingly. As mentioned earlier, due to the size of the dataset, we apply the cross-validation technique to reduce the bias in the model evaluation. However, it is still computationally expensive to analyze the change of error in the model evaluation with respect to the size of cross-validation [26, 27, 28]. As a result, further adjustments on the weightings of the loss function, and a more detailed analysis of the splitting size of cross-validation are considered as future works.”

 

 

All figures: Consider adding them in vector graphic format to allow better readability (aka .pdf, .eps rather than.png .jpg)

 

Figure 1: Please re-make this figure with much larger font size. At the moments it is literally impossible to read or evaluate it. Also ideally put the labels (a, b, c, d, e) above the figure rather than below since the figure is read top to bottom.

The Figure was improved

 

 

Figure 4: In the confusion matrix, consider including the absolute number for each case (not just the percentage), explain the colorbar in the caption.

The absolute number of the two confusion matrices are added as Figure 4c and Figure 4d, and the explanations to the colorbars are added to the caption accordingly.

 

 

Figure 5: Same as Figure 1

The Figure was improved

 

 

Minor Concerns:

Discussion section: ---------------------------------------------

The discussion section at this moment is not much of a discussion but mainly a re-establishment of the results presented above. Please consider elaborating on a number of matters that could be interesting from a physical and methodological point of view such as:

 

- Elaborate on why both models achieve a worse than luck (<50%) accuracy for F2F3 class. Can this be the result of a problematic input dataset? or mislabeling between other classes? How accurate is your ground truth to begin with?
The explanation of the misclassification of F2F3 was presented in the Discussion Section:

Lines 282-285:

"The statistics of this class is relatively low (see Table 1) because the F3 layer occurs relatively rare, which might be a reason of the low performance for prediction. Note that although the performance for the F2F3 class is poor, it is still much higher than the lucky guess of 0.17."

Lines 299-305:

“The prediction of relatively weak F3 layer is difficult for both models and it is often misclassified as F2ED class. In addition to poor statistics, this shortcoming can also be the result of a problematic input dataset. As one can see in Figure 5e, the prominent F3 layer at heights >400 km is accompanied by a weak EC enhancement in the bottomside ionosphere at heights <200 km. The latter is the distinct feature of the F2ED class. This situation is observed often such that it is very difficult to select samples with the pure F3 layer.”

 

 

- Are there other works (apart from [8]) that have applied different techniques for your problem in the past that you would like to mention and compare to? Maybe you can add a few more things there if they exist.
We do not know other works devoted to the ML application for classification of RO profiles. We will appreciate the Reviewer for recommendation of any relevant paper.

- Since the result as you write on lines 229-237 are not better than DIAS, maybe a further discussion on possible future work can be added that could be beneficial for other research groups to tackle this problem?
There are two crucial differences between the DIAS and our technique: 1. the size of data sets and 2. the dimensions. We explain these important issues in Discussion Section (lines 314-321):

“While the average level of accuracies obtained in this study is lower than that obtained in the DIAS ground based ionosonde technique, which gives the average score of ~0.91 [11]. For the F2, F1 and E layers, our classification provides the accuracies comparable with the ARTIST technique [15]. The DIAS technique resolves the 2D image segmentation problem for 19,000 ionograms. Hence, this very deep learning technique is based on huge number (at least 17 millions) of training parameters. While, it is interesting to test whether an 1D version of DIAS will benefit the classification of RO profiles, the number of parameters may be too large to be trained for our small dataset.”


Also consider providing a list of the misclassification observed by the dataset at the supplementary file for further investigation by your peers.
Full lists of both the correct and incorrect classifications are presented in Supplementary Materials. There are 36 subdirectories for each model, named as 00, 01, 02, ..., to 55 corresponding to the i-row, j-th column of the absolute number of the confusion matrices.

 

 

Edits/Typos:

- Line 33: known know

- Line 45: eliminated. This results eliminated that results

- Line 64: widely accepted widely-accepted (Even better if you skip the expression fully though)

- Line 81: often including including often

- Line 83: six 6

- Line 135: model models

- Line 272: both the both

Corrected. Thank you for the help.

Author Response File: Author Response.doc

Reviewer 3 Report

This paper applied CNN to classify the height profiles of electron content measure through radio occultation in COSMIC/Formosat-3 mission. Based on prominent ionospheric layer and distorted profiles, six classes of height profiles are distinguished. 1 dimensional CNN and fully connected network have been applied for classification.

I have several points and comments that will enhance the paper structure and its readability.

At first, the abstract is written very poor. Readers cannot grab the actual theme of what has been done in the paper. There are no background knowledge and definition nor the problems or challenges explained.

The keywords are not enough and explanatory.

Similarly, the introduction section does not cover any visual details to reflect the generic flow of the method. The main contributions are not clear, authors are suggested to add the contributions in bullets form after the challenges and problems existed in classification of ionospheric height profiles.

The scientific language of the paper is very weak. Also, there are several confusing sentences such as, FS-7/C-2 has a low inclination that makes it possible detail scanning of the ionosphere at low latitudes using a TriGNSS Radio Occultation System (TGRS) that receives the refracted signals...what authors mean here? Please rephrase it avoid such confusion from all the manuscript.

Similarly, the contents are misplaced, for instance, section2, experimental data should be discussed in Results section. The main proposed section is not discussed very well to reflect the working of CNN. Authors can get thoughts and expression from https://doi.org/10.1002/int.22537 and DOI: 10.1109/TII.2021.3116377 by their inclusion in the literature.

In figure 2, what 261 as input represents?

Several claims and statements have no supportive references. For instance, Section 4 and 5 have even no single reference.

Finally, there are several typos and grammatical mistakes. Authors need to work out on the sentence structure.

Author Response

Reply to Reviewer #3

We thank the Reviewer for the comments and suggestions. They were very useful for the improvement of our paper. The manuscript was revised accordingly.

 

At first, the abstract is written very poor. Readers cannot grab the actual theme of what has been done in the paper. There are no background knowledge and definition nor the problems or challenges explained.

We modified the beginning of abstract accordingly:

“Modern space missions provide a great number of height profiles of ionospheric electron density measured by the remote sensing technique of radio occultation (RO). The deducing of the profiles from the RO measurements suffers from a bias resulting in negative values of the electron density. We developed a machine learning technique which allows automatic identification of ionospheric layers and avoids the bias problem. An algorithm of Convolutional Neural Networks was applied for classification of the height profiles.”

 

The keywords are not enough and explanatory.

We modify the keyword as the following:

“radio occultation; ionospheric layers; machine learning; Convolutional Neural Networks”

Here:

radio occultation - a kind of remote sensing experimental techniques 

ionospheric layers - the structure of the ionosphere

machine learning - general method for the data analysis and modeling

Convolutional Neural Networks – a particular ML technique for modeling

We will appreciate the reviewer for any advice of more explanatory keywords

 

Similarly, the introduction section does not cover any visual details to reflect the generic flow of the method. The main contributions are not clear, authors are suggested to add the contributions in bullets form after the challenges and problems existed in classification of ionospheric height profiles.

We add the bullets in the end of Introduction (lines 77-93):

“In the present paper, we propose a new approach for identification and classification of the ionospheric layers. The approach is based on the application of machine learning for the analysis and modeling of EC height profiles acquired from the spaceborne RO GPS technique in the FS-3/C-1 mission. This technique makes it possible to automatically treat the great amount of EC data for determination of the key ionospheric layers and avoid the bias problem. It can be also applied easily for the data acquired from the FS-7/C-2 mission.

The study is organized as the follows:

  • Section 2 is a description of the experimental data. Six classes of the EC height profiles are introduced for the modeling.
  • Section 3 describes the two Convolutional Neural Network (CNN, [16, 17]) models used in this study for classification of the EC profiles.
  • Section 4 demonstrates the result of the CNN models applications for classification of the EC profiles and determination of the key ionospheric layers such as E (including sporadic) and F1, F2 and F3 layers.
  • Section 5 is discussion and comparison of the results obtained from different CNN models.
  • Section 6 is conclusions.”

 

The scientific language of the paper is very weak. Also, there are several confusing sentences such as, FS-7/C-2 has a low inclination that makes it possible detail scanning of the ionosphere at low latitudes using a TriGNSS Radio Occultation System (TGRS) that receives the refracted signals...what authors mean here? Please rephrase it avoid such confusion from all the manuscript.

We have rephrase the sentence (lines 61-64):

“The mission FS-7/C-2 has a low inclination. It is equipped by a TriGNSS Radio occultation System (TGRS) that receives the signals from numerous GNSS satellites [13]. That makes it possible to scan the low-latitude ionosphere in details and provide up to 4000 RO profiles per day.”

In addition, we try to simplify other complex sentence in order to make their content more clearly.

 

Similarly, the contents are misplaced, for instance, section2, experimental data should be discussed in Results section.

The models applied are strongly dependent on the content of experimental data. Without the basic information about the data, one cannot determine the requirements for the models such as the dimension, the number of nodes in the input and output layers (see below), etc. The Section 2 is necessary to introduce this important information.

 

The main proposed section is not discussed very well to reflect the working of CNN. Authors can get thoughts and expression from https://doi.org/10.1002/int.22537 and DOI: 10.1109/TII.2021.3116377 by their inclusion in the literature.

We revised the Sections 1 & 3 substantially. Namely:

- we add Table 2, where we list the number of samples in the training and testing data sets for all six classes (lines 150-154)

- we explain the configuration of the models and the training technique in more details.

 

In figure 2, what 261 as input represents?

This is the number of input nodes in the EC height profile. We have revised the text accordingly:

Lines 97-99:

“For the modeling, the profiles are interpolated with omni-distant step of 2 km in the range of heights from 80 to 600 km. As a result, the interpolated EC profile consists of 261 points.”

Lines 164 – 168:

“For the classification of EC height profiles, we use two different CNN models. The models are constructed using the Keras library under the Tensorflow framework [19]. The input layer of the models consists of 261 nodes corresponding the number of points in the EC height profiles interpolated from 80 to 600 km with the 2-km step. The output layer consists of 6 nodes corresponding to the number of EC profile classes.”

 

Several claims and statements have no supportive references. For instance, Section 4 and 5 have even no single reference.

We add references to these Sections

[24] – to the Section 4. Results (line 263).

[25 – 29] - to the Section 5. Discussion.

 

Finally, there are several typos and grammatical mistakes. Authors need to work out on the sentence structure.

We have corrected the typos and improved the structure of sentences

Author Response File: Author Response.doc

Round 2

Reviewer 2 Report

The authors have covered my points and questions fully to my satisfaction. Therefore, I believe the paper is now fine and should be published.

However, it would be great if in the final submission there will be some extra editorial check to make the figures slightly better. While improvements were made on the scientific aspect, the figures could still use some refinement. In particular, Figures 1 and 5 remain very hard to see and could also use some alignment on their labeling (a, b, c…) since at the moment they are not well-made.

As I understand, figures 1 and 5 are made in a way that increasing their font size may be difficult, however it's important to make them visible. Consider changing their form from .gif to anything else and align them via the use of any tools (inkscape/ppt/photoshop etc.).

The comment above does not reflect my opinion on publication, as the changes and the response of the authors were fine. It is simply a suggestion to make your work more impactful for the future readers and to be easily read.

Back to TopTop