Next Article in Journal
An Overview of Ecosystem Changes in Tibetan and Other Alpine Regions from Earth Observation
Next Article in Special Issue
A Fast Registration Method for Optical and SAR Images Based on SRAWG Feature Description
Previous Article in Journal
Automatic Defect Detection of Pavement Diseases
Previous Article in Special Issue
Scene Changes Understanding Framework Based on Graph Convolutional Networks and Swin Transformer Blocks for Monitoring LCLU Using High-Resolution Remote Sensing Images
 
 
Article
Peer-Review Record

Multiplicative Long Short-Term Memory with Improved Mayfly Optimization for LULC Classification

Remote Sens. 2022, 14(19), 4837; https://doi.org/10.3390/rs14194837
by Andrzej Stateczny 1,*, Shanthi Mandekolu Bolugallu 2, Parameshachari Bidare Divakarachari 3, Kavithaa Ganesan 4 and Jamuna Rani Muthu 5
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Remote Sens. 2022, 14(19), 4837; https://doi.org/10.3390/rs14194837
Submission received: 19 August 2022 / Revised: 23 September 2022 / Accepted: 23 September 2022 / Published: 28 September 2022
(This article belongs to the Special Issue New Advancements in Remote Sensing Image Processing)

Round 1

Reviewer 1 Report (New Reviewer)

In this paper the authors propose a strategy to classify LULC types using the multiplicative Long Short-Term Memory network with Improved Mayfly Optimization method. While the topic is interesting and a method for this purpose highly useful for LULC classification studies, the paper suffers from several style and content issues that need to be thoroughly revised. Following, I provide some major suggestions for improving the paper.

 

I strongly suggest re-writing the introduction and discussion sections, especially the discussion section. This is quite unclear, descriptions of the proposed IMO with mLSTM should be provided and why the method can achieve better classification accuracy. Compare with conventional methods, the advantages and characteristics of this method need to be mentioned. Which land use is more suitable for this method? This should be clarified.

 

The structure is not a good fit for the paper. The first three sections need to be adjusted and merged in a new Introduction. The authors have to analyze in-depth the existing contributions to the related topic, instead of listing relevant literatures.

 

Result section confuses the content of the Method. Equations (24), (25), and (26) should be in the method section. Result and discussion sections should be elaborated separately.

 

Conclusion section should focus on the experimental results and innovative findings, rather than a simple process introduction.

 

The quality of maps was quite low, and the figures and tables need to be rearranged to represent core results more clearly.

Author Response

Reviewer 1:

  1. I strongly suggest re-writing the introduction and discussion sections, especially the discussion section. This is quite unclear, descriptions of the proposed IMO with mLSTM should be provided and why the method can achieve better classification accuracy.Compare with conventional methods, the advantages and characteristics of this method need to be mentioned. Which land use is more suitable for this method? This should be clarified.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have updated introduction and discussion with proper descriptions at section 1 and 2.

  1. The structure is not a good fit for the paper. The first three sections need to be adjusted and merged in a new “Introduction”. The authors have to analyze in-depth the existing contributions to the related topic, instead of listing relevant literatures.

Answer:

Thank you for your important comments. We have provided the overall problem of this research by analysing the in-depth concept of existing contributions which is provided at the end of literature section.

From the overall literature works, still, there is a significant need for information about the environment and natural resources, many maps and digital databases that are already in existence were not especially created to fulfil the needs of different users. The kind of classification or legend employed to explain fundamental facts such as land cover and land use is one of the primary causes, while being typically underappreciated. Many of the current classifications are either focused on a single project or use a sectoral approach, and they are generally not comparable to one another. Although there are numerous categorization systems in use around the globe, no single one is universally recognized as the best way to categorize land use or cover. In order to overcome this, this research created a new Land Cover Classification System name called IMO-mLSTM. The suggested methodology is extensive in that it can easily handle any identifiable land cover found anywhere in the world and is applicable at any size. Additionally, the system can be used to evaluate the coherence of current categories, which is clearly described in the following sections.

The above information is updated at the literature section.

  1. Result section confuses the content of the Method.Equations (24), (25), and (26) should be in the method section. Result and discussion sections should be elaborated separately.

Answer:

Thank you for your useful comments. We have shifted the equations (24), (25), and (26) into the section 3.5.2. Additionally, we have provided the Results and discussion as a separate one (section 4).

  1. Conclusion section should focus on the experimental results and innovative findings, rather than a simple process introduction.

Answer:

Thank you for your valuable comments. We have updated the conclusion section with experimental results and innovative findings.

While related to existing LULC classification techniques, the proposed method outperformed them in terms of recall, accuracy, and precision. The simulation results conclusively demonstrate that the suggested IMO-mLSTM approach attained the classification accuracy of 99.99% on Sat 4, 99.98% on Sat 6, and 98.52% on Eurosat datasets.

The above information is updated at conclusion section.

  1. The quality of maps was quite low, and the figures and tables need to be rearranged to represent core results more clearly.

Answer:

Thank you for your useful comments. We have updated the quality of maps (Figure 9, 12, 14). Additionally, we have updated the figures (5 to 16) and tables (2 to 10) at section 4.

Reviewer 2 Report (New Reviewer)

Dear Authors,

Thank you for submitting the manuscript to Remote Sensing journal. This manuscript has interesting work but need some modifications in it.

1. Please modify the images having large fonts.

2. Comparative analysis conducted with other works seems contradictive. Please modify or remove it.

Thank you.

Author Response

Reviewer 2:

  1. Please modify the images having large fonts.

Answer:

Thank you for your valuable comments. We have updated the images (5 to 16) with uniform font sizes.

  1. Comparative analysis conducted with other works seems contradictive. Please modify or remove it.

Answer:

Thank you for your important comments. We have modified the comparative analysis (Table 9) at section 4.4.

Reviewer 3 Report (New Reviewer)

Reviewer’s Report on the manuscript entitled:

Multiplicative Long Short-Term Memory with Improved May-fly Optimization for LULC Classification

The authors classified Sat 4, Sat 6, and Eurosat datasets using Long-Short Term Memory (LSTM) with Improved Mayfly optimization and achieved the overall accuracy of 99.99% on Sat 4, 99.98% on Sat 6 and 98.52 % on Eurosat datasets. In my view, the topic is interesting, however, the literature review is incomplete and the presentation and structure need improvement. Below, please see my comments:

 

Abstract. Which bands on Eurosat and Sat 4, and Sat 6 were used for the LULC classification? RGB? Please mention this.

 

Line 95. Please include this survey article:

https://doi.org/10.3390/rs12071135

Literature review.

Please include the following recent articles:

Naushad et al. [https://doi.org/10.3390/s21238083] showed that the ResNet-50 architecture achieved 99.17% overall accuracy for Eurosat datasets using RGB bands. This reference should also be mentioned in the Literature review and in Table 9. It seems that the deep transfer learning architecture in this paper outperformed your proposed method applied to Eurosat data: 99.17% vs 98.52%. This needs to be discussed as well.

Please also briefly describe the following article:

LULC classification using U-Net:

https://doi.org/10.3390/rs13183600

 

Section 3. Problem statement. This can be moved to the end of Introduction (Section 1) in Line 95 before mentioning the major contributions. Then, make sure to update lines 106-110.

 

Line 235. Please end the sentence and start a new sentence. Also, please comment of the advantages and disadvantages of using Equation (1) vs Equation (2).

 

Line 414. Needs reference.

Line 502. Please name it “Results and Discussion”

 

Figure 5,6,7,8,10,11, 13, 16 look too bulky. Please consider producing them with a higher quality. Also, the scenes in Figures 9, 12, 14 can be enlarged relative to the figure size.  

 

Sections 5.1, 5.2, 5.3. Please add the confusion matrices for each case.

 

The limitations of this study should be mentioned in the Conclusion section.

Thank you!

Regards,

Author Response

Reviewer 3:

Multiplicative Long Short-Term Memory with Improved May-fly Optimization for LULC Classification

  1. Which bands on Eurosat and Sat 4, and Sat 6 were used for the LULC classification? RGB? Please mention this.

 Answer:

Thank you for your valuable comments. We have updated the abstract with bands detail.

Various spectral feature bands are involved, but unexpectedly little consideration has been given to these characteristics in deep learning models. Due to the wide availability of RGB models in computer vision, this research is mainly utilized RGB bands only.

The above statement is updated at abstract part.

  1. Line 95. Please includethis survey article:

https://doi.org/10.3390/rs12071135

Answer:

Thank you for your important comments. We have included the article [14] at the introduction part.

Cited Reference:

[14] Talukdar, Swapan, Pankaj Singha, Susanta Mahato, Swades Pal, Yuei-An Liou, and Atiqur Rahman. "Land-use land-cover classification by machine learning classifiers for satellite observations—A review." Remote Sensing 12, no. 7 (2020): 1135.

Literature review. 

  1. Please include the following recent articles:

Naushad et al. [https://doi.org/10.3390/s21238083] showed that the ResNet-50 architecture achieved 99.17% overall accuracy for Eurosat datasets using RGB bands. This reference should also be mentioned in the Literature review and in Table 9. It seems that the deep transfer learning architecture in this paper outperformed your proposed method applied to Eurosat data: 99.17% vs 98.52%. This needs to be discussed as well. 

Answer:

Thank you for your useful comments.

By reaching 99.17% accuracy, the results demonstrate that the ResNet-50 [24] has outclassed the proposed best outcomes in terms of accuracy performance. While the proposed IMO-mLSTM converted the RGB into a greyscale image, the existing ResNet-50 [24] have used RGB version of the EuroSAT dataset. As a result, the classification accuracy was lower than the ResNet-50 that was already in place [24]. If the proposed method were to train on RGB data (original data), it would be able to outperform ResNet-50 [24]. We hope, this proposed method will be a better option for LULC classification.

The above statement is updated at section 4.4.

Cited Reference:

[24] Naushad, Raoof, Tarunpreet Kaur, and Ebrahim Ghaderpour. "Deep transfer learning for land use and land cover classification: A comparative study." Sensors 21, no. 23 (2021): 8083.

  1. Please also briefly describe the following article: 

LULC classification using U-Net: https://doi.org/10.3390/rs13183600

Answer:

Thank you for your valuable comments. We have briefly described the article [21] at section 2.

Cited Reference:

[21] Solórzano, Jonathan V., Jean François Mas, Yan Gao, and José Alberto Gallardo-Cruz. "Land use land cover classification with U-net: Advantages of combining sentinel-1 and sentinel-2 imagery." Remote Sensing 13, no. 18 (2021): 3600.

  1. Section 3. Problem statement. This can be moved to the end of Introduction (Section 1) in Line 95 before mentioning the major contributions. Then, make sure to update lines 106-110.

 Answer:

Thank you for your important comments. We have moved the problem statement (section 3) and combined with the introduction part. Also, we have updated the structure of this research work.

  1. Line 235. Please end the sentence and start a new sentence. Also, please comment of the advantages and disadvantages of using Equation (1) vs Equation (2).

 Answer:

Thank you for your useful comments. We have provided the advantages and disadvantages of using Equation (1) vs Equation (2) at section 3.2.

In z-score normalization, the data is normalized with respect to its mean () and standard deviation (). For z-score, a normal distribution is often assumed. The distribution to the left and right of the origin line has not been equal if the data is unbalanced.

This normalizing method rely on a mean and standard deviation of the data, that might change over time, these normalization methods are helpful for maintaining the linkages between the original input data. Normalization is the finest method for image enhancement since it improves image quality without losing image information [32].

The above information is updated at section 3.2.

  1. Line 414. Needs reference.

Answer:

Thank you for your important comments. We have provided the citations for the section 3.5.

Cited Reference:

 

[33] Alshari, Eman A., and Bharti W. Gawali. "Development of classification system for LULC using remote sensing and GIS." Global Transitions Proceedings 2, no. 1 (2021): 8-17.

  1. Line 502. Please name it “Results and Discussion”

 Answer:

Thank you for your valuable comments. We have provided the name “Results and Discussion” for section 4.

  1. Figure 5,6,7,8,10,11, 13, 16 look too bulky. Please consider producing them with a higher quality. Also, the scenes in Figures 9, 12, 14 can be enlarged relative to the figure size.  

 Answer:

Thank you for your useful comments. We have updated the quality of figures (5,6,7,8,9,10,11,12,13,14,16) at section 4.

  1. Sections 5.1, 5.2, 5.3. Please add the confusion matrices for each case.

 Answer:

Thank you for your valuable comments. We have provided the confusion matrices for each case (figure 10, 14 and 17) at section 4.

 

  1. The limitations of this study should be mentioned in the Conclusion section.

Answer:

Thank you for your important comments. We have included the limitations of this study in the conclusion.

But, the proposed IMO with mLSTM has taken longer time and requires more memory to train the data. Therefore, the future research will integrate a hybrid optimization-based strategy in the developed framework for further enhancing the LULC classification with less durations.

The above statement is updated at the end of conclusion.

Round 2

Reviewer 1 Report (New Reviewer)

The authors have correctly addressed the comments. I provide some minor suggestions:

Please check the correctness of Equations (24), (25), and (26). 100 or 100%?

The quality of Figures 10, 14, 17 can be improved. You may change black to other colors for better contrast.

Author Response

Reviewer 1:

The authors have correctly addressed the comments. I provide some minor suggestions:

  1. Please check the correctness of Equations (24), (25), and (26). 100 or 100%?

Answer:

Thank you for your valuable comments. We have updated the equations (24), (25) and (26) at section 3.5.2.

  1. The quality of Figures 10, 14, 17 can be improved. You may change black to other colors for better contrast.

Answer:

Thank you for your valuable comments. We have updated the quality of Figures (10, 14, 17) at section 4.1.

Reviewer 3 Report (New Reviewer)

I would like to thank the authors for addressing my comments. I have some minor suggestions:

Line 134. Instead of listing all the authors simply say "in [24]" or "Naushad et al. [24]"

Line 478. Please only capitalize the first letter "Multiplicative LSTM"

The quality of Figures 9, 12, 16 can be improved. You may enlarge each panel and remove some of the blank spaces between the panels.

Lines 137-145. Please use your own words when you describe this to avoid possible plagiarism. 

Thank you

Author Response

Reviewer 3:

I would like to thank the authors for addressing my comments. I have some minor suggestions:

  1. Line 134. Instead of listing all the authors simply say "in [24]" or "Naushad et al. [24]"

Answer:

Thank you for your valuable comments. We have updated the Line 134 as “in [24]” instead of "Naushad et al. [24]".

  1. Line 478. Please only capitalize the first letter "Multiplicative LSTM"

Answer:

Thank you for your valuable comments. We have updated the Line 477 (Instead of 478) as "Multiplicative LSTM" at section 3.5.2.

  1. The quality of Figures 9, 12, 16 can be improved. You may enlarge each panel and remove some of the blank spaces between the panels.

Answer:

Thank you for your valuable comments. We have updated the figures (9, 12, 16) quality by enlarging each panel and removed some of the blank spaces between the panels at section 4.1.

  1. Lines 137-145. Please use your own words when you describe this to avoid possible plagiarism. 

Answer:

Thank you for your valuable comments. We have reduced the plagiarism by updating the sentences (Lines 135-144) with the own words at section 2.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

the methods are not adequately described. the research design is not appropriate.

There is a semantic problem in the data processing, methods and results sections. There are many unanswered questions, such as why this satellite data was chosen, and how to analyze the number of different classes. There is a lot of misunderstanding even in the abstract part. For example, why are both tables and graphs used in the article even though they show the same results in the results section? Even this point is meaningless.

 

Is the abbreviation (LSTM)  in the title  necessary and important? Not everyone knows the LSTM acronym. But most readers are familiar with the abbreviation LULC. On the other hand, how logical is the expression “in Remote Sensing Image” for this study? Because there are hundreds of types of satellite imagery.

 

Page 1 line 22

“After selecting the images, hybrid feature extraction is performed using haralick texture features, oriented gradient histogram, and a local Gabor binary pattern histogram sequence to extract features from the images.”

 

In the sentence above, what does “After selecting the images” mean? This point should be well explained. There are many topics and methods such as band selection or image selection in satellite data.

 

Page 1 line 25

“The excellent discriminative strength and partially invariant to colour and grayscale images are two 25

significant advantages of hybrid feature extraction.”

The place of the above statement should not be the abstract part. The use of this sentence harms the semantic integrity of the paragraph.

 

Page 1 line 26

“the Improved Mayfly Optimization  (IMO) method is used to choose the optimal features.”

Was the IMO method developed and used in this study? Rather use an IMO method developed by someone else?

 

Page 1 line 27

“The suggested feature selection algorithms have several advantages, including a high learning rate and computational efficiency.”

In the sentence above, what is meant as “the suggested feature selection algorithms”? In the previous sentence, only the IMO method was mentioned. No proposed algorithm is mentioned. Therefore, there are problems in terms of semantic integrity in the part of the abstract section up to this point.

 

 

Page 1 line 28

“After getting the optimal feature settings, the LULC classes are classified using a multi-class classifier known as

the Multiplicative Long Short-Term Memory (mLSTM) network.”

 

In the sentence above, the phrase “known as the Multiplicative Long Short-Term Memory (mLSTM) network” should be written more clearly. Is the mLSTM method a known method in the literature or is it the name of the method developed in this study? which option?

 

Page 1 Line 32

“When compared to conventional methods, the suggested mLSTM method improved the accuracy by a minimum of 0.003 % and a maximum of 2.70 % on Sat 4, Sat 6, and Eurosat datasets.”

In the sentence above, the names of the methods expressed as "conventinal methods" should be written in the same sentence. Because there are dozens, hundreds of methods in the literature.

Author Response

Reviewer 1:

 

The methods are not adequately described. The research design is not appropriate. There is a semantic problem in the data processing, methods and results sections. There are many unanswered questions, such as why this satellite data was chosen, and how to analyze the number of different classes. There is a lot of misunderstanding even in the abstract part. For example, why are both tables and graphs used in the article even though they show the same results in the results section? Even this point is meaningless.

 

  1. Is the abbreviation (LSTM) in the title necessary and important? Not everyone knows the LSTM acronym. But most readers are familiar with the abbreviation LULC. On the other hand, how logical is the expression “in Remote Sensing Image” for this study? Because there are hundreds of types of satellite imagery.

Answer:

Thank you for your constructive comments. As per the reviewer’s comment, we have modified the title as “Multiplicative Long Short-Term Memory with Improved Mayfly Optimization for LULC Classification”.

  1. Page 1 line 22 “After selecting the images, hybrid feature extraction is performed using haralick texture features, oriented gradient histogram, and a local Gabor binary pattern histogram sequence to extract features from the images.” In the sentence above, what does “After selecting the images” mean? This point should be well explained. There are many topics and methods such as band selection or image selection in satellite data.

 Answer:

Thank you for your valuable comments.

The sentence “After selecting the images” denotes that “the satellite dataset image is initially performed by the pre-processing”. Those outputs are given to the feature extraction stage, which is explained at the abstract part.

  1. Page 1 line 25 “The excellent discriminative strength and partially invariant to colour and grayscale images are two significant advantages of hybrid feature extraction.” The place of the above statement should not be the abstract part. The use of this sentence harms the semantic integrity of the paragraph.

Answer:

Thank you for your important comments. We have placed the sentence “The excellent discriminative strength and partially invariant to colour and grayscale images are two significant advantages of hybrid feature extraction” at section 4.3.

  1. Page 1 line 26 “the Improved Mayfly Optimization  (IMO) method is used to choose the optimal features.” Was the IMO method developed and used in this study? Rather use an IMO method developed by someone else?

Answer:

Thank you for your useful comments. Improved Mayfly Optimization (IMO) method is not an existing method. Unlike existing method, Improved Mayfly Optimization is updated with weighting factor and integrated with mLSTM in this research.

  1. Page 1 line 27 “The suggested feature selection algorithms have several advantages, including a high learning rate and computational efficiency.” In the sentence above, what is meant as “the suggested feature selection algorithms”? In the previous sentence, only the IMO method was mentioned. No proposed algorithm is mentioned. Therefore, there are problems in terms of semantic integrity in the part of the abstract section up to this point.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have unified the feature selection name as IMO based feature selection algorithms in the entire paper.

  1. Page 1 line 28 “After getting the optimal feature settings, the LULC classes are classified using a multi-class classifier known as the Multiplicative Long Short-Term Memory (mLSTM) network.” In the sentence above, the phrase “known as the Multiplicative Long Short-Term Memory (mLSTM) network” should be written more clearly. Is the mLSTM method a known method in the literature or is it the name of the method developed in this study? which option?

Answer:

Thank you for your important comments. Explanation is given in Page 1 line 28 at abstract part.

The proposed method is named as Multiplicative Long Short-Term Memory (mLSTM) and it is not existing method. Unlike existing method, Improved Mayfly Optimization is integrated with mLSTM.

  1. Page 1 Line 32 “When compared to conventional methods, the suggested mLSTM method improved the accuracy by a minimum of 0.003 % and a maximum of 2.70 % on Sat 4, Sat 6, and Eurosat datasets.” In the sentence above, the names of the methods expressed as "conventinal methods" should be written in the same sentence. Because there are dozens, hundreds of methods in the literature.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have modified the name of the method in Line 32 as “Deep Belief Network” at the abstract part.

Reviewer 2 Report

Please see the attached file.

Comments for author File: Comments.pdf

Author Response

Reviewer 2:

 

This manuscript proposed a multiplicative LSTM model improved by Mayfly optimization for LULC classification. The idea is interesting and the result is basically satisfactory. However, some other problems in the manuscript are still concerned in the following: 

 

  1. In the experiments, could the authors compare the proposed method with more state-of-the-art methods to validate the effectivity more extensively? 

 

Answer:

Thank you for your important comments. Yes, we have compared the proposed method with more state-of-the-art methods [15], [16], [28] to validate the effectivity. Furthermore, we have included the existing method [29] in Table 9 at section 5.4.

 

Cited Reference:

 

[15] S. Singh, A. Bhardwaj, and V.K. Verma, “Remote sensing and GIS based analysis of temporal land use/land cover and water quality changes in Harike wetland ecosystem, Punjab, India”, Journal of Environmental Management, vol.262, pp.110355, 2020.

 

[16] S. Nayak, and M. Mandal, “Impact of land use and land cover changes on temperature trends over India”, Land Use Policy, vol.89, pp.104238, 2019.

 

[28] He, Tongdi, and Shengxin Wang. "Multi-spectral remote sensing land-cover classification based on deep learning methods." The Journal of Supercomputing 77, no. 3 (2021): 2829-2843.

 

[29] Papadomanolaki, M.; Vakalopoulou, M.; Zagoruyko, S.; Karantzalos, K. Benchmarking deep learning frameworks for the classification of very high resolution satellite multispectral data, ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, 2019, 3.

 

 

  1. The visual results of land use and land cover classification should be shown in the experiments. 

 

Answer:

Thank you for your useful comments. As per the reviewer’s comment, we have included the visual results of Sat4 (figure 9), visual results of Sat6 (figure 12) and visual results of Eurosat (figure 14) at section 5.1, 5.2 and 5.3 respectively.

 

  1. For image normalization, did the authors use moment matching in “Block adjustment-based radiometric normalization by considering global and local differences”? Please show the reference. 

 

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have included the references [31] and [32] for image normalization at section 4.2.

 

  • After collecting the satellite images, normalization and histogram equalization methods are undertaken for improving the quality of the images. Image normalization is also called as contrast stretching that changes the range of pixel values which helps in improving the visual quality of the collected satellite images.
  • Additionally, the histogram equalization technique adjust the contrast of the images by using the histogram values. In image enhancement, histogram equalization is the best technique, which delivers better image quality without losing the image information

 

Cited References:

 

[31] Singh, Dalwinder, and Birmohan Singh. "Investigating the impact of data normalization on classification performance." Applied Soft Computing 97 (2020): 105524.

 

[32] Xiong, Jianbin, Dezheng Yu, Qi Wang, Lei Shu, Jian Cen, Qiong Liang, Huanyang Chen, and Baocheng Sun. "Application of histogram equalization for image enhancement in corrosion areas." Shock and Vibration 2021 (2021).

 

 

  1. More details on feature extraction are suggested. 

 

Answer:

Thank you for your useful comments. As per the reviewer’s comment, we have described about feature extraction in detail at section 4.3.

 

  1. Some references missed the fundamental information, such as volume, number and page. Please check them very carefully. 

 

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have verified and updated the references with volume, number and page.

Reviewer 3 Report

In fact, the paper does not discuss the choices that were made.

It is more of a "research" design than a research paper.

Most of the explanations of the proposed methodology are just "copy/paste" of the underlying algorithms without clearly explaining the inputs/outputs and consequences of the underlying algorithm.

 

L.151-152 is totally wrong, just look for any article mentioning "positional coding".

Moreover, any convolution model takes into account a small part of the local spatial information.

 

L.169 Categorization of pixels ? The term "Sementic Segmentation" is more generic.

 

F.1 The use of "old fashioned" feature extraction seems questionable.

Why not use a small CNN to extract these optimized local features?

That way you don't need to select the "best" features since they are already optimized for your case.

 

L.230 The proposed Mayfly optimization algorithm is never discussed.

Why use this algorithm and not a brute force algorithm?

Why not use a more standard algorithm like the Sequential Feature Selector?

Why not use an algorithm that combines feature extraction and feature selection?

like Contextual Bandit with Adaptive Feature Extraction?

 

L.304 There is no discussion of extracted features.

One of the goals of the feature extractor is to drastically reduce the amount of features for performance (in time and accuracy).

Something like 128 or less but here more than 3k?

 

L.327 The input to the LSTM needs to be clarified! The LSTM takes a sequence as input.

In this paragraph, I understand that the sequence is composed of different resolutions.

Where does the time information @t come from?

 

L.369 image classification ? at the beginning it is a pixel classification (so a semantic segmentation).

Indeed, the dataset is intended for image classification, but the article speaks almost everywhere of pixels.

A clarification should be made.

 

L.373 it seems strange to learn a network on a CPU ... it is probably very slow ...

Author Response

Reviewer 3:

In fact, the paper does not discuss the choices that were made. It is more of a "research" design than a research paper. Most of the explanations of the proposed methodology are just "copy/paste" of the underlying algorithms without clearly explaining the inputs/outputs and consequences of the underlying algorithm.

  1. 151-152 is totally wrong, just look for any article mentioning "positional coding". Moreover, any convolution model takes into account a small part of the local spatial information.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have modified the line 151-152 at section 4.

  1. 169 Categorization of pixels? The term "Sementic Segmentation" is more generic.

Answer:

Thank you for your important comments. As per the reviewer’s comment, we have used the same notation as image classification in entire paper.

  1. 1 The use of "old fashioned" feature extraction seems questionable. Why not use a small CNN to extract these optimized local features? That way you don't need to select the "best" features since they are already optimized for your case.

 Answer:

Thank you for your useful comments. As per the reviewer’s comment, we have included the recent feature extraction methods (Harris Corner Detection) at section 4.3.

  1. 230 The proposed improved Mayfly optimization algorithm is never discussed. Why use this algorithm and not a brute force algorithm? Why not use a more standard algorithm like the Sequential Feature Selector? Why not use an algorithm that combines feature extraction and feature selection? like Contextual Bandit with Adaptive Feature Extraction?

 Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have explained the importance of proposed improved Mayfly optimization algorithm over brute force algorithm Sequential Feature Selector and Contextual Bandit with Adaptive Feature Extraction at section 4.4.

Traditionally, Brute force algorithm attempts to provide systematic relation among presented candidate and outlier data instances have less relation with other features, so this technique is not suitable for large dataset and dataset with outliers. Sequential Feature Selection technique select the features using network cross-validation and it vary for every data instances, so it does not provide stable performance. Contextual Bandit with Adaptive Feature Extraction is a cluster based model that is usually applied for unlabeled data or unsupervised classification. In supervised learning, applying Contextual Bandit with Adaptive Feature Extraction tends to loss of information in network learning. This research is supervised classification or based on labelled data, so Contextual Bandit with Adaptive Feature Extraction of cluster technique is not required for this model. So, here we have introduced an optimization algorithm name called Improved Mayfly Optimization which gives better performance.  The introduced improved mayfly makes more use of group information to decide its own behavior, which ensures the diversity of the group, thereby promoting the balance between exploration and exploitation stages and the search efficiency of the algorithm.

The same information is updated at section 4.4

  1. 304 There is no discussion of extracted features. One of the goals of the feature extractor is to drastically reduce the amount of features for performance (in time and accuracy). Something like 128 or less but here more than 3k?

Answer:

Thank you for your constructive comments. As per the reviewer’s comment, we have briefly discussed about the extracted features. Also, we have updated the extracted and selected feature values of proposed method in Table 1 at section 4.4.2.

  1. 327 The input to the LSTM needs to be clarified! The LSTM takes a sequence as input. In this paragraph, I understand that the sequence is composed of different resolutions. Where does the time information @t come from?

 Answer:

Thank you for your useful comments. As per the reviewer’s comment, we have updated the information about ‘t’ at section 4.5.2.

  1. 369 image classification? at the beginning it is a pixel classification (so a semantic segmentation). Indeed, the dataset is intended for image classification, but the article speaks almost everywhere of pixels. A clarification should be made.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have used the same notation as image classification in entire paper.

  1. 373 it seems strange to learn a network on a CPU ... it is probably very slow ...

Answer:

Thank you for your constructive comments. As per the reviewer’s comment, we have updated the system environment in terms of Graphic Processing Unit (GPU) which is described at the beginning of section 5.

Reviewer 4 Report

The authors proposed a novel approach for LULC classification using Multiplicative Long Short-Term Memory and Mayfly Optimization. The approach is original and interesting.

However I have remarks to improve the final version. 

The introduction is short and insufficient, it deserves to be more extended.

The relationship between the problem statement and the literature review is unclear. The authors should clarify the shortcomings of the related work that led to these statements.

The proposed approach is clear and well presented and argued. I just have a reservation about the quality of the images that can be improved.

Author Response

Reviewer 4:

The authors proposed a novel approach for LULC classification using Multiplicative Long Short-Term Memory and Mayfly Optimization. The approach is original and interesting.

However I have remarks to improve the final version. 

  1. The introduction is short and insufficient, it deserves to be more extended.

Answer:

Thank you for your valuable comments. As per the reviewer’s comment, we have extended the introduction part with valuable points.

  1. The relationship between the problem statement and the literature review is unclear. The authors should clarify the shortcomings of the related work that led to these statements.

Answer:

Thank you for your important comments. As per the reviewer’s comment, we have clarified the shortcomings of the related work and problem statement at section 3.

We have derived the problem statement (section 3) from the literature review (section 2), which are clarified as follows,

  • This data was only suitable for minimum class classification not for maximum class classification and showed poor performance in various environmental conditions - This point is derived from reference [22]

 

  • Given satellite image datasets are transmitted through the atmosphere, therefore the risk of data loss is higher. Also, failed to achieve better land use classification in the large datasets due to “curse of dimensionality” issue - This point is derived from reference [25.

 

  • In classification, the images corresponding to distinct classes have much more discriminating behaviors, so it affects the precision, accuracy and recall – This point is derived from reference [23]
  1. The proposed approach is clear and well-presented and argued. I just have a reservation about the quality of the images that can be improved.

Answer:

Thank you for your useful comments. As per the reviewer’s comment, we have improved the quality of the images (figure 1 at Section 4) and (figure 4 at section 4.5.2) respectively.

Round 2

Reviewer 1 Report

The authors clarified many of the questions raised in my previous review. The manuscript can be accepted as a scientific study in its current form.

Author Response

Reviewer 1:

The authors clarified many of the questions raised in my previous review. The manuscript can be accepted as a scientific study in its current form.

Answer:

Thank you for your valuable time and effort.

Reviewer 2 Report

I have no other concerns.

Author Response

Reviewer 2:

I have no other concerns.

Answer:

Thank you for your valuable time and effort.

Reviewer 3 Report

The research design is still not appropriate. There is still semantic problem in all sections with wrong sentences.
Previously addressed review was not correctly taken into account. The method is incorrectly described.
In addition, they is control missing on the experiment, and the method may be incorrectly evaluated.
For now it's black magic.

L84 - Reliability analysis was not previously given top priority in studies on image classification ?
> You mean that other research don't care about the performances and transferability of there models ? please ...

L87 - There are two fundamental issues with the existing remote sensing images: managing the vast volume of data and image noise
> the volume of information is not necessary a problem, it depend on the computation complexity, capability and storage.
with the emergence of deep-learning and the use of GPGPU this is not really a problem today, as well as the noise of images.
you should rather tell about image variance, witch is not the same

L126 - However, other metrics such as root mean square and recall were not calculated using this technique.
> What the point here ? does these metrics useful ? especially the RMSE ?

L139 - However, satellite data are frequently inaccurate or limited due to restrictions on data transfer equipment
> This does not mean anything

L145 -  These outcomes are better, but they are hard to generalize because they rely on huge number of diverse data sets.
> This sentence is wrong, because if they are based on very numerous and diversified data sets,
then the underlying model is necessarily more generalized than a model learned on data from a single database

L163 - even though, the satellite image was affected by small amount of sensor bands
> bad english ?

L165-175 - first point incomprehensible ; second point is wrongly stated : data loss ? curse of dimensionality ? third point : behavior ?
> These points should be rewritten

L177 - The focus of this research is to examine the impact of spatial information which has not been
employed in image classification to give additional info for remote sensing applications
and enhance the overall reliability of satellite image-based LC mapping.
> it's still wrong as previously addressed

L181 - Landscape measurements with and without categorization significantly enhances the classification performance [...] metrics
> This sentence should be clarify, what the point about spectral values and segmentation ? you write an article about image classification
not about segmentation or spectral decomposition ... you get lost in the explanations

L185 - Image classification is the process of assigning images to pre-defined classification
> [...] to pre-defined classes ...

L186 - Images are categorized in classification techniques
> Images are categorized USING classification techniques ... but the paragraph is useless

L264-237 and E2
> What the point of using a sigmoid like function as a normalization process ? Where this "technique" come from ?
did you have any reference that use such procedure ? is that a proposal ?
In the equation 2, if I is in [0-255] for RGB or in [0-16384] like most of multi-spectral images, then this equation will clip value to {0,1}.
I thing this is a wrong usage or wrongly explained, this should be clarified

# 4.3 Feature extraction

L238-285 > All feature extractor are explained without tell anything about the input data. Does each of them computed on each spectral band ?
Are they all computed over a sum of spectral bad used like a gray scale one ... the method is incorrectly described

# 4.3 Feature selection

L288-290 - Traditionally, Brute force algorithm attempts to provide systematic [...] other features, so this technique is not suitable for large dataset and dataset with outliers.
> This rely on the metric used to evaluate the fitting of the selected features, not on the underlying feature selection algorithm
Thus this sentence is wrong ! And brute force algorithm can be suitable, the only problem is the computation time.

L291-295 > The SSF (Sequential Feature Selection) does not rely on "network cross-validation" please read more about it. Same for the other algorithm.
The author does not propose a good discussion about the choice of the "Improved Mayfly Optimization"
The rest of this section about feature selection remain unclear as previously addressed.
The table 1 is not clear. For exemple the cell "9000 x 55", does this mean there is 9000 images with 55 features ?
Or is it a matrix of feature of size 9000x55 per image in the Eurosat ?

T1 - it clearly shows that the proposed IMO based feature selection drastically reduces the amount of extracted features from the collected dataset
> No it does not clearly show that, a reduction may occur (since I still don't know what this table realy mean),
but no metric are proposed to SHOW that good ones are selected

# 4.5

The classification procedure and the LSTM input/output is still incorrectly described
They are not sufficient control of the over-fitting, which may be the issue about the proposed performances

Author Response

Reviewer 3:

The research design is still not appropriate. There is still semantic problem in all sections with wrong sentences. Previously addressed review was not correctly taken into account. The method is incorrectly described. In addition, they is control missing on the experiment, and the method may be incorrectly evaluated. For now it's black magic.


  1. L84 - Reliability analysis was not previously given top priority in studies on image classification ? > You mean that other research don't care about the performances and transferability of there models ? please ...

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 87-90 at introduction part.

L87 - There are two fundamental issues with the existing remote sensing images: managing the vast volume of data and image noise > the volume of information is not necessary a problem, it depend on the computation complexity, capability and storage. with the emergence of deep-learning and the use of GPGPU this is not really a problem today, as well as the noise of images. you should rather tell about image variance, witch is not the same

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 92-97 at introduction part.

Cited Reference:

[21] Cheng, Gong, Xingxing Xie, Junwei Han, Lei Guo, and Gui-Song Xia. "Remote sensing image scene classification meets deep learning: Challenges, methods, benchmarks, and opportunities." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13 (2020): 3735-3756.

L126 - However, other metrics such as root mean square and recall were not calculated using this technique. > What the point here ? does these metrics useful ? especially the RMSE ?

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 136-137 at section 2.

L139 - However, satellite data are frequently inaccurate or limited due to restrictions on data transfer equipment > This does not mean anything

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 150-152 at section 2.

L145 -  These outcomes are better, but they are hard to generalize because they rely on huge number of diverse data sets. > This sentence is wrong, because if they are based on very numerous and diversified data sets, then the underlying model is necessarily more generalized than a model learned on data from a single database

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 157-160 at section 2.

L163 - even though, the satellite image was affected by small amount of sensor bands > bad English ?

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the Line 175-181 at section 2.

L165-175 - first point incomprehensible ; second point is wrongly stated : data loss ? curse of dimensionality ? third point : behavior ? > These points should be rewritten

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have updated the problem statements in Line 182-192 at section 3 as

  1. Existing deep learning techniques in classification of LULC have suffered from irrelevant feature selection. Irrelevant features selected in the model have biased towards misclassification.
  2. Some of the visual features of the images of various classes in LULC are highly similar. Some classes commonly share the similar features that tends to misclassification.
  3. Over-fitting problem occurs in deep learning model due to irrelevant feature selection and more training of the model. Optimal and adaptive feature selection helps to highly reduce the overfitting problem in the classification.”

L177 - The focus of this research is to examine the impact of spatial information which has not been employed in image classification to give additional info for remote sensing applications
and enhance the overall reliability of satellite image-based LC mapping. > it's still wrong as previously addressed

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have eliminated the unnecessary information and updated the paragraph in Line 185-195 at the beginning of section 4.

L181 - Landscape measurements with and without categorization significantly enhances the classification performance [...] metrics > This sentence should be clarify, what the point about spectral values and segmentation? you write an article about image classification
not about segmentation or spectral decomposition ... you get lost in the explanations

Answer:

Thank you for your useful comments. As per the reviewer’ comment, we have eliminated the unnecessary information and updated the paragraph in Line 185-195 at the beginning of section 4.

L185 - Image classification is the process of assigning images to pre-defined classification > [...] to pre-defined classes ...

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have eliminated the unnecessary information and updated the paragraph in Line 185-195 at the beginning of section 4.

L186 - Images are categorized in classification techniques > Images are categorized USING classification techniques ... but the paragraph is useless 

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have eliminated the unnecessary information and updated the paragraph in Line 185-195 at the beginning of section 4.

L264-237 and E2 > What the point of using a sigmoid like function as a normalization process ? Where this "technique" come from ? did you have any reference that use such procedure ? is that a proposal ? In the equation 2, if I is in [0-255] for RGB or in [0-16384] like most of multi-spectral images, then this equation will clip value to {0,1}. I thing this is a wrong usage or wrongly explained, this should be clarified

Answer:

Thank you for your useful comments. As per the reviewer’ comment, we have explained the description for normalization process along with reference [31] in Line 218-235 at section 4.2.

Cited reference:

[31] Singh, D., & Singh, B. (December 2020). Investigating the impact of data normalization on classification performance. Applied Soft Computing, 97, 105524. https://doi.org/10.1016/j.asoc.2019.105524

# 4.3 Feature extraction


L238-285 > All feature extractor are explained without tell anything about the input data. Does each of them computed on each spectral band ? Are they all computed over a sum of spectral bad used like a gray scale one ... the method is incorrectly described?

Answer:

Thank you for your important comments.

We have not computed on each spectral band, because, we have used air borne images which make it easier to gather data. This air borne images are converted into gray scale for further processing.

As per the reviewer’ comment, we have explained the description for feature extraction in Line 245-257 at section 4.3.

# 4.3 Feature selection


L288-290 - Traditionally, Brute force algorithm attempts to provide systematic [...] other features, so this technique is not suitable for large dataset and dataset with outliers.
> This rely on the metric used to evaluate the fitting of the selected features, not on the underlying feature selection algorithm Thus this sentence is wrong ! And brute force algorithm can be suitable, the only problem is the computation time.

Answer:

Thank you for your useful comments. I agree with your valuable points. Computation time is discussed with reason in the section 4.4.

As per the reviewer’ comment, we have explained the description for Brute force algorithm in Line 302-305 at section 4.4.

L291-295 > The SSF (Sequential Feature Selection) does not rely on "network cross-validation" please read more about it. Same for the other algorithm. The author does not propose a good discussion about the choice of the "Improved Mayfly Optimization" The rest of this section about feature selection remain unclear as previously addressed. The table 1 is not clear. For exemple the cell "9000 x 55", does this mean there is 9000 images with 55 features? Or is it a matrix of feature of size 9000x55 per image in the Eurosat?

Answer:

Thank you for your useful comments. The  indicates 9000 images with 55 features. As per the reviewer’ comment, we have explained the description for IMO based feature selection (Table 1) in Line 404-409 at section 4.4.2.          

The SSF explanation is given in section 4.4, line 312 – 314 as

“During sequential forward selection (SFS), features are sequentially added to an empty candidate set until further features do not decrease the criterion.”

The choice of the “” is explained in 319 – 323, section 4.4 as

“These existing feature selection methods doesn’t adaptively select the features and tends to select many irrelevant features for classification. So, an optimization algorithm called Improved Mayfly Optimization is introduced in this research that learn the features adaptively and selects the relevant features.Similarly, IMO is less complex compared to SFS, Brute force algorithm and Bandit with Adaptive Feature Extraction”

T1 - it clearly shows that the proposed IMO based feature selection drastically reduces the amount of extracted features from the collected dataset > No it does not clearly show that, a reduction may occur (since I still don't know what this table really mean), but no metric are proposed to SHOW that good ones are selected

Answer:

Thank you for your important comments. As per the reviewer’ comment, we have explained the description for IMO based feature selection in Line 394-401 at section 4.4.2. In table 1, the selected feature and the length of the selected feature are tabulated based on the best fitness value for the feature (fitness will provide the maximum length of the feature). The reduction in the features are described in Table 1, section 4.4.2.

# 4.5

The classification procedure and the LSTM input/output is still incorrectly described
They are not sufficient control of the over-fitting, which may be the issue about the proposed performances

Answer:

Thank you for your valuable comments. As per the reviewer’ comment, we have clearly described the process of LSTM with proper input/output at section 4.5 and hyper parameters are mentioned in 4.5.2.

Reviewer 4 Report

The authors addressed most of the recommandations.

The current version is of better quality, and thus I recommend publication

Author Response

Reviewer 4:

The authors addressed most of the recommendations.

The current version is of better quality, and thus I recommend publication

Answer:

Thank you for your valuable time and effort.

Round 3

Reviewer 3 Report

I for one do not agree with the publication of this article. I leave it to the editor to make the final decision, since it depends on the impact factor, the quality of the paper and other things. For a high standard of quality, I can't agree to this. For me, the choice of terms is extremely important, CNN does not select features, it learns them, or more precisely it learns a latent space where the input images are projected, this high dimensional space can be interpreted in many ways, but never in terms of "selection". This is a point, but after several revisions, it appears that the paper contains a significant amount of errors and poorly explained parts, especially regarding Deep Learning, algorithms (normalization, feature selection, ...) and the data.

For example, one equation (E2) has totally changed between the first and the last version without any impact on the final classification. Another example, until the previous version, the data came from Sat 4, Sat 6, and Eurosat, now we find out that they did not use any spectral band of these images, but they were converted to gray scale. Second, I didn't see this before, but they were talking about satellite data being affected by data loss and the "curse of dimensionality", whereas these are aerial images, which do not suffer from these factors (on the first version), fortunatly this second point have been removed.

Finally, regarding the accuracy of the given method, it seems strange to have an almost perfect classification (around 99.95%) with only a gray scale image. While the state of the art, which takes into account all spectral bands and spatial information, does not achieve such perfect performance, but more likely around 96-98%. On this paper, GoogleNet shows an accuracy of 96.69%, a simple reference [15] is proposed (Remote sensing and GIS based analysis of temporal land use/land cover and water quality changes in Harike wetland ecosystem https://arxiv.org/pdf/1709.00029.pdf), the verification of this reference shows that this exact score does not exist, on the final accuracy table (3), the accuracy of GoogleNet on EuroSat is around 98.18% and 98.29% on SAT-6. How can these differences be explained? Has the author learned a GoogleNet on grayscale images? This reference [15] is interesting because it shows that training on grayscale (or single-band) data gives lower performance (97%) compared to RGB data (98.57) for ResNet. Therefore, one would expect to have additional performance using all spectral bands. Thus, how can a method that relies on grayscale images achieve better performance than multi-spectral methods? Especially considering the state of the art of image classification techniques?

So I have serious ethical concerns about this study.
Best regards,

Author Response

Editor Comment:

Please address the final comments from reviewer#3 namely about the consistency of the results.

Response:

Thank you very much for sharing your expert opinions on our work. We really appreciate the time and effort taken in reviewing this submission. As a result, we strongly believe that our manuscript has benefited from your constructive comments and suggestions, which were helpful in improving the quality of this paper. Here, we have clarified the queries raised by the Reviewer regarding the consistency of the results.

As the reviewer claimed, we have not used spectral images and the grayscale images are used during feature extraction, also we have not combined any image bands during acquisition. Here, the input airborne images are collected from three online datasets such as Sat 4, Sat 6, and Eurosat, and the datasets available links are given below. In addition, as similar to other deep learning models, the proposed multiplicative LSTM network learns only the feature vectors, which are selected by improved mayfly optimizer. As the reviewer said, we have not used the multiplicative LSTM network for selection, which is used only for LULC classification. As detailed in the manuscript, the Haralick texture features, histogram of oriented gradients, local Gabor binary pattern histogram sequence and Harris corner detection are used as feature extraction techniques, and improved mayfly optimizer is used for feature selection. In the resulting phase, the selection of optimal features and improvisation done in the LSTM network highly improves the classification accuracy, as similar to the reference 29.

https://www.kaggle.com/code/nilesh789/land-cover-classification-with-eurosat-dataset/notebook

https://csc.lsu.edu/~saikat/deepsat/

 

Hope, the above described statements clarify the doubts of the reviewer, and request him/her to reconsider the manuscript for publication.

 # Reviewer 3 Comments:

  1. I for one do not agree with the publication of this article. I leave it to the editor to make the final decision, since it depends on the impact factor, the quality of the paper and other things. For a high standard of quality, I can't agree to this. For me, the choice of terms is extremely important, CNN does not select features, it learns them, or more precisely it learns a latent space where the input images are projected, this high dimensional space can be interpreted in many ways, but never in terms of "selection". This is a point, but after several revisions, it appears that the paper contains a significant amount of errors and poorly explained parts, especially regarding Deep Learning, algorithms (normalization, feature selection, ...) and the data.

For example, one equation (E2) has totally changed between the first and the last version without any impact on the final classification. Another example, until the previous version, the data came from Sat 4, Sat 6, and Eurosat, now we find out that they did not use any spectral band of these images, but they were converted to gray scale. Second, I didn't see this before, but they were talking about satellite data being affected by data loss and the "curse of dimensionality", whereas these are aerial images, which do not suffer from these factors (on the first version), fortunatly this second point have been removed.

Response:

Thank you for your observation and guidance.

As per the reviewer’s suggestion, we have revised the paper accordingly. Additionally, we have not used any raw data directly. Instead of using raw data, we have used the following procedure like pre-processing, feature extraction, feature selection and classification process for training and testing the data. By using the above process, we are trying to prove this as an effective one. Our main objective is to follow the above process for achieving the higher accuracy.

  1. Finally, regarding the accuracy of the given method, it seems strange to have an almost perfect classification (around 99.95%) with only a gray scale image. While the state of the art, which takes into account all spectral bands and spatial information, does not achieve such perfect performance, but more likely around 96-98%. On this paper, GoogleNet shows an accuracy of 96.69%, a simple reference [15] is proposed (Remote sensing and GIS based analysis of temporal land use/land cover and water quality changes in Harike wetland ecosystem https://arxiv.org/pdf/1709.00029.pdf), the verification of this reference shows that this exact score does not exist, on the final accuracy table (3), the accuracy of GoogleNet on EuroSat is around 98.18% and 98.29% on SAT-6. How can these differences be explained? Has the author learned a GoogleNet on grayscale images? This reference [15] is interesting because it shows that training on grayscale (or single-band) data gives lower performance (97%) compared to RGB data (98.57) for ResNet. Therefore, one would expect to have additional performance using all spectral bands. Thus, how can a method that relies on grayscale images achieve better performance than multi-spectral methods? Especially considering the state of the art of image classification techniques?

Response:

Thank you for your valuable comment. As per the reviewer’s comment, we have verified and updated the Table 9 at section 5.4. Previously, reference [15] was wrongly cited, now we have provided the proper reference citation [22] and validated all the results values. Additionally, the reference [29] trained the raw data and achieved the accuracy of 99.98 % for Sat 4 dataset and 99.93 % for Sat 6 dataset. In our research work, we have done the pre-processing, feature extraction, feature selection and classification process for training and testing stage. So, we also have achieved the higher accuracy for Sat 4, Sat 6 and Eurosat datasets.

 Cited Reference:

[22] Helber, P., Bischke, B., Dengel, A., & Borth, D. (July 2019). Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7), 2217-2226. DOI: 10.1109/JSTARS.2019.2918242.

[23] Unnikrishnan, A.; Sowmya, V.; Soman, K. P. (2018). Deep AlexNet with reduced number of trainable parameters for satellite image classification. Procedia Computer Science, 143, 931-938. https://doi.org/10.1016/j.procs.2018.10.342.

[29] Papadomanolaki, M.; Vakalopoulou, M.; Zagoruyko, S.; Karantzalos, K. (June 2016). Benchmarking deep learning frameworks for the classification of very high-resolution satellite multispectral data. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. III, no. 7, 83–88. https://doi.org/10.5194/isprs-annals-III-7-83-2016.

Back to TopTop