A Novel Method for Imputing Missing Values in Ship Static Data Based on Generative Adversarial Networks
Round 1
Reviewer 1 Report
A Novel Method for Imputing Missing Values in Ship Static Data Based on Generative Adversarial Networks
Junbo Gao, Ze Cai 2, Wei Sun and Yingqi Jiao
Overall a reasonable paper that should be published, but with some corrections first.
1) Some important references are missing:
The seminal paper on gradient penalties:
Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Neural Information Processing Systems: La Jola, CA, USA, 2017; Volume 30 of Advances in Neural Information Processing Systems.
The other important paper on GANs and missing values (in addition to the GAIN paper):
Poudevigne-Durance, T.; Jones, O.D.; Qin, Y. MaWGAN: A Generative Adversarial Network to Create Synthetic Data from Datasets with Missing Data. Electronics 2022, 11, 837.
A reference for Isolation Forests:
Whichever reference you used.
2) Wasserstein distance:
The methodology described in Equations (2)--(6) does not use the Wasserstein distance. It even refers to the cross-entropy loss, which is the basis of the Jenson-Shannon divergence, which the authors specifically claim _not_ to be using. Either these equations need to be rewritten to describe a Wasserstein GAN, or the authors description in the text of what they are doing needs to be corrected.
3) Outliers:
Any distribution based imputation method needs to be wary of outliers. The plots 5(g), 5(i) and 5(j) all clearly show outliers which should probably have been removed before the data was used for imputation.
4) Not MCAR:
Table 3 gives the test result showing the data is not MCAR (this is also clear from Table 2b, which would show a geometric distribution if the data were MCAR). What are the consequenses of this? There are no theoretical guarantees that GAIN works for data MAR, so are you worried? If not why not?
Some other minor corrections:
line 12: rather than "perfomed best" you should say "outperformed the missing forest and polynomial fit" methods
line 103: "missing forest" should be "isolation forest"
Table 2b: "Missing percentage" should just be "percentage"
Figure 7: not really needed given Table 6 and Figure 6.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
This article proposes a ship static data imputation method based on generative adversarial networks (GAN). Since the authors claim that their proposed method can improve the Generative Adversarial Imputation Nets (GAIN) using Wasserstein distance and gradient penalty and optimize the data preprocessing process by combining knowledge from the ship domain. Therefore the following comments are suggested.
1. It is suggested to rewire the abstract and highlight the novelty and findings of this study.
2. This study suffers from a lack of a deep literature review of the problem of this study. It is suggested to boost this section and the related paper, such as A hybrid imputation method for multi-pattern missing data: A case study on type II diabetes diagnosis can be considered.
3. The contribution of this study should be summarized at the end of the introduction.
4. Please clarify equations 4-6.
5. It is recommended to boost the description of Figures 2 and 3 and add more detail in the related section.
6. The methods used for each box in figure 1 are unclear, and the pseudocode is recommended.
7. The authors are recommended to state the claims of this study and related experimental evaluation, which can support or satisfy them.
8. It is recommended to report the accuracy and precision of the proposed model before and after imputation.
9. It is recommended to design an experimental evaluation and investigate the impact of generative adversarial imputation nets using Wasserstein distance and gradient penalty.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The authors have adressed all of my comments appropriately.
Reviewer 2 Report
The authors have responded to most comments, and the revised manuscript is eligible for publishing.