Analyzing the Performance of the S3 Object Storage API for HPC Workloads
Round 1
Reviewer 1 Report
The paper addresses an interesting and important concept, “Analyzing the Performance of the S3 Object Storage API for HPC Workloads". The authors have invested considerable thought and effort into the problem investigated. The work is almost well-written and well-understood and falls within the scopes defined in the journal, though in some parts, the writing deteriorates considerably and needs some amendments. There are, however, some deficiencies with the paper that should be addressed/responded to before it can be recommended for publication in the Journal:
- Section 1 (Introduction) needs to be improved. More recent studies published in credible Journals could be included (2021).
- Section Related Work: the authors, need to discuss each of the topics they used in this research one by one. You can change the section name to the Background section.
- The article needs to have a Research Finding, Discussion, Conclusion, and Future Work sections.
Reviewer Recommendation: Major Revision
Author Response
Point 1: The work is almost well-written and well-understood and falls within the scopes defined in the journal, though in some parts, the writing deteriorates considerably and needs some amendments
Response 1:Thank you for your positive feedback. Can you please specify in which parts the writing deteriorates considerably?
Please Note that the new version contains several improvements, and the descriptions of some figures were also amended.
Point 2: Section 1 (Introduction) needs to be improved. More recent studies published in credible Journals could be included (2021).
Response 2: The introduction is now improved. The changes are highlighted in the new version.
The following related studies from 2021 are now included in the new version:
- Jamal, A.; Fleiner, R.; Kail, E. Performance Comparison between S3, HDFS and RDS storage technologies for real-time big-data applications. 2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI). IEEE,2021, pp. 000491–000496.
- Milojicic, D.; Faraboschi, P.; Dube, N.; Roweth, D. Future of HPC: Diversifying Heterogeneity. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2021, pp. 276–281
- Liu, Z.; Kettimuthu, R.; Chung, J.; Ananthakrishnan, R.; Link, M.; Foster, I. Design and Evaluation of a Simple Data Interface for Efficient Data Transfer across Diverse Storage. ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 2021, 6, 1–25.
Point 3: Section Related Work: the authors, need to discuss each of the topics they used in this research one by one. You can change the section name to the Background section.
The article needs to have a Research Finding, Discussion, Conclusion, and Future Work sections.
Response 3: Thank you for pointing this out. The section was named ”1.1. Background and Related Work” and placed as a subsection inside the “Introduction” section. There, we examine the current state of the research field carefully and cite relevant publications. We also highlight controversial and diverging hypotheses. The title of the “Summary” section - which briefly discusses and highlights the obtained results and possible future work- was changed to “Conclusions”.
According to the MDPI description in https://www.mdpi.com/journal/applsci/instructions#manuscript, the journal now accepts free-format submission. We also adjusted our section titles according to the requirements and moved and extended content in these sections.
Reviewer 2 Report
The authors present a paper concerning the improvement of the performance of a simple storage service (S3). To do this, in a first step the authors will compare the S3 algorithm to other existing algorithms. A methodology has been set up to make this comparison. From there the authors have obtained results.
This paper seems difficult to read, especially the different benchmarks seem abstract because we don't know if they are the different algorithms, tools or other things. For example: IO500, MD-Workbench, IOR, MDTest, MinIO, etc. We get a bit lost in reading the article. The same is true for the results with the graph legends not all described. Figure 9 does not seem to be the same size as the other figures.
Moreover, the state of the art seems thin, it would be necessary to detail a little more. For example, make a summary table comparing S3 to Google, IBM, etc., giving the advantages and disadvantages and the functionalities.
Author Response
Point 1: This paper seems difficult to read, especially the different benchmarks seem abstract because we don't know if they are the different algorithms, tools or other things. For example: IO500, MD-Workbench, IOR, MDTest, MinIO, etc. We get a bit lost in reading the article. The same is true for the results with the graph legends not all described. Figure 9 does not seem to be the same size as the other figures.
Response 1: Every Benchmark “IO500, MD-Workbench, IOR, MDTest” is described (section 3.1 Benchmarks); the modification done to each benchmark is extensively described in section 3.2. Please be advised that the provided references provide lots of extra information for the interested reader.
We described the missing graph legends and adjusted the size of figure 9 in the new version.
Please Note that the new version contains several improvements, and the descriptions of some figures were also amended.
Point 2: Moreover, the state of the art seems thin, it would be necessary to detail a little more. For example, make a summary table comparing S3 to Google, IBM, etc., giving the advantages and disadvantages and the functionalities.
Response 2: We tried our best to shed more light on this in the new version to clarify the misunderstanding: we are actually testing the S3 Interface/API provided by different vendors. In particular, table 2, entitled “IO500 results comparing S3 Cloud providers of different” compares the S3 interface of Google Storage vs. S3 Interface of IBM Storage, etc. The in-house tests of the S3 interface from MinIO are provided only as a reference. In terms of functionality, all these implementations provide the S3 interface, additional features are not relevant in this article as we focus on performance.
We have now clarified this in the article.
Round 2
Reviewer 1 Report
ACCEPT
Reviewer 2 Report
No suggestions