Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Performance Evaluation of Distributed Database Strategies Using Docker as a Service for Industrial IoT Data: Application to Industry 4.0

Information 2022, 13(4), 190; https://doi.org/10.3390/info13040190

by Theodosios Gkamas, Vasileios Karaiskos

and Sotirios Kontogiannis^*

Reviewer 1:

Anastasija Nikiforova

Reviewer 2: Anonymous

Information 2022, 13(4), 190; https://doi.org/10.3390/info13040190

Submission received: 16 March 2022 / Revised: 5 April 2022 / Accepted: 7 April 2022 / Published: 9 April 2022

(This article belongs to the Special Issue Big Data, IoT and Cloud Computing)

Round 1

Reviewer 1 Report

The topic covered should be of interest for the wide readership. It provides a good quantitative overview to provide the readership with the exact results.

The paper is well-written and well-structured.

There are however several suggestions on how to improve the paper.

Abstract: the authors are invited to make the abstract more fluent and motivated.

I.e. there is a need to emphasize why "This paper presents the authors’ proposition for cloud-centric sensory measurements and3measurements acquisition" by making a link to the provided background.

Similarly, there should be a motivation why exactly three scenarios are tested, i.e. "Three distinct scenarios have8been thoroughly tested: i) data insertions, ii) select/find queries, and iii) queries related to aggregate9correlation functions.". Most probably this will be discussed in the light of the centricity of these scenarios, i.e. the most basic but widely used.

Paragraphs dedicated to DBs would be more appropriate for the next Sections, i.e. their presence in the Introduction are questionable.

The authors are invited to provide not only the volume of the dataset (Section 4) but also the number of attributes and / or discuss how the difference between them will affect the results of the experiment.

The selection of artifacts such as TIG should be well motivated by elaborating on both (1) why this particular artifact make sense? why not alternatives (alternatives should be mentioned) and (2) how the results would / would not be affected by referring to alternative?

The list of references is not up-to-date. In the light of the topicality of the topic, it is important to cover more recent studies. This will allow to provide an evidence that statements and issue concerned is indeed topical.

The authors are suggested to briefly discuss what are other parameters that should determine the choice of the DB? Here the discussion on the security of databases in the light of NoSQL DBs relatively low security is crucial. For one of the most recent studies see -

Daskevics, A., & Nikiforova, A. (2021, December). IoTSE-based open database vulnerability inspection in three Baltic countries: ShoBEVODSDT sees you. In 2021 8th International Conference on Internet of Things: Systems, Management and Security (IOTSMS) (pp. 1-8). IEEE. with a particular focus on MongoDB found here - https://www.bitdefender.com/blog/hotforsecurity/bad-actors-target-mongodb-databases-threatening-to-contact-gdpr-legislators-unless-ransom-is-paid/ Additionally, the discussion in the context of other studies on DB performance should be established, i.e. whether the results are similar or differs significantly? As an example - Martins, P., Abbasi, M., & Sá, F. (2019, April). A study over NoSQL performance. In World Conference on Information Systems and Technologies (pp. 603-611). Springer, Cham. Martins, P., Tomé, P., Wanzeller, C., Sá, F., & Abbasi, M. (2021, March). NoSQL Comparative Performance Study. In World Conference on Information Systems and Technologies (pp. 428-438). Springer, Cham. Seghier, N. B., & Kazar, O. (2021, September). Performance Benchmarking and Comparison of NoSQL Databases: Redis vs MongoDB vs Cassandra Using YCSB Tool. In 2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI) (pp. 1-6). IEEE.

Other minor comments:

The language requires some improvements, where the native speaker would be a beneficial support.

The authors are also invited to control the tense they are using through the paper, i.e. subsection 4.1 use the future tense, while the 4.2 - past tense. please make sure that both use past tense.

The formatting guidelines should be checked. The current version does not follow reference formatting guidelines.

The authors should check that all abbreviations are explained, e.g. "IoE", "IIoT" are not.

In addition, the authors are invited to make sure that the title is as compliant with the stud as possible. Probably the term "strategy" is not the most accurate. What about "performance" you have checked? For me, it would be beneficial to include this term in the title.

Otherwise, the study is suited for the journal and is of potential interest for the wide readership, representing both theoreticians and practitioners once the above listed are improved.

Author Response

Reviewer #1

Comment 1: Abstract: the authors are invited to make the abstract more fluent and motivated.

I.e. there is a need to emphasize why "This paper presents the authors’ proposition for cloud-centric sensory measurements and3measurements acquisition" by making a link to the provided background.

Response: A better explanation is included in the abstract.

Comment 2: Similarly, there should be a motivation why exactly three scenarios are tested, i.e. "Three distinct scenarios have8been thoroughly tested: i) data insertions, ii) select/find queries, and iii) queries related to aggregate9correlation functions.". Most probably this will be discussed in the light of the centricity of these scenarios, i.e. the most basic but widely used.

Response: A proper explanation is added, about our choice of the three testing scenarios.

Comment 3: Paragraphs dedicated to DBs would be more appropriate for the next Sections, i.e. their presence in the Introduction are questionable.

Response: Thank you for your comment, but in our opinion, the presentation of MongoDB’s and PostgreSQL’s attributes shall be included in the introductory part, since it is not suitable to stand in the Related work section.

Comment 4: The authors are invited to provide not only the volume of the dataset (Section 4) but also the number of attributes and / or discuss how the difference between them will affect the results of the experiment.

Response: The number of the attributes of each record in the DBMS is specified as equal to 23, as the reviewer asked for.

Comment 5: The selection of artifacts such as TIG should be well motivated by elaborating on both (1) why this particular artifact make sense? why not alternatives (alternatives should be mentioned) and (2) how the results would / would not be affected by referring to alternative?

Response: The comment is taken into consideration.

Comment 6: The list of references is not up-to-date. In the light of the topicality of the topic, it is important to cover more recent studies. This will allow to provide an evidence that statements and issue concerned is indeed topical.

Response: The Related Work section is updated including a few more recent publications.

Comment 7: The authors are suggested to briefly discuss what are other parameters that should determine the choice of the DB? Here the discussion on the security of databases in the light of NoSQL DBs relatively low security is crucial. For one of the most recent studies see -

with a particular focus on MongoDB found here - https://www.bitdefender.com/blog/hotforsecurity/bad-actors-target-mongodb-databases-threatening-to-contact-gdpr-legislators-unless-ransom-is-paid/

Response: Both references are included in the manuscript and are properly discussed.

Comment 8: Additionally, the discussion in the context of other studies on DB performance should be established, i.e. whethTher the results are similar or differs significantly? As an example -

Martins, P., Abbasi, M., & Sá, F. (2019, April). A study over NoSQL performance. In World Conference on Information Systems and Technologies (pp. 603-611). Springer, Cham.

Martins, P., Tomé, P., Wanzeller, C., Sá, F., & Abbasi, M. (2021, March). NoSQL Comparative Performance Study. In World Conference on Information Systems and Technologies (pp. 428-438). Springer, Cham.

Seghier, N. B., & Kazar, O. (2021, September). Performance Benchmarking and Comparison of NoSQL Databases: Redis vs MongoDB vs Cassandra Using YCSB Tool. In 2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI) (pp. 1-6). IEEE.

Response: Additional recent references were included and discussed in the related work section of the manuscript.

Other minor comments:

MC 1. The language requires some improvements, where the native speaker would be a beneficial support.

R1: Proofreading resulted in ameliorating the use of English language.

MC 2. The authors are also invited to control the tense they are using through the paper, i.e. subsection 4.1 use the future tense, while the 4.2 - past tense. please make sure that both use past tense.

R2: Fixed.

MC 3. The formatting guidelines should be checked. The current version does not follow reference formatting guidelines.

R3: The references in the bibliography section are now properly written.

MC 4. The authors should check that all abbreviations are explained, e.g. "IoE", "IIoT" are not.

R4: Fixed, those abbreviations are explained in the “Abbreviations” sections at the end of file.

MC 5. In addition, the authors are invited to make sure that the title is as compliant with the stud as possible. Probably the term "strategy" is not the most accurate. What about "performance" you have checked? For me, it would be beneficial to include this term in the title.

R5: The title of the manuscript is updated containing the word “Performance”.

Reviewer 2 Report

Dear authors,

The topic is relatively up-to-date and promising, which gives some hope for reads and citations, which is good.

The abstract is VERY promising, if suggests a huge dose of really useful research in your paper, which is GREAT.

If the paper itself is so good as the abstract suggests, than I am really surprised you send it to MDPI.Information (which is young and doesn’t have IF calculated yet) and not to MDPI.AppliedSciences (which has IF=2.679).
If the paper itself is not as good as the abstract suggests, I will keep in mind the difference between Information and AppliedSciences.

The text contains some imperfections, which will be listed below. My goal is to help you see the imperfections, so that your paper writing skill could be leveraged, and your papers would become more interesting, easy to read, and professional.

Of course you can agree or disagree with my comments - you are the experts, the researchers, I am just a reviewer, trying to help you prepare better papers, more professional, easier to read, understand, and appreciate.

[line.number] - text - my.comment

[26] - several - definition of „several” == „more than two but not many”. You didn’t men that, did you? Did you mean „numerous”?
[31] - us - the use of 1st person is considered non-professional or non-scientific, and it should ba avoided e.g. by using passive. In this particular case I do not see inevitability (or even the need for) 1st person. Just delete „us with” and the sentence will be much more proper for a scientific paper
[34] - a few - why do you emphasize that there are not many (more advantages)? Did you mean that there ARE additional (advantages)? In that case saying „a few” limits the scope of possibilities. In my opinion you mean „There are additional advantages of… , including…”
[37] - DCS - formally, you are required to define abbreviation upon the 1st use, you have decides to define it within the text below, which is strange, but can be accepted in this case
[41] - PLCs - I am not so sure, I think that I have seem/programmed a mesh consisting of PLCs able to cooperate and watch each other. Your sentence suggests it is not possible, and that PLCs are always standalone units. (If this is not what you meant, this is what it sounds.)
[42] - will - it means always, and that is not always true. You rather should say „is able to…” or „is capable of …ing”
[45] - now - it is a tricky word, while it adds a time axis to your considerations. And while the reader does not know WHEN did you write the paper, instead of „now” you should write „as in March of 2022…” . Of course, that is not what you meant, for this reason it is better not use time-describing words, and instead of „PLCs can now be…” write „PLC-based systems can be designed as…” or „PLCs can be used as…”
[49] - big data - this phrase means nothing. Data can not be big or small. IF you mean the Big Data paradigm, then you have to use capital letters, because it is a name. So…: Big Data
[49] - handle Big Data - if you mean the handling of the specificity of the Big Data paradigm, then the sentence would be o.k. (with capital letters). But if you mean just „handling massive datasets” (or numerous data entries), you may want to rephrase that.
[56] == [49]
[62] - data are - either „data is” or „data entries are”
[62] - encoding on - either „encoding of” or „encoding information on”
[63] - array of value - values
[70] == [31]
[77] - On the other hand, - this phrase means that you will now present the second point of view, quite different/opposite to the first one, and maybe complementing the first one, like „the second side of a coin”, which is not what you mean I think. I think that you just want to present another way of addressing similar challenges, with other framework/system (but not complementary). (Maybe just delete the „On the other hand,” ?
[110] - industry 4.0 - Industry 4.0 (see [49]: big data)
[161] - industrial 4.0 - what is that !???
[180] - autor’s performance - o, really? The performance of authors? Don’t think so.
[184] - The dataset used to perform - „used to” primarily means „he had that in habit”, „he usually did that”. So you mean that the dataset usually (every evening? every Sunday?) decided to do some platform testing (just for fun?) ? //consult native speaker for clarification if needed
[184] - will - you are messing around with the timeline again. Why do you „start the future”? The experiment is already done, the paper is ready, so the future tense is very confusing.
[206] - and - (you have more than two)
[207] - Jitter of Response Time - you mean „Jitter” -OR- „Response Time”? Because there is no such thing as jitter of response time, while response time is a single value for particular response, and jitter is jitter ;-) Jitter can be considered only with many (more than one) consecutive frames that should arrive „in order” (but usually will not), but not for a single frame/value. Unless you did not mean one particular value of response time or not the jitter - so correct what is not o.k. or not clear. If you mean the distribution of the time distance of response times, or variation of these, or distribution, or whatever in this flavor, - this is not jitter. In jitter the sequence and the correctness of the sequence is crucial.
[207] - the Loss, meaning the database drops - WHAAAT? Do you know what is a „database drop” !?!?!?!?! Correct this sentence FAST !!!!!! Before anyone reads it…
[256table1] - loss of response time - ??? How do you „loose” the „response time”? The response time is a value, for instance 0.1 second. How do you „loose” it?
[256table1] - (jitter of) == [207]
[266table2] - (loss of response time) == [256table1]
[266table2] - (jitter of) == [256table1]
[286table3] - (loss of response time) == [266table2]
[286table3] - (jitter of) == [266table2]
[296table4] - /caption/table/ - the caption doesn’t help understanding the table. Make a better caption (or better column names) so that we would be able to see the reason behind this table. Neither column names nor row names seem to be compatible with the caption/description. Requires strong thinking - this is always bad - the reader will skip this table or not understand it or sop reading here.
[305] - overhead,rather - overhead, rather
[305] - where they use - who are „they”? (If you mean the two tested versions (these are not persons) then consider „using” instead of „where they use”)
[320] - mongos - isn’t that a Name?
[321] - and will carry the same data at all times - not true. It’s a lie’ a fairytale. Don’t trust it. It’s a fake news.Using replicas (/mirrors) for speed-up (or scalability) requires contacting with one particular instance, and the other instances will be synchronized later. (So… not „at all times”)
[342] - in Celcius - yes, nice to know, interesting, however not very useful. What difference does it make? If you want to give details, you might give the datatypes, but I do not see the point either
[340-344] - exact copy of [194-197] - why? To make the paper longer?
[440] - your Conclusion section looks like an Introduction: Look at the paragraphs: 1.the.industry.is.evolving ; 2.the.paper.focuses.on ; 3.the.simplicity.of.MongoDB.shows - you should make the Conclusions in regards to YOUR research/results, to finally conclude if your work gave something important/useful or maybe proved that a particular approach is hopeless or promising. I suggest rewriting the Conclusion section completely -or- adding a „targeted” paragraph, that would address your results, and would provide the luxury of formulating your own fact-based (research-based) opinions (conclusions).

The paper is nice, and after you address the above-mentioned issues, the paper will be even better :)

You do not have to give your answer to me in a letter-to-reviewer, just modify the paper so that it would be better.

I wish you good modifications of the indicated details and a great paper and scientific career.

Best regards,
Reviewer.

Comments for author File: Comments.pdf

Author Response

Reviewer #2

Thank you very much for your comments and concern, we really appreciate it. Most of the comments are taken into consideration.

Comment 1: [296table4] - /caption/table/ - the caption doesn’t help understanding the table. Make a better caption (or better column names) so that we would be able to see the reason behind this table. Neither column names nor row names seem to be compatible with the caption/description. Requires strong thinking - this is always bad - the reader will skip this table or not understand it or sop reading here.

Response 1: The caption of Table 4, along with the table itself, are moderated by providing a better explanation of it.

Comment 2: [440] - your Conclusion section looks like an Introduction: Look at the paragraphs: 1.the.industry.is.evolving ; 2.the.paper.focuses.on ; 3.the.simplicity.of.MongoDB.shows - you should make the Conclusions in regards to YOUR research/results, to finally conclude if your work gave something important/useful or maybe proved that a particular approach is hopeless or promising. I suggest rewriting the Conclusion section completely -or- adding a „targeted” paragraph, that would address your results, and would provide the luxury of formulating your own fact-based (research-based) opinions (conclusions).

Response 2: The conclusion section was totally re-written, according to reviewer’s comments.

Round 2

Reviewer 1 Report

The paper has undergone a round of revisions, where the comments were either taken into account and appropriate changes were made or the authors provided an explanation why they are not made. Since the paper has been improved and provides a rich content of potential interest for the readership, I recommend to accept the paper in its current form! The only point is that the authors need to format the references according to the guidelines of the journal. Otherwise, congratulations!!

Article Menu

Performance Evaluation of Distributed Database Strategies Using Docker as a Service for Industrial IoT Data: Application to Industry 4.0

Further Information

Guidelines

MDPI Initiatives

Follow MDPI