Next Article in Journal
What Determinants Will Enhance or Constrain the Spatiality of Agricultural Products with Geographical Indications in Northeast China? An Interpretable Learning Approach
Previous Article in Journal
An Automated Method for Generating Prefabs of AR Map Point Symbols Based on Object Detection Model
 
 
Article
Peer-Review Record

Development of a Voice Virtual Assistant for the Geospatial Data Visualization Application on the Web

ISPRS Int. J. Geo-Inf. 2023, 12(11), 441; https://doi.org/10.3390/ijgi12110441
by Homeyra Mahmoudi 1, Silvana Camboim 2,* and Maria Antonia Brovelli 1
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
ISPRS Int. J. Geo-Inf. 2023, 12(11), 441; https://doi.org/10.3390/ijgi12110441
Submission received: 27 June 2023 / Revised: 24 September 2023 / Accepted: 6 October 2023 / Published: 26 October 2023

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Lines 3-6. It is expected that abstract will clearly formulate obtained results not only approaches and the problem. Please clearly state obtained results.

 

Lines 67-69. Please provide references and/or specific examples of the challenges.

 

Lines 70-75. Please identify those gaps specifically.

 

Lines 83-87 … The microphone is a critical device for converting voice commands …

Review: The text with such common knowledge can be removed saving readers’ time.

 

Fig. 5. It looks that first attributes like map,  window size, zoom in and out scale are defined in the standard UI without voice interaction and then a user gives a voice command in the controlled terms retrievable in the help. Please state it clearly. It will help to see the real progress reached in saving user time for getting answers.

 

Lines 366-368.  Testing is a critical phase…

Review: I feel that this text is another example of redundancy. It would be good in the textbook and can avoided in the research paper. My general recommendation is to remove all such common knowledge sentences.

 

Lines 369-372. This text can be shortened to become less wordy like:

In this phase, participants completed 18 tasks. The Table 1 shows completion rates, task time, error rates, and usability feedback.

 

Lines 374-378

1. … It provided an indication of the effectiveness of the application in enabling users to accomplish their intended goals.

2. This metric assessed the efficiency of the application in terms of task completion speed.

Review: I feel these explanations of tasks are redundant for the educated readers of the scientific journal and can be removed too. 

 

Line 380-381. Error Rate. Typically,  the error rate is a percentage or a fraction of error, like 2.5% or 0.25. Table 1 presents  the actual number of errors not error rates. I would recommend putting actual error rates along with the definition of errors. For multiple choice questions we have one error for each question  when an incorrect option is selected. For open questions it can be a number between 0 and 100 depending on the actual number of errors in the answer.   

 

Consider questions listed in Figs. 9 and 11: “Where is the Pyramid of Giza?” and “Find Napoli”   Please explain how the error of answering such questions have been defined. Is it the same if the system finds nothing, shows a wrong place nearby, shows a wrong place far away but similar, e.g., Gaza instead of Giza, and so on.

 

Please report any studies of the time for oral queries including setting up all specifications versus short written queries like “pyramid of Gaza” or  “Napoli” in systems like Google map. Please also list the actual questions asked in the experiments reported in Table 1 with respective characteristics like accuracy.  

Lines 422-427. It is expected that the conclusion will  summarize  obtained results not only problems in the future directions.  Please add summary of the obtained results.

Comments on the Quality of English Language

See comments above.

Author Response

We are grateful for the careful review of our paper and invaluable comments. We believe that your feedback is instrumental in ensuring the excellence of our work, and thus, we deemed it essential to take the necessary time for a thorough revision and crafting a comprehensive response to your comments. In the attachment, you will find our detailed responses to your remarks.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

1. In line 143, the text mentions that there are some limitations. It would be better to list some manifestations of these limitations.

 

2. In this study, some user question keywords were used as the corpus, but I think this is insufficient. It requires the use of a geospatial keyword corpus. In the current research, how do you handle it if the keywords recognized in user speech are not in your corpus?

 

3. In the experiment corresponding to Table 1, the Task Completion Rate is not 100%. Please analyze the relevant reasons. Is the Error rate in Table 1 missing a percentage sign (%)?

Comments on the Quality of English Language

Some minor errors need to be imporved

Author Response

We are grateful for the careful review of our paper and invaluable comments. We believe that your feedback is instrumental in ensuring the excellence of our work, and thus, we deemed it essential to take the necessary time for a thorough revision and crafting a comprehensive response to your comments. In the attachment you will find our detailed responses to your remarks.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

This paper is a (practically oriented) description of an open-source web application platform used for visualizing spatial data and including an auditory virtual assistance. I have no doubts that this paper would be of interest for the community of web cartographers. However, there is also some room for improvement. Accordingly, I would like to suggest the following points for a revision:

1)      The discussion section reads like a discussion of technical report paper. The authors could be a lot more detailed when showing their readers in how far specific aspects of their new web application extends, backs up or even contradicts approaches from previously published studies. In a discussion, the authors should also highlight the particularities of the project leading to the innovative character of the projects.

 

2)      You integrate ChatGPT in your evaluation. I do not really understand what the readers could learn from this for a spatial problem? Please a bit more precise on that.

 

3)      Do you see any risks in working with a Web speech API for the aims of your project. You make yourself dependent on tools developed by others. Would it be an idea to include a limitations section pointing to the risks and improvement potentials of your project?

Comments on the Quality of English Language

Moderate editing of English language required

Author Response

We are grateful for the careful review of our paper and invaluable comments. We believe that your feedback is instrumental in ensuring the excellence of our work, and thus, we deemed it essential to take the necessary time for a thorough revision and crafting a comprehensive response to your comments. In the attachment, you will find our detailed responses to your remarks.

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

Authors of this manuscript developed an extension to a visualization web platform that allows users to interact with maps using their voice. The focus was on developing an effective vocabulary specific for geospatial context and later testing it within the platform. This is certainly a worthwhile goal, however I found several major issues with the manuscript that would need to be addressed. One such issue is English, there are many places that need to be improved, especially section Results was full of typos and grammatical mistakes that should have been caught before submitting the manuscript. The structure of the manuscript should more closely follow the traditional definitions of methods, results, and discussion sections.  Given that the vocabulary is an important reusable outcome of this work, I suggest publishing it as supplemental material. See other comments listed by sections:

 

Introduction

L20: developmental is not used in this context

L31-33: What do you mean with “are addressed”? In your work? In others’ works? There are no references and it’s not quite clear how e.g. real-time data processing, data democracy or AI are related to this.

The second paragraph is highly repetitive.

 

Literature review

L55: In contrast?

L67: The paragraph on challenges lacks citations (who identified those challenges) and generally needs to be clarified. For example, why is real-time data processing a challenge in this context? Balancing data democracy? It’s unclear how your work addresses these challenges.

L74: open-source? BStreams doesn’t seem to be open source, please clarify, I couldn’t find the source code. The GitHub code you provided has some derivation of the MIT license, but it’s not officially recognized as an open source license.

Generally, this article has only 12 references. I am not a fan of long unrelated references, but it seems there must be more in the literature than that (but I acknowledge speech recognition is not my expertise).

 

Materials and Methods

L91-94: This sounds repetitive.

L100: how does your application use machine learning and gestural inputs? If you don’t use it, don’t put it in the methods section.

Figure 1 and 2: what is ‘recipe’ and ‘plan’? I personally don’t find these figures helpful without more explanation. The specific example in Figure 5 is slightly more useful, but even that could be more clear.

L107: Acceleration?

L111: If you directly use any of the concepts from a specific paper, like PlanGraph, I would expect these to be more explained, perhaps mentioned better in the literature background section.

L117: Similarly, the BStream platform should be properly introduced with explanation why authors decided to use it for this research. 

L118: don’t say ‘some’, be specific

L138: Could you better explain why the classification was needed?

Figure 5: Perhaps a more complex example would be suitable.

L145: general public? Please be more specific.

L154: Why not give the actual percentage of men vs women?

L179: Usually, you should refer for the first time to figure 7 before 8.

Figure 8: I am afraid I don’t understand the layout of the table, why is there a count column twice? Please reconsider the layout.

L212: Here you say that human responses involve more complex sentences, but on line 227 you say humans tend to interact more simply.

Figure 9: I am not sure showing the decomposed sentence structure is relevant here, if yes, the abbreviations need to be explained.

Importantly, there are parts in this section which would better fit to the Results section, for example the survey demographics and results and the result of the chatGPT comparison.



Results

There are several issues here. While reading the description it’s unclear what was already part of bstreams platform and what was your development. The passive voice is not helping here. When you say ‘application’, do you mean bstreams application or your case study application?

Also most of the content here would perhaps better fit into the Methods section including the implementation and testing. Results should cover the case study used for testing, which is not really described much and the results of the testing (plus the previously mentioned results from the survey and chatgpt comparison).

 

L274-278: There is no need to explain object oriented programming.

 

Section 4.4: This entire section is somewhat problematic. MainSpeechAPI.js and VoiceMapChart.js are not part of the shared code on github. Is that part of BStreams platform and if yes, why do you talk about it? Moreover this section is very lengthy, full of typos. I suggest significantly shortening it, if readers are interested in implementation details, the source code should be available.

 

L362: Except I couldn’t match very well the descriptions to the source code linked. Also, how is your application available on the platform? I tried and couldn’t find it, which may be my mistake.

 

L386: The testing results need to be discussed in greater detail, you need to help the reader interpret the table.

 

Discussion

Discussion is typically a separate section (not under Results) and helps put your results in the context of current literature, which is mostly missing here. Figure 14 should be part of the previous section.

 

Conclusion

It should be more to-the-point, and not discussing various challenges that your application is not addressing anyway. Future work should go into Discussion. Summarize what you did and what is the scientific contribution.

 

Comments on the Quality of English Language

Regarding English, see my comments above.

Author Response

We are grateful for the careful review of our paper and invaluable comments. We believe that your feedback is instrumental in ensuring the excellence of our work, and thus, we deemed it essential to take the necessary time for a thorough revision and crafting a comprehensive response to your comments. In the attachment, you will find our detailed responses to your remarks.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

 

Response file:

Responses 1 and 2: skilfully => skillfully

Comment 2. Please provide references and/or specific examples of the challenges.

Review: This is definitely better than in the first version. But it is difficult answers  specific examples especially for the reader who is not deeply in the field.  Lack of the infrastructure can be exemplified with the lack of software with particular features. Heterogeneity of voice command across languages can be illustrated with the references to mismatch percentage of terms in languages A and B. For the balance between democracy and domain expertise it can be again the percentage of terminology difference. The pronounced gap in understanding and compliance with geospatial information standards also can be exemplified to be a real specific example to be informative for the reader.

 

Response 8: We replaced "Error rate" with "Error Count",

Review. Table 1 still shows "Error rate”.

Response 9.

Review: I did not find the answer for Comments 9 in Response 9. The response explains the source of the errors like accent, but I asked how the errors are measured, counted.  Do you count as an equivalent single error different situations or measure the error like small, medium or large when the system finds nothing, shows a wrong place nearby, shows a wrong place far away but similar, e.g., Gaza instead of Giza, and so on? 

Comments 10: Please report any studies of the time for oral queries including setting up all specifications versus short written queries like “pyramid of Gaza” or  “Napoli” in systems like Google map.

Response 10: Although it is an appropriate suggestion, given the article's length, we would not be able to explore it adequately in the allocated scope.

Review. Table 1 show the level of accuracy reached by using the oral instructions. It does not show its benefits over existing UI without oral instructions. It is not required to follow my specific suggestion but benefits of the oral instructions over current UI need to be shown somehow, like being faster or more convenient for the user. It will confirm the intent of this paper  and all endeavor of oral assistants:  “Voice assistants can elevate interaction in geospatial data web platforms” (line 1).

Comments on the Quality of English Language

I found only one small error in the response file.

Author Response

Please see the attachment. 

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The authors provided a revised version of the manuscript. In a detailed response letter, they refer to all review comments. Especially, the new discussion is much more detailed and highlights the innovative character of this research. Against this background, I would like to recommend this version of the manuscript for publication in IJGI.

Comments on the Quality of English Language

Minor editing of English language is required

Author Response

Dear Reviewer,

We would like to extend our sincere gratitude for your thorough and insightful review. Your constructive feedback and rigorous evaluation have greatly contributed to the improvement of the research.

We greatly appreciate your valuable comments, which guided us in revising the manuscript comprehensively. We are particularly pleased to hear that the enhanced discussion now better highlights the innovative aspects of our research. Once again, we want to express our gratitude for your time, effort, and expertise in reviewing our manuscript. Your feedback has been invaluable in shaping this research into its current form. 

Sincerely,

Homeyra Mahmoudi, Prof. Silvana Camboim, Prof. Maria Antonia Brovelli

Reviewer 4 Report

Comments and Suggestions for Authors

Thank you for addressing most of my comments, the revised version is indeed much more readable. Here are the remaining ones:

L79: there seems to be extra "is" in the sentence

L85 and 86 are duplicated

Figure 1 and 2: Constraints are not explained or mentioned in the text. Please make the figures consistent, for example see subactions spelling, why subactions have the grey area in one of the images is unclear.

L 101: free? in which sense, it's commercial, no? 

L103: has been facilitated - weird expression

L113: please standardize spelling of PlanGraph across the manuscript

Figure 3: in the caption you have model, but isn't this a plan?

L134-135: Please reread and correct the sentence grammar

L136: what is "diverse public"? In line 156 you talk about researchers, so obviously, it's not diverse public. Why don't you specify which channels you used for the survey? There is obvious bias in who responded, which is expected, but you need to be clear about it.

Table 5: "Most verb"? Please use upper/lower case consistently

L232: some grammar issue in this sentence

L247: no need to repeat the shortcuts, just specify them once for both trees

L259: seems repetitive

L285: Simplify the sentence, sounds awkward

L316-318: Simplify the sentence, sounds awkward

L340: access it

L344: looking at the source code, the mentioned scripts are part of the non-open source code, so you should make clear which parts are part of BStreams and what the open source code is actually doing.

L383: reread the in contrast sentence and fix it

L384: cluster radius? 

L391-392: I don't understand what you are saying here

L393-394: Again, this is unclear

L422: perhaps 'research domain' is better

L455: innovative

L460-464: this paragraph is not very convincing and not very well written, be more specific, don't say things like 'tangible'. Which gaps you have addressed specifically?

 

 

Comments on the Quality of English Language

Regarding English, please see my other comments.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop