Review Reports - Privacy-Aware Visualization of Volunteered Geographic Information (VGI) to Analyze Spatial Activity: A Benchmark Implementation

Round 1

Reviewer 1 Report

A well-written paper with an interesting method which has been tested with data and scenario analysis.

I only have couple of minor suggestions:

Figure 4, use 'a' and 'b', or 'right' and 'left', in the caption to differentiate the two parts of figure.
Conclusion can be expanded to better highlight the findings of the paper

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Good job. I would like some comment on this proposal to interact with anonymous users, for example to seek a opinions convergence ... (Di Zio, S., Castillo-Rosas, JD, & Lamelza, L. (2017). Real Time Spatial Delphi: Fast convergence of experts' opinions on the territory. Technological Forecasting and Social Change, 115, 143–154. Https://doi.org/10.1016/j.techfore.2016.09.029).

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

This paper presents a new approach of using HLL technology as the basis for developing privacy-aware visualization of data collected from VGI. As the volume of VGI dramatically increases in the past few years, the privacy issue indeed demands more attention. As the term of “privacy-aware” is frequently used, a formal definition should be given.
As the major argument is VGI and natural resource management, a more in-depth discussion about the fundamental characteristics (e.g., property, limitation) should be provided. How this can be linked to the test data chosen should also be explored.
As there have been various approach trying to provide privacy protection, comparisons of the outcomes of these approaches, and especially to the outcomes of the proposed approach in this paper would be helpful for evaluating the possible future development of the proposed approach, e.g., highlight the contribution of using less data storage to “approximately represent” the original dataset based on selected quantitative measures.
The selected metrics, post count, user count, post (photo) per day and quasi-UID, are suggested to be clearly defined. If possible, I would also suggest to explain the data in more details, e.g., the geographic distribution of the test data (not only the number of geotagged photos)
LINE 241, should be section 3 (Figure 1?)
Between LINE 289 and 297, there have been a lot of discussion regarding the parameters that affect the performance, how is the status of these parameters in the experiment?
Please explain how “the coordinates and user days are measured by concatenation” in LINE325-326.
Since the parameter of grid size (100km and 50km in this paper) has impacts on the outcomes, how to justify the selected grid size (granularity)?
In the “Sandy” case, how the proposed approach determines the additional query results with direct database access? What if there is no post from Alex on May 9, 2012? How is the 100km grid bin related the query of “Berlin” and “San Francisco”?

Author Response

Please see the attachment.

Author Response File: Author Response.docx