Group-Privacy Threats for Geodata in the Humanitarian Context
Round 1
Reviewer 1 Report
This is a really interesting and important concept, worthy of exploration. These topics and concepts clearly have great value as a thought problem and literature review.
I'd like to see more specific detail and suggested methods to carry out the triage process. The authors make a good case for this as an important issue and successfully argue for a more careful approach to ensure group privacy and appropriate data use.
However, there is no clear suggestion of a process or rubric to accomplish this. The authors acknowledge that these decisions may be fluid and the assessment would likely vary by the individual making the judgement during triage -- which I found completely unsatisfying.
As I was reading the manuscript, I was quite excited by this important and very valid concept, and then in the discussion and conclusions sections, was left without any clear direction from the authors or really any sense the authors deeply explored and tested methods to do such triage as opposed to just addressing these concepts conceptually.
There are issues with English use, structure and phrasing throughout. While I did not have time to identify all of these issues, I have highlighted several examples in the earlier portion of the manuscript (listed below) which should help the authors when making revisions throughout.
Most of these issues relate to passive voice, wordiness, incorrect words/slang or other minor structural issues. None of these are deal-breakers, but a careful review and edit of the language will strengthen the paper.
Examples of language/use edits:
Line 37 whose debates have emerged relatively recently [passive voice] consider rephrasing language to active voice: a debate that has recently arisen,
This sentence also needs citation since you state it is a recent debate by give no example(s).
Line 39 but affects groups as well. [wordy] rephrase as: but also affects groups.
Line 41 more on the individual [wordy] delete: more
Line 55 call to also consider [wordy] delete: also
Line 57 “group privacy is an underrated but worth studying problem”. [is this the actual quote? Usage is incorrect] Consider removing the direct quote and rephrase as: group privacy is a underrated problem worthy of study.
Line 57 Group privacy has also recently attracted attention in civil societies [wordy] consider rephrasing as: Recently, group privacy has attracted attention in civil societies
Line 81 The main objective of this paper… [papers do not have objectives, people do] consider rephrasing as: Our objective in this paper…
Line 87 used within the context of [wordy] consider rephrasing as used in context of…
Line 113 Discussion on group privacy [incorrect use] should be: Discussion of group privacy
Line 265 Majorly [slang] this should not be used in formal writing, delete
Line 507 trump [should be plural] trumps
Line 519-561 The entire discussion section is presented as one very long paragraph, this should be spit into multiple paragraphs.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
This is a very interesting and socially relevant submission. It covers an area that should more studied by the geospatial community. It feel and clearly written and provides a good systematisation of the major issues involved in group privacy in the context of geospatial data.
I have to say that I was a bit disappointed after reading the text, as it does not develop the case for geospatial data and even less for the humanities community. It stays at he abstract level. Yes, there are a few examples with geospatial but nothing that does not happen with all types of data.
Please see a few comments below:
- There is a need to define group privacy and clearly distinguish it from personal privacy.
- No related work described, i.e., what has been done (even if not much) in this area.
- very concentrated on remotely sensed imagery, not considering other existing datasets. I acknowledge the very high relevance of images in disaster management, but there other types of datasets that are also relevant, such hydrography, topography, landuse, just to mention a few examples.
- The problems posed in terms of group privacy are those known to automatic classifications of aerial images. It seems quite obvious that there will be biases that are solely related to the weaknesses of the classification methods (AI based or not). These errors can be introduced on purpose but that's normally not the case.
- Although the authors discuss AI biases in classifying images known to exist for example in recruitment processes (as mentioned by the authors), the case for non neutral classification in the building example is not clear. For example "What are the biases presented in figure 2? Why are they not neutral?"
- In the unlikely situation where the humanitarians do the training by themselves, the problem (bias) relies on what they decide to use as training dataset and not on the use of the image in itself. It would be still a bias provoked the tuple human/algorithm and not the data.
- The triage ideia is interesting but needs to be more thoroughly adapted to this specific case, perhaps with a few use cases.
- There is need that the submission makes a clear point for the geospatial data and the humanitarian context .
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
While I think the manuscript is a significant contribution, I think some "honing" of the presentation is called for. Starting with the title, I would suggest thinking of "triage" is a more relevant concept and more in line with the exploration. In the same vein, I think this "triage" should be far more central in the introduction.
Otherwise, I think the conclusion could speak more to the problem of using privacy, culturally and legally connected to individuals, as a concept for working data-centric on disaster applications, when saving lives will likely most always trump privacy, but the data you discuss as little information about individuals. Perhaps, considering demographic and individual data are matters for future research?
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Figure 3 is a confusing. The flow chart has multiple inputs (no clear start point) and no end point/action. I recommend reworking this flowchart to begin with a "Data Set" as the input and then list actions/decision-points on the arrows (e.g. YES/NO, LOW/HIGH RISK or ??) and then explain this more clearly and completely in the caption and text. For example, context analysis is clearly a crucial step (blue box) but there is no description of would would happen here and a search of the manuscript reveals no section explicitly covering this topic and how one would go about it. (You do talk around the issue in other sections on triage classification, but I don't have a good sense of how one would actually conduct the context analysis as this is currently presented.
Overall the manuscript raises important issues and the need for future work to clarify procedures, metrics and methods to assess these threats to privacy. Balancing the need to provide support during humanitarian crises with those of the long-term impacts of data privacy is one which we must consider both in the use of remotely sensed geospatial data and traditional geospatial data collected in situ by governments, non-profits and others involved in responding to these crises.
The manuscript is much improved, however, there are still areas that are wordy and could be streamlined. I'd recommend having someone give this a secondary edit to further enhance and strengthen the overall presentation.
battlefield is one word (correct on 494, 500)
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
I find the answers to reviewers satisfactory.
Author Response
We would like to thank the reviewer for their feedback.