Next Article in Journal / Special Issue
Location-Specific Orientation Set Is Independent of the Horizontal Benefit with or Without Object Boundaries
Previous Article in Journal
Object Properties Influence Visual Guidance of Motor Actions
Previous Article in Special Issue
Attention Combines Similarly in Covert and Overt Conditions
 
 
Article
Peer-Review Record

Contextually-Based Social Attention Diverges across Covert and Overt Measures

by Effie J. Pereira 1,*, Elina Birmingham 2 and Jelena Ristic 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 30 January 2019 / Revised: 27 May 2019 / Accepted: 30 May 2019 / Published: 10 June 2019
(This article belongs to the Special Issue Visual Orienting and Conscious Perception)

Round 1

Reviewer 1 Report

This paper investigates across 2 studies if there is a behavioral resonse bias or a gaze bias to a social (face), compared to a non-social (house) stimulus, when controlling for critical internal features (e.g., luminance, attractiveness, "facial expression"). The approach to control for these features is highly welcome and the method and results are sound.

However I have a few remarks on the framing of the study

- I do not get the selection of the title, could this be closer to the results? Otherwise it may be hard for interested readers to find the study and for readers who read the title to grasp the essence of the study.

- Specifically: I think the dissociation between the behavioral and occular response should be put in the foreground and adressed and discussed more clearly. Specifically, the eye-movement results show a clear social bias, which should not be discussed away. But what does the dissociation tell us about social perception really. (Title should modulation vs. determination mean here?)

- Describing the features controlled for as "background context" or "visual context" puzzled me, are these not rather low level, physical features. I this terminology common in the field.

- It is a common contrast to compare faces and houses, but I still consider this contrast very odd and sth that should really be improoved in these designs. First, because the shapes and organization of the stimuli are very different. I.e. faces may be much more interesting due to their less organized, less geometrical and variable internal stucture. This may also explain the results for the inverted faces in the occulomotor data and be discussed in more detail.

- At this point the attractivness rating is kind of odd. This really is comparing apples with peas I would say. This should at leaset be adressed in the discussion.

Author Response

We would like to thank the reviewer for their thoughtful comments and suggestions. As we detail in the response letter attached, we have addressed all comments that were raised and have edited the manuscript as such (all major changes are highlighted in blue).

Author Response File: Author Response.pdf

Reviewer 2 Report

In this paper, Pereira et al. report two experiments aimed to investigate whether faces and facial features attract covert and overt attention more than houses when low-level features are controlled for, and faces/houses are presented within a visual context. Pereira et al. conclude that “background contextual information does not determine spontaneous social attention biasing in manual measures, although it may act to facilitate oculomotor behavior”. At present, the paper is adequately written and the findings are adequately presented. However, I have seven major concerns with the manuscript. I am confident that four of these concerns can be addressed in a major revision. I am not sure whether the final three can be addressed, but I’d like to offer the authors a chance to do so.

1. Introduction:
In the introduction, the authors focus heavily on how faces may or may not attract and maintain attention. However, in the methods and results, the focus is much more on the eyes. The “face” actually does not exist as an ROI. It seems that implicitly the sum of the eyes and mouth are considered as the face. Please focus the introduction more on the eyes and mouth if this is the goal of the study. Otherwise, focus the method and results more on the face as a whole.

2. Statistics:
Throughout the manuscript, the authors want to show how the eyes do not attract attention automatically. As such, the statistical analyses should be geared towards this, and standard repeated-measures ANOVAs are not the right tool. I suggest that the authors use Bayesian analyses to quantify the evidence in support of the hypothesis that there is no difference in manual/eye-movement reaction times between cues.

For the current analyses, it is unclear whether tests for sphericity were conducted. I expect this assumption to be violated for the proportion of saccades measure.

3. Graphs:
The graphs are not informative at present. The authors conduct a repeated-measures ANOVA, but display group averages in a bar plot with CIs over the group mean. I’d urge the authors to consider alternative visualisations to capture the consistency across the different conditions by e.g. depicting individual data. See for inspiration e.g.:
Rousselet, G. A., Pernet, C. R., & Wilcox, R. R. (2017). Beyond differences in means: robust graphical methods to compare two groups in neuroscience. European Journal of Neuroscience, 46(2), 1738-1748.

4. Sloppy language:
I have major problems with how a lot of terminology is used. This applies particularly to the terms “attention biasing” or “oculomotor biasing”.
- What exactly do the authors mean by “(social) attentional biasing”? (e.g. line 91, 94). Please define attention and bias, because it seems these terms are used to indicate the same thing (namely that one spatial location is favoured as the locus of some covert or overt process).
- Line 47. “Facilitated” sounds like there is a process behind it. Please stick to the outcome, e.g. a quicker response.
- Line 57. “faster saccades”. This is wrong. Saccades are not faster, the peak velocity remains the same. The saccade occurs earlier.
- On line 138/142, the authors write that a factor (Target location or identity) manipulated something… These factors do not manipulate anything. This is what the authors do.
- Line 171, what is “performance accuracy”? I can understand either term, but not the combination.
- Line 192-192, what is the “proper amount of alertness”. Without a definition of alertness and a benchmark for “proper”, this sentence is incomprehensible.
- Line 230: It is not that 17% of trials are biased. The bias is the fact that in 17% of trials the saccade is directed toward the eyes, and that this percentage is more than the percentage of saccades to the other ROIs.
- What is a “breakaway”? This again implies some kind of process. Please stick to the description of the behavior (and eye movement away from the fixation cross). See also line 300 where the authors talk about a “breakaway bias”. These terms complicate the manuscript unnecessarily.
- Line 233-234, please stick to “eye movements” instead of “oculomotor movements”.
- Line 317-318, what is oculomotor preference? And how does it differ from “overt attentional biases”? Again, stick to the description of the measured behavior (differences in manual reaction times, or proportion of saccades directed at ROIs).
- Line 332-333, what is meant with “the fleeting nature”?
- Line 366 Overt measures do not produce effects.
- Line 377 What is meant with “reliable orienting”? I find that the entire text from line 370 to line 378 is unclear.
- Line 379 “Overt data” does not exist.

5. Controlling for luminance:
The authors state in the methods section of experiment 1 that “face and house cues were matched for average luminance”. This is only true if the author use a linearised screen. Was that the case?
What is more important is that average luminance isn’t very informative about the visual content of an image. A uniform grey image can have the exact same average luminance as a black and white striped pattern. In these cases, the luminance contrast is highly different. Were the luminance contrasts (Michelson) different between the two categories?

6. Controlling for internal configuration:
The authors state that the faces and houses have been matched for the internal configurations. From the example in Figure 1, I do not see how this has been achieved. The face has two eyes (top) and a mouth (bottom). The house has two windows (top) and a window and door (bottom). This needs to be worked out more thoroughly.

7. No direct comparison context vs. no context:
The authors state that “added contextual information may have exerted modulating effects…”. This conclusion is based not on the present study, but on the present study in combination with a previous study (Pereira et al., 2019). In fact, there is no direct comparison of visual context vs. no visual context for the faces and houses. How can the authors be sure that their conclusions hold? This is especially pressing since the authors start their discussion with “The present study examined how background context influenced …”, which implies that the no visual context / visual context comparison is essential.

Minor comments:

Line 30. “Our” ability. Who is referred to? The authors, or humans?
Line 97. Please define “natural oculomotor behavior” or replace.
Line 116. “Positioned at fovea”. This is incorrect, the fixation cross is positioned in the center of the screen.
Line 117. The sizes of the cue (4.2 by 6 degrees) does not match with Figure 1 and 2. Is this correct?
Line 119. “Fixation” -> “Fixation cross”.
Line 188. The first sentence surprises me. In the previous paragraph I read that there is no facilitation, only to read about significant main effects here. Please rephrase.
Line 217. What is meant with a “transient effect” specifically?
Line 222. Remove “once again”.

Author Response

We would like to thank the reviewer for their thoughtful comments and suggestions. As we detail in the response letter, we have addressed all comments that were raised and have edited the manuscript as such (all major changes are highlighted in blue).

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

I thank the authors for the revisions that have been made. I especially applaud the new visualisations, which capture the between-participant variance really well. All but one of my questions have been adequately answered.

One question, however, has not been answered, and this is an important one with regard to the claim of the authors to have controlled for "low-level differences" or stimulus properties. This is my previous question about the linearised screen (point 5 in my initial comments). The authors claim on page 7 that "Along with size and distance from the fixation cross, the face and house cues were matched for average luminance (computed using the MATLAB SHINE toolbox [73]1)". In addition, the authors also report Michelson contrasts.
However, it seems that these measures are computed from the per-pixel value (i.e. 0 for black, 255 for white, or between 0-1 as the authors seem to use) of the various images. However, this assumes that the screen used in the experiment has a linear relation between luminance and pixel value. For most screens this is NOT the case out of the box (because it looks awful to humans, so screens and software are optimised to make it look good to us). Screens therefore have to be linearised when making claims about luminance and/or contrast. If a screen is NOT linearised, the differences in luminance are generally much smaller for the black-dark grey part of the spectrum as compared to the light grey-white part of the spectrum. As such, the luminance or Michelson contrasts as measured from the screen may actually differ substantially compared to when computed from the image pixel values directly. My short investigation of the SHINE toolbox leads me to conclude that there is no computation of luminance and/or contrast based on screen properties in the toolbox. As such, I'd like the authors to clarify whether a linearised screen was used, and if not to measure the luminance of the screen for each grayscale value and redo the calculations based on the obtained luminance values.

Author Response

Point 1. All but one of my questions have been adequately answered. One question, however, has not been answered, and this is an important one with regard to the claim of the authors to have controlled for "low-level differences" or stimulus properties. This is my previous question about the linearised screen (point 5 in my initial comments). The authors claim on page 7 that "Along with size and distance from the fixation cross, the face and house cues were matched for average luminance (computed using the MATLAB SHINE toolbox [73]1)". In addition, the authors also report Michelson contrasts. However, it seems that these measures are computed from the per-pixel value (i.e. 0 for black, 255 for white, or between 0-1 as the authors seem to use) of the various images. However, this assumes that the screen used in the experiment has a linear relation between luminance and pixel value. For most screens this is NOT the case out of the box (because it looks awful to humans, so screens and software are optimised to make it look good to us). Screens therefore have to be linearised when making claims about luminance and/or contrast. If a screen is NOT linearised, the differences in luminance are generally much smaller for the black-dark grey part of the spectrum as compared to the light grey-white part of the spectrum. As such, the luminance or Michelson contrasts as measured from the screen may actually differ substantially compared to when computed from the image pixel values directly. My short investigation of the SHINE toolbox leads me to conclude that there is no computation of luminance and/or contrast based on screen properties in the toolbox. As such, I'd like the authors to clarify whether a linearised screen was used, and if not to measure the luminance of the screen for each grayscale value and redo the calculations based on the obtained luminance values.


Response: We thank the reviewer for clarifying this point. Although we did not use a linearized monitor, we have verified that the image pixel values do reflect the screen display measures accurately using a Datacolor Spyder3Pro colorimeter. We report this point in Footnote 1 on Pages 7-8.


Author Response File: Author Response.pdf

Back to TopTop