Next Article in Journal
The Limbal Niche and Regenerative Strategies
Next Article in Special Issue
Semantic Expectation Effects on Object Detection: Using Figure Assignment to Elucidate Mechanisms
Previous Article in Journal
High-Accuracy Gaze Estimation for Interpolation-Based Eye-Tracking Methods
 
 
Brief Report
Peer-Review Record

Visual Search Asymmetry Due to the Relative Magnitude Represented by Number Symbols

by Benjamin A. Motz 1,*, Robert L. Goldstone 1, Thomas A. Busey 1 and Richard W. Prather 2
Reviewer 1:
Reviewer 2: Anonymous
Submission received: 30 June 2021 / Revised: 7 September 2021 / Accepted: 15 September 2021 / Published: 17 September 2021

Round 1

Reviewer 1 Report

Motz and colleagues present a series of experiments suggesting a potential effect whereby numerical symbols representing a larger quantity are found more quickly among numbers representing a smaller quantity than when the target and distractor symbols are reversed. This suggests an effect of conceptual magnitude on perceptual processing, an intriguing proposal. Overall, I enjoyed reading this paper and am mostly convinced by their arguments. The paper has a number of strengths; it’s well-written, for one, and I appreciated their inclusion of an experiment that did not support their hypothesis, and their words of caution that the effect may not be very robust. Although I am favorably disposed toward the paper overall, I have some relatively minor comments and suggestions that are outlined below.

  1. The author list at the top includes three authors (Motz, Goldstone, & Busey), but then an orcid ID is provided for an additional author (Prather). 
  2. The authors note in the abstract that in one experiment the pattern was “diminished.” I am not terribly comfortable with that word since it implies that the pattern was still present, whereas, it seemed to me, there was essentially no evidence of the effect in that experiment (perhaps the authors are basing the wording off of a significant t-test, but see my concerns about this t-test in comment # 14).
  3. I thought the Introduction was excellent. It clearly and efficiently presents the literature and sets the stage for the experiments to come.
  4. The sample sizes for the experiments seemed rather small to me (e.g., n=12 in Experiment 1), and it wasn’t clear how they were decided upon. It would be great if the authors could provide some rationale.
  5. In most of the experiments, there was no effect on error rates. I don’t think this is a major problem, but I think the authors should address why they would find an effect on RTs but not accuracy (and to what extent this weakens their argument). 
  6. For all ANOVAs, I recommend that the authors list the factors, how many levels were within each factor, and whether factors were within- or between-subject, just to make things explicit.
  7. Relatedly, the authors should be explicit about how they analyzed error rates. I assume these analyses were also ANOVAs but this should be made clear. 
  8. Remaining on the subject of analyses, I noticed that target-present and target-absent trials were included in the ANOVA for Experiment 1. I found myself wondering throughout the manuscript whether there’s an effect of magnitude on the target-absent trials. Should it be faster to search through 5s to determine the absence of 2s than vice versa? Or perhaps the opposite? Regardless, target-absent vs. target-present trials seem very different to me, at least in terms of an effect of magnitude, and I wonder if target-absent trials should have their own, separate analysis. 
  9. I’m not sure I fully understand the logic of the set size analysis in Experiment 3 in regards to disentangling effects of target acceptance vs. distractor rejection. In the prologue to Experiment 3, the authors say that if the effect is based on distractor rejection, there should be no advantage for the larger quantity target with small set sizes, but in the results they mention that there was “no reversal” of the effect at smaller set sizes, so it’s not clear to me whether the prediction would be no effect or a reversal of the effect with smaller sets. 
  10. In Experiments 3 and 4, the authors cite Tong and Nakayama (1999) for their decision to only include target-present trials in the analysis, but it’s not clear what the logic is, especially since they included target-absent trials in the ANOVAs for the earlier experiments. It would be great if the authors could provide the rationale and why it’s different between experiments.
  11. There are a couple of times when the authors claim that an effect is “marginally significant” when p < .05 (p = .041 and .033). Are the authors using a different alpha value to determine significance than the typical .05? If so, what is it? I was particularly confused because there are other analyses where the p value is pretty similar (e.g., .027), where the authors do not claim that the effect is marginal. It comes across as arbitrary.  
  12. Relatedly, the claim that the magnitude effect is due to target salience seems shaky to me. There was an interaction between set size and target size in Experiment 3 (although the authors say it was only marginally significant, in one example of the concern raised in the previous comment), and while the authors claim that the magnitude effect was maintained at smaller set sizes (last full paragraph of Experiment 3 results/discussion), I don’t see a statistical test of that. It looks to me from Fig. 3 that t-tests of set sizes 2 and 3 might find no significant effect of magnitude. 
  13. I would encourage the authors to include a figure of the data in Experiment 4, for the sake of transparency. 
  14. The authors include a t-test of set size 6 in Experiment 4 that to me is inappropriate. In addition to looking at a single set size, despite the absence of a main effect or interaction, the authors included both target-present and target-absent trials (whereas target-absent trials were not included in the ANOVA), and the t-test was one-tailed. These decisions seem engineered to find something significant, and I would recommend removing the analysis. 
  15. The presence of the just-mentioned one-tailed t-test led me to wonder whether the t-tests included in the other experiments were one-tailed or two-tailed. I assume two-tailed, but it would be best to be explicit. 

Author Response

Index

Comment

Response

R1.1

The author list at the top includes three authors (Motz, Goldstone, & Busey), but then an orcid ID is provided for an additional author (Prather).

This appears to be a typesetting error that occurred when our submission was transitioned into MDPI format for review.  We have now corrected it in the revision.

R1.2

The authors note in the abstract that in one experiment the pattern was “diminished.” I am not terribly comfortable with that word since it implies that the pattern was still present, whereas, it seemed to me, there was essentially no evidence of the effect in that experiment (perhaps the authors are basing the wording off of a significant t-test, but see my concerns about this t-test in comment # 14).

We intend no discomfort!  We have replaced the word “diminished” with “not evident.”

R1.3

I thought the Introduction was excellent. It clearly and efficiently presents the literature and sets the stage for the experiments to come.

Thank you very much for this comment. 

R1.4

The sample sizes for the experiments seemed rather small to me (e.g., n=12 in Experiment 1), and it wasn’t clear how they were decided upon. It would be great if the authors could provide some rationale.

Our sample sizes are not justifiable.  In full disclosure, we started running these experiments in 2012 and these sample sizes were characteristic of norms in psychophysics research at the time.  Given that we lack a rationale (other than 10-year-old norms), we have refrained from inserting such.

R1.5

In most of the experiments, there was no effect on error rates. I don’t think this is a major problem, but I think the authors should address why they would find an effect on RTs but not accuracy (and to what extent this weakens their argument).

We have added a sentence noting that error rates are generally very low in visual search experiments and that the absence of differences in error rates suggests that the effects are not caused by changes in participants’ decision criteria.

R1.6

For all ANOVAs, I recommend that the authors list the factors, how many levels were within each factor, and whether factors were within- or between-subject, just to make things explicit.

Good points.  We have added these details to Experiment 1, and referenced this design in the subsequent analyses.

R1.7

Relatedly, the authors should be explicit about how they analyzed error rates. I assume these analyses were also ANOVAs but this should be made clear.

R1.8

Remaining on the subject of analyses, I noticed that target-present and target-absent trials were included in the ANOVA for Experiment 1. I found myself wondering throughout the manuscript whether there’s an effect of magnitude on the target-absent trials. Should it be faster to search through 5s to determine the absence of 2s than vice versa? Or perhaps the opposite? Regardless, target-absent vs. target-present trials seem very different to me, at least in terms of an effect of magnitude, and I wonder if target-absent trials should have their own, separate analysis.

We wondered the same, and indeed, this was the motivation for Experiment 3, which merits some clarification (see comment R1.9, below, and our response).

R1.9

I’m not sure I fully understand the logic of the set size analysis in Experiment 3 in regards to disentangling effects of target acceptance vs. distractor rejection. In the prologue to Experiment 3, the authors say that if the effect is based on distractor rejection, there should be no advantage for the larger quantity target with small set sizes, but in the results they mention that there was “no reversal” of the effect at smaller set sizes, so it’s not clear to me whether the prediction would be no effect or a reversal of the effect with smaller sets.

We have clarified our language in this prologue.  Indeed, our use of the term “reversal” was inconsistent with the way we motivated Experiment 3.  A better way of describing this analysis was to assess whether there was a complete elimination of the asymmetry, which there was not (see also R1.12).

R1.10

In Experiments 3 and 4, the authors cite Tong and Nakayama (1999) for their decision to only include target-present trials in the analysis, but it’s not clear what the logic is, especially since they included target-absent trials in the ANOVAs for the earlier experiments. It would be great if the authors could provide the rationale and why it’s different between experiments.

We have clarified this sentence in Experiment 3 to add additional rationale.

R1.11

There are a couple of times when the authors claim that an effect is “marginally significant” when p < .05 (p = .041 and .033). Are the authors using a different alpha value to determine significance than the typical .05? If so, what is it? I was particularly confused because there are other analyses where the p value is pretty similar (e.g., .027), where the authors do not claim that the effect is marginal. It comes across as arbitrary. 

We appreciate Reviewer 1’s perspective.  However, we also consider the conventional .05 alpha level threshold to be rather arbitrary, and also contentious (see Benjamin et al., 2018, “Redefine statistical significance” in Nature Human Behavior).  Our reporting exerts care when considering p-values that are particularly close to .05, and we believe this to be merited (see also R2.5).  Throughout our manuscript we also refrain from making strong claims on the basis of the present results, we present all findings without censorship, and highlight limitations in our findings. 

R1.12

Relatedly, the claim that the magnitude effect is due to target salience seems shaky to me. There was an interaction between set size and target size in Experiment 3 (although the authors say it was only marginally significant, in one example of the concern raised in the previous comment), and while the authors claim that the magnitude effect was maintained at smaller set sizes (last full paragraph of Experiment 3 results/discussion), I don’t see a statistical test of that. It looks to me from Fig. 3 that t-tests of set sizes 2 and 3 might find no significant effect of magnitude.

A common misconception is that the absence of a statistically-significant effect is evidence of the absence of an effect.  However, we have now conducted the requested analysis.  In this particular case, the difference between finding numerically-larger targets and numerically-smaller targets is statistically significant for set sizes 2 and 3 (combined).  We have added this post hoc contrast.

 

R1.13

I would encourage the authors to include a figure of the data in Experiment 4, for the sake of transparency.

We have now added this figure.  See also comment R2.7

R1.14

The authors include a t-test of set size 6 in Experiment 4 that to me is inappropriate. In addition to looking at a single set size, despite the absence of a main effect or interaction, the authors included both target-present and target-absent trials (whereas target-absent trials were not included in the ANOVA), and the t-test was one-tailed. These decisions seem engineered to find something significant, and I would recommend removing the analysis.

This is a fair criticism, and this analysis has been removed.

R1.15

The presence of the just-mentioned one-tailed t-test led me to wonder whether the t-tests included in the other experiments were one-tailed or two-tailed. I assume two-tailed, but it would be best to be explicit.

Yes, two-tailed all around (except for the analysis mentioned in R1.14, which has now been removed).  We have added clarification in Experiment 1.

 

Reviewer 2 Report

Comments on the study entitled “Visual search asymmetry due to the relative magnitude represented by number symbols” (ID vision-1300916).

The authors extended the knowledge on numerical cognition by providing insight on visual search asymmetry: stimuli that symbolically represent large numerosity are quicker identified compared to stimuli that represent small numerosity. These asymmetric biases were found irrespective of the orientation or the frequency of stimuli. The authors suggest that the magnitude of a symbolical number can affect visual perception search.

I find the study clear and well design. For my concern, the present form of the manuscript sounds almost fine to be considered for publication. The study would be a valid contribution to the field. I have some minor comments on the statistical approach. Please find below my comments, which can be extended to all experiments’ descriptions.


Statistical analysis:
- The authors may report which statistical software has been used to compute the analysis. The required information can be only described in the “2.2 Results and Discussion” section due to the redundant approaches in the following experiments.

- Author should report more detail on the adopted analytical approach. I would recommend providing a better clarification on which main effects and interactions have been considered in the analysis.

- In most cases, the values of the “reaction time” variable presented a right-skewed distribution. Did authors perform a data-transformation (e.g. log-transformation)? Did the authors check the assumptions of the regression analysis?

- Planned comparison should be performed in case of a significant interaction between interested main effects. For example, I focused on experiment 1 (2.2 section). Although the figure is quite clear, I do not find which interactions have been considered in the analysis. As suggested in the second comment, authors should clarify the structure of their regression analysis. Please provided better clarification and motivation for the planned comparison in each experiment.

- In experiment 2  (section 3.2), the results did not reject the null hypothesis (i.e., no difference in the visual search time due to the nature of stimulus), which is the result expected by authors. To support their non-significant results, authors should consider reporting an index, for example, an approximate Bayes Factor, to compare models with and without the interesting effect (i.e. nature of stimulus). Please consider the following article:
Wagenmakers, E. J. A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14, 779–804. https://doi.org/10.3758/BF03194105 (2007).

- In experiment 3 (section 4.2), the significant interaction between set size and target size is significant due to the slope of the two regression lines. However, this side effect did not influence the main results.  

- The figure for the results of experiment 4 is missing.

-Finally, there is a redundant mention of analysis for the statistics among experiments. I think the results can be significantly condensed by tabulating the statistics.

Author Response

Index

Comment

Response

R2.1

The authors may report which statistical software has been used to compute the analysis. The required information can be only described in the “2.2 Results and Discussion” section due to the redundant approaches in the following experiments.

We have now noted our use of IBM SPSS for all statistical analysis.

R2.2

Author should report more detail on the adopted analytical approach. I would recommend providing a better clarification on which main effects and interactions have been considered in the analysis.

We appreciate this comment and have added more detail (see also R1.6 and R1.7, above).

R2.3

In most cases, the values of the “reaction time” variable presented a right-skewed distribution. Did authors perform a data-transformation (e.g. log-transformation)? Did the authors check the assumptions of the regression analysis?

Two points of clarification are merited here.

 

First, this comment mentions “reaction time” (in quotes), but the phrase “reaction time” never appears in our article.  In formal psychophysics parlance, “reaction time” refers to the amount of time to detect the presence of a stimulus.  The phrase “response time” refers to the amount of time to detect the presence of a stimulus, and to select and execute a response.  Raw response time data are, indeed, right(positive)-skewed (but not quite as dramatically as raw reaction time data).

 

Second, we never attempt to model raw response times.  All analyses are conducted on participant averages, which are not skewed.  No data transformation is necessary in this case. 

R2.4

Planned comparison should be performed in case of a significant interaction between interested main effects. For example, I focused on experiment 1 (2.2 section). Although the figure is quite clear, I do not find which interactions have been considered in the analysis. As suggested in the second comment, authors should clarify the structure of their regression analysis. Please provided better clarification and motivation for the planned comparison in each experiment.

The structure of the ANOVA design has been clarified. 

R2.5

In experiment 2  (section 3.2), the results did not reject the null hypothesis (i.e., no difference in the visual search time due to the nature of stimulus), which is the result expected by authors. To support their non-significant results, authors should consider reporting an index, for example, an approximate Bayes Factor, to compare models with and without the interesting effect (i.e. nature of stimulus). Please consider the following article:

Wagenmakers, E. J. A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14, 779–804. https://doi.org/10.3758/BF03194105 (2007).

We agree with Reviewer 2’s suggestion that we should avoid confusing failure to reject the null hypothesis with the absence of an effect (as mentioned in our response to R1.12).  However, we have opted against reporting a Bayes Factor.  Bayes Factors provide no new information – there is a mathematical conversion from p-values to BFs that requires no new parameters – so the BF is redundant.  We also feel that it would be inconsistent to report only one BF for one contrast, and we instead prefer consistency.  Additionally, Berkson's interocular traumatic test reveals plainly in Figure 2: the pattern of differences in Experiment 1 are not present in Experiment 2.  Finally, by making all raw data available without restriction, interested readers are equipped to calculate any variety of indices that they may desire.

R2.6

In experiment 3 (section 4.2), the significant interaction between set size and target size is significant due to the slope of the two regression lines. However, this side effect did not influence the main results. 

This finding has been clarified and an additional analysis has been added (see also R1.12)

R2.7

The figure for the results of experiment 4 is missing.

We have now added this figure.  See also comment R1.13.

R2.8

Finally, there is a redundant mention of analysis for the statistics among experiments. I think the results can be significantly condensed by tabulating the statistics.

We appreciate this suggestion, but we disagree that readers would favor ANOVA tables, which is not our preference. 

 

Back to TopTop