1. Introduction
Vision can be roughly decomposed into looking and seeing [
1]. Looking is selecting a fraction of visual inputs for deeper processing, e.g., by directing one’s gaze to a particular location so as to put it into one’s attentional spotlight; and seeing is inferring the visual properties from the selected fraction, e.g., to recognize a face within one’s attentional spotlight. Selection is also called attentional selection, and it can be top-down (also called goal-directed or endogenous), such as when directing one’s gaze to the page of a book in order to read, or bottom-up (also called goal-independent, input stimulus driven, or exogenous), such as when gaze is distracted from the page of the book by a sudden movement in the peripheral visual field [
2,
3]. This paper focuses on the exogenous selection and shows that this selection can be induced by a visual input feature, ocularity, that is often hardly visible to seeing.
In this paper, the strength of a visual location to attract attentional selection exogenously is defined as saliency. For example, a bright spot in a dark field is salient, so is a vertical bar among horizontal bars, a red spot among green ones, or an object moving to the left among many other objects that are static or moving in another direction. Hence, it has been well known that visual locations with a high contrast in visual input features such as luminance or colour, orientation, or motion direction are very salient [
4,
5,
6]. Thus, these visual feature dimensions, colour, orientation, and motion direction, are called basic feature dimensions. In other words, a visual location is salient if, in one of the basic feature dimensions, its visual input feature value has a large spatial contrast from the input feature values at neighbouring locations in the scene. Defining input ocularity as the relative difference between the input to the left eye and the input to the right eye for a visual input item, this paper demonstrates that a contrast in input ocularity also makes a visual location salient. In fact, saliency by ocularity contrast can be no less salient than saliency by orientation contrast, and hence, ocularity also constitutes a basic feature dimension.
Reaction time (RT) in a visual search task to find a target is often used to assess saliency [
6], such that a shorter RT is associated with a larger saliency at the target’s location. Traditionally, a spatial contrast in input ocularity is not thought to induce a high saliency since ocularity is hardly visible to perception. The invisibility makes it difficult or impossible for one to search for a target defined only by its unique ocularity, and the RT for this search would be very long or indefinite. For example, an input presented to the left eye only and another input presented to the right eye only differ very much in ocularity, but perceptually viewers can barely tell the difference between the two inputs if they are otherwise identical (unless their two eyes differ substantially in, e.g., eye sight). In a previous study, Wolfe and Franzel [
7] asked observers to find a monocular white dot (in a dark background) among other white dots presented to the other eye. The target identity would be obvious if observers were allowed to view the visual stimuli by one eye only while closing the other eye. Without this allowance, the input stimuli appeared like that in the image for the fused perception in
Figure 1B, and, consequently, the observers were randomly guessing in their responses. A successful performance in such a search task requires that the observer sees or recognizes the target’s distinctive feature, even though the saliency to be assessed by the RT is associated with looking (by exogenous selection) and not with seeing or recognizing.
Looking and seeing have different roles, and thus they are likely dissociable. If so, then conceivably, the eye-of-origin singleton like that in
Figure 1B could be very salient to attract gaze while its unique eye-of-origin could not be recognized for the search task. To assess its saliency requires a task that does not require observers to recognize the eye-of-origin. For example, one of my previous works [
8] involved searching for a uniquely tilted bar in a background of uniformly oriented bars. As illustrated schematically in
Figure 2, I presented all bars monocularly and found that the RT to find the target bar was shorter when the target had the unique eye-of-origin (in the dichoptic congruent (DC) condition) compared to when the target shares the same eye-of-origin with all the background bars (in the baseline condition). While observers did not need to recognize the eye-of-origin of the target to perform the task, the target was more salient when its eye-of-origin was unique. I also found that if instead a non-target background bar far from the target had the unique eye-of-origin (in the dichoptic incongruent (DI) condition in
Figure 2), the RT became longer compared to when all bars had the same eye-of-origin [
8]. This suggested that the non-target singleton in eye-of-origin attracted attention away from the target to interfere with the task. A subsequent experiment with eye tracking confirmed this suggestion: in most trials of the DI condition, the gaze was distracted to the non-target singleton before directing to the target [
9].
Figure 1 shows an analogue between colour features and ocularity features. A colour singleton in
Figure 1A is analogous to an eye-of-origin singleton in
Figure 1B, and they are both salient. This is so even though the colour singleton is distinctive to perception while the eye-of-origin singleton is not. Extending the analogue to that between
Figure 1C and
Figure 1D, the ocularity singleton in
Figure 1D has a weaker ocularity contrast (from the other input items) than the eye-of-origin singleton in
Figure 1B, just as the colour singleton in
Figure 1C has a weaker colour contrast from the background items than the colour singleton in
Figure 1A. As long as the colour contrast is sufficient, the colour singleton is salient, even though its saliency decreases with decreasing colour contrast. Hence, we expect that the ocularity singleton in
Figure 1D can attract attention as long as the ocularity contrast is sufficient, even though it may be less salient than the eye-of-origin singleton in
Figure 1B. This paper will show that this is indeed the case.
In addition, in a scene of mainly ocularly balanced input items, when one visual item has a large ocular imbalance in its input, i.e., when its inputs to the two eyes differ substantially in input strength, this visual item appears glossy or lustrous and can cause viewers to avoid looking at it. This paper will show that such a perceptual quality of lustre, at the level of seeing rather than looking, could also obscure the behavioural manifestation of the saliency by ocularity contrast. While the invisibility of the eye-of-origin feature obscured the saliency effects of an eye-of-origin singleton in Wolfe and Franzel’s study, visual aversion rather than invisibility obscures the saliency effects by an ocularity singleton due to ocular imbalance.
To reveal saliency effects by ocularity singletons, visual search tasks similar to those in the previous studies [
8,
9] were used. The task was either to search for a bar with a unique orientation from a background of uniformly oriented bars, or to search for a letter ‘T’ among letter ‘L’s. In either task, the observer had to press as quickly as possible a left or right button on a keypad held by both hands, using the left or the right thumb, respectively, to indicate whether the target was in the left or right half of the search array in their perceived visual image or scene. Let
and
be the input contrasts to the left and right eyes, respectively, for each search item, e.g., a bar or a letter. The value of
or
is defined as the difference
between the luminance
L of an input item and the luminance
of the background (which is uniformly white, black or grey), divided by the maximum possible luminance difference
by the visual display. Thus,
and
always. Denoting
and
so that
always, the ocularity of an item is defined as
. A monocular item has
or
; an ocularly balanced item has
; and a left-eye dominant item has a positive
O while a right-eye dominant item has a negative
O. The dichoptic condition of the visual inputs could be baseline when all the search items have the same ocularity
O. It can be DC, when all the search items have the same ocularity except for the target, or can be DI, when all search items have the same ocularity except for one non-target item in the opposite lateral half of the search array (in the perceived scene) from that of the target; see
Figure 2 for an example.
In all the experiments in this study, the search target was never defined by its ocularity value, so the dichoptic condition was irrelevant to the visual search task. If the ocularity singleton is salient to attract attention exogenously, then the baseline, DC, and DI conditions give three different attentional cueing situations for the task: uncued, validly cued, and invalidly cued, respectively. In other words, while the baseline condition has no ocularity singleton to cue attention, the ocularity singleton in the DC and DI conditions, respectively, cues attention towards and away from the target. Some experimental sessions randomly interleaved trials from all the three dichoptic conditions; others randomly interleaved just the baseline and the DC conditions. The RTs for the three dichoptic conditions will be denoted as
,
and
, respectively. A relative ease of the task in the DC compared to the DI conditions, e.g., a positive
, is a cueing effect or a behavioural manifestation of the saliency by the ocularity singleton, in one of the usual manners to assess attentional attraction by contrasting the validly cued and invalidly cued situations. Sometimes, but not always, the saliency of the ocularity singleton can also be manifested as a positive
, the ease of the task in the DC condition relative to that in the baseline condition (cf. validly cued versus uncued), or as a positive
(cf. invalidly cued versus uncued). The saliency can also be manifested as the priority to saccade towards the ocularity singleton before saccading elsewhere, when gaze is tracked during the search. Some results in this paper have been presented previously in preliminary forms [
10,
11].
2. Materials and Methods
All experiments, Experiments 1–4, in this study were adapted from previous experiments in terms of equipment, stimulus designs, and visual tasks. Hence, I will briefly outline the essentials and highlight the differences, since more details can be found in published papers [
8,
9]. In this paper, each reported luminance of visual inputs always refers to the luminance measured directly from the visual display, without taking into account the luminance reduction along the optical pathway from the display to the eyes by the presence of the stereo goggles, the mirror stereoscope, and/or the half-reflective mirror for the eye tracker. For example, in Experiment 1, the actual luminance reaching the eyes was reduced by seven eighths by the pair of stereo goggles used to view the dichoptic stimuli.
2.1. Experiment 1
Experiment 1 extends from Experiment 4 of [
8], and the main text highlights the main difference between the two studies. All stimulus bars were
bright rectangles in a black background, displayed on a Clinton Monoray cathode ray tube (CRT) at a frame rate of 150 Hz, viewed at a distance of 40 cm in a dim room using a pair of FE-1 shutter goggles from the Cambridge Research System. The target bar was tilted
from horizontal, in the clockwise or counter-clockwise direction with equal probability, and the non-target bars were tilted
from horizontal in the other direction. The bars formed an array of 22 rows by 30 columns, extending
in visual space, and each bar’s location was randomly and independently displaced from its regular grid position by up to
horizontally and vertically. The target’s location was about
(at least
horizontally) from the centre of the array. A task-irrelevant bright binocular dot (
) was placed at the centre of mass of every four closest neighbouring bars, and a task-irrelevant binocular disk (
diameter) was also placed at each of the four corners outside the array of bars. These binocular dots and disks were used to anchor binocular vergence. The dots were displayed simultaneously with the stimulus bars while the disks were constantly on the display throughout each experimental session. The dichoptic inputs
and
for each bar was such that
were fixed for all bars, while
varied with experimental conditions and trials. When
, the bar was 24
on the screen without wearing the stereo shutter goggles. In the session of
Figure 3B, each bar (except for the zero-disparity items in the left-most and right-most columns) had a random horizontal disparity independently chosen from the range
by shifting the bar’s locations in the two monocular images by an equal amount (half of the disparity magnitude) in the opposite horizontal directions. The central fixation point (
diameter disk) was binocular and presented for about 1.2 s on a blank screen (other than the four corner disks to help anchor vergence) before the bar stimulus onset. This fixation screen was triggered by the subject’s button press to start a trial, and the bar stimulus disappeared after the subject’s button response for the target’s location. In each session, there were 50 trials for each condition for each subject, except for subject LZ, who had only 20 trials for each condition.
2.2. Experiments 2–4
These three experiments were adapted from Experiment 1b in a previous eye tracking study [
9], using a set up with a mirror stereoscope (in front of a Mitsubishi 21-inch CRT display at a frame rate of 100 Hz) and a video eye tracker schematically shown in
Figure 4A of [
9]. All the eye-tracking techniques and details (such as the criteria for fixations and saccades) are as in [
9] unless mentioned otherwise. The right eye of the observers was tracked.
In Experiment 2, the bars were in a array extending . Each bar was , tilted from horizontal. The locations of the bars were randomly and independently jittered from the respective regular grid positions by up to horizontally and vertically. The possible grid locations for the target in the search array were within Rows 4–20, closest on a circle that was centred on the array and had a radius of 10 grid units. Each target was at least 6 grid units horizontally from the centre of the array. Each observer participated in one session, which had 450 trials, with 50 trials for each condition.
In Experiment 3, the letters were each in size, and each horizontal or vertical stroke of the letters had a thickness of . The locations of the letters were randomly jittered from the regular grid positions by up to horizontally and vertically. The search array was , extending . The possible grid locations for the target in the search array were within Rows 4–14, closest on a circle that was centred on the array and had a radius of 7 grid units. Each target was at least 5 grid units horizontally from the centre of the array. Each observer participated in one session, which had 400 trials, with 50 trials for each condition.
In Experiment 4, the letters were in a array (extending ), each letter was in a sans-serif font. The locations of the letters were randomly jittered from the regular grid positions by up to horizontally and vertically. The target was in Rows 4–11, at a horizontal distance of 2–5 grid units from the centre of the array, and within a radius of 3–6 grid units from the centre of the array. Each observer participated in one session, which had 45 search trials for each condition. The eye tracking data in this experiment had poor quality or were incomplete for several observers and hence were not analysed further.
In Experiment 2 and Experiment 3, each bar or letter in the search array was dark in a white background of , and its luminance was for the left eye and for the right eye with , , and defining the relative strength of left eye and right eye inputs. The values of and were assigned according to experimental conditions as described in the main text.
In Experiment 4, each letter in the search array was lighter than the grey background, unlike Experiments 2 and 3. In addition, each letter also had an additional random luminance fluctuation. For each zero-ocularity non-target letter, the monocular luminance of the strokes in the letter was where was the background luminance, and , and x was a random number within for each letter (identical for the two monocular images) and was independent between letters. For the target letter, the monocular luminance of the strokes in the letter was , where or for the left or right monocular image, and x (identical for the two monocular images) was a random number within the range .
In each of Experiments 2–4, the disparity of each search item (except for the zero-disparity items in the left-most and right-most columns) was randomly and independently assigned to within the range of
by shifting the horizontal positions of each monocular image item in a similar way as that in the session depicted by
Figure 3B for Experiment 1. In addition to the binocular dots (which were square dots with side length
) between the search items to anchor vergence, the monocular images of the search arrays were enclosed by a vergence anchoring frame, which was black for Experiments 2–3 and white for Experiment 4, like that illustrated in
Figure 4B of [
9]. The size of this anchoring frame was 10% larger than the extent of the search array.
2.3. Data Analysis
For each observer and each stimulus condition, the RT of the mean button press was the average of the RTs for the button presses across the pool of the trials of this observer and this condition. This pool excludes the trials in which the button press was incorrect and excludes the trials in which the button press RTs were outliers. A button press RT was an outlier if it was shorter than s or longer than the average RT by three standard deviations of the RTs (with the average and standard deviation calculated before removing the RT outliers) for this observer and this condition. This same pool of trials was also used to calculate the average RT for the gaze to reach the target (excluding in addition trials in which the gaze did not reach the target).
The average RT or error rate across subjects were the average of the mean RTs or error rates, respectively, of individual subjects. All error bars in the plots were standard errors of the mean, whether it was the mean across the trials for a subject or the mean across observers. A significant difference between two quantities was when the p-value in a t-test satisfies . Matched (subject) sample t-tests were used when comparing two conditions involving the same set of subjects. When RTs vary too much between observers, the RTs were normalized by dividing each subject’s RT for a given condition by this subject’s RT for the corresponding baseline condition, before averaging across subjects. These results are then plotted out as the normalized RTs.
3. Results
The first experiment, Experiment 1, extends directly from Experiment 4 in the original study [
8] that demonstrated the saliency of an eye-of-origin singleton. The task for the observers was to search for a uniquely oriented bar and report as soon as possible by button press whether the target was in the left or right half of the array of bars in their perceived image. All stimulus bars were bright on a black background, tilted
or
from horizontal. The target bar and non-target bars were tilted in opposite directions such that there was a
orientation contrast between the target and non-target bars. This experiment copied the following aspects from Experiment 4 in the original study [
8]: equipment setup, the observer’s task, and the properties of the visual stimuli, in terms of the size (22 rows and 30 columns extending over a
visual space) of the search arrays, the size of the stimulus bars (each a
rectangle), the task-irrelevant binocular dots to anchor vergence, the central fixation point before the search array onset, and the viewing distance (40 cm). It differed from the previous design in the following aspects: (1) a larger (doubled) extent of the random position jitters of the search items, (2) the angle (
, rather than
) of each bar from horizontal, and (3) the dichoptic input properties of the search items. Aspects (1) and (2) made the target bar less salient by the spatial, non-dichoptic, properties of the visual stimuli and should prolong the RT to perform the task. The third aspect was to alter input dichoptic properties session by session to investigate how saliency depended on these properties. In each session, each trial was equally likely to be a baseline, DC, or DI trial. Each of the four observers had normal depth vision and participated in all the sessions of this experiment in a random order. One observer (LZ) was the author, while the other participants were naive to the purpose of the study (although one (KM) of them could perhaps guess the purpose since he was familiar with the research interests of the author).
3.1. From Eye-Of-Origin Singletons to Ocular Dominance Singletons
In one session of Experiment 1, all bars were monocular like in the previous study [
8], so that the ocularity singleton was an eye-of-origin singleton. In each trial, the eye of origin of the target bar was randomly the left or the right eye. In another session, each bar was shown to both eyes, but its luminance (43.2
) in one eye was nine times as strong as its luminance (4.8
) in the other eye, making the bar left-eye dominant or right-eye dominant in its input. In the baseline dichoptic condition, all the bars for a given trial had the same dominant eye. In the DC or DI condition, the ocularity singleton had one dominant eye and the other bars had the other dominant eye. In each trial, the target bar’s dominant eye was randomly the left or the right eye with equal chance.
Figure 3 shows the observations from these two sessions. When all the stimulus bars were monocular so that the ocularity singletons were eye-of-origin singletons (
Figure 3A),
was significantly shorter than
and
. This indicates that the target was more salient when it was also the eye-of-origin singleton; this is like the findings from the previous study [
8]. However,
was not significantly longer than
, unlike the finding from the previous study [
8], perhaps because the RTs in this study were about twice as long as those in the previous study mainly due to the smaller orientation contrast between the target bar and the background bars. Conceivably, if the ocular singleton distracted attention in the DI condition away from the target so that it consumed approximately an extra 200 millisecond (ms) for the task completion, this extra duration would be more noticeable out of an
ms, rather than an
–2000 ms. In any case, the significant difference between
and
is a signature of the saliency by the eye-of-origin singleton. The error rate for the DI condition was slightly although not significantly larger than those in the DC and baseline conditions. It is likely that the observers in their hurried decisions were more likely to make an error by mistaking the salient non-target ocular singleton as the target.
When each stimulus bar was left-eye or right-eye dominant, rather than monocular, qualitatively similar results were observed (
Figure 3B). Hence, saliency by an eye-of-origin singleton can be extended to saliency by an ocularity singleton, at least in this particular stimulus situation when the ocularity singleton was an eye-dominance singleton and when all the search items had the same magnitude
of ocularity.
3.2. An Ocularity Singleton Is More Salient When It Has a Larger Ocularity Contrast from
Background Inputs
Another experiment, Experiment 2, further investigated the saliency by ocularity singletons by varying the ocularity contrast between the singleton and the background items. This experiment is very similar to an experiment (Experiment 1b) in a previous eye tracking study [
9] in the equipment setup (using a mirror stereoscope and a video eye tracker), the task and the visual stimuli (including the binocular dots between the search items to anchor vergence). Like in Experiment 1, the observer’s task was to search for a uniquely oriented bar among uniformly oriented background bars, and each trial was randomly a baseline, DC, or DI trial with equal chance. Each bar was a
rectangle, tilted
from horizontal, and the target bar was tilted in a unique direction from horizontal. The bars were dark in a white background (of 110
), and they formed a
array (with some random spatial jitters from the regular grid) extending to a visual space of
. Each bar in a search array had a random disparity value within a range of
, except for the zero-disparity bars in the right-most and left-most columns of the search array. Six observers participated; all had normal depth vision, and all except one were naive to the purpose of the study.
The input contrasts and for the left and right eyes for the stimulus bars were such that was held constant across all the bars within a trial and across the trials within a session. Meanwhile, the ocularity magnitude was constant across the bars within a trial, but varied randomly across trials to be , , or . In a DC or DI trial, the ocularity singleton had the unique sign of the ocularity O (i.e., the unique dominant eye) among all the bars. Hence, each session randomly interleaved trials from nine different conditions arising from all the combinations of the three ocularity magnitudes and the three possible dichoptic conditions (baseline, DC, and DI).
Figure 4 shows the eye traces in four example DI trials with the largest magnitude
for the largest ocularity contrast between the ocularity singleton and the other bars. In each example, the gaze started at the central fixation point at the start of the stimulus onset, since the eye tracker verified that the gaze was directed centrally before the search array onset for each trial. In
Figure 4A, the first gaze shift occurred at 228 ms after the stimulus onset and was distracted to the non-target ocularity singleton, and the incorrect button was pressed at 303 ms (a very short RT) from the stimulus onset even though gaze went toward the target by the time the button was pressed. Due to the latency for executing a motor command, it was likely that, in this trial, the decision to press the button was made before the saccade towards the target. In
Figure 4B, the first saccade was in the correct direction towards the target, but the incorrect button was pressed at an RT of 383 ms upon the gaze reaching the distractor. In
Figure 4C, the gaze distraction in the first saccade was overcome so that the correct button was pressed at an RT of 648 ms after the subsequent saccade brought the gaze to the target. In
Figure 4D, the gaze went first towards the target before turning towards the distractor, and the correct button was pressed after the gaze had reached the distractor.
Figure 5 shows the RTs, averaged across six observers, for the gaze to reach the target and for the correct button presses in the nine conditions. It also shows the error rates of the button presses and the error rates in the lateral direction of the first saccade. The first saccade is defined as erroneous when the target was in the right or left half of the search array in the perceived image while this saccade went (from the central fixation point) leftward or rightward, respectively. The shorter RTs and fewer errors in the DC than the DI trials are again signatures of the saliency of the ocularity singleton. With eye tracking, another signature is the direction of the first saccade reporting the priorities of visual locations for visual selection. In this experiment, these signatures are apparent only when the ocularity contrast
between the ocularity singleton and the other input bars was the largest.
The button presses had an
ms, which is shorter in this Experiment 2 than that
ms in Experiment 1. This is mainly because Experiment 2 had (1) a larger orientation contrast between the target bar and non-target bars and (2) a smaller spatial extent for the search array. The shorter
made the difference
more apparent when
was significant. In comparison, in Experiment 1,
was not significant and negligible. In other words, the distracting effect of the ocularity singleton, measured by
, was more noticeable in Experiment 2 in which the overall RTs were shorter. This distracting effect was also apparent in the previous study [
8] in which the button press
ms. Meanwhile, since RTs cannot be shorter than a certain minimum floor value, the cueing effect measured by a significantly positive difference
is not as substantial here in Experiment 2 as it was in Experiment 1.
When the ocularity singleton was sufficiently salient, the error rate for the first saccade direction during search was larger in the DI than the other conditions. This error rate was much less than 50% in this study, like that in Experiment 1b of [
9]. However, this error rate could be more than 50% when the search array was more spatially extended and when the salient singletons are more eccentric, as in Experiment 1a of [
9].
To observe a more substantial RT difference
when the ocularity singleton is sufficiently salient, another eye tracking experiment, Experiment 3, used a task in which
was longer. It was like Experiment 2, but the task was to search for a target letter ‘T’ among non-target letter ‘L’s. This task is harder since the target does not have a uniquely oriented bar to make it salient in the baseline condition. To focus on
, Experiment 3 omitted the DI trials and added additional conditions with
. Each experimental session randomly interleaved eight conditions, made from all the combinations of four possible
values (
,
,
, and
) and two dichoptic conditions (baseline and DC).
Figure 6 shows the RTs and error rates in Experiment 3 averaged across five observers. For the conditions with a larger
and
(and thus a larger ocularity contrast), the difference
was substantial, but was only significant when
(due to a large variability across observers). This was so both for the RT of gaze arrival to the target and for the RT of correct button presses. In Experiment 3, four out of the five observers were naive to the purpose of the study, and the sixth observer was removed from data analysis because he could not see depth well (qualitatively, the same results would be obtained if this observer’s data were included).
3.3. Asymmetry between Ocularly Balanced and Ocularly Unbalanced Items
Figure 7 shows the observations from another two sessions in Experiment 1, which involved orientation singleton search and randomly interleaved the baseline, DC, and DI trials in each session. In one of these two sessions, all the bars were monocular except for the ocularity singleton, which was binocular, while the binocular summation
of inputs was constant across all the bars. In this session (see
Figure 7A), the saliency of the ocularity singleton was apparent, and both
and
were significantly positive. In the other session (
Figure 7B), all the bars were binocular, except for the ocularity singleton, which was monocular (again with
held constant across the bars), and
was insignificant in any observer or averaged across the observers. However, these two sessions were identical in terms of the ocularity contrast between that of the ocularity singleton bar and that of the other bars. Interestingly, the error rates in button presses were highest, though not statistically significant, in the DI condition in both sessions. This hints that, when the ocularity singleton was a monocular item among binocular items, this singleton may still distract attention when it was a non-target even though it did not reduce
when it was the target.
The author, who was one (LZ) of the subjects, observed that the monocular singleton was highly conspicuous since it appeared shiny or much brighter than other search items, as if it was an illuminant light source. Furthermore, she noticed that this shiny appearance made her instinctively avert her gaze from it as if to avoid looking at the Sun. Another observer reported that the brightest item in the search array was often not the search target in half of the trials, and that this perhaps inhibited him from looking at this brighter item when it was actually the target. Spatially local imbalance between inputs to the two eyes has been observed to cause the perception of lustre or glossiness [
7,
12,
13,
14,
15,
16]. This appearance is comprehensible since a specular surface could reflect a light source in a very directional manner, so that the reflected light could reach one, but not the other, eye of the viewer, causing ocular imbalance of inputs, which can then be used to infer the shiny or glossy nature of the object surface. Perhaps the monocular singleton distracted attention in the DI trials because it was highly conspicuous and that, when it was also the search target in the DC trials, it did not reduce the
because the observers felt inhibited from shifting gaze towards it. This is another example when an item’s appearance interfered with the behavioural investigation of its saliency. In this example, the appearance is lustre or glossiness by local ocular input imbalance; in Wolfe and Franzel’s study [
7], the appearance is the invisibility of the eye-of-origin feature.
Figure 8 showed that the saliency of the monocular singleton can be unmasked in another session in which the ocularity contrast was transient so that it was present only in the initial 0.15 seconds (s) of the search stimulus display. In other words, after the initial 0.15 s, the monocular singleton became binocular while keeping the binocular input summation
unchanged and identical to that of the other bars in the search array. Now, averaged across the same set of four observers,
was significantly shorter than
. Although the ocularity contrast was brief, it was sufficient to attract attention since exogenous attraction to attention acts quickly [
17,
18].
However, since seeing the properties (e.g., lustre) of a visual object usually occurs after looking (overtly or covertly via attentional guidance), the briefness of the monocularity of the input item was likely to reduce or even prevent the perception of lustre, thereby reducing or preventing the interference by this percept.
However, for sessions with the binocular singleton bar among monocular bars, making the ocularity contrast transient did not qualitatively change the RTs or the error rates.
Baker and colleagues [
19] observed that the perceived luminance contrast of an input item is often a nonlinear summation of the input luminance contrasts
and
to the two eyes, so that given a fixed
, the perceived contrast is often larger when the inputs are ocularly unbalanced
. The strongest nonlinearity is when the perceived luminance contrast is max(
), the larger one of
and
, and this is approximately the case when visual inputs are luminance decrements from the background luminance. In contrast, a lack of nonlinearity is when
is the perceived luminance contrast, and this is approximately the case when visual inputs are luminance increments from the background luminance, unless
is very different from unity. Experiment 1 used luminance increments as visual inputs, but its monocular bars had
or
. Hence, given the same
for our monocular and binocular bars, the monocular bar should appear stronger in luminance contrast. It is thus unclear whether the lustrous percept arose from the stronger perceived luminance contrast or from the ocular imbalance of the visual inputs in the context of ocularly balanced inputs.
Experiment 1 had two additional sessions involving monocular singletons among binocular bars. They were identical to the two sessions in
Figure 8, except that the monocular singleton bar had its max(
) identical to the max(
) for the binocular bars, which had
, i.e., for the monocular bar, the luminance contrast presented to one eye only was identical to the luminance contrast for each eye in all the binocular bars.
Figure 9 shows that these single-strength monocular singletons (in
Figure 9B) were still distracting in the DI trials and were not helpful to reduce the RTs in the DC trials. The author as a subject observed that their evoked lustrous percept was roughly as conspicuous and uncomfortable as that evoked by the double-strength monocular singletons shown in
Figure 8 and
Figure 9A.
3.4. Effects of Depth, Luminance Contrast, and Duration of the Ocularity Contrast
between an Ocularly Unbalanced Target among Ocularly Balanced Non-Targets
We have seen that the reaction time to find the target can be affected by the ocularity contrast between the target and the background items and by an observer’s aversion to a target’s perceived lustrous nature.
Figure 8 and
Figure 9 suggest that the aversion to the lustrous property can be reduced by making the ocularity contrast brief, so as to prevent the slower process for perception (for lustre) while allowing the faster process for saliency. However, the ocularity contrast was made brief by removing it abruptly during the search. Presumably because of this abrupt input change, three out of the four observers reporting seeing blinking or movement associated with an input item in the search array in some of the trials. Since this perceived input change could also serve as a cue to attract attention, it is unclear whether the observed attentional cueing effect was due to the saliency by the ocularity contrast or to the perceived sudden change.
Additionally, when a visual input item is viewed by both eyes in natural vision, the depth of this item can also affect reaction time since attention is often biased towards objects nearer to observers. The depth effect has been so far ignored in this paper, although the search items were assigned heterogeneous depth values (see Materials and Methods) in Experiments 2–3, as well as in Experiment 1’s session shown in
Figure 3B. Input luminance contrasts can also affect reaction times, since a stronger luminance contrast makes an input item more attractive to attention.
Experiment 4 was designed to examine the effects of these multiple stimulus factors mentioned above. It was a modified version of Experiment 3, using the same equipment setup, had the same task to search for a letter ‘T’ among letter ‘L’s. Each session randomly interleaved one baseline condition, in which all search items had zero ocularity , and nine DC conditions in which only the target ‘T’ had non-zero ocularity O. These nine DC conditions differed from each other in terms of the ocularity magnitude of the target and in whether this non-zero was static or whether this was transiently present for the initial period or s. When the target’s non-zero was transient, it decayed smoothly to zero over the duration between to s from the search array onset. The smooth decay rather than an abrupt disappearance was designed to eliminate the perception of the input changes. None of the nine observers for Experiment 4, including the sole non-naive observer (the author), noticed any changes (e.g., blinking or movements) associated with any search item during the visual search.
In addition, each letter in the search array was independently assigned a random value in its binocular input summation
, making the letter array appear heterogeneous in luminance contrast. In each trial, the
for the target ‘T’ was neither the maximum nor minimum of all the
’s. Averaged across the trials, the
for the target was the same as the average
for the non-target ‘L’s. This design aimed to reduce any conspicuousness that might arise from any lustre percept of the ocularity singleton. Among the eight naive observers, only one noticed that in occasional trials, the target letter appeared to have a layer of glaze on it. After the experimental data taking, some observers were invited to view the search stimuli from the static DC conditions leisurely and were asked if they perceived lustre on the target letter ‘T’. Typically, lustre was not perceived when the target’s
, but was perceived when this
[
14].
Experiment 4 also differed from Experiment 3 by having lighter letters on a darker grey background, instead of dark letters on a lighter background. This was so that an ocularly unbalanced item should appear to have approximately the same perceived contrast as an ocularly balanced item with the same binocular summation
of input contrasts [
19].
It is worth noting that, in each session of this experiment, the ocularly unbalanced singleton was encountered in 90% of the trials and was always the target whenever it was present. In contrast, in each session of Experiments 1–3, at least one third of the trials had no ocularity singleton; the target was the ocularity singleton in no more than 50% of the trials; and in some trials, the ocularity singleton was not the target. Hence, Experiment 4 may be more likely to make observers associate, consciously or unconsciously, an ocularity contrast with the target. In addition, in this experiment, letting attention be drawn towards the ocularity singleton (rather than resisting the attentional attraction) should improve the task performance.
Note that the ocularity O for all the non-targets was zero in Experiment 4. In contrast, this O was non-zero for all the non-targets in Experiments 2–3 and was the negative of the O for the ocularity singleton. Therefore, in the DC conditions, given a target’s , the ocularity contrast between the target and the background items in Experiment 4 was only half of that in Experiments 2 and 3.
The observations from Experiment 4 are shown in
Figure 10. The
was significantly shorter than
when the target’s
was sufficiently strong and transient (
Figure 10A). This is consistent with the idea, suggested by observations in
Figure 8 and
Figure 9, that an ocularly unbalanced target among ocularly balanced non-targets may repel observers unless the target’s non-zero ocularity is transient. However, unlike in sessions depicted in
Figure 8 and
Figure 9, observers in Experiment 4 did not perceive the transient changes by the disappearance of the ocularity contrast. Therefore the attentional cueing effect in Experiment 4 should be due to saliency by the ocularity singletons and not to any perception of input changes.
To analyse the depth effect, each target was categorized as a far-target or a near-target, when its distance from the viewer (by its binocular disparity) was farther or nearer than 50% of the letters in the search array. The attentional cueing effect by the ocularity contrast was much more evident for the far-targets than the near-targets (
Figure 10B,C). In general, the far-targets evoked substantially longer RTs than the near-targets, particularly in the baseline condition. Hence, in the DC conditions, the natural attentional bias against the far-targets can be partially overcome by the saliency from the ocularity contrast. When heterogeneous depth values were also randomly assigned to search items in Experiments 2–3 and in Experiment 1’s session shown in
Figure 3B, depth effects were also present, but were weaker. This is likely because these previous sessions had binocularly unbalanced inputs for all the search items, targets and non-targets. It is known that depth acuity is better when visual inputs for objects are binocularly balanced [
20].
To analyse the effects from the luminance contrast, each target was categorized as to have a strong or a weak contrast, when its
was larger or smaller than 50% of all the
’s (for all the letters) in a trial.
Figure 10D,E shows that the attentional cueing effect by the ocularity contrast was much more evident for targets with weaker luminance contrasts
. These targets required longer RTs, particularly in the baseline condition. Hence, saliency by ocularity contrast in the DC conditions also helped to partially overcome the attentional bias against these weaker-contrast inputs. Together,
Figure 10B–E indicates that ocularity contrast can help particularly visual inputs that are farther away from the viewers or are weaker in luminance contrast to overcome their weakness in attracting attention.