*2.2. Stimuli*

The stimulus set consisted of 240 different color images created as 3D graphics using Blender software v 2.79. The images depicted wooden dummies (80 stimuli), structures of cubes (80 stimuli), and chairs (80 stimuli) on white backgrounds (see Figure 1). The stimuli were composed of modular structures (cylindrical for the dummies, cubical for the cubes, and rectangular for the chair) which lacked detail (i.e., face, hair, hands/feet in the case of the wooden dummies) and were characterized by a light wood-like texture. This method was illustrated in a previous study by our research group [51]. Presently, 16 models were designed for each of the three stimulus categories (i.e., a mannequin with: both arms at side, left arm extended forwards) and then rotated around the vertical/longitudinal z-axis (0◦, <sup>±</sup>20◦, ±40◦). For each model, five points of view were obtained. This increased the visual variety and reduced eventual habituation effects due to stimulus repetition (3 types × 16 models × 5 rotations = 240 stimuli). The stimulus categories were balanced for stimulus distribution in the four quadrants of the visual field. The maximum size of the stimuli was 3.75 × 3.95 cm, subtending visual angles of 1◦ 53- × 1◦ 59-. No difference in stimulus luminance (≈15.9 *cd*/m2, *p* = 0.29) and volume occupation (non-empty pixels: ≈10.86%, *p* = 0.39) was revealed by the ANOVAs as a function of stimulus category. Moreover, since each stimulus served as both target and non-target in non-consecutive experimental runs (as illustrated in the following paragraph), there was no difference in mean luminance due to variation in directed attention (target vs. non-target).

**Figure 1.** Example of stimuli. The figure shows a few examples of the stimuli used in the present study. Bodies (wooden dummies), objects (chairs), and cube structures were created as 3D graphics. The stimuli set included 240 different images: 16 models for each of the three categories of stimuli, presented from five different points of view obtained by rotating each model along its vertical axis (−40◦, <sup>−</sup>20◦, 0◦, <sup>+</sup>20◦, +40◦).

#### *2.3. Task and Procedure*

Once the EEG cap was placed, participants were invited to seat in an acoustically and electrically shielded cabin, facing a high-resolution VGA (video graphics array) computer screen 114 cm away from their eyes. A fixation dot remained at the center of the screen for the entire duration of the experiment. The participants were invited to fixate upon it in order to minimize motion artifacts (i.e., eye gazes, blinks, and body movements) during EEG recording. The stimulus presentation was performed using Eevoke v2.2 (ANT Nneuro, Hengelo, The Netherlands). Each trial consisted of an image centrally displayed for 500 ms and followed by an empty, isoluminant white background for 900 ms ± 100 ms (inter-stimulus interval, ISI). Each of the 240 images was repeated twice during the experiment (in non-consecutive runs) and presented in both upright and upside-down orientations, for a total of 960 stimuli. An experimental run included 80 trials in a pseudo-randomized order, counterbalanced for the type of stimulus (bodies, cubes, chair) and orientation (upright, upside-down). Twelve different runs were created and presented in pseudo-randomized and counterbalanced order to the participants. Before the beginning of the EEG recording, the participants were provided with the experimental instructions (printed and standardized) and engaged in practical training (using two additional runs) to familiarize them with both the task and setting. The participants were asked to identify a specific target stimulus, regardless of the orientation, with a button press on a joypad using the index finger (see Figure 2). The target was verbally indicated by the experimenter at the beginning of each run and represented one-third of the total images displayed in that run. The left and right hand were used

alternatively between runs, and the order was counterbalanced across volunteers. All the participants were blinded to the goal of the study and the stimulus proprieties.

**Figure 2.** Timescale of the experimental design. The stimuli were presented for 500 ms at the center of the screen, separated by an ISI (inter-stimulus interval) of 900 ± 100 ms. The participants were instructed to recognize a specific target category (i.e., bodies, as illustrated by the red squares in the present figure), indicated at the beginning of each run, by button press.

#### *2.4. EEG Recording and Data Analysis*

EEG data were recorded using a standard EEG cap with 128 electrodes located according to the 10–5 International System [52] using EEProbe v2.2 software (ANT Nneuro, Hengelo, The Netherlands). The sampling rate was 512 Hz, and averaged mastoids represented the reference electrode. Electrooculograms (EOG) were also collected. The impedance of the electrodes was maintained below 5 kΩ. The EEGs and EOGs were amplified and subjected to a half-amplitude band-pass filter (0.16–70 Hz, 50 Hz notch). An automated artefact rejection procedure was used to remove EEG segments marked by eye movements (saccades and blinks), muscle-related potentials, or amplifier blockages. Peak-to-peak amplitudes superior to 50 μV were considered artefacts. Trials containing errors (non-targets wrongly indicated as targets) and omissions (wrongly unrecognized targets) were also manually discharged. EEG epochs were synchronized with the stimulus onset. ERPs were averaged, considering −100 ms before the stimulus onset and 1000 ms after the onset. They were subjected to a band-pass filter of 0.16–30 Hz. ERPs were identified and measured with reference to the average baseline voltage, computed as the 100 ms before the stimulus onset. The electrode sites and ERPs' latency were chosen based on the maximum amplitude reached by the components of interest [53] and in accordance with previous literature [45–47]. For the purposes of the present manuscript, only the stimuli presented in the upright orientation were considered (50% of all trials). This avoided any possible confounding effect lead by the inversion of stimuli depicting bodies but not objects [51,54,55]. ERP averages were computed as a function of attention, electrodes, and hemisphere factors. The two levels of the attention factor (target, non-target) were obtained by collapsing all the target and non-target images, respectively, regardless of the content (to increase the EEG signal-noise ratio).

The mean area voltage of the N2 component was measured at AFp3h, AFp4h, AFF1, AFF2, F1, and F2 electrode sites during the 225–265 ms time window (see Figure 3). The mean area voltage of the selection negativity (SN) component was measured at P9, P10, PPO9h, PPO10h, PO7, and PO8 electrode sites during the 240–280 ms time window. The mean area voltage of the P300 component was measured at CPz, Pz, and POz electrode sites during the 350–450 ms time window. The N2 and SN data were subjected to multifactorial repeated measures ANOVA with three within-group factors, including: attention (non-target, target), electrode (three levels depending on the ERP component of interest), and hemisphere (left, right). The P300 data were subjected to multifactorial repeated measures ANOVA with two within-group factors, including attention (non-target, target), and electrode (CPz, Pz, POz) factors. Multiple comparisons were computed using Tukey's post-hoc tests; all the ANOVAs were performed using Statistica software (version 10, Tulsa, OK, USA) by StatSoft.

**Figure 3.** Grand average event-related potential (ERP) waveforms recorded over the scalp. Grand average waveforms (ERPs) recorded over the entire scalp (128 electrodes). The red lines represent the evoked response to target stimuli, while the blue lines represent the evoked response to non-target stimuli. The electrode sites where the three components of interest (N2, SN, and P300) reached the maximum peak are shown by the green circles. The time windows in which the N2 (225–265 ms), SN (240–280 ms), and P300 (350–450 ms) were analyzed are highlighted by the green areas.

Standardized weighted low-resolution electromagnetic tomography (swLORETA) was applied to the difference waves obtained by subtracting the ERPs for the non-target stimuli from those elicited by target stimuli in the SN time window (240–280 ms). LORETA, which is a discrete linear solution to the inverse EEG problem, corresponds to the 3D distribution of neuronal electric activity that yields maximum similarity (i.e., maximum synchronization) in terms of orientation and strength between neighboring neuronal populations (represented by adjacent voxels). In this study, an improved version of the sLORETA (standardized low-resolution electromagnetic tomography) was used, which incorporates a singular value decomposition-based lead field weighting (swLORETA) [56]. The following characteristics for source space were included: five points of grid spacing (the distance between two calculation points) and estimated SNR (Signal-to-Noise Ratio defines the regularization; a higher SNR value leads to less regularization and less blurred results) equal to three. The source reconstruction was performed on group data to identify statistically significant active electromagnetic dipoles (*p* < 0.05).

The accuracy (percentage of hits), reaction times (RTs), and errors (percentage of wrong responses to non-targets) were also recorded and measured. Repeated measures ANOVAs were performed on the mean RTs, and percentages of hits and errors with one within-group factor: hand (left, right).
