*Article* **Emotion Word Processing in Immersed Spanish-English/ English-Spanish Bilinguals: An ERP Study**

**Anna B. Cie´slicka \* and Brenda L. Guerrero**

Department of Psychology and Communication, Texas A&M International University, Laredo, TX 78041, USA

**\*** Correspondence: anna.cieslicka@tamiu.edu

**Abstract:** We conducted a lexical decision task to measure Spanish-English/English-Spanish bilinguals' behavioral (RT) and electrophysiological (EPN, Early Posterior Negativity and LPC, Late Positive Complex) responses to English emotion words and their Spanish translation equivalents. Bilingual participants varied in age of acquisition (AoA of Spanish/English: early, late), language status (L1 Spanish, L1 English) and language dominance (English-dominant, Spanish-dominant, balanced) but were all highly immersed bicultural individuals, uniformly more proficient in English than Spanish. Behavioral data showed faster and more accurate responses to English than Spanish targets; however, the emotion effect was only present for Spanish, with positive Spanish words recognized significantly faster than those that were negative or neutral. In the electrophysiological data, the emotion response was affected by language of the target stimulus, with English targets eliciting larger EPN amplitudes than Spanish targets. The reverse effect was found on the LPC component, where Spanish targets elicited a higher positivity than English targets. Dominance did not turn out to be a significant predictor of bilingual performance. Results point to the relevance of proficiency in modulating bilingual lexical processing and carry implications for experimental design when examining immersed bilinguals residing in codeswitching environments.

**Keywords:** emotion words; bilingual; early posterior negativity; late positive complex; proficiency; dominance

#### **1. Introduction**

Emotion words can be categorized into emotion-label and emotion-laden, where the former name a specific emotional state (e.g., angry, overjoyed), while the latter do not directly refer to an emotion but elicit it (e.g., *kitty, war*; Altarriba and Basnight-Brown 2010). Such words differ along the dimension of valence (positive, negative, or neutral) and arousal, or the amount of physiological response (high or low) they evoke (Lang et al. 1997). Research on emotion processing has consistently demonstrated the so called "emotion effect", i.e., the differential processing of emotionally-relevant content relative to non-emotional, neutral material (see Citron 2012).

The emotion effect may manifest as a faster response to emotionally valenced words in a lexical decision task (e.g., Estes and Adelman 2008; Kousta et al. 2009; Kuchinke et al. 2005; Larsen et al. 2006; Schacht and Sommer 2009b), enhanced priming for emotion relative to neutral words (Altarriba 2006; Altarriba and Canary 2004), faster lexical access of emotion words in reading (Kissler and Herbert 2013), better recall (e.g., Altarriba and Bauer 2004; Anooshian and Hertel 1994; Ayçiçe˘gi-Dinn and Caldwell-Harris 2009; Rubin and Friendly 1986), stronger attentional blink effect in response to emotion vs. neutral words (Colbeck and Bowers 2012), slower naming latencies in the Stroop task (Eilola et al. 2007; Sutton et al. 2007), increased galvanic skin response (GSR) to emotion words in psychophysiological studies (e.g., Harris et al. 2003), or a larger amplitude of an event-related potential (ERP) response in electrophysiological studies (e.g., Hofmann et al. 2009; Holt et al. 2005; Kissler et al. 2009; Zhang et al. 2014).

**Citation:** Cie´slicka, Anna B., and Brenda L. Guerrero. 2023. Emotion Word Processing in Immersed Spanish-English/English-Spanish Bilinguals: An ERP Study. *Languages* 8: 42. https://doi.org/10.3390/ languages8010042

Academic Editor: John W. Schwieter

Received: 5 September 2022 Revised: 19 January 2023 Accepted: 20 January 2023 Published: 31 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### *1.1. Electrophysiological Correlates of Emotion Word Processing*

Two ERPs have emerged as major indices of emotion effects: the early posterior negativity (EPN) and the late positive complex (LPC). The EPN component, recorded mainly at occipito-temporal sites, is viewed as an index of early (i.e., automatic) lexical access as it starts immediately after the onset of the lexicality effect in the word recognition task. Peaking between 200 and 300 ms, it reflects an enhanced attention to emotional content at early processing stages (Herbert et al. 2006; Junghöfer et al. 2001; Kissler et al. 2007, 2009; Optiz and Degner 2012; Schacht and Sommer 2009a, 2009b; Schupp et al. 2003). The EPN emotion effect has been consistently found in studies using varied methodologies, such as silent reading (Kissler and Herbert 2013; Kissler et al. 2007, 2009), grammatical decision (e.g., Kissler et al. 2007, 2009) or lexical decision (Citron et al. 2011; Palazova et al. 2011; Schacht and Sommer 2009b; Scott et al. 2009), and it seems independent of the stimulus presentation rate or the nature of the task (Herbert et al. 2008; Kissler et al. 2006, 2009; but see Rellecke et al. 2011). EPNs have been reported for emotion words of different grammatical categories, such as nouns (Kissler et al. 2007), verbs (Schacht and Sommer 2009b), and adjectives (Herbert et al. 2006, 2008).

The LPC component, which begins at approximately 400–500 ms post-stimulus and lasts for a few hundred milliseconds, is primarily recorded at centro-parietal electrodes and reflects higher-order cognitive stages of more elaborate semantic processing (Citron 2012; Cuthbert et al. 2000; Fischler and Bradley 2006; Kissler et al. 2009; Palazova et al. 2013; Schupp et al. 2003). The LPC is sensitive to the dimension of valence, in that its amplitude increases for positively (or negatively) valenced words relative to neutral words (Hofmann et al. 2009; Kanske and Kotz 2007); however, increased LPCs have also been reported for high arousal neutral over emotionally valenced words (e.g., Citron et al. 2011; Recio et al. 2014; see also Yao et al. 2016). Unlike the EPN, the LPC is affected by task requirements. It manifests in tasks requiring explicit processing of the emotional content or deep semantic processing, such as an overt valence categorization task (e.g., Delaney-Busch et al. 2016) but not in shallow tasks, such as orthographic judgment to spelling patterns (Fischler and Bradley 2006), same/different font judgment (Schacht and Sommer 2009b), or a semantic categorization task (e.g., Delaney-Busch et al. 2016).

While the electrophysiological literature is generally consistent when it comes to emotion effects present in both early and late ERP components, the findings differ regarding differential processing of emotion over neutral words. An enhanced EPN has been found in response to positively valenced versus neutral or versus both negative and neutral words (Chen et al. 2015; Palazova et al. 2011; Recio et al. 2014), or in response to positive higharousal and negative low-arousal words (Citron et al. 2013) and to emotionally arousing pleasant words over neutral words (e.g., Schacht and Sommer 2009b). Other research has shown an increase in the EPN elicited by both positive and negative versus neutral verbal stimuli (Herbert et al. 2008; Kissler and Herbert 2013; Kissler et al. 2007, 2009; Optiz and Degner 2012; Palazova et al. 2011; Schacht and Sommer 2009b). In later time windows, positive words have been associated with a larger LPC in comparison to negative or neutral words (Herbert et al. 2006, 2008; Kissler and Herbert 2013; Kissler et al. 2009; Palazova et al. 2011; Recio et al. 2014; Schapkin et al. 2000; Zhang et al. 2014); however, other studies have yielded results showing a larger LPC in response to negative relative to neutral words (e.g., Bayer et al. 2010; Hofmann et al. 2009), to negative words as compared to both positive and neutral words (Bernat et al. 2001; Delaney-Busch et al. 2016; Kanske and Kotz 2007), or an increased LPC amplitude in response to both negative and positive vs. neutral words (e.g., Conrad et al. 2011). These inconsistencies may be attributed to a number of factors that have been shown to modulate emotion processing, such as word frequency (e.g., Kissler et al. 2007; Kuchinke et al. 2007; Scott et al. 2009), concreteness (e.g., Hinojosa et al. 2014; Imbir et al. 2016; Kanske and Kotz 2007; Palazova et al. 2013; Yao et al. 2016), grammatical class (Palazova et al. 2011; Schacht and Sommer 2009b), arousal level of the emotionally valenced word (e.g., Citron et al. 2011, 2013; Delaney-Busch et al. 2016; Hofmann et al. 2009; Recio et al. 2014), task demands (e.g., Fischler and Bradley 2006; Hinojosa et al. 2010; Kissler

et al. 2009; Schacht and Sommer 2009b), the origin (automatic vs. reflective) of the word's emotional content (Imbir et al. 2016), and individual differences (Citron 2012; Gibbons 2009; Mueller and Kuchinke 2016).

#### *1.2. Emotion Word Processing in Bilinguals*

The question of emotion word processing is even more complex with bilinguals who express and perceive emotions in more than one language. Early research into second language (L2) emotion processing (e.g., Bond and Lai 1986) has suggested the possibility that L2 emotion word processing is characterized by more distance than L1, primarily because of the strong coupling between cognition and emotion and the fact that emotional connotations of words are established during the person's cognitive growth (see Harris 2015). Hence, L1 emotion words are intrinsically linked to a person's emotional responses, unlike L2 words that have been acquired later in life. This diminished L2 emotionality has been referred to as *disembodied cognition* (Pavlenko 2012) or *reduced emotional resonance in L2* (Toivo and Scheepers 2019). Indeed, a number of studies into bilingual emotion word processing have found attenuated effects for bilinguals' L2 as compared to L1 emotion words (e.g., Anooshian and Hertel 1994; Caldwell-Harris 2014; Colbeck and Bowers 2012; Gonzales-Reigosa 1976; Harris et al. 2003; Harris 2004; Sheikh and Titone 2016).

For example, Iacozza et al. (2017) recorded pupillary responses of Spanish-English bilinguals engaged in reading emotionally-charged English or Spanish sentences. After each sentence, participants were instructed to rate its emotional impact. While pupillary responses showed a significantly larger effect for L1 (Spanish) compared to English sentences, explicit ratings of emotionality were comparable across both languages. Iacozza et al. (2017) suggest that the sympathetic nervous system might show an attenuated emotional response in L2 depending on the language context. While an automatic, implicit measure of emotional reactivity, such as pupil dilation, is likely to show differential effects in the native and foreign language, a more explicit measure, such as subjective rating of the material's emotional impact, might reveal no such effects. Attenuated response of the sympathetic nervous system to L2 emotion stimuli was also reported by Jankowiak and Korpal (2017), who presented late proficient Polish (L1)–English (L2) bilinguals with emotionally-laden spoken and written L1/L2 narratives. The GSR results showed a reduced response to L2 compared to L1, and the effect was further constrained by the modality of the presentation with visual stimuli eliciting a more pronounced skin conductance level than the auditory stimuli.

However, other research has shown comparable effects for bilinguals' L1 and L2 emotion word processing (e.g., Conrad et al. 2011; Eilola et al. 2007; Eilola and Havelka 2011; Ferré et al. 2010; Harris et al. 2006; Kim 1993; Optiz and Degner 2012; Ponari et al. 2015; Sutton et al. 2007), suggesting that the decreased sensitivity of bilinguals to L2 emotional content might be more nuanced than initially assumed and affected by such bilingual participant characteristics as age of acquisition (AoA), level of proficiency, amount of exposure to L2, and language dominance (e.g., Ayçiçe˘gi and Harris 2004; Eilola et al. 2007; Harris 2004; Harris et al. 2003, 2006; Kazanas and Altarriba 2016; Sutton et al. 2007). For example, using a modified Stroop paradigm, Eilola et al. (2007) found equal emotional response for both L1 and L2 in late Finnish-English bilinguals who were highly proficient in their L2 (see also Sutton et al. 2007).

In addition to proficiency, language dominance has also been shown to affect bilingual emotion processing. In Harris et al.'s (2003) GSR study, native speakers of Turkish who learned English after 12 years of age were presented with L1 and L2 emotionally valenced words, including taboo words and childhood reprimands. Results showed a significant difference for reprimands in L1 vs. L2, with reprimands in Turkish eliciting a significantly stronger GSR than in English. Contrary to the expectation that taboo words in one's native language would elicit stronger responses than similar L2 taboo words learned later in life, reactivity to taboo words in both languages was highly comparable, suggesting that L1 is not necessarily a more emotional language in cases where L2 becomes more dominant (see also Ayçiçe˘gi and Harris 2004).

Indeed, Caldwell-Harris (2014, 2015) suggests that all of these bilingual participant factors are interrelated and converge in modulating emotional processing. For example, high proficiency is typically causally linked with early acquisition and correlated with frequency of use or amount of exposure to language. Increased exposure and frequency of use are, in turn, relevant for dominance, which is also crucially dependent on the immersive learning as when a bilingual resides in the L2-speaking country. Based on her review of studies into the emotionality differences between multilinguals' languages, Caldwell-Harris (2015) emphasizes the need to account for those modulating factors, suggesting that emotional processing differences are the most pronounced when the person's L1 is dominant and L2 is less proficient and learned later.

One question raised in the ERP bilingual emotion literature is whether L2 emotion words can evoke the early EPN response compatible to that elicited by L1 or whether the emotional content of L2 words only becomes available at later stages of processing. Because the EPN and LPC components reflect, respectively, early (automatic) versus late (higherorder) lexical processes, they can provide an insight into whether L1 and L2 emotion word processing differ. If both components are comparably affected by the emotional valence of words, regardless of whether the words are presented in L1 or L2, this would be indicative of early access of emotional content in both L1 and L2. However, if EPN responses are more pronounced for L1, this would support the idea of an attenuated emotional response in L2. In turn, the LPC, reflecting more elaborate semantic processing strategies, might be observed for L2 even when no EPN response was recorded. This is because of the possibility that L2 words have been by then re-translated into L1 and it is the L1 word's emotional content that causes an increased amplitude in LPC.

The existing ERP emotion processing studies have shown mixed results so far. For example, in Chen et al.'s (2015) lexical decision experiment, Chinese-English bilinguals who were late learners of L2 English showed an enhanced EPN effect only in response to L1 positive words during the time windows of 250–300 ms and 300–350 ms. In turn, valence effects for L2 did not emerge until 400–500 ms post-stimulus, and their topography differed from the EPN component, suggesting no EPN effect for L2 words. Similarly, no effects were found for L2 in later time windows. The LPC effects were shown only for L1 in the time windows of 500–550 ms and 550–600 ms, such that larger amplitudes were recorded for neutral than for positive words. The only marginally significant emotion effect for L2 was found in the time window between 400–500 ms, with neutral words eliciting a marginally larger negativity than positive words.

Conversely, other studies showed the presence of emotion effects for both L1 and L2 on both early and late ERP components (Conrad et al. 2011; Kissler and Bromberek-Dyzman 2021; Optiz and Degner 2012). In addition, the timing of the early EPN component has been shown to differ across L1 and L2 in that the processing of emotion words in L2 may be delayed relative to L1. For example, Conrad et al. (2011) conducted an ERP study with Spanish-German and German-Spanish bilinguals matched on L2 proficiency. Results revealed that, regardless of the language status, emotionally valenced words evoked a larger amplitude of an EPN and LPC as compared to neutral words for both bilingual groups. These findings suggest that emotion word processing in L1 and L2 does not differ qualitatively, although quantitatively the EPN response was delayed by 50–100 ms for L2 relative to L1. Conrad et al. (2011) interpret this time shift as indicative of a general delay in L2 visual recognition processes rather than a delayed L2 emotion recognition per se.

Further support for qualitatively comparable L1 and L2 emotion effects was demonstrated by Optiz and Degner (2012), who asked German-French and French-German bilinguals to perform a go/no-go lexical monitoring task. Participants were presented with valenced and neutral L1/L2 words and asked to determine whether a pseudoword was orthographically similar to real words in the respective target language. Results showed an amplified EPN for both positive and negative compared to neutral words, regardless of the language status. As in Conrad et al.'s (2011) experiment, the timing of the EPN differed across L1 and L2, in that emotional processing of L2 words was delayed relative

to L1. Optiz and Degner (2012) attribute this delay to costs of interference resolution in highly proficient L2 users. Since L1 and L2 lexicons in such proficient speakers are highly integrated, access to a word's emotional content results in automatic activation of both L2 and L1 lexical representations, hence incurring extra processing costs. However, in a recent study, Kissler and Bromberek-Dyzman (2021) failed to find any timing differences in the onset of EPN or LPC for L1 vs. L2 emotion responses in German-English bilinguals.

Previous studies into L2 emotion word processing have focused on comparing bilinguals against monolinguals (e.g., Kim 1993), late AoA L2 learners with different L2s (e.g., Conrad et al. 2011; Optiz and Degner 2012), unabalanced late bilinguals' performance in their L1 and L2 (Kissler and Bromberek-Dyzman 2021), or low proficient bilinguals performing in their L2 (e.g., Chen et al. 2015). Participants in these studies were consistently dominant in their L1. More recently, Vélez-Uribe and Rosselli (2021) examined emotion processing in Spanish-English bilinguals varying along the dimension of dominance and proficiency. While the balanced group comprised individuals with comparable levels of proficiency in their L1 and L2, the unbalanced bilinguals were more proficient in English than Spanish. Participants were asked to perform an emotion rating task in both Spanish and English. Results revealed a significant language effect on both the EPN and LPC components. The EPN data showed a larger amplitude for words in Spanish than English and the main effect of valence, i.e., an enhanced EPN response to positive vs. neutral and neutral vs. negative words. The LPC data showed overall larger amplitudes for words in English than in Spanish. In addition, significant differences emerged between balanced vs. unbalanced groups. Whereas balanced bilinguals showed comparable emotion effects for both English and Spanish, the unbalanced group manifested differences in the LPC amplitudes for Spanish words, such that positive targets recorded an enhanced positivity relative to negative and neutral targets. Vélez-Uribe and Rosselli (2021) explain these differences by suggesting that emotional content in the more proficient language might be processed identically by balanced and unbalanced bilinguals but that the processing patterns might diverge for the less proficient language.

#### *1.3. The Present Study*

The present study aims to further explore the time course of bilingual emotion word processing by focusing on bilingual participants who are not only highly proficient in their L2 but for whom L2 often becomes their dominant language. We use a lexical decision task to measure proficient Spanish-English and English-Spanish bilinguals' reaction time (RT) and electrophysiological (EPN and LPC) responses to English emotion-label and emotionladen words and their Spanish translation equivalents. Our bilinguals offer a unique opportunity to assess the contributions of the various factors modulating L1 and L2 ERP emotion effects. Specifically, our participants reside in a highly immersive environment, a US-Mexican border town, where both languages are spoken interchangeably. While the majority of them learned Spanish as their L1 or were exposed to both Spanish and English simultaneously, their early educational experience in US-based schools and subsequent English-only academic environments led to many of these bilinguals becoming dominant in English. In addition, they fall on a continuum of AoA, in that some of them learned English in early childhood, while others learned it after they had acquired Spanish. We ask the following questions: (1) Do L1 and L2 emotion word processing differ qualitatively and/or quantitatively in highly proficient immersed bilinguals who are routinely exposed to both languages and reside in a bilingual community? (2) How do language dominance and AoA modulate L1/L2 emotion word processing?

Given the inconsistent research with bilinguals (Conrad et al. 2011; Kissler and Bromberek-Dyzman 2021; Optiz and Degner 2012), our question regarding quantitative/qualitative differences between L1 vs. L2 emotion processing is purely exploratory. Of note, Spanish-English/English-Spanish bilinguals examined here are typically more proficient in English than in Spanish, regardless of their L1. It is therefore likely that ERP responses might be more pronounced for emotion words in the more proficient language

(English) than for emotion words in Spanish. In addition, L1 and L2 emotion processing might be affected by language dominance. For example, for Spanish-English bilinguals dominant in Spanish, EPN and LPC responses to emotionally valenced words might be more enhanced when presented in their dominant language (Spanish) than in the weaker language (English). If, however, English has become the dominant language for a Spanish-English bilingual, the reverse might be expected, with English (L2) emotion words evoking a larger EPN and LPC response than Spanish (L1).

As for the role of AoA, early bilingualism might be indicative of greater proficiency on account of length of exposure (Caldwell-Harris 2014). We therefore expected to see a more pronounced emotion effect for English emotion words in early than in late English L1 speakers who are dominant in English. Such bilinguals not only learned English early in their life and thus benefit from an increased length of exposure relative to late learners, but they have maintained greater dominance and proficiency in English over Spanish. Early Spanish learners might show an enhanced emotion effect for Spanish words; however, if they became dominant and more proficient in English, this effect might be diminished. On the other hand, late L2 learners might show an attenuated emotion effect similar to Chen et al.'s (2015) results.

Overall, while the research questions asked here are largely exploratory, the novelty of our study lies in the fact that most of the existing literature has examined bilinguals dominant in their L1 who have learned their L2 in more formal settings, whereas the population we investigate is unique. Specifically, it consists of bilinguals whose L2 was acquired in the immersive environment and became their more proficient language. These bilinguals are not only highly proficient in English, but they reside in the bilingual and bicultural community characterized by dense codeswitching practices. Exploring L1 and L2 emotion word processing in such bilinguals might help to shed new light on the interplay of the various participant characteristics in modulating the emotion effect.

#### **2. Materials and Methods**

#### *2.1. Participant*

The participants were 27 bilinguals (9 male, 18 female, *M age* = 21.25, *SD* = 5.65) recruited from the student population of a South Texas university. Informed consent was obtained from all subjects involved in the study. Data from one participant were discarded due to the excessive amount of muscular and ocular artifacts (40%) in the EEG recordings. Participants were all right-handed (Oldfield 1971), with normal or corrected-tonormal vision. Proficiency was established based on the adapted version of the Language History Questionnaire (LHQ; Li et al. 2006), while language dominance was assessed with the Bilingual Dominance Scale (BDS) (Dunn and Tree 2009). A total of 16 participants reported Spanish as their L1, 4 grew up as simultaneous bilinguals, and 6 learned English as L1. Out of the 16 L1 Spanish participants, only 2 were born in Mexico, while the remaining were second and third generation immigrants who were born, raised, and educated in the US. All L1 English and simultaneous bilinguals were born and raised in the US. Regardless of their native language, the majority of the bilinguals became dominant in English. Specifically, based on the BDS, 16 bilinguals were categorized as English-dominant, 5 as balanced, and 5 as Spanish-dominant. Proficiency self-ratings revealed that, overall, bilingual participants rated English significantly higher than Spanish in terms of speaking, reading, understanding, and writing. In total, 15 bilinguals reported losing fluency in Spanish and all but 3 had over 7 years of schooling in English. Self-ratings for English were therefore consistently significantly higher than for Spanish in all bilingual groups (see Table 1 for summary of the participant characteristics).


**Table 1.** Participants' language background information. Proficiency rating was measured with a 7-point Likert scale where 1 = very poor, 7 = native like. Asterisks show significant differences in proficiency ratings for English and Spanish.

Note. \* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001.

With regard to AoA, bilinguals were categorized as early and late. The term *early bilingual* is used in the literature to describe an individual who has been exposed to L2 from a very early age simultaneous with their cognitive and linguistic growth, for example as when a child grows up in a bilingual home with parents speaking two languages interchangeably. In turn, a *late bilingual* is a person whose exposure to L2 started after the foundations of their L1 have already been established and who starts learning L2 either through classroom instruction or immigration. The early category included participants who acquired language before the age of 5, while the late category included two subgroups: participants who learned Spanish/English between the ages of 6–9 and those who learned their L2 between 10–16 (see Heredia and Cie´slicka 2014). Since only two participants reported learning their L2 between the ages of 10–16, the two late subcategories were collapsed for further analyses. Overall, for AoA of English, fourteen bilinguals reported learning English before the age of 5 and twelve learned English between the ages of 6–9. For Spanish, 22 were early and 4 were late bilinguals.

#### *2.2. Stimuli*

The stimuli were 240 English emotion-label and emotion-laden words selected from the Affective Norms for English Words (ANEW) database (Bradley and Lang 1999) and their Spanish translations obtained from the Spanish adaptation of ANEW (Redondo et al. 2007). Spanish translations that were cognates or cross-language homographs were excluded. Within each language, emotion words varied along the dimension of valence (positive, negative, and neutral), with 80 words for each condition. Ratings of valence in English differed significantly between positive (*M* = 7.27; CI [7.11, 7.44]), negative (*M* = 2.84; CI [2.61, 3.07]), and neutral (*M* = 5.56; CI [5.31, 5.8]) words, *F* (2, 237) = 404, *p*<sup>s</sup> < 0.001. Ratings of arousal differed between positive (*M* = 5.53; CI [5.36, 5.70]) and neutral (*M* = 4.19; CI [3.99, 4.38]; *p*Tukey < 0.001) and between negative (*M* = 5.70; CI [5.45, 5.95]) and neutral words (*p*Tukey < 0.001), with no difference between positive and negative words (*p*Tukey = 0.51). Similarly, for Spanish, stimuli ratings of valence differed significantly between positive (*M* = 7.22; CI [7.06, 7.38]), negative (*M* = 2.23; CI [2.10, 2.36]), and neutral (*M* = 5.12; CI [5.01, 5.24]) targets, *F* (2, 237) = 1277, *p*<sup>s</sup> < 0.001. Arousal ratings were significantly different between Spanish positive (*M* = 6.01; CI [5.85, 6.16]) and neutral (*M* = 4.57; CI [4.47, 4.68]; *p*Tukey < 0.001), as well as between negative (*M* = 6.20; CI [6.04, 6.35]) and neutral targets (*p*Tukey < 0.001), with no difference between the positive and negative (*p*Tukey = 0.15). Spanish and English arousal and valence ratings did not differ significantly across positive, negative, and neutral categories (all *p*<sup>s</sup> > 0.05).

Stimuli were also matched according to word length, grammatical category, frequency, and concreteness (all *p*<sup>s</sup> > 0.05; see Table 2 for summary of stimuli characteristics). Word frequencies were selected from the SUBTLEX-ESP database (Cuetos et al. 2011) for Spanish and SUBTLEX-US (Brysbaert et al. 2012) for English. Concreteness ratings were derived from Brysbaert et al. (2014) for English and from Hinojosa et al. (2016) for Spanish words. Concreteness was controlled for, such that for each language, half the words were concrete and half were abstract. Thus, the final list included 240 English words (40/40 positive concrete/abstract, 40/40 negative concrete/abstract, 40/40 neutral concrete/abstract). The Spanish translations followed the same procedure.

**Table 2.** Means for valence, arousal, concreteness, word length (number of letters) and word frequency for English and Spanish word stimuli. Square brackets represent SE.


<sup>1</sup> Based on ANEW and REDONDO valence norming: 1-very negative, 9-very positive. <sup>2</sup> Based on ANEW and REDONDO arousal norming: 1-not at all arousing, 9-highly arousing. <sup>3</sup> Based on SUBTLEX-US and SUBTLEX-ESP (SUBTLWF)-frequency per 1 million of occurrences. <sup>4</sup> Based on concreteness ratings from Hinojosa et al. (2016) for Spanish: 1-very abstract, 9-very concrete; English concreteness values based on Brysbaert et al. (2014)—the scale was recoded to be comparable with Spanish: 1-abstract, 5-concrete. <sup>5</sup> Range = 2–5 syllables. Range = 4–11 letters.

An additional set of 160 nonwords (80 English and 80 Spanish) was created using Wuggy (http://crr.ugent.be/programs-data/ (accessed on 1 March 2020)), an experimental software that creates nonwords by changing letters from the provided set of languagespecific items. The resulting nonwords were pronounceable in English/Spanish, orthographically legal, and matched with the experimental stimuli in terms of length.

Lists were counterbalanced using a Latin square design, and participants were randomly assigned to each list. Two experimental lists were needed to counterbalance the design and to ensure that a participant did not see the same emotion word in both English and Spanish. Each list included 160 nonwords and 240 emotion words, 120 of which were English and 120 were Spanish (see Supplementary Materials for a complete list of stimuli).

#### *2.3. Procedure*

Stimuli presentation was controlled by E-Prime 2.0 (Schneider et al. 2002), which automatically randomized stimuli for each participant. Participants were seated approximately 60–70 cm from a 21-inch computer screen with both index fingers resting on the Chronos response box. They were instructed to read the letter strings appearing on the computer screen and to respond, as fast and as accurately as possible, by pressing a "YES" button on the response box if the string of letters was a Spanish or English word or "NO" if the string of letters was a nonword. Response sides were counterbalanced across participants. On each trial, a fixation cross was displayed for 800 ms, followed by the stimulus. Stimuli were presented centrally in black letters (font: Arial, size: 20) against a white background and remained on the screen until participants responded. After each trial, a "BLINK NOW" message in a white screen in black capital letters appeared for 1000 ms, allowing participants to blink and relax their muscles.

#### *2.4. EEG Recordings*

EEG was recorded from 64 scalp sites using a Biosemi Active Two headcap (10/20) layout and referenced to electrode Cz. The common mode sense (CMS) active and the driven right leg (DRL) passive electrodes were used as ground electrodes. To minimize artifacts related to eye movements, bipolar horizontal and vertical electrooculography (EOG) activity was recorded with additional electrodes attached under and next to the eyes. Electrode impedances were kept below 5 kΩ. The EEG signals were recorded continuously at a sampling rate of 2048 Hz. Preprocessing steps were performed using *MATLAB* (MATLAB R2022a, The Mathworks, Inc., Natick, Massachusetts, United States), EEGLAB (v. 2021.1; Delorme and Makeig 2004), and the ERPLAB Toolbox (v.9.00; Lopez-Calderon and Luck 2014). The data were first visually inspected for abnormalities, with sections of data showing excessive muscular artefacts being manually rejected. Abnormal channel activity was detected with the help of the trimOutlier plugin (Lee & Myakoshi SCCN, INC, UCSD) and by plotting the channels in EEGLAB. No more than six channels were rejected in each dataset (*M* = 6; min = 0, max = 6)

Each dataset was next filtered offline with a 0.1 Hz high-pass (slope 12 dB/octave) and 30 Hz low-pass (slope 24 dB/octave) IIR Butterworth filter. Subsequently, to correct for vertical and horizontal EOG artefacts, the Independent Component Analysis (ICA; Makeig et al. 1995) was run on the EEG data. The mean number of rejected ICs per participant was 2.32 (SD = 0.97; min = 1, max = 4). EEG continuous signal was next segmented into epochs of 900 ms, starting 200 ms prior to stimulus onset. The pre-stimulus period of 200 ms was used for baseline correction. Ocular and muscular artifacts were corrected using Artifact Detection function (peak-to-peak moving window; threshold: +/−100 µV; window size: 200 ms; window step: 100 ms) in ERPLAB and further subjected to visual inspection. Epochs containing ocular/muscular artifacts or amplitudes exceeding ±100 µV were rejected (see Table 3 for summary of the percentage of accepted epochs per condition).


**Table 3.** Percentage of accepted epochs per condition in the EPN and LPC analyses.

#### *2.5. Statistical Analyses*

#### 2.5.1. Behavioral Data Analysis

RTs exceeding 3.0 standard deviations above the mean were excluded and analyzed as errors (2%). All analyses were performed on correct responses. Trimmed RT and accuracy data were analyzed with a linear mixed-effects model (LMM) using the buildmer package (v. 1.9, Voeten 2020; see also Matuschek et al. 2017) in R (v. 4.1.0., R Core Team 2021) and javomi (v. 2.0, The Jamovi Project 2021). The variables of interest were language of the target stimulus (English/Spanish) and valence (positive, negative, neural). Among bilingual participant factors, we originally planned to assess the effect of dominance. However, as is typical of our student population, while many bilinguals are heritage speakers who were exposed to Spanish since birth on account of growing up in a Spanish-speaking household, the majority became dominant in English by virtue of residing in the US and attending US schools (see Section 2.1). Because of the uneven number of participants in each dominance group, we entered participants' dominance score as a continuous variable.

Overall, the following fixed effects were included in each model: (1) language (English, Spanish); (2) valence (positive, negative, neutral); (3) dominance score; and (4) their interactions. The fixed effects were coded using deviation coding. As determined by buildmer, maximal models with a full random-effect structure were first computed. These included

random intercept by subject and item and random slope by-subject and by-item (Barr et al. 2013; see also Matuschek et al. 2017). Maximally converged models were then run in jamovi and gamlj-General Analyses for Linear Models in jamovi (Version 2.4.7). The final structure, summary and variance components for each of the models are available in the Supplementary Materials.

#### 2.5.2. Electrophysiological Data Analysis

The ERP data were segmented into three time windows defined *a priori* based on previous research: 200–300 ms and 300–400 ms for the EPN; and 500–700 ms for the LPC, (see, e.g., Kissler and Bromberek-Dyzman 2021; Scott et al. 2009). The two different time windows for the early component were chosen to address possible latency shifts in emotion effects recorded for L1 vs. L2, given the L2 delay effect reported in previous emotion word processing studies (e.g., Conrad et al. 2011; Optiz and Degner 2012). Based on the previous literature (e.g., Chen et al. 2015; Conrad et al. 2011; Kissler et al. 2007), the following electrodes were selected for this early component analysis: F7/8, PO3/PO4, P1/P2, P3/P4, P6, P7/P8, CP1, CP5, FC5/FC6, T7/8, O1, Oz. For the LPC effect, which is most salient at the centro-parietal sites, the following electrodes were chosen: CP1/CP2, Pz, P1/P2, P3/P4, P5/P6, PO3/PO4, CPz, Oz, O1/O2. Mean EPN and LPC amplitudes were analyzed with repeated measures (RM) ANOVAs, conforming to a 2 (language of the target stimulus: English, Spanish) by 3 (valence: positive, negative, neutral) within-subjects independent variables, with dominance score as a covariate. P-values were adjusted using Greenhouse– Geisser correction for violations of sphericity, and the Bonferroni correction was applied for multiple testing in all post hoc comparisons.

To fully address our research question regarding the effect of AoA on ERP emotion effects, we would need to compare the amplitudes elicited by Spanish/English emotion words between early/late learners of Spanish/English. However, as reported earlier (see Section 2.1), the majority of our bilinguals (22) were early Spanish learners, with only 4 participants in the late AoA Spanish group. Because of the unequal group size for the AoA of Spanish/English, this variable was not entered into the overall analysis.

#### **3. Results**

#### *3.1. RT Data*

We found a fixed effect of language (*β* = −83.28, *SE* = 14.7, *df* = 30.09, *t* = −5.67, *p* < 0.001), with slower responses to Spanish (*M* = 963 ms, 95% CI [878, 1048]) than English targets (*M* = 777 ms, 95% CI [707, 847]), and of valence (*β* = 41.52, *SE* = 9.22, *df* = 439.72, *t* = 4.5, *p* < 0.001), with faster responses to positive (*M* = 810 ms, 95% CI [736, 884]) than to negative (*M* = 911 ms, 95% CI [838, 985]) and neutral words (*M* = 889 ms, 95% CI [815, 963]). In addition, the analysis revealed a language x valence interaction (*β* = −28.5, *SE* = 9.22, *df* = 439.71, *t* = −3.09, *p* < 0.01). This interaction showed that positive Spanish targets were responded to faster (*M* = 870 ms, 95% CI [782, 958]) than negative (*M* = 1033 ms, 95% CI [945, 1121]) and neutral Spanish targets (*M* = 986 ms, 95% CI [898, 1074]). Regardless of valence, Spanish targets took longer to respond than English (negative Spanish: *M* = 1033 ms, 95% CI [945, 1121] vs. negative English: *M* = 790 ms, 95% CI [716, 864]; neutral Spanish: *M* = 986 ms, 95% CI [898, 1074] vs. neutral English: *M* = 791 ms, 95% CI [717, 865]; positive Spanish: *M* = 870 ms, 95% CI [782, 958] vs. positive English: *M* = 749 ms, 95% CI [675, 823]). Dominance failed to yield significant effects (see Figure 1 for summary of the RT data).

**Figure 1.** Mean RTs in milliseconds for English and Spanish positive, negative, and neutral words recorded in the LDT task. Error bars depict 95% confidence interval. **Figure 1.** Mean RTs in milliseconds for English and Spanish positive, negative, and neutral words recorded in the LDT task. Error bars depict 95% confidence interval.

Spanish: *M* = 870 ms, 95% CI [782, 958] vs. positive English: *M* = 749 ms, 95% CI [675, 823]). Dominance failed to yield significant effects (see Figure 1 for summary of the RT data).

#### *3.2. Accuracy Data 3.2. Accuracy Data*

The accuracy analysis showed a fixed effect of language, *b* = 0.57, *SE* = 0.24, *z* = 2.38, *p* < 0.001, whereby English targets (*M* = 98%, 95% CI [97, 98]) were responded to with greater accuracy than Spanish targets (*M* = 89%, 95% CI [88, 90]). The analysis also yielded a fixed effect of valence, *b* = −0.62, *SE* = 0.18, *z* = −3.52, *p* < 0.001, such that positive targets (*M* = 96%, 95% CI [95, 97]) were responded to with greater accuracy than both negative (*M* = 92%, 95% CI [91, 93]) and neutral targets (*M* = 92%, 95% CI [91, 93]). Mirroring the RT data, where English words elicited faster responses than Spanish words, regardless of valence, the accuracy analysis also showed higher response accuracy for English relative to Spanish targets (negative English: *M* = 97%, 95% CI [96, 99] vs. negative Spanish: *M* = 87%, 95% CI [86, 89]; neutral English: *M* = 97%, 95% CI [95, 98] vs. neutral Spanish: *M* = 87%, 95% CI [85, 88]; positive English: *M* = 99%, 95% CI [97, 100] vs. positive Spanish: *M* = 93%, 95% CI [92, 95]) (see Table 4 for the summary of the RT and accuracy data). Unlike the RT analysis, where no effects of dominance were obtained, here we found a fixed effect of dominance, *b* = −0.05, *SE* = 0.01, *z* = −3.88, *p* < 0.001 and a dominance x language interaction, *b* = 0.05, *SE* = 0.01, *z* = 3.19, *p* < 0.01, with English-dominant bilinguals responding more accurately to English (99%) than to Spanish targets (89.6%, *p* < 0.001). Conversely, Spanish-dominant and balanced bilinguals responded more accurately to Spanish targets (Spanish-dominant: 98.7%; balanced: 98.6%) than did English-dominant bilinguals (89.6%, *p* < 0.001). The accuracy analysis showed a fixed effect of language, *b* = 0.57, *SE* = 0.24, *z* = 2.38, *p* < 0.001, whereby English targets (*M* = 98%, 95% CI [97, 98]) were responded to with greater accuracy than Spanish targets (*M* = 89%, 95% CI [88, 90]). The analysis also yielded a fixed effect of valence, *b* = −0.62, *SE* = 0.18, *z* = −3.52, *p* < 0.001, such that positive targets (*M* = 96%, 95% CI [95, 97]) were responded to with greater accuracy than both negative (*M* = 92%, 95% CI [91, 93]) and neutral targets (*M* = 92%, 95% CI [91, 93]). Mirroring the RT data, where English words elicited faster responses than Spanish words, regardless of valence, the accuracy analysis also showed higher response accuracy for English relative to Spanish targets (negative English: *M* = 97%, 95% CI [96, 99] vs. negative Spanish: *M* = 87%, 95% CI [86, 89]; neutral English: *M* = 97%, 95% CI [95, 98] vs. neutral Spanish: *M* = 87%, 95% CI [85, 88]; positive English: *M* = 99%, 95% CI [97, 100] vs. positive Spanish: *M* = 93%, 95% CI [92, 95]) (see Table 4 for the summary of the RT and accuracy data). Unlike the RT analysis, where no effects of dominance were obtained, here we found a fixed effect of dominance, *b* = −0.05, *SE* = 0.01, *z* = −3.88, *p* < 0.001 and a dominance x language interaction, *b* = 0.05, *SE* = 0.01, *z* = 3.19, *p* < 0.01, with English-dominant bilinguals responding more accurately to English (99%) than to Spanish targets (89.6%, *p* < 0.001). Conversely, Spanish-dominant and balanced bilinguals responded more accurately to Spanish targets (Spanish-dominant: 98.7%; balanced: 98.6%) than did English-dominant bilinguals (89.6%, *p* < 0.001).

**Table 4.** Means of response latencies in milliseconds and accuracy results (percentage of correct responses; SE in parentheses) for Spanish and English emotion words recorded in the lexical decision task (LDT). **Table 4.** Means of response latencies in milliseconds and accuracy results (percentage of correct responses; SE in parentheses) for Spanish and English emotion words recorded in the lexical decision task (LDT).


#### *3.3. EEG Results*

3.3.1. EPN

In the early EPN time window, the RM ANOVA revealed a main effect of language, *F*(1,24) = 5.92, *p* = 0.017, η 2 <sup>p</sup> = 0.21, with more pronounced amplitudes following English targets (*M* = − 0.98 µV, 95% CI [−1.9, −0.1]) than Spanish targets (*M* = 0.68 µV, 95% CI [−0.68, 2.02]) (see Figure 2). Likewise, in the late EPN time window, there was a main

effect of language, *F*(1,25) = 4.77, *p* = 0.04, η 2 <sup>p</sup> = 0.160, with English targets eliciting larger amplitudes (*M* = − 1.02 µV, 95% CI [−3.1, 1.03]) than Spanish targets (*M* = 0.09 µV, 95% CI [−1.54, 1.72]) (Figure 3). No effect of valence was found. Likewise, dominance failed to yield significant effects in either early or late EPN time windows. of language, *F*(1,25) = 4.77, *p* = 0.04, η2p = 0.160, with English targets eliciting larger amplitudes (*M* = − 1.02 μV, 95% CI [−3.1, 1.03]) than Spanish targets (*M* = 0.09 μV, 95% CI [−1.54, 1.72]) (Figure 3). No effect of valence was found. Likewise, dominance failed to yield significant effects in either early or late EPN time windows.

In the early EPN time window, the RM ANOVA revealed a main effect of language, *F*(1,24) = 5.92, *p* = 0.017, η2p = 0.21, with more pronounced amplitudes following English targets (*M* = − 0.98 μV, 95% CI [−1.9, −0.1]) than Spanish targets (*M* = 0.68 μV, 95% CI [−0.68, 2.02]) (see Figure 2). Likewise, in the late EPN time window, there was a main effect

*Languages* **2023**, *8*, x FOR PEER REVIEW 12 of 29

*3.3. EEG Results*  3.3.1. EPN

**Figure 2.** Amplitudes recorded for EPN 200–300 ms as a function of language of the target stimulus and valence. Representative electrodes O1, PO3, and CP1 illustrate differences in the EPN responses between English and Spanish negative, positive, and neutral targets. Bar plots show posterior EPN activity averaged across the posterior electrodes and the entire time window for EPN 200–300 ms. Error bars are standard errors. **Figure 2.** Amplitudes recorded for EPN 200–300 ms as a function of language of the target stimulus and valence. Representative electrodes O1, PO3, and CP1 illustrate differences in the EPN responses between English and Spanish negative, positive, and neutral targets. Bar plots show posterior EPN activity averaged across the posterior electrodes and the entire time window for EPN 200–300 ms. Error bars are standard errors.

**Figure 3.** Amplitudes recorded for EPN 300–400 ms as a function of language of the target stimulus and valence. Representative electrodes O1, P3, and P6 illustrate differences in the EPN responses between English and Spanish negative, positive, and neutral targets. Bar plots show posterior EPN activity averaged across the posterior electrodes and the entire time window for EPN 300–400 ms. Error bars are standard errors. **Figure 3.** Amplitudes recorded for EPN 300–400 ms as a function of language of the target stimulus and valence. Representative electrodes O1, P3, and P6 illustrate differences in the EPN responses between English and Spanish negative, positive, and neutral targets. Bar plots show posterior EPN activity averaged across the posterior electrodes and the entire time window for EPN 300–400 ms. Error bars are standard errors.

3.3.2. LPC 3.3.2. LPC

In the LPC time window, there was a main effect of language, *F*(1,24) = 7.07, *p* = 0.01, η2p = 0.23, which revealed that Spanish targets evoked a more pronounced positivity (*M* = 0.1 μV, 95% CI [−1.06, 1.26]) than English targets (*M* = −2.14 μV, 95% CI [−3.99, −0.28]) (see Figure 4). Neither valence, *F*(2,48) = 1.63, *p* = 0.93, η2p = 0.003, nor language × valence interaction, *F*(2,48) = 0.07, *p* = 0.93, η2p = 0.05, turned out to be significant. Similar to the results reported in the early time windows, dominance did not show any significant effects. In the LPC time window, there was a main effect of language, *F*(1,24) = 7.07, *p* = 0.01, η 2 <sup>p</sup> = 0.23, which revealed that Spanish targets evoked a more pronounced positivity (*M* = 0.1 µV, 95% CI [−1.06, 1.26]) than English targets (*M* = −2.14 µV, 95% CI [−3.99, −0.28]) (see Figure 4). Neither valence, *F*(2,48) = 1.63, *p* = 0.93, η 2 <sup>p</sup> = 0.003, nor language × valence interaction, *F*(2,48) = 0.07, *p* = 0.93, η 2 <sup>p</sup> = 0.05, turned out to be significant. Similar to the results reported in the early time windows, dominance did not show any significant effects.

**Figure 4.** Amplitudes recorded for LPC 500–700 ms as a function of language of the target stimulus. Representative electrodes Oz, P2, and O1 illustrate differences in the LPC responses between English and Spanish negative, positive, and neutral targets. Bar plots show LPC activity averaged across the centro-parietal electrodes and the entire time window for LPC 500–700 ms. Error bars are stand-**Figure 4.** Amplitudes recorded for LPC 500–700 ms as a function of language of the target stimulus. Representative electrodes Oz, P2, and O1 illustrate differences in the LPC responses between English and Spanish negative, positive, and neutral targets. Bar plots show LPC activity averaged across the centro-parietal electrodes and the entire time window for LPC 500–700 ms. Error bars are standard errors.

#### ard errors. **4. Discussion**

**4. Discussion**  The current study aimed to shed light on the dynamics of emotion word processing in immersed Spanish-English/English-Spanish bilinguals who reside in a bilingual community and are routinely exposed to both languages in everyday personal and professional interactions. While the majority of the participants learned Spanish as their L1 or The current study aimed to shed light on the dynamics of emotion word processing in immersed Spanish-English/English-Spanish bilinguals who reside in a bilingual community and are routinely exposed to both languages in everyday personal and professional interactions. While the majority of the participants learned Spanish as their L1 or grew up speaking both languages, regardless of their L1, all bilinguals uniformly reported significantly higher proficiency in English than in Spanish. The participants were presented with English/Spanish emotion-label and emotion-laden (positive, negative) and neutral words along with nonwords and asked to make a lexical decision, while their RTs and ERP (EPN and LPC) responses were recorded. We asked the following research questions: (1) Do L1/L2 emotion processing differ qualitatively and/or quantitatively in highly proficient immersed bilinguals who are routinely exposed to both languages and reside in a bilingual community? (2) How do language dominance and AoA modulate L1/L2 emotion word processing?

#### *4.1. Behavioral Results*

The behavioral data revealed that English targets were responded to significantly faster than the Spanish, and this effect held true for both emotionally valenced and neutral stimuli. The accuracy data further confirmed the RT results, showing that English targets were responded to with a significantly greater accuracy than the Spanish. Dominance did not emerge as significant in the RT data. Lack of the modulating effect of dominance in our RT data is compatible with the study by Ferré et al. (2017). They looked at the effects of language status, task type, and word concreteness on the emotional content processing by Catalan-Spanish bilinguals who were early bilinguals highly fluent in both languages but dominant in Catalan. The task was either explicit (affective decision task, Exp.1) or automatic (LDT, Exp.2). Results showed effects of valence and concreteness for both the explicit and implicit tasks (Experiments 1 and 2). In the LDT, negative words took longer to process and elicited more errors than positive, and this effect held true regardless of the language of target stimuli. Hence, despite the participants' dominance in Catalan, the fact that they were all highly proficient in both languages seemed to contribute the most to their performance. Along similar lines, regardless of our bilingual participants' varying dominance, their responses were fastest and most accurate for English, their more proficient language than for Spanish.

In a subsequent experiment (Exp.3), Ferré et al. (2017) employed an LDT with a group of Catalan-Spanish bilinguals who were all late learners of English and dominant in Catalan. While the pattern of results was highly comparable for both Spanish and Catalan, the two languages that the bilingual participants grew up speaking and were immersed in, the effects diverged for English, the less proficient language acquired later in life and in a formal setting. Here, again, dominance did not seem to play a role but age/context of acquisition and proficiency were relevant. While our participant population was more varied, in that it included not only early but also late English-Spanish/Spanish-English bilinguals who were dominant either in their L1 or L2, the common characteristic of the participants in both studies was their high proficiency in the language(s) they performed best at, as well as a highly immersive context offering a rich bilingual and bicultural experience.

In addition, we found a robust emotion effect in our behavioral data such that positive targets were recognized significantly faster than the negative and neutral. This emotion effect was modulated by language of the target stimulus and present only in Spanish. Accordingly, positive Spanish targets were responded to faster compared to negative and neutral targets. In addition, on error rates, a significant valence effect was found with positive targets eliciting significantly fewer errors than either negative or neutral targets. This effect was again constrained by language, such that positive Spanish targets provoked the smallest, and negative/neutral the greatest, number of errors.

Our RT data replicate the facilitatory effect widely reported in LDT studies where RTs are faster for positively valenced over neutral words (e.g., Chen et al. 2015; Conrad et al. 2011; Hofmann et al. 2009; Kanske and Kotz 2007; Kousta et al. 2009; Kuchinke et al. 2005; Mueller and Kuchinke 2016; Recio et al. 2014) and over negative words (Briesemeister et al. 2011; Kanske and Kotz 2007; Kuchinke et al. 2005). Our accuracy data showing that positive words elicited significantly fewer errors than negative or neutral words are again consistent with the data reported in previous studies that employed the LDT

(Briesemeister et al. 2011; Chen et al. 2015; Conrad et al. 2011; Ferré et al. 2017; Kousta et al. 2009; Kuchinke and Lux 2012).

Results from the behavioral data showing an enhanced emotion effect for Spanish, as opposed to English suggest that L1 and L2 emotion word processing might diverge for immersed bilinguals. This absence of the emotion effect for English in our Spanish-English bilinguals might be viewed as indicative of the reduced L2 emotional resonance discussed earlier. For example, in an LDT with Chinese-English bilinguals, Chen et al. (2015) showed a diminished emotional impact of L2 as compared to L1 valenced words in both early and late ERP components. In their study, positive words elicited a larger EPN than neutral words and a smaller LPC than both neutral and negative words, but this emotion effect was only present for L1. Notably, our participants were predominantly L1 Spanish bilinguals dominant in English, so they differed substantially from the bilingual group in Chen et al.'s (2015) study, which employed Chinese-English bilinguals dominant in Chinese and residing in their L1 environment.

Increased RT to negative Spanish but not negative English targets present in our data is also compatible with studies showing a selected attenuated response to L2 negative stimuli (e.g., Jo ´nczyk et al. 2016; Wu and Thierry 2012). Wu and Thierry presented Chinese native speakers fluent in English with pairs of English words, some of which had a concealed sound repetition if translated into Chinese. Participants were asked to decide if the word pairs were related in meaning. While sound repetition priming elicited the expected effects for positive and neutral words, English words with a negative valence failed to automatically activate their Chinese translations, suggesting an inhibitory mechanism whereby a negative emotional content in L2 might be suppressed. Attenuated processing of L2 negative emotion words was further corroborated in Jo´nczyk et al.'s (2016) experiment employing a context richer than single words. Late fluent Polish-English bilinguals residing in the UK read English and Polish sentences and indicated whether each sentence, which ended with either a semantically and affectively congruent or incongruent adjective, made sense. Results showed an increased N400 response to L1 Polish emotionallyvalenced sentences and a reduction in the N400 amplitude for English sentences ending with negatively-valenced words, independent of semantic congruity.

The fact that our bilingual participants would display this reduced response to emotion words in English, the language in which they became more proficient, would seem to indicate that one's native language continues to be intrinsically more emotional even for highly immersed bilinguals. Those results are inconsistent with the suggestion that an increase in L2 proficiency might lead to a similar emotional sensitivity in bilingual's two languages (e.g., Costa et al. 2014; Jo´nczyk et al. 2019). In their study adopting the "trolley dilemma" (Thomson 1985), Costa et al. (2014) found that the more bilinguals became proficient in their L2, the more likely their performance resembled that of L1 when making moral decisions, as opposed to less proficient bilinguals who would tend to be more moral in their L1 and more utilitarian in their L2.

However, as discussed earlier, emotion effects in L1 and L2 are likely affected not just by proficiency but by a complex interplay of such bilingual characteristics as AoA, the context of acquisition, frequency of daily usage, length of residence in the L2 speaking country, and possibly many other factors. In fact, some researchers have suggested that in order for bilinguals to have comparable emotional responses in L1 and L2, they need to be early AoA learners in addition to being highly proficient (see Harris et al. 2006). Indeed, the study by Harris (2004) has shown that L1 and L2 reprimands and taboo words elicited a comparable GSR in early but not sequential bilinguals, pointing to the possibility that AoA might be crucial in modulating the affective response of the autonomic nervous system (but see Ponari et al. 2015).

Overall, the behavioral analysis showed faster responses in bilinguals' more proficient (English) language, regardless of their dominance. However, in the accuracy analysis, dominance did appear significant, with Spanish-dominant and balanced bilinguals obtaining higher accuracy for Spanish than English targets and English-dominant bilinguals

showing higher accuracy for English than for Spanish stimuli. The emotion effect was only observed for Spanish targets. Given that the majority of the bilingual participants were early learners of Spanish, either learning Spanish as L1 or simultaneously with English; this result is consistent with the idea that the first learned language might still evoke a stronger emotion effect than an L2, even if a bilingual person becomes more proficient or more dominant in their L2. As noted earlier, the study by Harris et al. (2003) with highly proficient Turkish speakers of English showed that while reactions to taboo words were identical in both L1 and L2, certain words (childhood reprimands) evoked a larger skin conductivity response in L1 only as compared to L2, suggesting that the language status might override proficiency in certain contexts.

#### *4.2. Electrophysiological Results*

#### 4.2.1. L1 vs. L2 Emotion Word Processing

In the electrophysiological data, we found a significant effect of language in both early and late EPN time windows, such that English words elicited a larger negativity than Spanish. No emotion effects were present for either Spanish or English targets in either early or late EPN time windows. In the LPC 500–700 ms time window, a robust main effect of language was again present, manifesting a reverse pattern than that recorded for the EPN. Here, Spanish targets evoked a larger LPC positivity than English targets. A significant effect of language found on both early and late components in our study is generally compatible with findings from Vélez-Uribe and Rosselli (2021), although the pattern of our data diverges from theirs. Participants in Vélez-Uribe and Rosselli's (2021) study were Spanish-English balanced and unbalanced bilinguals highly comparable to our bilingual population, i.e., living immersed in a bicultural environment and receiving their education primarily in English. The EPN amplitude was found to be larger for Spanish than English targets across all valence categories, regardless of the participants' dominance. Vélez-Uribe and Rosselli (2021) suggest that an enhanced EPN in response to Spanish, as opposed to English words might be reflective of the overall higher proficiency of the bilingual participants in English compared to Spanish, thus evoking a larger negativity in the less proficient language. However, in later time windows, their study showed larger LPC amplitudes for English than Spanish words, regardless of the bilingual group.

Overall, discrepancies between our results and those of Vélez-Uribe and Rosselli (2021) might potentially be attributed to task-related demands. While the explicit, valence rating task employed by Vélez-Uribe and Rosselli (2021) might have favored the more proficient language by encouraging a deeper semantic processing in early time windows, an implicit lexical decision task used in the present study might be merely indicative of the automatic attention capture that the EPN typically reflects, without necessarily coinciding with early availability of the emotional content. More globally, lack of emotion effects in English despite an overall enhanced EPN response to English stimuli might possibly be related to our experimental design. Specifically, we used a fully randomized, mixed experimental design likely to have further weakened the strength of L1 and L2 valence effects. Typically, bilingual ERP studies into L1 and L2 emotion effects employ a blocked design for each language (e.g., Conrad et al. 2011; Jo´nczyk et al. 2016; Kissler and Bromberek-Dyzman 2021; Optiz and Degner 2012). Since our bilingual participants were habitual codeswitchers who routinely engage in conversations where lexical items from Spanish and English are used interchangeably, for the sake of ecological validity we purposefully designed a study with a mixed design. Presence of both language stimuli was announced in the instructions and emphasized throughout the experimental set-up and practice block, with participants specifically told that they would see both Spanish and English words/nonwords.

Such a design, nevertheless, might have inadvertently led to brain responses related to an enhanced cognitive control which is called for when bilinguals have to process mixed language stimuli. Several ERP components have been identified as sensitive to codeswitching in contexts where bilingual participants are exposed to mixed language stimuli (see, Van Hell et al. 2018 for an overview). One component of interest to our

study is an early frontal positivity (200–300 ms), which has been linked to attention shifts from the expected to unexpected language as well as from a narrow to a broad focus of attention (Beatty-Martínez and Dussias 2017). In a series of ERP experiments, Beatty-Martínez and Dussias (2017) examined whether bilinguals' codeswitching experience would have a modulating effect on the processing of codeswitched stimuli. To that effect, two groups of Spanish-English bilinguals were recruited: the first group routinely exposed to codeswitched speech by virtue of being immersed in a dual-language context and the second group consisting of bilinguals living in a single-language context devoid of the codeswitching experience. Participants were presented with preamble-target sentence pairs, with the first sentence providing supporting context and the second containing a target codeswitch involving an English noun with a Spanish determiner. While the two bilingual groups differed in their sensitivity to the switched targets, only non-codeswitchers manifested an early positivity for switched vs. non-switched conditions.

Beatty-Martínez and Dussias (2017) interpret these results as supporting Green and Wei's (2014) *Control Process* (CP) model which links language users' codeswitching behaviors to distinct control states in bilinguals. Whereas bilinguals in unilingual and bilingual contexts experience a competitive relationship between their languages on account of having to actively select one language only, this is not the case for bilinguals in dense codeswitching contexts where a cooperative relationship between their languages is present. Since the early positive component is an index of attentional control, Beatty-Martínez and Dussias (2017) propose that codeswitches trigger a shift of attention from a narrow, typical of a competitive control state, to a broad focus characterizing a cooperative control state. The early positivity can hence be viewed as an index of control, such that in appropriate contexts encouraging the activation and selection of both languages, attention would already be broad and no shift in attention from focused would be necessitated (Kaan et al. 2020).

Crucially, Beatty-Martínez and Dussias (2017) acknowledge that the early positivity effect could also reflect the overlapping N2 and P3 waves, which are present in this time window (200–300 ms) and suggest that the early frontal positivity might be a combination of P2-N2 and P3 components. Because of this overlap, lack of emotion effects in our EPN data might be attributed to the contamination from the competing codeswitch effects present in the mixed design trials, especially since the P3 component has also been associated with evaluation of the affective valence (see Zhang et al. 2014). Of interest is the question whether our bilinguals, who are habitual codeswitchers and should display a cooperative relationship between their languages, would still experience a codeswitch cost.

In a more recent ERP study relevant to this question, Kaan et al. (2020) examined whether a pro-active selection of both languages primed by the bilingual context would attenuate the early frontal positivity response for a codeswitch vs. no-switch control. They presented Spanish-English bilinguals with English sentences that were English only or contained a codeswitch from English to Spanish. While for one half of the study participants read the sentences together with an English monolingual who accompanied them, in the other half they did so with another Spanish-English bilingual. Consistent with the codeswitching literature, switches elicited an enhanced fronto-central positivity; however, the effect was attenuated in the bilingual condition where a Spanish-English bilingual accompanied a participant. These findings suggest that bilinguals expecting to operate in the bilingual context can accommodate codeswitches, in line with the dynamic control model of language processing.

However, the experimental setup in Kaan et al.'s (2020) study was very elaborate, including the presence of another monolingual/bilingual person and a joint reading task to ensure a strongly priming bilingual context. In their Experiment 1, which included bilinguals with a self-reported regular exposure to codeswitching but which lacked the manipulation of the bilingual context, Kaan et al. did find an enhanced frontal positivity in switch vs. non-switch trials. Kaan et al. (2020) suggest that, despite their participants' codeswitching experience, the use of switches in a written isolated context might not have been strong enough to engage the broad attentional focus that would eliminate the switch

cost effect. Along the same lines, while our bilinguals came from the dense switching environment, their codeswitching practices are primarily executed in the spoken language mode and carried out in everyday conversations with Spanish-speaking family members and bilingual peers. The written language they are predominantly exposed to by virtue of their academic career is switch-free English.

In a bilingual ERP study directly relevant to our experimental setup, Christoffels et al. (2007) had German-Dutch bilinguals name pictures in their L1 and L2 in either blocked or mixed language conditions. The bilinguals were dominant in their L1 and switched languages routinely in their everyday lives. Switching costs manifested in the 275–375 ms and 375–475 ms time-windows. In the first time window (275–375 ms), an increased negativity was found for non-switch trials relative to blocked ones for both L1 and L2. The second time window affected mainly participants' L1 and resulted in more enhanced ERP modulations for blocked vs. mixed language conditions. Noteworthy is the fact that Christoffels et al.'s time windows overlap with those selected for our EPN measures (200–300 ms, 300–400 ms). Given that the lowest amplitudes in Christoffels et al.'s data were found for switch and the highest for blocked trials, our attenuated EPN responses might have been modulated by the mixed language condition where switch trials were predominant.

Another component of interest to our results, which has been consistently reported for codeswitched vs. control words, is the LPC (Moreno et al. 2002; Ng et al. 2014; Van Der Meij et al. 2011). LPC modulation in response to a mixed language design was found in Kaan et al.'s (2020) study with Spanish-English bilinguals described above, with switch trials eliciting a larger positivity than non-switch ones. Importantly, LPC switch effects have been found to be particularly prominent in higher proficiency bilinguals (Van Der Meij et al. 2011) and more robust for switches into the non-dominant language (Litcofsky and Van Hell 2017). Litcofsky and Van Hell (2017) used a self-paced reading paradigm with intrasentential codeswitches in both language directions with highly proficient Spanish-English bilinguals who were habitual codeswitchers. The participants were asked to read sentences which switched from L1 to L2 or in the opposite direction. Switched words elicited higher positivities than non-switched ones in the 500–900 ms LPC time window. While no significant differences were found between switched and non-switched sentences for switches into the dominant language, switches into the weaker language elicited a large posterior positivity. According to Litcofsky and Van Hell (2017), this switching cost asymmetry might relate to the fact that, when switching into the nondominant language, bilinguals would need to exercise more cognitive effort to activate their weaker language (cf. Green 1998). Consistent with those findings, enhanced LPC responses to Spanish vs. English targets found in our data might be partially attributed to the switch costs reported in the literature for the nondominant language. Since English was the more proficient language for our bilingual participants and the majority of them reported being dominant in English, a mixed design condition might have contributed to higher LPC amplitudes for Spanish, the weaker language.

Overall, since language control modulates the amplitude of the early P2, P3, and N2 and late LPC components and has been recorded in time windows overlapping with those we measured, our mixed design might have resulted in diminishing emotion effects due to the competing codeswitch effects in the data. While this is certainly a limitation in our study, it offers a valuable insight to take into account when planning future L1 and L2 studies with highly immersed bilinguals.

Crucial in addressing absence of the valence effect in our data are findings from Delaney-Busch et al. (2016), who examined how stimulus characteristics, such as valence and arousal as well as experimental task demands, affect the LPC response. Two ERP experiments were conducted, each with a different group of participants but with the identical set of stimuli. In Experiment 1, a semantic-monitoring task was used where neither valence nor arousal of the stimulus words were relevant for its successful completion. In this task, participants were asked to press a button if a word presented on the screen belonged

to the category of animals. While judging the word category membership encourages deep semantic processing in that it requires participants to access semantic features of the target, the dimensions of valence and arousal are task-irrelevant. Results showed no effect of valence but a significant effect of arousal, such that high-arousal words elicited a larger LPC amplitude than low-arousal ones. On the other hand, in Experiment 2, where participants were instructed to make an explicit judgment regarding the valence of each stimulus word, the LPC showed a significant effect of valence with negative words eliciting the largest response, but no effect of arousal.

According to Delaney-Busch et al. (2016), these results can explain inconsistencies in the ERP literature where no effect of valence might be present on the LPC if stimuli are matched on arousal, as opposed to the strong LPC effect that would be recorded for valenced words which are high in arousal and hence likely to differ from low-arousal neutral ones (see also Recio et al. 2014). Importantly, as discussed earlier (Section 2.2), both positive and negative targets in our study were matched on arousal, with their ratings ranging from medium (4.5) to medium-high (7.5) on a 1–9 scale and the average arousal rating of *M* = 5.8 across all Spanish and English stimuli. Neutral words were medium arousal, ranging from 3.0–5.5, with an averaged mean for Spanish and English stimuli, *M* = 4.4. While arousal ratings for valenced vs. neutral stimuli differed significantly in the statistical analysis, the arousal level difference between the emotion and neutral words in our study was substantially smaller than that reported in the bilingual literature. For example, in Chen et al.'s (2015) study, ratings of arousal for positive (*M* = 5.40) and negative (*M* = 5.41) words were over twice higher than those for neutral ones (*M* = 2.65). Differences in arousal ratings coupled with task demands might hence explain absence of emotion effects in our ERPs data.

To sum up, with regard to our first research question, the present study revealed significant differences in behavioral and electrophysiological responses to English and Spanish words, with the emotion effect only present for Spanish in the RT data, the overall attenuation of the effect in the ERP data, and the divergent pattern of results for Spanish and English in early vs. late time windows.

#### 4.2.2. Bilingual Characteristics Modulating Emotion Word Processing

Our second research question looked at the potential influence of bilingual participant characteristics, such as dominance and AoA, on emotion word processing. Given unequal numbers of participants in the early/late AoA groups, this variable was not explored. In turn, dominance failed to show a significant effect in either early or late time windows, with the EPN and LPC responses varying solely as a function of the target language. Accordingly, while participants' more proficient language (English) evoked more pronounced EPN amplitudes than the less proficient (Spanish), the reverse effect was found for the LPC time window.

Hence, similar to Vélez-Uribe and Rosselli's (2021) study, the results reported here seem to point to a crucial role that proficiency plays in bilingual emotion word processing. Interestingly, while both proficiency and dominance appeared significant in Vélez-Uribe and Rosselli's study, in our case dominance did not emerge as important. To further explore possible causes of the absence of the dominance effect in our data, we ran an *a posteriori* correlation analysis between participants' Spanish/English proficiency ratings and their dominance score. Ideally, we should expect proficiency ratings to be highly correlated with the dominance score in each of the participants' languages. As per the BDS coding, the higher the value on the scale, the more dominance in English it indicated. Conversely, the lower (more negative) the value on the BDS, the higher the dominance in Spanish. Scores of approximately 0 indicated a balanced bilingual. We found a significant negative correlation between proficiency in Spanish and participants' dominance score [*r*(26) = −0.66, *p* < 0.001], suggesting that higher proficiency in Spanish was also associated with Spanish dominance. In contrast, analysis with English proficiency failed to yield significant results [*r*(26) = 0.26, *p* = 0.098], implying that regardless of their dominance participants were all

comparably highly proficient in English. A follow-up ANOVA run with dominance as a grouping variable and proficiency in English/Spanish as dependent variables, confirmed these results. English proficiency failed to reach significance in the analysis, with Spanish proficiency only marginally significant, *F*(2,23) = 4.15, *p* = 0.01. Post-hoc comparisons of English proficiency ratings between Spanish-dominant (*M* = 6.45, 95% CI [5.88, 7.02]), English-dominant (*M* = 6.68, 95% CI [6.35, 7.02]), and balanced bilinguals (*M* = 6.29, 95% CI [5.67, 6.92]) were all insignificant, whereas comparisons for Spanish yielded a significant difference only between English-dominant (*M* = 4.33, 95% CI [3.42, 5.25]) and Spanishdominant bilinguals (*M* = 6.50, 95% CI [5.82, 7.18]; *t*Tukey (23) = −2.76, *p* < 0.05), such that English-dominant bilinguals were significantly less proficient in Spanish than those who were Spanish-dominant. Overall, these results indicate that language proficiency might be a better predictor of L1/L2 emotion word processing than dominance.

Indeed, a LDT study by Ponari et al. (2015) seems to point to the superiority of proficiency over other participant characteristics, such as dominance, AoA or L1/L2 status in affective processing of L2 words. Bilingual participants in the study were recruited from diverse L1 families, including sign and non-Latin-script languages with a varying degree of typological distance from English. All bilinguals were highly proficient L2 speakers of English. Results of a LDT on negative, positive, and neutral words showed a comparable emotion facilitation effect for bilingual participants and native speakers of English, regardless of the bilinguals' varied L1 backgrounds, AoA, the degree of immersion, or the frequency and domain of L2 use. Likewise, based on the review of the functional neuroimaging studies using PET and fMRI to explore cerebral language organization in bilinguals during comprehension and production tasks, Abutalebi et al. (2001) suggest that proficiency is the most important factor affecting the bilingual language system, much more so than age of acquisition.

Despite its critical importance in bilingual studies, language proficiency has been notoriously difficult to objectively measure and conceptualize. Generally defined as the ability to use a language fluently, proficiency is viewed as a multidimensional construct subsuming linguistic components, such as phonology, orthography, morphology, syntax, and lexicon, in addition to pragmatic, sociolinguistic, and discourse-level features (De Souza and Silva 2015).

Weak correlation between our participants' dominance and proficiency points to a larger question identified in the bilingual literature, namely, the employment of self-assessment proficiency measures. In their review of 140 empirical papers published in the journal *Bilingualism: Language and Cognition* between 1998–2011, Hulstijn (2012) notices that over half of them included self-assessment of language proficiency (LP) rather than an objective LP test as an independent variable; yet, participants' LP scores were seldom applied to explaining variance obtained in the dependent variables (see also De Souza and Silva 2015).

Despite these criticisms, self-ratings have been extensively used to assess bilingual language proficiency, and multiple studies have shown highly robust correlations between self-ratings and such objective proficiency measures as reading/auditory comprehension, reading fluency, grammaticality judgment speed/accuracy, picture naming, receptive vocabulary, and sound awareness (see Marian et al. 2007). Indeed, the Language History Questionnaire (LHQ; Li et al. 2006, 2019) employed in the present study has been widely used in the bilingual literature to examine language proficiency and the background of bi/multilingual language users, and its scores have been validated with objective measures of proficiency, such as, for example, verbal fluency (Li et al. 2019). Apart from the selfassessment module, where participants rate their proficiency in reading, writing, speaking, and listening, the LHQ examines participant's AoA, language of instructed education, length of using the languages, the frequency of daily language use and language mixing, the current country of residence, as well as language preference and cultural identity.

More nuanced than the reliability of self-assessment ratings, however, is the issue of the correlation between self-ratings of proficiency and dominance, as well as bilingual participants' ability to classify themselves into dominance groups. In the study more

directly relevant to our results, Gollan et al. (2012) looked at the usefulness of proficiency self-ratings for establishing participants' spoken language dominance. In order to obtain objective measures of spoken proficiency, 52 young and 20 aging Spanish-English bilinguals were interviewed in each language using a structured oral proficiency interview and completed a picture naming test in each language. In addition, participants self-rated their language proficiency using a 10-point scale ranging from (1) novice low to (10) superior. Based on participants' performance on each of the measures, Gollan et al. (2012) calculated an index score that reflected the degree of balanced bilingualism. This index was obtained by dividing the lower score obtained in whichever language by the higher one for each measure. For example, a participant who rated themselves as superior (10) in English and intermediate high (6) in Spanish, would be classified as 60% bilingual. Thus, the index scores reflected the degree to which knowledge of each language was similar, regardless of the direction of dominance.

Results revealed a significant correlation between self-reported proficiency in English and objective measures, such as the oral proficiency and naming tests. Similarly, correlations between self-reported level of proficiency in Spanish, which was the nondominant language for most participants, and the objective measures of proficiency was high. On the other hand, the correlations between self-rated and objective index scores were only marginally significant, suggesting that while bilinguals were fairly accurate in assessing which of their two languages is more dominant than the other, they were much less accurate in estimating the degree of difference between proficiency in each language. For example, bilinguals who rated their proficiency as equal in English and Spanish were later shown to perform better in English on both the interview and naming tasks. Similarly, both Spanishand English-dominant bilinguals, especially the young participants, tended to overestimate their abilities in their dominant language. While some amount of overestimation in proficiency self-ratings might have been present in our study, the fact that our bilinguals consistently self-rated as more proficient in English and responded consistently faster to English than Spanish targets seems to indicate a high degree of overlap between subjective (the LHQ) and objective (LDT) measures of language proficiency. In turn, low correlations between proficiency ratings and dominance scores might be partially a product of unequal comparison groups, with the majority of our bilinguals (16) reporting dominance in English, and only 5 in Spanish.

More generally, our results that do not fit neatly into the existing L2 emotion processing literature can be attributed to the uniqueness of our bilingual population consisting of habitual codeswitchers whose L1 has ceased to be their dominant language. It has been suggested that neurocognitive mechanisms of language use and control in such bilinguals might differ qualitatively from those in non-habitual codeswitchers (e.g., Green 2011; Green and Wei 2014). In line with this assumption, Pliatsikas et al. (2017) suggest that the major mechanism shaping cortical regions in the bilingual brain is a continuous L2 usage in an immersive environment. Pliatsikas et al. (2017) acquired brain scan images from 20 sequential (late) learners of English varying in their L1 backgrounds and residing in the English-speaking country for an average of almost 11 years. Significant subcortical reshaping of the basal ganglia and thalamus was visible, mirroring the data obtained earlier in simultaneous (early) bilinguals. Since the participants in Pliatsikas et al.'s (2017) study were all late bilinguals, the authors suggest that structural changes in the bilingual brain are primarily modulated by the amount of L2 immersion. In line with this suggestion, the time spent in the UK turned out to be a significant predictor for the expansion of the right globus pallidus, a nucleus in the basal ganglia. Analyses with proficiency and AoA showed no significant effects, indicating that brain restructuring in bilinguals depends on the active and continuous usage of L2 in the immersive context. Thus, immersion emerges as a crucial factor to consider when comparing L1 and L2 emotion effects in bilingual participants.

Finally, regardless of an individual bilingual's dominance, L1/L2 learning history, proficiency, AoA, or immersive experience, bilinguals' two languages might differ in their sensitivity to emotional content depending on the context of use, as would be the case when

one language is primarily used for professional purposes and another at home in a less formal and more emotion-laden context. In this view, words develop emotional resonances depending on the intensity of the emotional context in which they were first learned and subsequently used throughout the bilingual's life experiences. The importance of contextdependent emotional learning is captured by the *emotional context of learning hypothesis* (Caldwell-Harris 2014). Briefly, the hypothesis postulates that emotional resonance is affected by the context in which language is learned and that language will be perceived as more emotional if it has been acquired and used in emotional settings. This is akin to the *contextual-learning hypothesis* (Barrett et al. 2007) emphasizing the interplay between learning and experience. Individual's personal experiences will shape emotional processing in each language and modulate the vividness of emotional reaction. For non-immersed bilinguals, who are typically unbalanced, more proficient in their L1 and who often learned L2 in a more formal, emotionally neutral context, L1 and L2 emotional resonance might substantially differ. Immersed bilinguals recruited in our study have likely experienced both languages in emotionally-grounded contexts at various stages of their linguistic and cognitive development where linguistic information was strongly linked with emotional experiences. Such varying individual experiences in each language might thus be another factor constraining the strength of the emotion effect as measured by behavioral and electrophysiological data.

An important limitation in our study is a lack of equal groups of AoA early/late English/Spanish bilinguals, which prevented us from assessing the relevance of AoA in a full analysis. To adequately assess the effect of each of the bilingual characteristics on emotion processing in L1 and L2, we would need a fully-crossed design with multiple groups of participants varying in terms of their AoA, L1/L2 status, dominance, proficiency, the degree of their immersive experience, and possibly other relevant factors, such as frequency of codeswitching or the emotional context of language usage. The border town from which our Spanish-English/English-Spanish participants were recruited is an essentially rich bicultural and bilingual community where both cultures and languages are tightly interwoven and both languages spoken interchangeably on a regular basis. Comparing the results from such a population against those reported for L1-dominant bilinguals traditionally employed in emotion processing studies is hence challenging

#### **5. Conclusions**

In conclusion, the present study examined behavioral and electrophysiological correlates of L1 and L2 emotion processing in immersed highly proficient Spanish-English/ English-Spanish bilinguals residing in the bilingual community characterized by dense codeswitching practices. We wanted to see whether there would be any qualitative or quantitative differences between L1 and L2 emotion word processing and whether bilingual participant characteristics, such as dominance or AoA would constrain the L1/L2 emotion effects. Behavioral data showed faster and more accurate responses to English than Spanish targets, reflecting the fact that all of the bilinguals participating in the study were more proficient in English than in Spanish. However, the emotion effect was only present for Spanish, which was the first language for the overwhelming majority of our participants. Electrophysiological data showed a significant effect of language, such that early and late EPN responses were more pronounced for English than Spanish, with the reverse effect found on the LPC component, where Spanish targets elicited a higher positivity than those that were English. Dominance did not turn out to be a significant predictor of bilingual performance. Overall, emotion word processing in highly proficient immersed bilinguals might reflect a complex interaction of a number of participant factors, such as proficiency, AoA, the length of the immersive experience, or individual histories with each of the languages and how they were grounded in the emotional context. Further research with more diverse bilingual populations and a wider range of tasks is needed to more accurately assess the dynamic interaction of the various participant characteristics in the course of L1/L2 emotion word processing.

**Supplementary Materials:** The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/languages8010042/s1. Experimental Stimuli. ACC\_LGTARGETXV ALENCEXDSCORE analysis. RT\_LGTARGETXVALENCEXDSCORE analysis.

**Author Contributions:** Conceptualization, A.B.C.; data curation, A.B.C. and B.L.G.; formal analysis, A.B.C. and B.L.G.; investigation, A.B.C. and B.L.G.; methodology, A.B.C.; project administration, A.B.C. and B.L.G.; software, A.B.C. and B.L.G.; supervision, A.B.C.; validation, A.B.C. and B.L.G.; visualization, A.B.C.; writing—original draft, A.B.C.; writing—review and editing, A.B.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the TAMIU Advancing Research and Curriculum Initiative (TAMIU ARC) awarded by the US Department of Education Developing Hispanic Serving Institutions Program (Award # P031S190304) and by the NSF BCS Division Of Behavioral and Cognitive Science. MRI: Acquisition of a Biosemi Event Related Potentials Active Two Acquisition System to Enhance Research and Training at Texas A&M International University (Award # 1229123).

**Institutional Review Board Statement:** The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Texas A&M International University (protocol code 2019-02-13; date of approval: 11 March 2019).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available from the corresponding author upon request.

**Acknowledgments:** We want to thank the following TAMIU students for their help in administering the experiment: Alexandra Reyes, Ruby Salas, Rebeca Salazar, Devon Nuñez, Mariella Soto Ruiz, Jessica Garza, Alexandra Rodriguez, and Samantha Andrade.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
