*Review* **A Selective Review of Event-Related Potential Investigations in Second and Third Language Acquisition of Syntax**

**Tanja Angelovska 1,\* and Dietmar Roehm <sup>2</sup>**


**\*** Correspondence: tanja.angelovska@plus.ac.at

**Abstract:** The aim of this contribution is to highlight the role and relevance of neurolinguistics accounts for second and third language syntactic acquisition/processing. This chapter begins with a brief historical overview of the field of experimental psychology and the birth of the EEG methodology. We then provide a general introduction of the ERP methodology and the language-related ERP components, explaining what they show and how they are to be interpreted. A special focus is given on the clear distinction between behavioral measurements in contrast to real-time measures and the leading role of ERPs is elaborated on. We then provide a selective narrative review of existing L2 and L3 syntax acquisition studies with the EEG methodology within the domain of syntax that we consider relevant for deriving implications for language instructed settings. We discuss results from EEG studies on second and third language syntactic acquisition/processing and finally, highlight several conclusions important for the field.

**Keywords:** EEG; neurolinguistics; second language acquisition; syntactic processing; third language acquisition

#### **1. Introduction**

From a layperson's perspective, language comprehension seems to be an imperceptible and apparently effortless process. However, the human language processing system is constantly confronted with unexpected, meaning-related conflicting events that must be resolved if comprehension is to proceed successfully. Successful language comprehension depends not only on the involvement of different domain-specific linguistic processes, but also on their respective time-course. A large part of the recent work in psycho- and neurolinguistics has focused on trying to determine which processes play a role and how these processes interact in time. The following section begins with a brief overview of the field followed by an overview of the ERP methodology and the language-related ERP components, highlighting the importance of real-time measurements in contrast to behavioral measures.

#### **2. A Brief Overview of the EEG Methodology**

The initial beginnings of experimental psychology lead us back to Wilhelm Wundt. With his paradigm-shifting approach, psychology was established as an empirical discipline, whose aim is to analyze the consciousness processes precisely, to measure elementary perceptions, to break down associated conscious processes and complex interrelationships and find out the regularities behind such relationships. However, the first successful EEG results about the human brain origin from Berger (1929) who, as a result, received the recognition as founder of the human electroencephalogram. He not only expanded earlier research with animals to the human brain, but he also delivered a thorough description of the conditions, under which these rhythmic EEG processes were observed. His biggest achievement is his first description of an objective EEG correlate of the mental state. In

**Citation:** Angelovska, Tanja, and Dietmar Roehm. 2023. A Selective Review of Event-Related Potential Investigations in Second and Third Language Acquisition of Syntax. *Languages* 8: 90. https://doi.org/ 10.3390/languages8010090

Academic Editor: John W. Schwieter

Received: 16 May 2022 Revised: 7 February 2023 Accepted: 10 March 2023 Published: 22 March 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

other words, cognitive changes are reflected through the observed rhythmic activation in EEG. For instance, an alpha rhythm (a 10 Hertz oscillation) is shown in adults when they are relaxed and awaken with their eyes closed. The so-called Berger-effect (alpha-blockade) emerges when the person opens the eyes, and the mental state is challenged (i.e., alpha waves vanish or are reduced during problem solving). This effect was the decisive starting point for the area of psychophysiological EEG research (Altenmüller and Gerloff 1999). The EEG began to be used for detecting electrical potentials in the well-structured form of brain waves. The new discipline focused on the relationship between the frequencies of these brain waves and human behavior.

In 1951, the electromechanically based summative method as a means of EEG research technique was introduced (Dawson 1954). With the development of the digital computers in the early sixties several labs managed to apply this technique successfully to electrophysiological data. This milestone represented the basis for an improvement in the signal-to-noise ratio in the EEG analysis and led to the discovery of small endogenic event-related potentials (ERPs), which are bound to sensory and cognitive stimulus (e.g., the presentation of a word). After the importance of these potentials was recognized in the sixties, the EEG-based research focused exclusively on the research of the ERP-components (Altenmüller and Gerloff 1999). This tendency was strengthened through the development of digital computer techniques, which allowed that ERPs are calculated with special computer programs and represented as time-dependent functions (Barlow 1957; Brazier 1960). It was shown that ERPs vary systematically with stimulus properties (e.g., tone pitch, color, intensity) with respect to several measurable parameters (amplitude strength, latency, polarity, topography). However, it took another two decades to achieve a breakthrough in EEG and language.

Kutas and Hillyard (1980) reported in a science publication for the first time a clear language-related ERP correlate in a ground-breaking study investigating the processing of English sentences in three different conditions. The conditions were created based on the sentence's final word which was either syntactically and semantically well-formed (1a), included a plausibility violation (1b), or a mere physical deviation (letters size; 1c):


(Kutas and Hillyard 1980)

The activity at the critical sentence's final word was measured. While the physical deviation led to a late positive potential, the semantic violation resulted in a late negative component with an amplitude maximum at around 400 ms after the beginning of the presentation of the critical word (=N400). An N400 is, however, not only a correlate of semantic plausibility, but it is also influenced by several other factors such as word frequency, sentence context, repetition in relation to lexical status (open/close class words) and the degree of unexpectedness of a word, as well as a plethora of other lexical-semantic and contextual factors (for an overview see Kutas and Federmeier 2011). Furthermore, it was found that the N400 is part of the normal brain activity during the perception and processing of words and other meaningful (or potentially meaningful) stimuli (e.g., signs in sign language, numbers, pictures, faces, sounds and odors).

Despite the vivid rise in EEG research and related publications on sentence processing following Kutas and Hillyard (1980), the search for a syntactic ERP correlate (as a counterpart of the lexical-semantic N400) took another decade. It was not later than the 1990s when Osterhout and Holcomb (1992) described a late positive component as a correlate of a syntactic violation, which emerged with a wide parietal distribution and a latency of 600 ms after the stimulus onset. They labelled this component as the P600. Indeed, they found that this component not only shows up when the sentence was clearly ungrammatical (as in example 2), but also in so-called garden-path sentences (example 3), in which a temporal structural ambiguity (here, the verb "persuaded" with an ambiguity between a finite and a

non-finite reading) leads to a wrong initial syntactic analysis, which can be revised later on the basis of the apparently incompatible information (here, the infinitive marker *to*).


Hence, it was suggested that the P600 is an index of syntactic reanalysis and repair processes. In addition, the P600 was found in sentences with a more complex structure (e.g., wh-questions) and was discussed as a correlate of syntactic integration costs (Kaan et al. 2000). For example, Burkhardt (2005, 2006) reported a P600 as a correlate of the establishment of new discourse referents in a mental model. Because the P600 was also found in non-linguistic domains (e.g., for violations of mathematical rules; Lelekov et al. 2000) it was suggested that the P600 always emerges when a stimulus is hard to be integrated in the structure of the preceding context. Since then, many other ERP effects as correlates of language processing processes in different domains were found (e.g., for congruency violations, scrambling, subject preference, etc.; for an overview see Bornkessel-Schlesewsky and Schlesewsky 2009; Kaan 2007).

Prior to discussing the specifics about ERPs (cf. Roehm 2004), we begin with an example of a language ambiguous sentence, which leads to comprehension difficulties. Such an example was given in (3) and is repeated as (4) (from Osterhout and Holcomb 1992, 1993):

(4) The broker persuaded to sell the stock was sent to jail.

When sentence (1) is processed sequentially, the verb *persuaded* is initially analyzed as the finite main verb (as in "The broker persuaded the manager to sell the stock."). This decision must be revised when the reader/listener reaches "to" in the sentence as it becomes evident that persuaded is a non-finite verb in a reduced relative clause. Hence, the comprehension difficulty is a result of an ambiguity, as the reader/listener is prone to misanalysis pertaining to properties of syntactic structure. The comprehension difficulty is evident as a processing cost is imposed on the reader/listener. This type of enhanced processing cost gives insights into the architecture and mechanisms of the language processing system (Kimball 1973; Fodor et al. 1974; Frazier 1987; Clifton et al. 1994). Similarly, as in the investigations of other cognitive domains (e.g., memory and attention), the simplest account of this "processing difficulty" is realized through behavioral measures (for example, reaction times measurements). However, using these measures as a means of characterizing underlying mechanisms of linguistic analysis presupposes that the locus of the processing problem can be straightforwardly established. In this way, for an implausible sentence such as in (5), the reader/listener will certainly have longer reaction times in the critical region and lower acceptability ratings in contrast to a plausible sentence that will end with *butter.*

(5) He spread the warm bread with socks.

In contrast to example (4) where the processing cost was due to a structural ambiguity (due to the ambiguity of the verb *persuaded*), in (5) the processing costs are clearly based on the semantic oddness and thus, can be attributed to the lexico-semantic processing domain. Evidently, behavioral measures present pure quantitative measures of processing difficulty and cannot disentangle underlyingly different linguistic domains from one another. They merely provide rather unspecific global measures of processing difficulty representing only the result of the comprehension process locally (i.e., on the word level) and/or globally (i.e., on the sentence level). We can conclude that behavioral measures (such as, e.g., self-pacedreading, speeded-grammaticality judgments, or lexical decisions) do not allow conclusions about the precise time course of underlying processing mechanisms (Schütze 1996).

While chronometric procedures allow only for indirect measures of psychological processes in the form of reaction times (RTs) regarding a given stimulus, the measurements of brain waves with electroencephalography directly reveal the psychophysiological processes in the brain. In other words, the RTs are a measure of the outcome of cognitive processes and

the brain waves measure the cognitive processes as they unfold in real time. To fully represent the linguistic processing domains including its temporal processing characteristics, we need comprehension measures that provide not only quantitative estimations but allow for qualitative characterizations and an ongoing assessment of the comprehension process (or rather, its underlying processing characteristics). The comprehension process is the cornerstone of both first and second language (L2) processing. While behavioral measures only assess the overall performance of learnt L2 knowledge (i.e., final product based on one's displayed competence), real-time measures (such as EEG) track the online processing, internalization, and proceduralization of L2 knowledge in a moment-by-moment manner. Comparing the EEGs by second language learners with those from native speakers enables us to gain insights into the similarities and differences in the process itself and to observe how far the EEGs of L2 learners are native-like.

#### **3. Neurolinguistic Measures for Syntactic Processing: Event-Related Brain Potentials (ERPs) and Language-Related ERP Components**

The recording of the human electroencephalogram (EEG) is one of the methods that provides a direct reflection of underlying brain processes with the advantage to supply an excellent time resolution in the millisecond range. Using EEG, we obtain a direct (although damped) reflection of the summed electrophysiological activity of the brain by means of electrodes applied to the surface of the human scalp. Throughout the presentation of words/sentences, the event-related brain potential (ERPs) technique provides a continuous record of language comprehension processes as they unfold in real time. As mentioned above, ERPs provide a multi-dimensional characterization on processing difficulties during language comprehension, in which various language-related components are identified based on parameters such as polarity, amplitude, latency and topography. The high temporal resolution of EEG signals is a property which is of primary importance if the very rapid and complex processes that make up language comprehension and production are to be captured adequately (Bornkessel-Schlesewsky and Schlesewsky 2009). However, one must keep in mind that ERPs represent the average of EEG samples obtained in several trials (typically, between 30 and 40) belonging to the same experimental condition. Thus, the averaged waveforms represent estimates of the time-locked neural activity engendered by the presentation of stimuli belonging to different experimental conditions. Differences between ERP waveforms derived from different conditions therefore represent differences in the neural activity engaged by the items belonging to each condition (cf. Roehm 2004).

Only through the application of the ERP methodology, the syntactic processing difficulty in (4) and the semantic violation in (5) indeed elicit distinct ERP components, namely a parietal positivity (Osterhout and Holcomb 1992, 1993) with a maximum at approx. 600 ms (P600 or 'syntactic positive shift', SPS) vs. a centro-parietal negativity with a maximum at approx. 400 ms post critical word onset (N400; Kutas and Hillyard 1980). Based on such findings, the N400 was initially seen as an unambiguous general marker of lexical-semantic processes (Kutas and Federmeier 2000) used as a 'diagnostic tool' in cases where the nature of the observed processing difficulty could not be established straightforwardly. However, later ERP findings revealed that precisely the N400 can be found in several areas which are clearly independent of the lexical-semantic domain (e.g., Bornkessel et al. 2002; Osterhout 1997). The fact that the N400 component cannot be attributed to a single specific language processing domain therefore shows that the desired one-to-one mapping between ERP components and linguistic processes is at least problematic (see below).

Most of the EEG-based language research includes ERPs. However, complementary analyses of stimulus-related oscillatory EEG activity (rhythmic activity in the EEG) have only recently come to be used. Oscillatory activity is an inherent property of neurons and neuronal assemblies. It is generally agreed upon that the timing of oscillatory dynamics encodes information (Buzsaki and Draguhn 2004). Oscillations in the theta, gamma, and beta frequency range have been linked to mental effort and have recently been related to language processing (Lewis et al. 2016), whereas theta and gamma oscillations (theta: 3 to 7; gamma: 25 to 100 Hz (typically 40 Hz)) are associated to the encoding and retrieval of declarative knowledge and beta oscillations (13 to 30 Hz) have been treated as a potential index of syntactic integration, and more recently as a reflection of predictive processing during language comprehension. The next section includes a discussion about the languagerelated ERP components.

Several language-related ERP components have been identified and viewed as correlates of linguistic domains (e.g., semantic, and syntactic processing). As discussed above, the N400 is a negative ERP-deflection peaking around 400 ms after the onset of a potentially meaningful stimulus. In language comprehension, it has been interpreted as reflecting lexical–semantic processing (e.g., the integration of word-associated information into a meaningful context). Thus, the N400 is considered a reflection of 'contextual integration'. However, it is not only a reflection of the relative ease of semantic integration, but the N400 amplitudes also vary as a function of frequency (Van Petten and Kutas 1990) and repetition (Rugg 1990; Van Petten et al. 1991). The P600 is a late positive component with a broad parietal distribution and a typical latency between 600 and 900 ms post critical word onset (Osterhout and Holcomb 1992). It was first observed as a correlate of outright syntactic violations (following an early left anterior negativity, the so-called ELAN) and interpreted as an index of repair processes in sentences such as (6) (from Osterhout and Holcomb 1992).

#### (6) \* The broker hoped to sell the stock was sent to jail.

The ELAN (Friederici 1995), an early left-anterior negativity between 150–200 ms, has been observed in several ERP studies in situations where the brain is confronted with phrase structure violations due to outright word category violations (such as the word "to" in example 6). It is typically interpreted as a highly automatic correlate of initial structure-building processes (first pass parsing processes responsible for local phrase structure building; Friederici 1995, 1999; Hahne and Friederici 1999). A slightly later leftanterior negativity labeled as LAN with a maximum between approx. 300 and 500 ms after onset of the critical item has been reported for agreement violations with legal words (e.g., subject-verb agreement or wrong pronoun case) (cf. Coulson et al. 1998; Kutas and Hillyard 1983; Osterhout and Mobley 1995; Gunter et al. 1997), as well as with morphologically marked pseudowords (from Münte et al. 1997). It has been suggested (e.g., Münte et al. 1997) that the LAN specifically reflects the actual detection of a morphosyntactic mismatch. However, because a LAN can also be found in grammatically correct sentences, others have claimed that it indexes some aspect of working memory usage (Kluender and Kutas 1993a, 1993b; King and Kutas 1995; Rösler et al. 1998).

Following the declarative/procedural model (Ullman 2001), the declarative memory system underlies the mental lexicon (storing rote-learned linguistic mappings) and the procedural memory system underlies the mental grammar (necessary for the acquisition and use of rule-governed computations in language). Indeed, it has been shown that declarative memory processes are likely to evoke N400 responses (cf. Morgan-Short et al. 2010), while procedural memory processes (especially with respect to morphosyntactic relations) primarily show ELAN/LAN responses. Moreover, it has been argued that ERPs can reveal implicit processing of (morpho-)syntactic knowledge that may be distinct from explicit (morpho-)syntactic processing as evidenced in grammaticality judgment tasks (GJT), and thus may allow differentiating cross-linguistic differences and similarities (Tokowicz and MacWhinney 2005). With respect to classroom-like instructed subjects, N400 and P600 effects have been found with L2 learners even when they lack immersion exposure (Morgan-Short et al. 2010) that is regarded essential for implicit learning and the consolidation of knowledge.

Findings that language-related ERP components can be used as diagnostic indicators of linguistic or cognitive domains, such as the N400 as a correlate of lexical–semantic processing or declarative memory contributions to language, versus the LAN/P600 as correlates of morphosyntactic processing or procedural memory, seem highly appealing. However, this possibility has been recently challenged by a continually increasing number of findings (Bornkessel-Schlesewsky and Schlesewsky 2009).

Having reviewed the general EEG methodology and the ERP components, in the next section, we discuss findings from event-related potentials relevant for the second language acquisition process.

#### **4. EEG and Second Language Acquisition of Syntax**

The field of neurocognitive SLA has followed two main aims in the past: (1) to compare the neurocognitive representation of L2 processing with that of native speakers' and, subsequently, (2) to explain possible differences or similarities as a function of factors, such as proficiency, or age of onset in the L2, etc. (Angelovska and Roehm 2020). The most comprehensive and informative narrative review of L2 ERP research encompassing studies from 2008 to 2013 (Morgan-Short 2014) has identified several findings which are relevant for language teachers. The reilluminating focus in EEG studies in the field of second language acquisition of syntax falls under the domain of investigating whether the ERP patterns between L2 learners and native speakers differ and which type of training (implicit or explicit) is more beneficial for comprehension and/or production of the target language, as well as which factors moderate such effects. L2 proficiency has been considered as one of the most influential factors. For instance, regarding L2 proficiency and (morpho- )syntactic processing in late L2 learners, Morgan-Short et al. (2010) reviewed a body of ERPs studies indicating that the L2 learners' neural activity in the brain during processing (morpho-)syntactic violations can be modulated by several inter-related factors, such as the similarity or dissimilarity of syntactic structures in L2 and L1 and the L2 learners' level of L2 proficiency—a finding which is relevant for second language educators interested in deriving ERP-based implications for teaching. We build upon this review focusing on newer results and including studies of relevance for instructed settings. We divide this review of relevant studies according to whether the target language is a natural or an artificial language.

#### *4.1. EEG Studies with an Artificial Language as a Target L2*

Grey et al. (2017) examined the role of learners' language experience in an EEG study which included target language training and behavioral and neural correlates as measures. They included Mandarin–English bilinguals and English monolinguals following an explicit grammar instruction session of only 13 minutes using an artificial language as a target language. Learners were trained in comprehension and production and received the task to judge sentences grammaticality. Despite the finding that monolinguals and bilinguals did not differ on the behavioral measures, they differed regarding the ERPs whereby proficiency proved to matter considerably. More specifically, at low-proficiency level, only bilinguals showed an expected P600 effect, whereas at high proficiency both monolinguals and bilinguals showed a P600. An anterior positivity was found for monolinguals only, which is not typically found in native speakers. The general conclusion is that when acquiring an additional target language and even after a minimal amount of instruction, bilinguals show ERP patterns similarly to native speakers.

SLA researchers still debate the question of which type of instruction (explicit or implicit) is more effective for learners. Morgan-Short et al. (2010) tested low- and high-proficiency learners who were instructed with explicit (classroom-like) and implicit (immersion-like) training when acquiring an artificial language as a target language. Regarding the behavioral measures, no group differences were found. However, group differences were evident regarding the ERP patterns. The implicit group showed significant native-like patterns on noun–adjective and noun–article agreement and were considerably more successful than the explicit group, which showed gains only for noun-article agreement. Morgan-Short et al. (2010) interpret these results in line with the declarative/procedural model maintaining the view that with an increasing L2 experience and proficiency, learners will become more dependent on L1 neurocognitive mechanisms, whereas if learners are less proficient, they would rely more on the declarative memory for lexical and semantic processing.

In a subsequent longitudinal EEG study, Morgan-Short et al. (2012) confirmed the results from Morgan-Short et al. (2010) regarding whether explicit and implicit training would differentially affect neural and behavioral performance measures of syntactic processing. The confirmatory findings highlight the benefits of the implicit training. To be precise, the implicitly trained group showed native-like patterns (an anterior negativity followed by a P600 accompanied by a late anterior negativity) at a high proficiency level and only an N400 at a low proficiency level. The explicitly trained group demonstrated only an anterior positivity followed by a P600 at high proficiency and no significant effect at low proficiency. The results clearly speak for language learners' reliance on native-like mechanisms, as evident through electrophysiological signatures, only when they are exposed to implicit instruction. An important conclusion relevant for second language pedagogy is that language teaching practitioners should focus on increasing the amount of exposure to implicit training conditions for their learners, thereby maximizing the immersion-like and communicative experience for their learners.

(Pili-Moss et al. 2019) investigated whether learners' declarative and procedural abilities predict comprehension and production accuracy and automatization when doing L2 practice. Their results point to positive associations between comprehension accuracy and declarative learning ability, which did not change across practice. Furthermore, they found that with increasing procedural learning ability, the automatization increased as well. Likewise, learners with a higher declarative learning ability showed stronger automatization. A more recent synthesis of studies with neurocognitive measures within the domain of second language research with artificial languages can be found in Morgan-Short (2020).

We can derive two conclusions based on the reviewed EEG studies with an artificial language as a target language: (1) in both studies (Grey et al. 2017) increasing language experience and proficiency proved to be the main factors accounting for more native-like ERP signatures; and (2) implicit instruction seems to be inevitable and more beneficial for the second language learners to obtain native-like mechanisms as automatization is increased, as found in Morgan-Short et al. (2012) and Pili-Moss et al. (2019).

#### *4.2. EEG Studies with a Natural Language as a Target L2*

With respect to the EEG components in L2 acquisition, P600 is often considered a measure of native-like processing, while smaller or delayed N400 and P600 effects were found for L2 learners compared to native speakers (Hahne 2001; Sabourin and Stowe 2008). However, these findings depend on several factors which are implemented in the analyses and shown to moderate these effects. In what follows, we review selected ERPs studies about syntactic acquisition with relevance to classroom implications. We begin by reviewing studies that consider the effects of the linguistic factors' effects (e.g., similarity, prosody, etc.) on the outcomes of L2 processing, then we focus on considering the role of proficiency and working memory as variables in ERP studies and finally we review studies which focused on the training type and context, which is decisive for deriving a conclusion of relevance for instructional contexts.

Kotz et al. (2008) investigated how native and non-native readers of English respond to syntactic ambiguity and anomaly. The learners they focused on were highly proficient L1 Spanish and L2 English individuals who had acquired English informally around the age of five years. Their results point to the fact that early acquisition of L2 syntactic knowledge leads to comparable online sensitivity towards temporal syntactic ambiguity and syntactic anomaly in early and highly proficient non-native speakers. Guo et al. (2009) aimed at finding out which kind of information (semantic or syntactic) is more important in sentence processing when verb sub-categorization violations are included. Using ERPs, they investigated strategies employed by 17 native English speakers and 28 L2 English learners with Chinese as L1. They found a P600 effect to verb sub-categorization violations for the native speaker group and an N400 effect for the L2 learners. Even though they interpret these findings as electrophysiological evidence for different strategies used by native speakers and L2 learners in sentence processing (i.e., shallower processing), one must

be aware that the proficiency measure they used in their study is not robust (self-rating scores) without any stay-abroad experience. Both studies Kotz et al. (2008) and Guo et al. (2009) offer relevant pieces of evidence for considering the additional type of information when interpreting ERP results.

Nickels and Steinhauer (2018) considered the role of prosodic information available in syntactic processing which is a rather not addressed aspect in L2 instruction with neurocognitive studies on prosody–syntax interactions being rare. They analyzed the ERPs of Chinese and German learners of English as L2 to those of native English speakers (control group). Their findings showed that the L1 background and L2 proficiency influence the online processing whereby the garden-path effects were triggered by prosody. They used linear mixed effect models treating the L2 proficiency factor as a continuous variable. Their main finding refers to the fact that both L1 background and proficiency are predictors for the syntactic processing. Similarly, they (2020) investigated the ERP signatures of beginning, intermediate and advanced learners when processing L1–L2 word order conflicts. They found N400-like signatures for beginners. In contrast, the intermediate and the advanced group showed native-like P600 signatures for the violated sentences without conflict between L1 and L2, and delayed P600 signatures for the violated sentences with L1- L2 conflicts, although behaviorally both groups were on a native level. It was shown that processing L2 word-order that conflicts with L1 is an obstacle even at an advanced proficiency level, and that progression through developmental stages depends not only on proficiency, but also on how similar L1 and L2 are of the learners.<sup>1</sup>

Tokowicz and MacWhinney (2005) investigated explicit and implicit processes during L2 sentence comprehension using ERPs and a grammaticality judgment task among 20 L2 Spanish learners with L1 English. Three syntactic phenomena were tested: tense-marking (which exhibits similarity to the L1), determiner-number agreement (which exhibits differences to L1) and determiner-gender agreement (unique to the L2). The experimental sentences varied the form of three different syntactic constructions: (a) tense-marking, which is formed similarly in the first language (L1) and the L2; (b) determiner-number agreement, which is formed differently in the L1 and the L2; and (c) determiner-gender agreement, which is unique to the L2. They found P600 sensitivity for tense-marking violations in L2, but this effect was not found for determiner-number agreement and for determiner-gender agreement. These findings show that L2 learners' processing depends on the similarity between the L1 and the L2.

We can conclude that Nickels and Steinhauer (2018), Mickan and Lemhöfer (2020) and Tokowicz and MacWhinney (2005) agree in bringing forth one main conclusion that includes not only proficiency but also similarity between the L1 and the L2 in order to predict how L2 learners process syntactic information in an L2.

We now turn to elaborating on the role of proficiency and working memory in L2 sentence processing. Dallas et al. (2013) examined whether young adult L2 English learners with L1 Chinese would show differential results in relation to their proficiency level and working memory capacity. Using ERPs, they investigated L2 English learners' real-time processing of sentences containing filler-gap dependencies. Their finding revealed that with rising L2 proficiency, the sensitivity to plausibility variations increases likewise, regardless of learners' working memory capacity. However, working memory proved to be a strong predictor of learning in more explicit settings with behavioral measures (e.g., Linck and Weiss 2015). Another study with important findings for the learning context and working memory is Faretta-Stutenberg and Morgan-Short (2018). They focused on examining the role of the acquisitional context. In two longitudinal studies, they tested young adult (at university level) native speakers of English studying L2 Spanish in an "at-home" (*n* = 29) and "study-abroad" context (*n* = 20) at the intermediate level. They used measures of learners' working memory capacity, declarative and procedural learning ability, and EEG data, which was collected while learners completed a syntactic grammaticality judgment task. Results revealed gains for at-home intermediate learners in their ability to detect phrase structure violations and N400 and P600 effects at the end of the semester. However, the study-abroad learners showed larger behavioral gains, which were accounted for by their gained procedural learning ability and working memory capacity. It seems that more immersed-like learning conditions are more beneficial when acquiring a target language.

Bowden et al. (2013) included young-adult learners without immersion experience and at a low proficiency level. Bowden and Steinhauer et al. (2013) tested two groups of late-learned L2 Spanish speakers (low vs. advanced proficiency) and a control group of L1 Spanish speakers. Regarding semantic processing, N400s were found in all three groups, whereas LAN/P600 responses were evident for syntactic word-order violations only in the native speakers and in the L2 advanced group, with a statistically indistinguishable outcome. The authors concluded that L2 semantic processing relies on similar neural process as in L1, whereas the factors proficiency or exposure do not moderate these effects. On the other hand, L2 syntactic processing differs from L1 syntactic processing at the low proficiency levels; however, with increasing proficiency and exposure (immersion experience), it can become native-like. In other words, the language learning process qualitatively changes the syntactic processing of L2 learners.

Both studies (Faretta-Stutenberg and Morgan-Short 2018 and Bowden et al. 2013) bring evidence about the benefits of immersion settings for the acquisition of syntax, which has influential outcomes for how learning conditions should be planned by educators.

We now analyze the effects of the differential training types in L2 acquisition. Within a pre- and post-test design, Davidson and Indefrey (2009) examined L1 Dutch learners' development of L2 German (adjective declension and both article–noun and adjective– noun gender agreement) during only one instructional session, which consisted of explicit training on German. Learners were instructed and given grammatical information about adjective declension and gender and given short feedback (correct/incorrect). The posttest was delivered one week after the instruction. The control group with L1 German speakers completed the final test only. The findings reveal that the L2 learners improved, showing a P600 response only for declension. Despite the result suggesting that even a relatively short training session may have resulted in instructional benefits and a P600 response development in L2 learners for adjectival declension, we do not know whether the result will be confirmed if learners underwent a different type of instruction. We also lack information on the amount of instruction needed for the P600 response to appear for all cases.

Batterink and Neville (2013) examined whether second language learners recruit some of the same language-processing mechanisms as those employed by native speakers of the same language. They investigated L2 syntactic processing on article-noun agreement, subject-verb agreement, and word order with 24 French native speakers and 67 L1 English speakers learning French as L2 who were assigned two differential trainings (implicit vs. explicit). The implicitly trained group (*n* = 44) did not receive any explicit rule provision, while the explicitly trained group (*n* = 23) received formal instruction on the underlying grammatical rules. Their findings show that both instructed groups did not show any differences in the comprehension scores; however, the native speaker group achieved significantly higher comprehension scores. On the grammaticality judgment task, the explicitly trained group performed better than the implicitly trained group. Regarding the ERP results, the native speaker group showed a biphasic response, consisting of an early negativity effect followed by a later P600 effect, and the early negativity effect did not prove significant in the noun agreement condition. Similarly, both implicitly and explicitly trained high proficiency groups showed P600 effects to all three agreement conditions but the low proficiency implicitly trained group did not elicit P600 in any of the conditions. The only condition in which the training predicted the improvement effects was the word order violation condition where the implicit training predicted the P600 amplitude. The authors concluded that even a short implicit training alone is predictive of a P600 effect for L2 learners, suggesting that the implicit training results in L2 learners recruiting the same neural mechanisms as those employed by native speakers even at early acquisitional stages of syntactic processing and regardless of the qualitative differences.

We can conclude that the results regarding the beneficial effects of explicit versus implicit training still are inconclusive. Generally, some studies (e.g., Guo et al. 2009; Morgan-Short et al. 2012) found N400 instead of the expected P600 for late bilinguals in their early stages of L2 acquisition. In the case when late learners demonstrate native-like processing, it is mainly dependent on proficiency, although the problem of disentangling age of onset from proficiency has already been acknowledged (Friederici et al. 2002). Lower proficiency learners have demonstrated weaker P600 in comparison to higher proficiency learners and native speakers (e.g., Bowden et al. 2013). Early and highly proficient bilinguals' syntactic processing resembles native processing (P600) when both L1 and L2 share the syntactic property. In regard to the learning conditions, we can conclude that ERP studies on natural L2 training under differential input conditions in instructed classroom settings is limited to non-existent. These types of studies are very extensive and difficult to conduct in such "noisy" settings as natural language classrooms and often carry several confounding factors that are hard to control for. For example, classroom foreign language learners are at different ages and proficiency levels and differ in other individual variables of both an affective and cognitive nature. Certainly, the instructed SLA domain would profit from ERP results which test language acquisition in designs where the target language is a natural language not biasing the ongoing explicit-instruction trend. However, it remains a challenge to create such experimental designs that would meet the criteria of both lab and instructional conditions.

#### **5. EEG and L3 Acquisition of Syntax**

The research on L3 morphosyntactic processing using EEG is rather scarce with a clear focus on transfer. Eye-tracking results in L3 studies (Abbas et al. 2021) revealing that during L3 processing, L3 learners have access to both L1 and L2, which can be the source of transfer in L3 but with different time-courses, have been corroborated by EEG results. The first study to compare bilinguals' processing to monolinguals' processing of an additional language using ERPs was by Grey et al. (2017). L1 Mandarin-L2 English bilinguals and L1 English monolinguals learned an artificial language, Brocanto2. The behavioral data showed no group differences on comprehension and production accuracy or RTs. There was an improvement of the GJT accuracy after a second training compared to the first session, but no significant differences between groups. The bilingual group exhibited a P600 response to ungrammatical sentences in the first session, whereas after the second training both bilinguals and monolinguals showed a P600 response. Moreover, monolinguals exhibited an anterior positivity in the 400–700 ms time window that was absent in bilinguals, who had a visually apparent but not significant anterior negativity instead. These findings prove that even without bilingual/monolingual behavioral differences, bilinguals show distinct ERP patterns which are more like native speakers.

Andersson et al. (2019) examined how native Swedish speakers (controls) and German (+V2 in L1) and English (−V2 in the L1) learners of Swedish produce, judge, and process grammatical V2 and ungrammatical verb-third (V3) word orders in Swedish sentences with sentence initial adverbials. Behaviorally, the learners differed from the native speakers only on judgements, but all were highly proficient. Neurocognitively, all groups showed a similar increased posterior negativity, as well as posterior P600 ERP-effect, but German learners displayed more native-like anterior ERP-effects than English learners. Interestingly, learners and native speakers did not differ in P600 amplitude. German learners had a similar anterior negative effect as Swedish learners, whereas English learners had a larger frontal positivity. This could suggest that they used more attentional resources, while German learners used a less demanding processing strategy because their L1 has a similar structure to Swedish. The results suggest crosslinguistic influence in that the presence of a similar word order in the L1 can facilitate online processing in an L2, even if no offline behavioral effects are discerned. Despite the information that the AoA of Swedish for L1 German learners was 21;5 (SD = 2;5) and exposure 3;4 (SD = 2;10), whether English was the learners' L2 has not been specified. Most German schools teach English from the 5th

grade, which corresponds to AoA of English at about 11 years old, making it likely that L1 German speakers acquired English before Swedish. Therefore, it is not clear whether this was the first study of L3 Swedish word order processing.

Gonzalez Alonso et al. (2020) used artificial languages (Als), Mini-English and Mini-Spanish, to distinguish several transfer models using ERPs. One group of L1 Spanish-L2 English participants learned Mini-English as L3 and the other Mini-Spanish, both at lower intermediate to advanced proficiency in English. The mini grammars used English and Spanish lexicons, respectively, while the morphemes were the same in both ALs. Grammatical gender, only present in Spanish, was featured in both ALs. There were no differences between conditions (grammatical, gender violation and number violation) in accuracy and RTs. There was an early fronto-lateral negativity for gender violations in Mini-English, as well as a broad positivity over all regions for gender violations in Mini-Spanish. The early positivity could be interpreted as P300, since gender violations might have been attended to as target stimuli and detected more easily due to this grammatical feature being present in Spanish.

What these L3 acquisition studies of syntax have in common is the fact that both found typology effects as attested with EEG measures accounting for the fact that syntactic processing resembles native processing if the languages under observation share typological similarity.

#### **6. Conclusions**

We began with a brief historical overview of the field of experimental psychology and the birth of the EEG methodology. We then provided a full description of the ERP methodology and the language-related ERP components explaining what they show and how they are interpreted. A special focus was given to the clear distinction between realtime measurements in contrast to behavioral measures, and the leading role of ERPs was elaborated on.

We provided a narrative review of classroom-relevant EEG studies about the acquisition of syntax in L2 and L3 by reviewing selected studies. Based on our review, we can conclude that an adult second-language learner's brain is highly dynamic, even during the earliest stages of L2 learning (Osterhout et al. 2008). Furthermore, proficiency as a factor matters in explaining the variation in EEG patterns in such a way that only a higher proficiency is associated with P600 effects typical for monolingual native speakers. Further, less proficient learners employ declarative knowledge and more proficient bilinguals rely on native-like mechanisms, when exposed to implicit instruction. Late bilinguals do not show native-like neurocognitive signatures in the early stage of L2 acquisition. Early and highly proficient bilinguals' syntactic processing resembles native processing if the L1 and L2 are typologically similar. Other factors that showed to be predictors are working memory, timing, and instructional setting. Early acquisition of L2 syntactic knowledge leads to a higher mastery of syntax. Thus, for language policy makers, it is relevant to plan an early introduction of syntactic instruction in foreign languages in the respective national curricula. The results from the L3 studies using EEG methodology confirm findings from SLA.

Despite the fact that, on one hand, we evidenced very few EEG studies having examined ecologically valid learning situations that can tell us about second language education itself, the findings of the so far conducted EEG studies in lab settings offer to second-language educators important implications that can be derived about the nature of second/third language knowledge, its consolidation and proceduralization, as well as about the mechanisms involved in these processes. Based on such findings, second language educators can learn more about how particular language domains are processed and acquired and what teaching techniques they should employ to foster that particular type of knowledge generation and consolidation.

Having in mind the importance of these findings for the actual language teaching, we hope that teachers will deepen their understanding of the neurocognitive processing in L2

and make use of research-informed implications. We allow ourselves a note of caution in this context, as sometimes deriving concrete pedagogical implications is rather challenging, especially when the reported results show differential ERP effects across studies which also differ in the amount of input exposure and controlled individual differences, which could account for language learner variation. Moreover, the number of EEG studies comparing various instructional interventions is limited to non-existent. This makes it likewise impossible to derive any conclusions about the effectiveness of such interventions as neurocognitive assessment measures are lacking. L2 instructional interventions focus on providing learners with meaningful input. This is relevant from an acquisitional point of view. However, as pointed out by Angelovska and Roehm (2020), it is not feasible to suggest implications of practical relevance for teaching if EEG results about the effectiveness of the different grammar intervention types are lacking. In addition, what further remains unclear is: (a) whether and to what extent classroom instructional grammar interventions can compensate for immersion-context acquisition opportunities; (b) what duration should those interventions have; (c) what type of feedback should be given and to what extent would it be sufficient if we want our learners to generate more native-like neurocognitive responses. They called for longitudinal L2 EEG studies.

Finally, with continued interdisciplinary approaches and sophisticated research designs, the potential of ERP results to inform central questions of second and third language acquisition needs to be further explored.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Note**

<sup>1</sup> According to an anonymous reviewer, with similar behavioral outcomes for both groups, it is unclear whether the neural differences between groups suggest an "obstacle" per se or simply a different neurocognitive mechanism of accomplishing the same outcome.

#### **References**


Brazier, Mary A. B. 1960. Some uses of computers in experimental neurology. *Experimental Neurology* 2: 123–42. [CrossRef]

Burkhardt, Petra. 2005. *The Syntax-Discourse Interface: Representing and Interpreting Dependency*. Amsterdam: John Benjamins.

Burkhardt, Petra. 2006. Inferential bridging relations reveal distinct neural mechanisms: Evidence from event-related brain potentials. *Brain and Language* 98: 159–68. [CrossRef]

Buzsaki, Gyorgy, and Andreas Draguhn. 2004. Neuronal Oscillations in Cortical Networks. *Science* 304: 1926–29. [CrossRef]

Clifton, Charles, Jr., Lyn Frazier, Keith Rayner, and Charles Clifton. 1994. *Perspectives on Sentence Processing*. Hillsdale: Lawrence Erlbaum Associates.


Morgan-Short, Kara, Karsten Steinhauer, Cristina Sanz, and Michael T. Ullman. 2012. Explicit and Implicit Second Language Training Differentially Affect the Achievement of Native-Like Brain Activation Patterns. *Journal of Cognitive Neuroscience* 24: 933–47. [CrossRef] [PubMed]

Morgan-Short, Kara. 2014. Electrophysiological Approaches to Understanding Second Language Acquisition: A Field Reaching Its Potential. *Annual Review of Applied Linguistics* 34: 15–36. [CrossRef]


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
