1. Introduction
Multilingual speakers display an interesting characteristic in their everyday communication: the practice of code-switching, where elements from different languages seamlessly integrate into a single sentence or discourse (cf.
Deuchar 2012). For example, in Spanish, the verb,
hacer, meaning “do” or “make,” can serve as both a lexical verb of creation and a causative verb. When integrated into Spanish–German code-switching,
hacer is also employed as a light verb, losing a significant portion of its semantic content (1). However, this usage is confined to code-switching contexts (
González-Vilbazo 2005;
González-Vilbazo and López 2011).
(1) | Vamos | a | hacer | schreiben | la | Matharbeit. |
| go.1PL | PREP | do.INF | write.INF | DET.FEM | maths.homework |
| We will write the maths homework. |
| (Adapted from González-Vilbazo 2005, p. 202) |
In the following sections, we explore studies on code-switching at the interfaces, focusing specifically on its intersections with information structure (or information packaging), i.e., the way in which information is formally packaged within a sentence. The interface between grammar and discourse relates to how the organization of information within sentences (information structure) influences and is influenced by the syntactic form of sentences (syntax) and how these sentences fit into larger discourse contexts (discourse).
1 For an overview, see
Erteschik-Shir (
2007). The notion of information structure subsumes various dichotomies, such as topic/comment, focus/presupposition, theme/rheme, or background or given/new information, which divide a sentence into two parts based on pragmatic notions (
Féry et al. 2007;
Krifka 2008). Topic refers to what the sentence is about, whereas the rest of the sentence is the comment (
Lambrecht 1996;
Zubizarreta 1998). Focus roughly refers to the new or non-presupposed information of a sentence, whereas the rest of the sentence is given, presupposed, or shared information (
Chomsky 1971;
Jackendoff 1972;
Zubizarreta 1998). The literature presents conflicting views on the nature of focus, its distinction from topic, the types of foci, and their effects. For instance, a distinction is often made between broad and narrow focus. In broad focus, all information is new, and not one element is highlighted, whereas in narrow focus, one element of the sentence is highlighted. Furthermore, a distinction is often made between neutral focus and contrastive (or corrective) focus. In contrastive focus, one alternative is selected and contrasted with another alternative (cf.
Erteschik-Shir 2007). Discrepancies also exist regarding the potential conflation of topic/comment with focus/presupposition (cf.
Parafita Couto 2005 for an overview). Cross-linguistically, information structure can be expressed in syntax (e.g., word order), morphology (e.g., through topic and focus markers), and/or prosody (e.g., through intonation). Interestingly, in code-switching, we may find two languages that differ in their expression of information structure. This prompts an exploration of how such differences impact or limit the phenomenon of code-switching. In this paper, we acknowledge the diverse perspectives on information structure held by each author, striving to identify overarching patterns.
Within the realm of linguistic theory, an important question revolves around the dynamic interplay between syntactic and information-structural theories when explaining the mechanisms that underlie various grammatical phenomena (cf.
Féry and Ishihara 2022). However, there exists a relative scarcity of literature that addresses the intricate relationship between code-switching and information structure. As such, in this paper, we undertake the task of providing an encompassing overview of the existing (albeit limited) body of research on code-switching across language combinations. Our focus is directed towards specific manifestations of code-switching, including light verbs (as discussed by
González-Vilbazo and López 2011,
2012), subject pronoun–verb switches (as explored, for instance, by
Bustin et al. 2024), ellipsis (examined by
Merchant 2015;
González-Vilbazo and Ramos 2019;
Delbar 2022), and intonation (studied by
Olson and Ortega-Llebaría 2010).
Code-switching has been studied from different perspectives, and different theoretical models have been proposed, with differing and sometimes contradicting predictions regarding the (un)acceptability of code-switches. In
Section 2, we discuss the main theoretical approaches to code-switching. In
Section 3, we provide an overview of different studies, contextualizing the findings associated with information structure and code-switching, which stem from diverse theoretical frameworks (e.g., generativism and the matrix language framework). We acknowledge the limitations of the existing research, which often feature examples of code-switching studied in isolation and out of context, i.e., some do not explicitly consider information structure. Moreover, many of these studies tend to involve a relatively small number of participants. In light of these limitations,
Section 4, following this overview, presents a discussion of the theoretical and methodological considerations that should guide future studies in this area.
Overall, this contribution endeavors to shed light on the scope of the existing literature concerning information structure and code-switching at the interfaces of syntax, prosody, and discourse. As such, we aim to provide a holistic understanding of code-switching at the interfaces (syntax–prosody–discourse), examining the available sources to delineate the existing landscape and identify key themes, gaps, and potential avenues for further research.
2. Theoretical Approaches to Code-Switching
In the study of code-switching, the focus is on understanding the boundaries that allow or disallow the mixing of languages in speech. The study of code-switching is relatively recent, originating in the early 1970s, opposing earlier assertions of randomness in language switching (cf.
Weinreich 1953;
Labov 1971). This early research laid the groundwork by providing a descriptive foundation that outlined constraints on language mixing. For instance,
Gumperz and Hernández-Chávez (
1970) noted “linguistic constraints” on pronominal pronouns in Spanish–English code-switching, while
Aguirre (
1976) investigated grammatical judgments on code-switching, marking the overt emergence of theories of code-switching. As such, linguists at this time, such as
Lipski (
1978),
Pfaff (
1979), and
Poplack (
1980), argued that code-switching between languages is not arbitrary but is governed by structural constraints. In these early years, researchers concentrated on identifying the specific structural constraints characterizing this bilingual speech phenomenon.
Pfaff (
1979) conducted a pioneering study on code-switching, emphasizing that the mixture of Spanish and English is socially motivated but constrained by clear linguistic principles. Based on an analysis of conversational data from around 200 speakers, Pfaff argued against the need for a third grammar and proposed that Spanish and English grammars are intertwined, based on various constraints. She identified structural constraints favoring surface structures common to both languages. For instance, Spanish auxiliaries and English participles can be combined, while the mixing of adjectives and nouns within noun phrases is restricted due to differing surface orders in Spanish and English. This observation aligns with
Poplack’s (
1980) “Equivalence Constraint,” which suggests that code-switching tends to occur where the juxtaposition of elements from both languages does not violate syntactic rules. It is important to highlight that the constraints postulated by Poplack focused on surface switch points rather than on the material being switched. More recent studies suggest that exclusively considering switch points is not adequately explanatory and that the morphosyntax of the entire clause needs to be taken into account (cf.
Deuchar 2020;
Parafita Couto et al. 2023).
Given instances that contradict Poplack’s constraints,
Poplack (
2001) proposed replacing the idea of constraints on code-switching with that of general principles. One such principle involves an asymmetry between the contributions of the two languages in code-switching. This was already observed by
Joshi (
1982), who presented evidence of such asymmetry based on Marathi–English data. He distinguished between the bilingual’s “matrix” and “embedded” languages and argued that the matrix language serves as the source for both open and closed class items (e.g., determiners, prepositions, tense-marked verbs) in bilingual speech, whereas the embedded language can only contribute open class items (e.g., nouns, adjectives). This differentiation between the matrix and embedded language in code-switching served as the foundation for
Myers-Scotton’s (
1993,
2002) Matrix Language Frame (MLF) model. Myers-Scotton encapsulates the mentioned asymmetry in her “Asymmetry Principle,” characterizing it as the “morphosyntactic dominance of one variety in the frame” of a clause within bilingual speech (
Myers-Scotton 2002, p. 9). The language providing the morphosyntactic framework is termed the “matrix language,” while the other language becomes the “embedded language,” with its material constituting switches away from the matrix language. While there are no restrictions on the grammatical categories of constituents from the matrix language, only certain broadly defined “open class” items can be borrowed from the embedded language. An exception to this limitation occurs when a “chunk” of embedded language items appears together, as in embedded language islands.
The 4-M model, introduced by
Myers-Scotton and Jake (
2000), builds upon the MLF model and expands its scope. It attempts to explain the variation in the frequency of different morpheme types in code-switched data. The model classifies morphemes into four types: content morphemes, early system morphemes, late system morphemes, and outsider system morphemes. Content morphemes participate in the thematic structure of a clause by receiving or assigning thematic roles. Examples include nouns and pronouns occurring in argument positions. Early system morphemes, such as determiners and affixes modifying their heads, are characterized by their involvement in the lexical–conceptual level. They contribute semantic and pragmatic features like definiteness, plurality, and aspect. Early system morphemes are conceptually activated and depend on their directly elected content morpheme heads for form and meaning information. Late system morphemes, according to the System Morpheme Principle, are supplied by the matrix language. They include elements like agreement morphology and case affixes that make relationships within the clause more transparent, especially between arguments and predicates. Outsider system morphemes do not interact with structure-building processes at the functional or positional levels unless they are involved in case assignment. We will return to the distinction between content morphemes and system morphemes in our discussion of information structure and subject pronoun–verb switches.
Constraint-based approaches, including the Matrix Language Framework model, faced challenges from the perspective advocating a Null Theory, suggesting that no specific restrictions should apply exclusively to code-switching.
Pfaff (
1979, p. 314) had already argued that there is no need to postulate the existence of a third grammar to explain mixed-language utterances.
Di Sciullo et al. (
1986, p. 7) characterized code-switching as an ordinary aspect of language use, requiring no special stipulation, while
Joshi (
1982, p. 150) asserted that there is no third grammar in code-switched speech. In contrast,
López (
2020) distinguishes between two types of Null Theories: Broad Null Theory, utilizing Universal Grammar constraints to explain apparent code-switching constraints, and Narrow Null Theory, exclusively relying on grammatical features of the participating languages.
2These challenges to code-switching-specific constraints were often voiced within the generative linguistic framework, which views language structure as multi-layered and hierarchical. The initial explicit Chomskyan approaches to code-switching emerged during the Government and Binding era (
Chomsky 1981a,
1981b). Examples include studies by
Joshi (
1982),
Bentahila and Davies (
1983),
Woolford (
1983), and
Di Sciullo et al. (
1986). These approaches signaled a departure from Poplack’s linear perspective, adopting a more hierarchical approach. For instance,
MacSwan (
1999,
2005,
2009) proposes a code-switching model within Minimalism where lexical items are inserted at the beginning of structure building, suggesting a universal structure-building process for all speakers. Lexical items are chosen from each contributing language’s lexicon to compose the Numeration. The construction of structures involves merging elements found in the Numeration. During the derivation, lexical items merge according to the features that require checking and valuation before transferring to the interfaces upon completing a phase. Notably, the bilingual structure’s computational process aligns with that of monolingual structures. As such,
MacSwan (
2005, p. 75) illustrates the I-language of a bilingual speaker by feeding a computational system (CHL) with lexical items from different languages (Lx and Ly). For a thorough exploration of the differentiation between “constraint-based” and “constraint-free” theories of code-switching, readers are directed to
MacSwan (
2014), who introduced this distinction.
Some studies have attempted to evaluate the predictions of different theoretical models, in particular the theoretical predictions of the MLF model and the Minimalist Program (
Parafita Couto and Gullberg 2019;
Herring et al. 2010;
Parafita Couto et al. 2015;
Vaughan Evans et al. 2020, among many others). The results of these studies mostly align either with both accounts or remain inconclusive, making it challenging to distinguish between the predictions of both theories in naturalistic data. In contrast, several of the studies discussed in this paper contrast the two theories, using (semi-)experimental data.
Indeed, the tension may arise from the fact that linguistic communities and individual speakers differ in their language use. By recognizing and describing this variability, researchers can paint a more comprehensive picture of multilingual language practices. Understanding the spectrum of choices among communities and speakers adds necessary depth to theoretical development, offering insights that go beyond the confines of conflicting theoretical perspectives.
Muysken (
2000,
2013) stands out as a pioneer in recognizing the impact of various factors in the development of his code-switching typology. He introduces four types of language mixing in his code-switching typology, each reflecting varying degrees of contribution from lexical items and structures of two languages. The specific outcome is influenced by numerous factors, including typological distance, political distance, and community norms. Similarly,
Bhatt and Bolonyai (
2011) incorporate socio-cognitive constraints into their Optimality Theory (OT) framework of bilingual grammar. Furthermore,
Goldrick et al. (
2016) propose Gradient Symbolic Computation (GSC) as a formalism to explain the systematic nature of code-switching patterns, integrating psycholinguistic concepts of bilingual co-activation with generativist perspectives on grammar. Finally, a more recent perspective from
Aboh and Parafita Couto (
2024) advocates for a paradigm shift that recognizes the intricate interplay of linguistic features, hybridity, community norms, and multilingualism to foster a more holistic understanding of language. They propose studying diverse ecologies
3 where multilingual acquisition occurs, integrating factors like experience and cognition into the development of models of human language capacity. We come back to this in the discussion.
This concise theoretical overview lays the groundwork for the forthcoming section. Given the brevity required here, for a recent overview of theoretical approaches to code-switching, please refer to
Parafita Couto et al. (
2023). In the ensuing discussion, we will provide an overview of diverse studies that have examined the intersection of code-switching and information structure within a particular theoretical framework. These studies investigate the complex relationship between code-switching phenomena and the organizational principles governing the information structure within mixed utterances. By exploring the findings and insights derived from these diverse theoretical lenses, we aim to contribute to a deeper understanding of the complex interplay between code-switching and information structure in linguistic research.
3. Studies on Code-Switching at the Interfaces
Interface phenomena involve the interaction of different linguistic domains (e.g., syntax, morphology, phonology, pragmatics). Studies on code-switching at the interfaces are informative in two ways. By taking into account strategies to encode information structure of the participating languages, researchers can test hypotheses that cannot be tested with monolingual data alone. Moreover, considering the information structure of the sentences gives us a more complete picture of when code-switches are acceptable. This section includes studies on light verbs (
González-Vilbazo and López 2011,
2012), subject pronoun–verb switches (
Bustin et al. 2024), intonation (
Olson and Ortega-Llebaría 2010), ellipsis (
Delbar 2022;
González-Vilbazo and Ramos 2019;
Merchant 2015), topic and focus particles, and code-switching between sign languages (
Stoianov et al. 2023).
3.1. Light Verbs
An interesting phenomenon of code-switching at the interfaces concerns light verbs, exemplified by the following Spanish–German example from
González-Vilbazo and López (
2012), which uses a generativist (Minimalist Program) framework:
4In (2), a form of the Spanish light verb,
hacer, “do,” is used with the German lexical verb,
nähen, “to sew,” in the infinitive. This type of code-switch has been attested in numerous other language pairs. In fact,
González-Vilbazo and López (
2012) suggest that it occurs in most if not all code-switching pairs. The paper proposes the analysis in
Figure 1 for this construction (
González-Vilbazo and López 2012, p. 35), where the light verb in little
v takes a VP as its complement, which has a lexical V as its head.
5In this construction, the little
v and the lexical verb come from two different lexica, which allows for the study of the properties of little
v.
González-Vilbazo and López (
2011,
2012) argue that the light verb can only come from one of the languages, which in Spanish–German code-switching is Spanish.
6 The XP, however, can come from either language.
Following Chomsky’s phase theory, González-Vilbazo and López argue that
vP is a phase and that little
v is the head of the phase.
7 As the head of the phase, little
v determines some grammatical properties of the VP. In particular, González-Vilbazo and López propose that little
v determines the word order (XP V/V XP) of the VP, its prosodic phrasing, and the encoding of information structure (in particular, focus/background). Code-switching between two typologically different languages lends itself well to testing this hypothesis. Based on conversational data and oral and written grammaticality judgment tasks from over 85 Spanish–German bilinguals living in Spain,
González-Vilbazo and López (
2011,
2012) showed that in constructions with a Spanish light verb and a German lexical verb, word order, prosodic phrasing, and the encoding of focus/background follow Spanish.
First, regarding the word order of the VP, in German, the word order of constructions with an auxiliary verb is OV, whereas in Spanish, it is VO. As shown in (3), the light verb constructions in Spanish–German code-switching follow Spanish VO word order rather than German OV word order. That is, the word order is governed by the Spanish light verb in little
v.
Second, regarding the prosodic structure of the VP, Spanish and German differ in their prosodic phrasing. In German, the lexical verb and its complement can appear in one prosodic phrase (with one pitch accent, on the object), whereas in Spanish, they appear in two separate prosodic phrases (with two pitch accents). In the light verb construction in Spanish–German code-switching, the lexical verb and its complement are in two separate prosodic phrases, as in Spanish (4a). Example (4b), where the lexical verb and its complement appear in one prosodic phrase with one accent, is ungrammatical.
(4) | a. | Juan | ha | hecho | (ϕ verKAUfen) | (ϕ die | BÜcher). | |
| | Juan | has | done | sell | the | books | |
| | Juan has sold the books. | |
| b. | #Juan | ha | hecho | (ϕ verkaufen | die | BÜcher). | |
| | Juan | has | done | sell | the | books | |
| | Juan has sold the books. (González-Vilbazo and López 2012, p. 43) |
Of particular relevance to the current paper, little
v determines the encoding of background information, i.e., whether it is deaccented or dislocated. In Spanish, background information (i.e., given information, or topic) is typically moved to the left periphery of the sentence and doubled by a clitic.
8 In German, background information can be topicalized, where the object is moved to SpecC, or it can be deaccented. In Spanish–German code-switching, the background information is dislocated as in Spanish, which is exemplified in (5a), in which the object,
die Uhren, “the watches,” is moved to the left periphery of the sentence and doubled using a Spanish clitic with the same φ-features. Specifically, the Spanish clitic is feminine and plural, like
die Uhren, “the watches” in German. Example (5b), in which the object is deaccented and appears in a single prosodic phrase with the lexical verb (as in German), is not acceptable in the context provided. This sentence would only be possible if the verb were in contrastive focus. Moreover, example (5c), in which the object is moved to the beginning of the sentence, but which lacks a clitic (as in German), is not possible in this context, in which
die Uhren, “the watches” is background information. This sentence would also only be possible in a context in which the object is in contrastive focus.
(5) | [Context: What happened to the watches?] | |
| a. | Juan | die | Uhren | las | hizo | verKAUfen. | |
| | Juan | the | watches | CL.ACC | did | sell | |
| | The watches, Juan sold them. | |
| b. | #Juan | hizo | verKAUfen | die | Uhren. | |
| | Juan | did | sell | the | watches | |
| | Juan sold the watches | |
| c. | #Die | Uhren | hizo | Juan | verkaufen. | |
| | the | watches | did | Juan | sell |
| | The watches, Juan sold them. (González-Vilbazo and López 2012, p. 44) |
In sum, these examples show that the encoding of information structure (focus/background) in the light verb construction in Spanish–German code-switching is as in Spanish. This can be explained if it is assumed that little v determines the properties of the VP. An important contribution of this study is that it shows that little v determines more properties than what has been argued in the literature so far. Crucially, this could not have been shown based on monolingual data alone.
González-Vilbazo and López (
2011) argue that the Matrix Language Frame model cannot account for the light verb construction. According to the Matrix Language Frame model, the language of the inflection of the verb is the matrix language and determines the word order of the sentence. For instance, in (6), the light verb
hizo determines that the matrix language of the sentence is Spanish, and thus the word order within the VP is VO.
However,
González-Vilbazo and López (
2011) show that the word order is only VO when there is a light verb. Sentences with code-switches between a Spanish lexical verb and a German complement clause (7) or a DP (8) follow German word order:
9(7) | a. | Juan | dijo | dass | Johannes | klug | ist. | |
| | Juan | said | that | Johannes | clever | is | |
| | Juan said that Johannes is clever. | |
| b. | *Juan | dijo | dass | Johannes | ist | klug. | (González-Vilbazo and López 2011, p. 847) |
| | Juan | said | that | Johannes | is | clever |
| | Juan said that Johannes is clever. | |
In (7), the lexical verb is in Spanish and the complement clause is in German. The sentence is only acceptable if the verb in the complement clause is in final position (7a), as in German. Sentence (7b), in which the verb is in a non-final position, is unacceptable. In (8), a Spanish lexical verb is followed by a German DP. The contrast between (8a) and (8b) shows that the only acceptable word order is Adjective–Noun, as in German. Importantly, the Matrix Language Frame model cannot explain the difference in word order after lexical verbs versus light verbs. González-Vilbazo and López, on the other hand, have a straightforward and elegant explanation for these examples based on phase theory. Because the lexical verb and its complement belong to different phases, the lexical verb does not determine the properties of the complement, unlike light verbs.
More recent work on light verbs, however, has shown that the use of
hacer as a light verb is not always available. Specifically,
Balam et al. (
2020) showed that light verbs with
hacer are produced and accepted in Belize and New Mexico, but not in Puerto Rico, showing a variation across code-switching communities. This means that the production and acceptability of
hacer + Vinf are shaped by community-specific code-switching practices, in the same way that differences are observed in monolingual language contexts. As pointed out by an anonymous reviewer, these practices lead to the evolution of structural choices over time. Regarding the specific case of Belize,
Balam et al. (
2021) propose that a previously established Spanish-Maya template (such as
hacer lit’í, “to stand on one’s toes”) in earlier generations, probably played a role in the spread and standardization of this hybrid structure as the community shifted from Spanish–Maya to Spanish–English bilingualism. Such findings emphasize the need to study code-switching from a language ecological perspective (cf.
Mufwene 2014), recognizing the influence of community-specific practices on bilingual grammars.
3.2. Subject Pronoun–Verb Switches
Code-switching between subject pronouns and finite verbs (9a) has been claimed to be relatively infrequent in corpora and dispreferred in judgments (cf.
Lipski 1978,
2019;
Timm 1975;
van Gelderen and MacSwan 2008). In contrast, switches between a full lexical DP and a finite verb are considered acceptable, as seen in (9b):
This has been confirmed in acceptability judgment tasks (
Fernández-Fuertes et al. 2016;
Lipski 2017). However,
Lipski (
2017) found that the accuracy for switches of lexical nouns and subject pronouns was similar in a concurrent memory-loaded repetition task, suggesting their acceptability. Several studies on code-switches between a subject pronoun and a verb have shown an effect of grammatical person (
Bellamy et al. 2022;
Fernández-Fuertes et al. 2016). Specifically, switches of third person pronouns are typically rated higher in an acceptability judgment task than switches of first and second person pronouns. Importantly, however, these previous studies on switches between a subject pronoun and a finite verb do not consider the information structure of the sentences. Sentences containing these switches are either presented in isolation, or the subject pronouns have referents that were previously mentioned. For instance, in
Fernández-Fuertes et al. (
2016), the lexical NPs and the subject pronouns were always given information. Specifically, they appeared as an answer to a
wh-question, as in
¿Qué hace el chico?, “what does the boy do?”—
The boy bebe agua, “The boy drinks water.” However, the information structural status of subject pronouns may play a role in the acceptability of these switches. For instance,
González-Vilbazo and Koronkiewicz (
2016) found a difference between prosodically stressed subject pronouns in contrastive focus (as indicated by capital letters in their acceptability judgment task) and other subject pronouns. Specifically, switches of stressed subject pronouns (in contrastive focus) were rated as highly acceptable, whereas unmarked subject pronouns were not, suggesting an effect of information structure.
Information structure, in particular, focus, is highly relevant because it allows us to distinguish between the predictions of different theoretical models of code-switching, as shown in
Bustin et al. (
2024). This study focused on Spanish–English code-switches between a pronominal subject and a finite verb, as in (10), and used the notion of focus to test two code-switching models: a Minimalist Approach to code-switching (
MacSwan 1999,
2000;
van Gelderen and MacSwan 2008) and the MLF/4-M model (
Myers-Scotton 1993;
Myers-Scotton and Jake 2000).
Focus plays an important role in pronoun use in Spanish, which is a null subject language (unlike English). In broad focus, where the entire sentence is new information, and no single element is highlighted, null pronouns are used (11a). However, in contrastive focus, where one element is highlighted and contrasted with alternatives, overt pronouns are used (11b) (for more on the use of null and overt pronouns and its relation to information structure, see, for instance,
Jiménez-Fernández 2016;
Frascarelli and Jiménez-Fernández 2019):
(11) | a. | Diegoi | escucha | música | mientras | proi | corta | el | césped. |
| | Diego listens to music while he cuts the grass. |
| b. | Diegoi | escucha | música | mientras | ÉL*i/j | corta | el | césped. |
| | Diego listens to music while he cuts the grass. |
| | (Bustin et al. 2024, suppl. materials) |
This distinction is relevant for the predictions of the MLF/4-M model. The MLF/4-M model allows for code-switching when the pronouns in both languages (here Spanish and English) are obligatory. According to this model, null pronouns in Spanish are system morphemes, whereas overt pronouns in contrastive focus are content morphemes. English always requires overt pronouns, which are classified as content morphemes. According to the MLF/4-M model, then, in contrastive focus sentences, code-switching between a subject pronoun and a finite verb is permitted, as the pronouns in both languages are content morphemes and obligatory. In broad focus sentences, this type of code-switching is not allowed as the (null) pronoun in Spanish is a system morpheme. The Minimalist Approach to code-switching as proposed by
MacSwan (
1999,
2000) and
van Gelderen and MacSwan (
2008) does not distinguish between pronouns in broad focus and contrastive focus sentences. According to this approach, code-switching between a subject pronoun and a finite verb is generally predicted to be unacceptable. The predictions of the two models are summarized in
Table 1 adapted from
Bustin et al. (
2024, p. 316).
It should be noted that the Minimalist Program could predict code-switching between a subject pronoun and finite verb to be acceptable if overt subject pronouns in contrastive focus would be considered DPs.
Van Gelderen and MacSwan (
2008) argue that subject pronouns are base-generated under D and then move to a D position in the Spec of T. Because this creates a mixed language head, violating the PF Interface Condition, code-switching between subject pronouns and finite verbs is unacceptable. However,
Bustin et al. (
2024) propose that if subject pronouns in contrastive focus were considered DPs, they would move to the Spec of TP, like lexical subjects, and would be acceptable. Alternatively, it could be argued that subject pronouns in contrastive focus move to the left periphery of the sentence and appear as a lexical head.
The contrasting predictions of the two models, as presented in
Table 1, allowed
Bustin et al. (
2024) to experimentally test the two models, using a concurrent memory-loaded repetition task, adapted from
Lipski (
2017).
10 In this task, the participants first heard a sequence of four numbers, then saw a picture, and then heard a sentence, followed by a beep. After the beep, the participants repeated back the sequence of numbers and the sentence. Crucially, the specific picture provided a context for the sentence. Specifically, a picture in which one person performed two actions was used to elicit a contrastive focus interpretation, whereas a picture, in which two people performed the two actions, elicited a broad focus interpretation. Accuracy in repeating back the sentence was measured. In this study, accuracy was considered to reflect acceptability, whereas modifications to the structure were interpreted as that structure not being acceptable.
The participants completed the task in both unilingual Spanish and code-switching mode. As there is variation in pronoun use among Spanish varieties and among Spanish–English bilinguals, unilingual mode was used to ensure that the participants had the expected pronoun use in Spanish, that is, null pronouns in broad focus and overt pronouns in contrastive focus. The participants for the code-switching analysis were 38 adult Spanish–English early bilinguals, who showed the expected pronoun use in Spanish.
The results revealed a higher accuracy for sentences with a Spanish overt subject pronoun and an English finite verb in contrastive focus than in broad focus. These results provide support for considering the information structure of the utterances. Moreover, accuracy was higher in the contrastive focus condition than in the broad focus condition, against the Minimalist Program as proposed by
van Gelderen and MacSwan (
2008), which predicted code-switching to be unacceptable in both contexts. Finally, accuracy was higher for the subject pronoun–finite verb switches than for the distractors in this study, which included switches after a preposition, complementizer, conjunction, or auxiliary verb. This provides additional support for the permissibility of these code-switches.
All in all, the results showed that code-switching between a subject pronoun and finite verb is more complicated than previously thought. The study provides empirical support for a distinction between overt and null pronouns and shows the importance of considering the information structure of sentences. Specifically, the manipulation of information structure provided a better understanding of the permissibility of code-switching between a subject pronoun and a verb. Moreover, by looking at Spanish unilingual pronoun use, Bustin et al. were able to carefully select participants and test the predictions of the two models.
In the code-switching literature, the information structure of sentences is usually not taken into account. However, it might explain the low frequency of code-switches between a subject pronoun and a finite verb in corpora and the low acceptability in studies that do not manipulate information structure and often present sentences in isolation. Additionally, other studies do not control for Spanish unilingual pronoun use, possibly obscuring the results. All in all, when we only look at the syntax and not interfaces, we are missing an important fact of the permissibility of code-switching.
3.3. Intonation
In unilingual mode, intonation can be used in languages such as English and Spanish to convey the information structure of an utterance. However, relatively little is known about the use of intonation to mark information structure in bilingual, code-switching mode. One of the few studies on the intonation of code-switching is
Olson and Ortega-Llebaría (
2010), who studied the effect of intonation on the perception of narrow contrastive focus (where one element in the utterance is highlighted and contrasted with previous information) in code-switched utterances. Their hypothesis was that code-switched utterances would be interpreted as narrow contrastive focus utterances more frequently than non-code-switched utterances, especially when there were no intonational cues.
Their perception experiment included three groups of participants with six participants per group: late bilinguals with English as their L1, late bilinguals with Spanish as their L1, and early (simultaneous) bilinguals. The latter group was exposed to or used code-switching. The experimental stimuli consisted of unilingual Spanish and code-switched utterances in broad focus (i.e., the entire utterance consists of new information), as in (12a) and (12b), respectively:
These stimuli were manipulated for pitch range and peak alignment, as narrow contrastive focus in Spanish is associated with a larger pitch range and earlier peak alignment than broad focus. Pitch range here refers to the difference between the peak that is associated with the stressed syllable and the preceding valley, and peak alignment refers to the location of the peak with respect to the stressed syllable.
The participants listened to an utterance and had to indicate whether the utterance was a response to the question in (13a), which elicits a broad focus utterance, or to the question in (13b), which elicits a narrow contrastive focus utterance. The participants were instructed to press a key if the utterance was a response to the question in (13b).
(13) | a. | ¿Qué | pasa? | | | | (eliciting a broad focus utterance) |
| | What happens? | |
| b. | ¿Miras | el | padre | de | María? | (eliciting a narrow contrastive focus utterance) |
| | Do you see Maria’s father? |
The results revealed that code-switched utterances were interpreted as narrow contrastive focus utterances more frequently than non-code-switched utterances, by all the groups. Importantly, this difference was larger (and significant only) when there were no intonation cues for narrow contrastive focus such as a larger pitch range. This means that code-switching by itself was perceived as a cue for narrow contrastive focus.
This study is one of the few studies looking at the interaction of prosody, focus, and code-switching, and it provides support for code-switching as a way to highlight or emphasize part of an utterance. However, it is based on a highly controlled experiment with manipulations of two utterances, which raises the question of ecological validity.
11 Further research is needed on the interaction of prosody, focus, and code-switching in both experimental and naturalistic settings.
In this section, we saw that code-switching and intonation are used to highlight information. In the next section, we discuss what happens when elements or sounds are omitted.
3.4. Ellipsis
Ellipsis involves intricate connections with information-structural concepts, as highlighted by
Winkler (
2022). Generally, ellipsis pertains to the deliberate omission of linguistic elements, encompassing both structure and sound. This omission is intrinsically linked to the idea of givenness, i.e., the unspoken or deleted segment is previously mentioned or understood. The remnants of the ellipsis site, which occur to the left or right of the omitted material, are frequently connected to the notion of contrastive topic and focus. In contemporary linguistic theory, a fundamental question revolves around the synergy between syntactic and information-structural theories in explaining the licensing mechanisms governing different types of ellipsis (
Winkler 2022). Notably, discourse factors and information structure exert a profound influence on the form and interpretation of ellipsis, highlighting the essential role of information structure. However, although research on code-switching as well as ellipsis has witnessed remarkable growth over the past two decades, the specific realm of ellipsis within codeswitching remains an area that has not received commensurate attention (
González-Vilbazo and Ramos 2019).
The study of ellipsis in code-switching is interesting because it allows us to test theories of ellipsis, and it sheds light on the identity relationship between the ellipsis and its antecedent. There are different theories of ellipsis, including the copy theory and deletion theory. Most studies on code-switching provide support for the deletion theory, according to which the syntactic structure of the ellipsis is deleted at Spell-Out or PF. Regarding the identity relation between the antecedent and the ellipsis, the question is whether it is semantic, syntactic, or a hybrid. Most studies on code-switching and ellipsis suggest that the relationship has to be hybrid, i.e., both semantic and morphosyntactic.
One recent study showing a hybrid relationship is
González-Vilbazo and Ramos (
2019), who report on a study on sluicing and code-switching with six participants. In sluicing or TP ellipsis, the entire
wh-clause except for the
wh-phrase is deleted. González-Vilbazo and Ramos tested connectivity effects in Spanish–German code-switching, in particular regarding case. In their study, they used verbs that assign different case in Spanish and German. For instance, the verb,
amenazar, “to threaten,” assigns accusative case in Spanish, but the German equivalent,
drohen, assigns dative case. In code-switched sentences, the
wh-phrase has the case assigned by the verb in the subordinate clause, that is, dative in (14) and accusative in (15):
(14) | Juan | amenazó | a alguien | aber | ich | weiss |
| Juan | threatened | someone.ACC | but | I | know |
| nicht | {*wen/ | wem} | er | gedroht | hat. |
| not | who.ACC | who.DAT | he | threatened | has |
| Juan threatened someone, but I don’t know who he threatened. |
| (González-Vilbazo and Ramos 2019, p. 11) |
(15) | Juan | amenazó | a alguien | aber | ich | weiss |
| Juan | threatened | someone.ACC | but | I | know |
| nicht | {wen/ | *wem} | Juan | amenazó. | |
| not | who.ACC | who.DAT | Juan | threatened | |
| Juan threatened someone, but I don’t know who he threatened. |
| (González-Vilbazo and Ramos 2019, p. 11) |
In sluicing, however, the
wh-phrase has to have accusative case, as shown in (16):
(16) | Juan | amenazó | a | alguien | aber | ich | weiss | nicht |
| Juan | threatened | someone.ACC | but | I | know | not |
| {wen/ | *wem}. | | | | | |
| who.ACC | who.DAT | | | | | |
| Juan threatened someone, but I don’t know who. |
| (González-Vilbazo and Ramos 2019, p. 12) |
This example shows that the elided verb was the Spanish verb, amenazar, and sheds light on the identity relation between the ellipsis and the antecedent. Specifically, it shows that the identity relation is not only semantic but also morphosyntactic. If the relation were only semantic, the use of wem in (16) should be acceptable, as the ellipsis and antecedent are semantically identical. However, the case assigner of the wh-phrase needs to be identical, which supports a hybrid analysis of ellipsis.
González-Vilbazo and Ramos (
2019) cite further support from
Nee (
2012) on sluicing in Spanish–Zapotec code-switching. Some verbs in Spanish and Zapotec assign different cases. For instance, in Spanish,
hablar, “to talk,” selects a PP (
habló con), whereas the Zapotec equivalent,
gunien, is a transitive verb. Code-switched data show that (17) is grammatical because the Spanish verb,
habló, selects for a PP, which in this case is a Zapotec PP (
tu cun). Example (18) is unacceptable, because it involves P-stranding, which is not permitted in Spanish (nor Zapotec). Finally, the sentence in (19) is acceptable because
gunien is a transitive verb and selects a DP. Nee concludes that the identity relation between the antecedent and the ellipsis is not only semantic but also includes lexical identity.
Another code-switching study that shows that the identity relation between the antecedent and the ellipsis is not purely semantic is
Merchant (
2015) on VP ellipsis in Greek–English code-switching. Merchant provides the example in (20), which contains a question–answer pair:
(20) | a. | Píres | tin | tsánda | mazí | su? | (Merchant 2015, p. 204) |
| | took.2S | the | bag | with | you | |
| | Did you take the bag with you? | |
| b. | Yes, I did. | |
The response in (20b) is an example of ellipsis in code-switching, but interestingly, the data in (21) show that parallel responses without ellipsis (and the intended meaning “yes, I did take the bag with me”) are not possible:
(21) | a. | *Yes, | I | did | píra | tin | tsánda | mazí | mu. |
| | yes | I | did | take.ACT.PERF.PAST.1S | the | bag | with | me |
| (Merchant 2015, p. 204) |
| b. | *Yes, | I | did | pern | tin | tsánda | mazí | mu. |
| | yes | I | did | take[stem.form] | the | bag | with | me |
Example (21a) is unacceptable, because the English auxiliary,
did, cannot appear with a Greek inflected form, here
píra “take.ACT.PERF.PAST.1S.” The inflected form can only be created when a Greek root appears with a T node. On the other hand, (21b) is unacceptable because
pern, “take,” is a bare stem form of the verb, and bare stem forms cannot appear as free-standing words in Greek, which
Merchant (
2015) proposes is a morphological issue.
Merchant (
2015) argues that these examples show that the identity relation between the antecedent and the ellipsis needs to be morphological as well. To explain these and other examples, he proposes morphological elliptical repair. Together, all these code-switching studies show that the identity relation between the ellipsis and the antecedent needs to include morphosyntax, in addition to semantics.
All in all, most of the studies on code-switching in ellipsis show support for the deletion theory and suggest a hybrid relation between the ellipsis and the antecedent. However,
Delbar (
2022) did not find evidence for a hybrid relation in a study on NP ellipsis and gender agreement in Belgian Dutch–French code-switching. This study explored whether the choice of the grammatical gender showed a morphosyntactic link between the French elided noun and the Belgian Dutch antecedent, based on data from 23 Belgian Dutch–French bilinguals who participated in a two-alternative forced choice task. The materials for the task included sentences such as those in (22):
(22) | a. | Ik | eet | den | roden | appel | et | tu |
| | I | eat | the.M | red.M | apple.M | and | you |
| | manges | le | <appel> | vert. | | | |
| | eat | the.M | <apple.F> | green.M | | | |
| b. | Ik | eet | den | roden | appel | et | tu |
| | manges | la | <pomme> | verte. | | | |
| | I eat the red apple and you eat the green one. (Delbar 2022, p. 40) |
In (22),
appel, “apple,” in Belgian Dutch is masculine, whereas
pomme in French is feminine. If there were a syntactic identity relationship between the antecedent and the ellipsis, (22a) with masculine agreement in the ellipsis site would be expected to be acceptable. However, the participants in the two-alternative forced choice task preferred the feminine. In Delbar’s data, the participants typically preferred the gender of the French equivalent (as in (22b) (i.e., gender agreement with the elided noun). Delbar concludes that there is no evidence for a syntactic identity relation. However, it is important to note that some studies on gender in code-switching have shown that participants may prefer the gender of the translation equivalent, influenced by the profile of the participants and the specific rate of switching within the community (see
Bellamy and Parafita Couto 2022, for an overview). Consequently, as an anonymous reviewer highlighted, Delbar’s findings could be relevant both to our understanding of gender agreement and the syntax of ellipsis. At the same time, the different findings for this study and previous studies on ellipsis could be due to a range of factors, including different types of ellipsis, methodological considerations (such as task effects and the number of participants), and language pairs. Delbar suggests replicating the study with Spanish–German bilinguals or using more implicit techniques such as EEG. We come back to methodological considerations in the discussion.
3.5. Information Status Particles
We have discussed several studies that included languages that use syntax and/or phonology (intonation) to encode information structure. However, in other languages, focus and topic can be expressed morphologically as well. There has been limited research on the use of these markers in code-switched utterances. However, there are some examples in the literature of these code-switches. For instance, in the English–Ewe example in (23), an Ewe topic marker is used. Moreover, the Fongbe–French example in (24) shows a novel combination involving the topic marker,
ɔ́, with both a French conjunction (e.g.,
donc, “so”) and a noun (e.g.,
langue, “language”), and the doubling of Fongbe–French complementizers
ḍò -que, “that–that” (Aboh, personal communication).
(23) | English–Ewe | | |
| To | me | dee | mé-le | serious | o. | (Ameka, personal communication) |
| to | me | TOP | 3SG:NEG-be.at:PRES | serious | NEG |
| | To me, he is not serious. | |
(24) | Fongbe–French | |
| Donc | ɔ́ | nyɛ̀ | mɔ́ | ḍɔ́ | que | langue | (Meechan and Poplack 1995, p. 187) |
| so | TOP | I | see | tell | that | language |
| ɔ́ | é | dò | ‖ | importante. | | |
| DEF | she | be | ‖ | important | | |
| | So, me I see that language is important. |
The Ewe and Fongbe topic markers here may be used to highlight a switch, as in
Poplack’s (
1980) flagged switches. They may also be used to trigger another code-switch as in
Clyne’s (
2003) Triggering Hypothesis.
Ameka (
2009) highlights that Kwa languages such as Akye, Akan, Ewe, Ga, Likpe, and Yoruba, while not prototypically “topic-prominent” like Chinese or “focus-prominent” like Somali, have dedicated structural positions in the clause and morphological markers to signal the information status of various components within information units. These languages can be considered “discourse configurational languages” (
Kiss 1995), as they have distinct positions in the left periphery of the clause for scene-setting topics, contrastive topics, and focus. Ameka also discusses the morpho-syntactic properties of various information packaging constructions and the variations across these languages. This highlights the need for further research on the use of topic and focus markers in code-switched utterances to understand their effects.
3.6. Code-Switching between Sign Languages
There have been very few studies on code-switching between sign languages at the interfaces. An exception is
Stoianov et al. (
2023), who showed that reiterative code-switching between two sign languages (Cena, a young sign language used in a rural community in northeastern Brazil and Libras, the national sign language of Brazil) is used to mark information structure. In reiterative code-switching, sometimes called doubling, two signs with the same meaning (one from each language) appear one after another. In order to uncover the reasons for doubling,
Stoianov et al. (
2023) collected sign language data using an elicited production task. In this task, the participants described 30 short video clips of different intransitive, transitive, and ditransitive events (e.g., a woman looks at a man) to a partner, who was asked to select an image that depicted the event from three images. The task yielded 38 cases of reiterative code-switching produced by ten participants.
The results of the study showed that reiterative code-switches were particularly frequent in descriptions of reversible events, i.e., events with two animate arguments (e.g., push, look at) in which it is not clear who did what to whom. For instance, the results showed that the sign for WOMAN in Cena was frequently followed by the sign for WOMAN in Libras, although the reverse also occurred.
These cases of reiteration were interpreted as marking focus. The example in (25), which is produced in response to a video in which a woman gives a shirt to a man, shows a case of reiterative code-switching to mark contrastive focus. The participant initially produced the sentence in A1. As only the participant (A1) is recorded, B1’s reaction is not clear. However, in A2, WOMAN appears in sentence-initial position (in Libras) and is reiterated (in Cena).
(25) | A1: | MAN WOMAN GIVE MARRIED GIVE | (Stoianov et al. 2023, p. 408) |
| | A man gives something to a woman. |
| B1: | [unknown] | |
| A2: | WOMAN (Libras) WOMAN (Cena) GIVE MAN | |
| | A woman gives something to a man. | |
Example (25) is thus an example of contrastive focus, in which an element is highlighted and contrasted with alternatives (in this case, other human agents who appeared in the task). These reiterative code-switches were mostly used in reversible events, and to disambiguate the sentence. Stoianov et al. argued that reiterative code-switching is used to focalize an argument (often the agent) and to disambiguate the sentence.
All in all, the paper argues that the data shed light on strategies facilitating the successful transference of potentially ambiguous information in a developing language like Cena, where conventionalized strategies have not yet taken root. Reiterative code-switches introduce a novel tool for this purpose, underscoring language users’ ability to navigate ambiguity effectively by creatively employing the linguistic tools available to them. Like the studies discussed in the previous sections, this study shows the importance of information structure in accounting for code-switching. Reiterative code-switching or doubling here functions similarly to other strategies, such as intonation or word order, to focalize an element.
3.7. Summary
In summary, this section covered diverse studies on code-switching and information structure phenomena, encompassing topics like light verbs, subject pronoun-verb switches, intonation, and ellipsis. The scrutiny of light verbs entails a systematic analysis of their usage in code-switching, especially in Spanish–German (
González-Vilbazo and López 2011,
2012). The study suggests that little
v, acting as the head of the phase, dictates word order, prosodic phrasing, and information structure in code-switching constructions.
The examination of subject pronoun–verb switches started with their infrequent occurrence in corpora and their dispreferred status. However, the study by
Bustin et al. (
2024) underscores the role of information structure, particularly focus, in distinguishing between different theoretical models. That is, several studies on subject pronoun–verb switches argue that these switches are infrequent, and according to some models, unacceptable. However, the majority of these studies do not consider information structure. As we have shown, subject pronoun–verb switches are acceptable in certain contexts. Moreover, careful consideration of information structure (in particular, focus) allowed
Bustin et al. (
2024) to experimentally test the predictions of theoretical models.
We then addressed the role of intonation in conveying information structure in both unilingual and bilingual contexts.
Olson and Ortega-Llebaría’s (
2010) study experimentally explores the impact of intonation on the perception of narrow contrastive focus in code-switched utterances. Their findings suggest that code-switching itself can function as a cue for narrow contrastive focus, especially in the absence of intonational cues.
Subsequently, the discussion shifted to ellipsis, emphasizing its connection to information structure. We underscored the limited research on ellipsis within code-switching and its potential to test theories of ellipsis. Various types of ellipsis, such as VP ellipsis, sluicing, and NP ellipsis, were explored, with a focus on studies providing support for the deletion theory and indicating a hybrid relationship between ellipsis and its antecedent.
Finally, we explored information status particles, investigating how languages express focus and topic morphologically. Once again, we highlighted the scarcity of research on these markers in code-switched utterances, with limited examples from language combinations such as English–Ewe and Fongbe–French.
Stoianov et al.’s (
2023) study on reiterative code-switching between two sign languages, Cena and Libras, was also discussed, revealing its use in marking information structure, particularly in disambiguating reversible events.
It is worth noting the growing interest in this area. For instance, ongoing research by
Jiménez-Fernández et al. (
2024) explores Topic Preposing in the grammar of Puerto Rican bilingual speakers engaging in English–Spanish code-switching, both in matrix and embedded sentences. Their hypothesis suggests that when English serves as the matrix language, preposed topics may be less accepted compared to when Spanish serves as the matrix language, due to the rigid SVO order of English in contrast with Spanish. However, their results showed that bilingual participants from Puerto Rico generally found code-switched examples moderately acceptable regardless of the matrix language. Unilingual English clauses maintained the rigid word order of English, while unilingual Spanish examples demonstrated greater acceptability in line with the language’s flexibility (
Jiménez-Fernández and Miyagawa 2014;
Jiménez-Fernández 2023). They are currently expanding this research to a different linguistic ecology, focusing on Spanish–English bilinguals in Gibraltar and the Virgin Islands, aiming to bridge the gap in comparative research within this domain.
Together, these studies highlight the importance of considering information structure in code-switching research, shedding light on the permissibility and acceptability of various linguistic phenomena. The findings challenge traditional frameworks and underscore the need for a more nuanced understanding of code-switching, incorporating factors such as focus, background, prosody, and community-specific practices.
4. Discussion
In this discussion, we will first examine how the interplay between information structure and code-switching enhances our understanding of multilingual grammars and language competence more generally. Second, we will address the theoretical and methodological considerations that should guide future studies in this field.
(i) The interplay between information structure and code-switching, multilingual grammars, and language competence. As we navigate through the available research on code-switching and information structure, it becomes evident that code-switching research stands at a crucial juncture, poised to make significant strides in understanding this intricate phenomenon and advancing our theoretical models. As we have shown, research on interfaces (and information structure in particular) informs code-switching research, and vice versa. The studies that we have discussed in this paper approach the topic of code-switching at the interfaces in different ways. On the one hand, there are studies that use code-switching data to inform linguistic theory, and in particular, information structure theories. For instance,
González-Vilbazo and López (
2012) used code-switching data to show that little
v determines the grammatical properties of VP, including information structure properties. Moreover,
González-Vilbazo and Ramos (
2019) and
Delbar (
2022) used code-switching data to contribute to theories of ellipsis. Finally,
Olson and Ortega-Llebaría (
2010) used data on code-switching and focus to shed light on the role of intonation in bilingual grammars. On the other hand, there are studies that use information structure to test the predictions of code-switching models. For instance,
Bustin et al. (
2024) used theory on the information structure of subject pronouns to test the predictions from the MLF model and a minimalist approach to code-switching. These two types of approaches to code-switching and information structure are not entirely separate. For instance, the findings from
Bustin et al. (
2024) also contribute to theories of information structure (in particular, subject pronouns in contrastive focus), even though it was not the main objective of the study. Thus, the approaches inform each other and help advance the field. However, given the limited evidence base to date, the future of the field hinges on methodological considerations, empirical depth, and a comprehensive exploration of different theoretical perspectives. The challenges that lie ahead are both methodological and theoretical.
(ii) Theoretical and methodological considerations to guide future studies. Methodological considerations are paramount, requiring the development of more robust approaches that can overcome the challenges. One of the issues is that the empirical base is limited, often featuring isolated examples without contextualization. A lack of contextualization and attention to information structure might lead us to overstate the unacceptability of certain code-switches, such as those between subject pronouns and finite verbs. Moving forward, our understanding of the (un)acceptability of code-switches would benefit from a contextualization of examples from naturalistic data (rather than isolated examples out of context) and/or a link to a corpus of naturalistic speech (such as those found at bangortalk.org.uk or talkbank.org). Moreover, some acceptability judgment tasks might benefit from providing utterances in an explicit context (cf.
Schütze 1996), as done in some experiments (e.g.,
Bustin 2020;
Bustin et al. 2024;
Fernández-Fuertes et al. 2016). Ideally, the contexts created for experiments would be modeled on code-switching patterns observed in naturalistic data (what has been termed the field-to-cognition approach, cf.
Beatty-Martínez et al. 2018;
Valdés Kroff et al. 2018). It is also recommended that future research provide more detailed information on the methodology employed and, where possible, make their data available. Improved transparency regarding methodology is essential, along with an emphasis on context, as evidenced by the omission of contextual details in the majority of early studies.
As previously emphasized by
Gullberg et al. (
2009) and
Parafita Couto et al. (
2021), despite a substantial number of observations on code-switching, the support for a specific theoretical stance is limited across language combinations and both production and experimental data. There exists tension between code-switching theories, and one approach to address this is by reconciling the differences through a careful examination of patterns of variation.
To overcome the current limitations particularly regarding information structure, we must strive to build a stronger overall evidence base through comparative studies across language combinations, linguistic communities, and individual multilingual speakers, and through a wider range of research methods. First, we welcome comparative studies of a particular phenomenon across language combinations. For instance, replications of studies such as those by
González-Vilbazo and Ramos (
2019) on ellipsis and by
Bustin et al. (
2024) on subject pronoun–verb switches with different language combinations could strengthen their proposals. Moreover, research on code-switching and interfaces would benefit from more comparative studies of a particular phenomenon across linguistic communities, while keeping the language combination constant. As we discussed in
Section 3, recent studies on light verbs in Spanish–English code-switching show differences between communities. The comparison across linguistic communities provides a unique opportunity to determine the interplay of linguistic and social factors.
We also argue for more studies across individual multilingual speakers, as some experimental studies have highlighted issues such as participants not being habitual code-switchers and the constraint of a limited number of participants. Future research, therefore, should include more information about the participants, their bilingual/multilingual experience, and their code-switching practices. In recent years, more tools have in fact become available to assess the participants’ bilingual profile and their code-switching practices, such as the Bilingual Language Profile (
Birdsong et al. 2012), the Assessment of Code-Switching Experience Survey (ACSES) (
Blackburn 2013), and the Bilingual Codeswitching Profile (BCSP) (
Olson 2024). More information on the participants’ background and their code-switching practices would not only help in the selection of participants, but also explain variation in results within and across studies. Some recent studies, such as
Bustin et al. (
2024), have in fact shown variation in results across participants based on their linguistic profile (e.g., language dominance). Regarding the participants, we also argue for studies that include an examination of the participants’ speech in unilingual mode. Given that these participants are bilingual, their unilingual speech may be affected by cross-linguistic influence and not follow the expectations for monolingual speech (cf.
Ebert and Koronkiewicz 2018), which in some cases are crucial for the predictions of theoretical models. For instance,
Bustin et al. (
2024) tested the participants in both unilingual and bilingual (code-switching) modes, and only included data from participants that showed the expected pronoun use in unilingual mode in their analysis of the code-switching data. Especially given the variation across participants, it is also suggested that future research should consider increasing the number of participants.
In addition to linguistic profile, variation based on situational factors (e.g., register variation) might be relevant as well. Future studies could look at the effect of situational factors.
We also suggest that a broader array of tasks, encompassing both production and comprehension tasks in addition to acceptability judgment tasks, could enhance the scope of investigation. Some experimental studies have raised concerns about task-related effects. Moreover, a number of recent studies on code-switching have found that findings from judgment tasks were not in line with other tasks. For instance,
Bustin’s (
2020) findings for oral and written acceptability judgment tasks were less clear than their findings for the more implicit concurrent memory-loaded repetition task. Particularly for some of the more complex structures involving information structure, acceptability judgment tasks might not be sensitive enough to capture subtle effects. We therefore recommend including a wider range of tasks, as well as more implicit tasks (e.g., using eye-tracking or EEG).
Moving forward, we also call for more controlled experiments examining various interfaces, including discourse factors, intonation, and the role of discourse particles (cf.
Carrasco Santos 2023 for a recent syntactic–prosodic–sociopragmatic approach to Spanish–English code-switching in Puerto Rico). As a first step, existing corpora could be analyzed for the use of prosody, syntax, and/or morphology in the expression of information structure in code-switched utterances. Subsequent experimental work could further investigate the interplay between different strategies in the expression of information structure and code-switching. Going back to subject pronoun–verb code-switches, most of the extant literature does not mention the prosody of the utterances, and sound files are generally not available. However, as discussed in
Section 3, recent studies such as
González-Vilbazo and Koronkiewicz (
2016) suggest that prosody plays a role in the acceptability of these switches (even though the study was limited, in that prosody was indicated by capitalizing words). Moreover, although
Bustin et al. (
2024) did not include a prosodic analysis, their participants seemed to add contrastive stress in the contrastive focus condition of their concurrent memory-loaded repetition task. A good example of a study that includes both syntax and prosody is
González-Vilbazo and López (
2012) on light verbs. Future studies could include a more detailed prosodic analysis of the data, including repair phenomena (pauses, hesitations, repetitions, etc.) at the boundary of code-switches.
Furthermore, the evolution of theoretical perspectives hinges on a more extensive and diverse pool of code-switching data, including also code-switching (or code-blending) in bimodal bilinguals and signers, to refine and fine-tune our linguistic models. Building a stronger and more coordinated evidence base will allow us to explore the multifaceted nature of code-switching more comprehensively and enhance our theoretical models of language competence. This collective effort will enable us to refine our theories, better understand the intricate interactions at play, and ultimately advance our understanding of this complex linguistic phenomenon.
We specifically make a call for more extensive research on information structure and code-switching, exploring, for instance, various types of ellipsis and conducting studies on topic/focus markers. Additionally, there is a need for investigations across different language pairs, aiming to replicate previous findings and expand research across diverse communities. Integrating prosody, syntax, morphology, and semantics is recommended, whether studies are based on corpora or controlled production/comprehension tasks. Regarding theoretical implications, such studies constitute a promising avenue for advancing our theoretical understanding of multilingual grammars. Theories should be capable of accommodating diverse data sets, as illustrated by Bustin et al.’s discussion of how the Minimalist Program could explain their findings. As noted by an anonymous reviewer, another area for further research is the pragmatics of code-switching and its use as a rhetorical and persuasive device.
In conclusion, this review has provided a panoramic view of the existing literature on code-switching, particularly focusing on its interfaces with syntax, prosody, and discourse. Our exploration has revealed key themes, highlighted gaps in knowledge, and identified avenues for further research. In summary, a comprehensive understanding of multilingual grammars requires theoretical approaches that draw insights from diverse methodologies, including experimental studies, corpus analyses, as well as ethnographic investigations (
Aboh and Parafita Couto 2024;
Parafita Couto et al. 2021). As highlighted in this overview, an inclusive approach that also considers the nuances of information structure is crucial. By integrating these various perspectives, we will not only enhance the depth and scope of our linguistic inquiries but also enrich our collective knowledge of multilingual grammars, recognizing the significance of information structure in shaping language dynamics across diverse (multilingual) linguistic landscapes. Moving forward, we anticipate that future research endeavors will build upon these insights, integrating a diverse array of methodologies and recognizing the role of information structure in multilingual grammars.