Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken?

Kogan, Vita V.

doi:10.3390/languages9070238

Open AccessArticle

Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken?

by

Vita V. Kogan

Laboratory Fonetica e Fonologia, University of Lisbon, 1649-004 Lisboa, Portugal

Languages 2024, 9(7), 238; https://doi.org/10.3390/languages9070238 (registering DOI)

Submission received: 25 September 2023 / Revised: 15 June 2024 / Accepted: 27 June 2024 / Published: 30 June 2024

(This article belongs to the Special Issue Investigating L2 Phonological Acquisition from Different Perspectives)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Phonetic category compactness pertains to the degree of variation or dispersion within a specific category. Previous research has shown that more compact native (L1) categories in production have been related to the discrimination of non-native sounds in perception and production. The understanding of the factors influencing L1 category compactness remains limited. Some proposals suggest that compactness may be influenced by individual differences in cognitive processes. Alternatively, category compactness could be linked to linguistic factors, such as the number of languages spoken or the density of the phonological system. This study investigates the latter hypothesis. This study examined category compactness in perception for three L1 Spanish vowels /i/, /e/, and /a/ across four participant groups: 12 monolinguals, 31 functional monolinguals, 24 bilinguals, and 19 multilinguals. To measure compactness in perception, the study employed a perceptual categorization task consisting of synthesized variants of /i/, /e/, and /a/. Participants were asked to label these variants as either acceptable or unacceptable members of their L1 /i/, /e/, and /a/ categories. The findings revealed significant differences in category compactness between monolingual and bi/multilingual speakers. More specifically, bilingual and multilingual speakers had larger/less compact L1 vowel categories than monolinguals. The substantial variability in compactness across all groups suggests that compactness may be influenced by a range of other individual differences, besides the number of languages spoken.

Keywords:

speech perception; L2 phonetics and phonology; category compactness; variability in perception; bilingualism and multilingualism

1. Introduction

Individual differences within the same native language (L1) have garnered increasing attention, particularly in the context of second language (L2) acquisition (Kartushina and Frauenfelder 2013, 2014; Kogan and Mora 2022; Saito et al. 2024). One consistent source of L1 variability identified by previous research is phonetic category compactness in production, which pertains to the degree of variation or dispersion within a specific category. It has been reported that more compact categories are associated with more successful L2 perception and production (Kartushina and Frauenfelder 2013, 2014). Future research seeks to identify the factors that may influence category compactness among individuals. Several potential factors have been proposed, including variations in cognitive processing such as phonological short-term memory and acoustic memory (Kogan 2020; Kogan and Mora 2022), general auditory abilities (Saito et al. 2024), and linguistic factors such as the number of languages spoken (Flege and Bohn 2021). The current study investigates the latter proposition and is among the first attempts to investigate category compactness in perception, differing from previous studies, which focus on production, and investigate whether category compactness varies in monolingual, bilingual, and multilingual speakers.

1.1. Individual Differences in L1 Speech

Significant research has been devoted to investigating individual differences in native speech perception and production, with a primary focus on comparing native speakers of different L1s. Typically, these studies aim to uncover how the phonological system of one’s native language interacts with the processing and acquisition of other languages at a phonetic level. More recently, with the application of such postclassical theories as exemplar-based theories and computational models to L1 and L2 speech acquisition, greater attention has been directed toward understanding the individual differences within the same L1 speech community. While these differences may not appear as striking as those observed between speakers of different L1s, they can nonetheless play a significant role in shaping the task of acquiring an L2, either facilitating or hindering the learning process (Holliday 2015; Huffman and Schuhmann 2020; Kartushina and Frauenfelder 2013, 2014; Kartushina et al. 2016; Kogan and Mora 2022; Zhai et al. 2023).

Exemplar-based theories propose that phonetic categories are defined as collections of instances of speech sounds or exemplars that individuals have encountered. These categories are not abstract, unchanging entities; instead, they consist of specific examples of speech sounds, each associated with its unique acoustic, lexical, and social context (Coleman 2003; Ettlinger and Johnson 2010; Pierrehumbert 2001). Since phonetic categories arise from a complex interplay of neurocognitive, conative, and social factors, individuals who share the same native language can possess distinct L1 phonetic categories. This diversity leads to individual differences in L1 perception and production, resulting in differences in the acquisition of novel non-native categories, e.g., different L1-to-L2 assimilation patterns (e.g., Escudero and Williams 2012).

A computational speech production model known as DIVA (Guenther 1995; Guenther et al. 1998), although different in its underlying assumptions and mechanisms from exemplar-based theories, also suggests that there exists individual variability in the size of L1 phonetic categories or the auditory spaces occupied by specific phonemes. According to the model, the auditory spaces are formed by monitoring sounds from the speaker’s L1 and learning the region of auditory space that encompasses examples of each phoneme. Individuals with more acute perception could be more likely to reject poorly produced tokens representing a particular phoneme and thus might learn auditory goal regions that are spaced further apart and more sharply defined. This hypothesis was corroborated by Perkell et al. (2004) in their study involving 19 native English speakers. They administered two tasks: (1) a task involving reading aloud and (2) a task focused on discriminating vowel sounds on a continuum. The findings revealed that the participants with a greater ability to distinguish acoustically similar L1 vowel contrasts (e.g., in cod vs. cud) displayed less variability in vowel production within categories and greater acoustic distinctions between categories. The authors proposed that greater perceptual accuracy is indicative of more specific speech targets, characterized by smaller target areas in acoustic space. This, in turn, leads to more consistent speech production, as smaller target areas result in a more rigorous rejection of non-prototypical productions identified as “speech errors”. Franken et al. (2017) arrived at similar conclusions—although with a smaller effect size—while examining native Dutch speakers. They assessed individual differences in the dispersion of phonetic categories in production and correlated these differences with perceptual discrimination abilities. Their findings indicated that individuals with superior discrimination skills have more distinct vowel production targets—namely, targets with reduced variability within phonemes and greater between-phoneme distances (compactness).

To summarize, individuals sharing the same native language may exhibit variations in the manner in which their native sounds are represented within a psychoacoustic space, which might influence the processing and acquisition of L2 sounds.

1.2. Phonetic Category Compactness and L2

Compactness in production. Phonetic category compactness refers to the degree of clustering or proximity of phonetic tokens within a particular category in the acoustic space. In simpler terms, it describes how closely related the sounds belonging to a specific phonetic category are to each other. A phonetic category is considered compact if the sounds within it are closely grouped together, making them easier to distinguish from sounds belonging to other categories. Arguably, individuals with more compact native phonetic categories have a perceptual advantage when learning L2 sounds because the chances that a novel sound would fall within an acoustic space already assigned to an existing L1 category are minimal. For example, a native speaker of Spanish—a language that only has five vowels, /a/, /o/, /e/, /i/, and /u/—who possesses the compact vowel category /i/ might be more successful in distinguishing between acoustically similar L2 English /i/ and /ɪ/. Conversely, native speakers of Spanish with the less compact category /i/ might find themselves in a situation where both L2 English /i/ and /ɪ/ fall within the large acoustic space occupied by the native Spanish /i/, making it difficult to distinguish between /i/ and /ɪ/.

Kartushina and Frauenfelder (2013) were the first to introduce the concept of category compactness, which they defined as the distribution of speech sound tokens within a specific category based on individual speakers’ productions. In their study, they quantified the compactness index of L1 Spanish vowel /e/ across 14 Spanish learners of French and established a connection with the accuracy of perceiving similar L2 French vowels, specifically /e/ and /ɛ/. Their findings demonstrated a direct link: the more compact the L1 Spanish category /e/ in production was, the better participants perceived the distinction between the challenging French /ɛ/ and their native Spanish /e/. These findings suggested that the distribution of individual native vowel productions transfers into the shared phonetic space and affects the perception of similar L2 sounds.

In a subsequent study (Kartushina and Frauenfelder 2014) involving the same language pair, the authors expanded upon their findings by showing that L1 compactness in production also correlates with improved production in L2. Spanish speakers with more compact distributions for the Spanish /e/vowel were more proficient at producing the acoustically similar French /e/ and /ε/ vowels.

Both studies suggest that individuals with more compact vowels in their L1 potentially possess larger “empty” acoustic spaces in between their native categories that can be readily utilized for acquiring novel sounds, thereby increasing their capacity to accurately perceive and produce unfamiliar L2 sounds. In other words, having more compact L1 categories allows for the auditory perceptual space to be less densely populated by phonetic categories from the individual’s native language, and thus, the likelihood that a similar L2 sound is perceived as not belonging to the L1 category is higher, facilitating L1–L2 discrimination and the establishment of L2 speech sounds for perception and production.

Huffman and Schuhmann (2020) demonstrated that the relationship between L1 category compactness and accuracy in L2 production extends beyond the F1–F2 space to encompass voice onset time (VOT). Tracking learners’ L2 VOT over a semester in a beginner Spanish course, they found a correlation between learners’ initial compactness in L1 English stop VOT production and their L2 Spanish VOT production accuracy by semester’s end. Furthermore, they observed that speakers with more accurate L2 VOTs also exhibit greater compactness in their L2 VOT productions.

On the other hand, Holliday (2015) observed the opposite effect with less compact categories in production during early learning stages associated with improved L2 phonetic learning up to a year later. Holliday (2015) investigated Mandarin speakers learning Korean. Beginning learners, after six weeks of instruction, showed varying levels of success in acquiring Korean VOT. Surprisingly, those with more variability in their production categories at the initial analysis displayed better production of Korean stop VOT contrasts twelve months later. This suggests that having a wider range of exemplars to draw from during the emulation of new L2 targets may be beneficial, especially in the early stages of learning. Leung (2014) arrived at a similar conclusion, observing a connection between the range of exemplars available for a given category and the robustness of that category during the acquisition process.

Lastly Zhai et al. (2023) revealed no significant correlation between speakers’ L1 category compactness and L2 production compactness or accuracy. They investigated whether the level of compactness in Japanese L1 production categories correlates with the compactness and accuracy of speakers’ L2 English production categories. They analyzed the F2 and F3 values of L1 /ɾ/ productions from 30 Japanese speakers to determine individual L1 compactness. Subsequently, they utilized the F2 and F3 values of the same speakers’ L2 English /l/ and /ɹ/ productions to assess their L2 production accuracy. Their findings showed no connection between compactness and L2 production.

To summarize, while some studies demonstrated a positive correlation between compactness in L1 categories and proficiency in acquiring and producing L2 sounds, others suggest a more nuanced relationship, wherein initial variability in production categories may facilitate later phonetic learning.

Compactness in perception. Early research on L1 category typicality or the perception magnet effect (e.g., Kuhl 1991; Iverson and Kuhl 1995; Lively and Pisoni 1997; Frieda et al. 2000) paved the way for the investigation of category compactness in perception. These studies demonstrated the existence of individual differences in the perceived outer limits of L1 phonetic categories. When listening to synthesized variants of a native vowel category, speakers of the same L1 differed in the extent to which a given sound token maps a category. Kogan and Mora (2022) employed the same methodology to assess the relationship between L1 category compactness in perception and the discrimination of an acoustically similar unfamiliar vowel contrast. They assessed the perceptual compactness index of a Spanish L1 vowel, /i/, and demonstrated that this measure is related to the discrimination of a non-native Russian contrast, /i/-/ɨ/, with more compact L1 categories facilitating the perception of novel sounds. The authors concluded that having more compact L1 categories in perception might support the early accusation of L2 speech, allowing for more rapid processing and the building of novel non-native categories.

More data are needed to tease apart the effects that category compactness in production and perception language may have on the accuracy and precision of developing L2 phonetic categories.

1.3. Phonetic Category Compactness and the Influencing Factors

Little is known about the factors that might influence category compactness. Kartushina et al. (2016) observed that L1 compactness appears to remain relatively stable as an individual characteristic: in their study, the L1 compactness index stayed the same as new sounds were acquired. At the same time, the compactness of newly acquired L2 sounds tends to increase with training, indicating that as exposure and proficiency grow, L2 sounds become more compact. Kartushina et al. (2016) proposed that individual variability in L1 compactness may be attributed to one or a combination of the following factors: (1) individual differences in the precision of articulatory gestures, (2) individual differences in auditory phonetic representations, and/or (3) individual differences in cognitive processing. Kogan and Mora (2022) investigated the latter proposal, measuring phonological short-term memory and acoustic memory in relation to L1 category compactness in perception. Their findings indicated that only acoustic memory significantly contributed to the L2 discrimination ability interacting with L1 compactness, albeit with a weak effect size. Specifically, individuals with greater acoustic memory relied less on category compactness when discriminating a non-native vowel contrast. Conversely, individuals with lower acoustic memory benefited significantly from having more compact L1 categories on the same task.

Recently, Flege and Bohn (2021) proposed that among other factors, L1 compactness could be influenced by language-specific characteristics, such as the number of sounds present in a phonological system and the number of languages spoken. Previous research investigating differences in category compactness across languages has not consistently shown a clear relationship between category compactness and the number of phonemes within a language’s inventory. For instance, Bradlow (1995) compared English, which boasts a large vowel inventory, with Spanish, which has a smaller one. The study reported no substantial differences in the tightness or compactness of within-category clustering between these two languages.

Another study by Franken et al. (2017) compared Dutch and English, two languages with a similar number of vowels but distinct local phoneme densities. Dutch front vowels, for instance, occupy a densely populated acoustic space with neighboring phonemes, while this is not the case in English. The authors hypothesized that vowel categories situated in close proximity to other sounds might necessitate more precise (compact) articulatory targets to prevent confusion with neighboring phonemes. However, the findings did not establish a consistent relationship between category compactness and local phoneme density. With all vowels, whether they occupied a denser or sparser region within the vowel space, participants with heightened auditory acuity produced vowels that were distinctly spaced apart and displayed less within-category variability, signaling that L1 compactness is an individual characteristic.

It is important to highlight that both studies concentrated on monolingual speakers who exclusively spoke one language or another. It would be particularly intriguing to explore the distinctions in L1 category compactness among individuals with phonological systems of varying sizes, such as monolingual versus bi/multilingual speakers, since there is currently no research that investigates this question.

1.4. Phonetic Space of Bi/Multilingual Speakers

Previous research investigating L1 category compactness has focused on monolingual L2 learners. The understanding of how vowel inventories are organized in terms of category compactness among bilinguals and multilinguals remains limited. If it is assumed that the phonetic space is shared across languages, questions arise regarding whether bi/multilingual speakers possess more compact phonetic categories compared to monolingual speakers, or if their phonetic space is characterized by overlapping categories that could potentially impact speech perception and production. According to the Speech Learning Model (SLM) and its revised version (SLM-r) proposed by Flege (1995) and Flege and Bohn (2021), similar sounds in a speaker’s L1 and L2 tend to converge and combine into a composite L1-L2 category. This conversion might result in larger phonetic categories, assuming that bi/multilingual phonetic space is populated with less compact sound categories. A substantial body of empirical research on bilingual speech development has supported this prediction (Baker and Trofimovich 2005; Flege 1991; Flege and Hillenbrand 1984; Kornder and Mennen 2021; Thornburgh and Ryalls 1998). For instance, Flege and Hillenbrand (1984) discovered that both late learners of L2 English with L1 French and late learners of L2 French with L1 English produced French /t/ with Voice Onset Time (VOT) values that notably deviated from monolingual French short-lag VOT values yet remained insufficiently long for the long-lag English categories. Similar findings were documented in Major’s (1992) study involving American-English speakers residing in a Portuguese-speaking environment for an extended period. Although these studies do not measure category compactness directly, based on the findings they present, one might assume that bi/multilinguals speakers would possess less compact categories in comparison to monolingual speakers.

Conversely, the phenomenon known as “dissimilarity shift” (Chang 2012)—where native phonetic categories deviate from those of L2 to enhance perceptual accuracy—has also been observed in bilingual acquisition studies (Flege et al. 2003; Flege and Eefting 1987a, 1987b). In terms of production, reported bilingual speakers intentionally compact and/or move their L1 and L2 sounds away from each other relative to the productions of monolingual speakers of the respective languages. This deliberate manipulation creates greater contrast between similar sounds within a shared L1–L2 phonetic space, effectively exaggerating their dissimilarities and potentially resulting in smaller, more compact phonetic categories.

It must be noted that whether dissimilarity shift does or does not take place might depend on the level of proficiency attained by the bi/multilingual speaker. Flege and Eefting (1987a) examined the production of Dutch and English /t/ stops among Dutch speakers of English. Their findings indicated that only proficient English speakers—those exhibiting the most native-like English accent, as evaluated by native speakers—had a native Dutch VOT shorter than the typical Dutch /t/, which itself featured a shorter VOT compared to the longer VOT observed in English. Similarly, dissimilar shifts were observed in proficient Spanish speakers of English. These individuals produced Spanish /p/, /t/, and /k/ with shorter VOTs than monolingual Spanish speakers (Flege and Eefting 1987b). In both studies, the authors attributed the shorter VOTs to the necessity of enhancing phonetic contrast with the longer VOTs found in English. This dissimilarity shift is not limited to consonants, extending to vowels as well. Flege et al. (2003) conducted a comparison between early and late English–Italian bilinguals and concluded that a more distinct production of vowels (in their case, English /e/ produced with more movement in comparison to monolingual English production) was a characteristic feature of early and more proficient bilinguals but not late ones. It seems that as proficiency levels increase, sound categories tend to become more compact, with dissimilarity shift serving as the underlying mechanism for this phenomenon.

The present study investigated whether bilingual or multilingual speakers exhibit merged, and thus less compact, sound categories or if their sound categories become more compact due to dissimilarity shift. In other words, the study measured and compared category compactness in perception among monolingual, bilingual, and multilingual speakers to understand whether the addition of non-native (L2, L3, Lx) vowels to the shared phonetic space influences the category compactness of L1 vowels. The bilingual and multilingual participants were proficient speakers of their corresponding L2s, so the expectation was that there would be more compact categories in bilingual and multilingual speakers, with multilingual individuals demonstrating the highest level of compactness. The study’s hypothesis posited that the speaker type, denoting the number of languages spoken by an individual, would influence compactness with more compact categories found in more crowded vowel spaces of bilinguals and multilinguals.

2. Materials and Methods

2.1. Participants

Spanish-speaking participants were recruited through CloudResearch, formerly known as TurkPrime, an online research participant-sourcing platform. The data were collected without supervision; the participants were instructed to find a quiet room and use headphones to complete the experiment. Out of the initial 109 participants, 89 completed the perception categorization task. Two participants were excluded from the study due to their dominance in languages other than Spanish, specifically Catalan and English. Eighteen participants were excluded due to the dataset for them being incomplete, an unrealistic completion time, or unrealistic RTs. All participants were asked to self-assess their proficiency in languages other than Spanish using an adapted version of the CEFR self-assessment grid of reference levels (Council of Europe 2001; Appendix A). Based on this information, the following four experimental groups were determined: monolingual, functional monolingual, bilingual, and multilingual speakers. There were 13 European Spanish monolinguals, 31 European Spanish functional monolinguals, 24 bilinguals, and 19 multilinguals. “Functional monolinguals” defined individuals who have basic proficiency in a foreign language but were raised in households where only one language was spoken, meaning they had limited exposure to languages other than Spanish and primarily used Spanish in their daily lives (Best and Tyler 2007). In this study, functional monolinguals reported possessing introductory knowledge (not exceeding level A1) of several languages, including Basque, Catalan, French, Galician, German, Italian, Romanian, and Valencian (listed alphabetically). The terms “bilingual” and “multilingual” are used broadly here to refer to individuals who speak languages other than their native Spanish at the B2 level or higher. For the purposes of this study, participants were categorized as multilingual if they were proficient in more than two languages, in addition to their native Spanish—so, three languages minimum, as opposed to two languages of bilingual speakers. The smallest number of L2 languages reported by multilingual participants was two, while the largest was four. Bilingual and multilingual participants indicated proficiency at the B2 level or higher in several languages, including Basque, Catalan, English, French, Galician, German, Italian, Portuguese, Sicilian, and Valencian.

In order to qualify as a participant in the study as either a bilingual or monolingual speaker, individuals were required to enumerate only non-native languages in which they possessed proficiency at the B2 level or higher. The B2 proficiency level indicates the ability to communicate comfortably and spontaneously with clarity and detail (Council of Europe 2001). Out of the 43 non-monolingual participants, only 5 indicated having a proficiency level of C1 or higher in one or more languages. Specifically, three participants were bilingual in Spanish and English, one in Spanish and Galician, and one in Spanish and Catalan. It is worth noting that participants had no incentive to provide inaccurate information about their language proficiency since they would have been eligible for study participation regardless, either as part of the monolingual or functional monolingual groups.

To assess language dominance among non-monolingual participants, a short version of the Bilingual Language Profile questionnaire (Birdsong et al. 2012; Appendix B) was administered. Except for the two aforementioned participants who were dominant in Catalan and English, all non-monolingual participants demonstrated dominance in the Spanish language. Based on the questionnaire data, the average age of acquisition (AoA) for bilingual and multilingual speakers was 8.09 years (SD = 6.03). All non-monolingual participants provided information on the years of formal education they received in each language, with an average of 8.11 years (SD = 3.1). Additionally, the average number of years spent living in a country or region where the target language is spoken was 16.81 years (SD = 13.49). For multilingual speakers, these measures were based on the L2 they acquired earliest.

2.2. Instruments

Vowel Identification Task

To assess the compactness of the native Spanish vowels /i/, /e/, and /a/ in perceptual space, a vowel categorization task was administered. In this task, participants were asked to evaluate whether a presented sound is a member of their native category or not. Participants were exposed to synthesized variants of the Spanish vowels /i/, /e/, and /a/, one variant at a time, and tasked with determining whether each variant belonged to their respective Spanish category, specifically /i/, /e/, or /a/, or if it did not. Each token was assessed as an instance of one vowel category only; for example, participants heard a sound and had to decide whether the sound resembled Spanish /i/or it did not. To illustrate, the instructions for /i/ were as follows: “In this task, you will be presented with various vowels spoken by a male speaker. Your objective is to ascertain whether each heard vowel resembles the sound /i/ as heard in the word ‘sin’ (pronounced as: ‘sin’/sin/)”. Participants were instructed to select either “yes” or “no” as quickly as possible by clicking a corresponding button on the screen, and their response time (RT) was recorded for each response. If the response was not provided within 3 s, the screen moved on to the next sound, and the response was assigned a value of 0 (excluded from statistical analysis). Each participant had to rate each variant four times for a total number of trials equal to 128 per vowel, presented in three blocks—one block per vowel. The order of blocks was counterbalanced using the Latin Square design. Male speakers were selected to mitigate potential interference from preferences for specific female voices, which previous research has identified as a confounding variable in perception studies (e.g., Reiterer et al. 2020).

To create the stimuli, the current study followed the procedure described in Kuhl (1991) and Bosch et al. (2000). Klatt’s synthesizer was employed (Klatt 1980) to generate 32 variations for each vowel. These variants were distributed across a psychoacoustic space scaled in mel frequencies (F1*F2), as shown in Figure 1. The prototypical Spanish /i/, /e/, and /a/ vowels were chosen based on values reported by Chládková and Escudero (2012) for a European Spanish male speaker. These prototypical values were set at F1 = 286 Hz (386 mels) and F2 = 2367 Hz (1665 mels) for /i/, F1 = 488 Hz (596 mels) and F2 = 2023 Hz (1531 mels) for /e/, and F1 = 770 Hz (836 mels) and F2 = 1342 Hz (1257 mels) for /a/. The amplitude was set to 70 dB.

The 32 variants formed four vectors around the prototypical /i/, /e/, and /a/vowels. The pilot study with seven native speakers of Spanish confirmed that 4-step vectors approximately covered the perceptual spaces for the Spanish /i/, /e/, and /a/vowels. The vowel variants were created by manipulating F1, F2, or both simultaneously, with a consistent difference of 30 mels in F1 values and 50 mels in F2 values between each variant. Additionally, all vowel tokens shared fixed values for the third through sixth formants: F3 = 3010 Hz, F4 = 3300 Hz, F5 = 3850 Hz, and F6 = 4990 Hz. Bandwidths were set at B1 = 60 Hz, B2 = 90 Hz, B3 = 150 Hz, B4 = 200 Hz, B5 = 200 Hz, and B6 = 1000 Hz. The stimuli had a duration of 500 ms. The fundamental frequency started at 112 Hz, rose to 132 Hz within the initial 100 ms, and gradually decreased to 92 Hz over the subsequent 400 ms, producing a natural rise–fall contour.

2.3. Procedure

The experimental design entailed a single testing session conducted entirely in Spanish, which was completed within approximately 30 min. Each participant was provided with a unique URL to access the experiment via the PsyToolkit platform (Stoet 2010, 2016).

Upon visiting the website, participants first encountered the information sheet and informed consent screens. They were also required to specify the type of technology they were using to ensure optimal settings. For instance, the use of mobile phones and tablets was prohibited, and participants were instructed to use headphones for the experiment.

Before commencing the audio segment of the experiment, thorough sound checks were performed to ensure both sound quality and a comfortable volume level. Upon completing the experiment, each participant received an automatically generated individual code, which they used to claim their compensation of GBP 10.

3. Results

All statistical analyses were conducted using R version 3.5.0 (R Core Team 2018), with the assistance of several essential packages, including car (Fox and Weisberg 2019), ggplot2 (Wickham et al. 2007), and psych (Revelle 2017).

3.1. Overview of the Data

To compute the perceptual compactness index for each vowel for each participant, the methodology described by Kartushina and Frauenfelder (2013) and Chang et al. (2023) was adopted, which assesses category compactness in production. Their formula (1) was applied, which calculates the area of an ellipse to represent the distribution of productions in the F1/F2 space, although in the current study, the formula was used for perceptual data. First, 1 standard deviation from the mean F1 and F2 of the tokens selected by the participants as acceptable representatives of their internal L1 categories was calculated. These standard deviations were used to determine the major and minor axes of an ellipse that depicted the compactness of a specific category. Next, the area of the ellipses was calculated using the following formula:

Area = abπ,

(1)

where a represents the standard deviation of F1, and b denotes the standard deviation of F2 of a given vowel.

Among the three native Spanish categories measured in the present study, /i/ displayed the highest mean value across all participants, indicating the greatest average compactness while concurrently exhibiting the lowest variability. The mean value for /i/ was 72.47 kHz² (SD = 17.2 kHz²), whereas the mean value for /e/ was 68.1 kHz² (SD = 20.3 kHz²), and the mean value for /a/ was 64.9 kHz² (SD = 23.9 kHz²). This observation aligns with corpus research on Spanish, which has documented a lower frequency of occurrence for /i/ (6–7%) in comparison to /e/ (15%) and /a/ (12–13%) (Sandoval et al. 2008).

The global L1 category compactness index for each participant, the sum of all three categories, exhibited a range from 91.74 kHz² to 32.93 kHz², with a mean value of 201.08 kHz² (SD = 39.81 kHz²). Descriptive statistics for each speaker type (monolingual, functional monolingual, bilingual, and multilingual) are provided in Table 1. Upon first glance, monolingual speakers exhibited the most compact global compactness index (mean = 181.45 kHz²) when compared to the other speaker groups, i.e., functional monolinguals (mean = 194.27 kHz²), bilinguals (mean = 211.54 kHz²), and multilinguals (mean = 214.7 kHz²) (Figure 2).

3.2. Compactness Index and Speaker Type

To examine the impact of speaker type (the variable with four levels: monolinguals, functional monolinguals, bilinguals, and multilinguals) on the compactness index, a mixed-effects linear regression model was fitted with the compactness index as a response variable and speaker type and vowel as two predictors; a vowel was also included as a random intercept. At first, the interactional term between speaker type and vowel was introduced as the third predictor, but then it was dropped, as it did not reach significance. The results derived from the model with two predictors demonstrated a significant effect of speaker type for bilingual speakers compared to monolinguals (β = 38.71, SE = 42.26, p = 0.03*) and for multilingual speakers compared to monolinguals (β = 87.20, SE = 44.59, p = 0.05*); more specifically, bilingual and multilingual speakers exhibited a larger compactness index than monolingual participants. The model accounted for 4.6% of the variance explained by the fixed effects alone (marginal R-squared) and 29.9% of the variance explained by both the fixed effects and the random effects combined (conditional R-squared). Although the compactness index was larger for multilingual speakers in comparison to bilingual speakers, there was no statistically significant difference between the groups (p = 0.69). To delve deeper into the relationship between compactness and various contributing factors, a linear regression model was fitted for bilingual and multilingual speakers. This model included language proficiency, age of acquisition, mode of language instruction, and experience in the target language country. These variables were motivated by previous research that indicates that the quality and quantity of L2 input represent important factors in determining whether novel L2 categories will be formed, which might influence the compactness index (Flege and Bohn 2021; Kartushina and Martin 2019; Leung 2012; Piske and Young-Scholten 2008; Ramon-Casas et al. 2009). None of the predictors reached statistical significance.

3.3. Reaction Time and Speaker Type

In the present study, when participants responded to each auditory token (variants of the three Spanish vowels), their reaction time (RT) was recorded as part of the experiment. The average RT across all vowels was 581.17 msec (SD = 657.16 msec).

RTs were captured to investigate the relationship between processing time and compactness, with the hypothesis that individuals with more compact phonetic categories exhibit faster auditory processing and shorter RTs (Ishkhanyan et al. 2019; Meunier et al. 2003). An additional linear mixed-effects regression model was implemented, with RT serving as the response variable and compactness as the fixed effect, while participants were included as the random effect. The analysis revealed no significant relationship between the variables (β = −0.01, SE = 0.006, p = 0.0671.). Nonetheless, the observed trend was intriguing, with RT decreasing as the compactness index increased, indicating faster processing among individuals with larger categories or non-monolingual speakers. Despite the lack of statistical significance, we discuss this relationship below, as this is the first instance in perceptual research when RTs and compactness have been empirically related.

4. Discussion

The present study delved into the compactness of native phonetic categories, exploring how they are influenced by the speaker’s language profile. In other words, the study investigated whether monolingual, bilingual, and multilingual speakers differ in terms of the size or compactness of their native vowel categories in perceptual space.

4.1. Compactness in Perception and Production

The present study is the first attempt to capture category compactness variability in perception, differing from previous research, which has primarily focused on production. Participants displayed significant individual variability in terms of the compactness of their native Spanish categories for /i/, /e/, and /a/ in perception. To provide context, Kartushina and Frauenfelder (2013) reported a compactness range in production for L1 Spanish /e/ spanning from 16.28 kHz² to 57.69 kHz². In this study, the range for the same vowel extended from 13.33 kHz² to 132.65 kHz². It appears that, on average, phonetic categories in perception tend to be larger than those in production. It must be noted, however, that a more nuanced measure of category compactness—namely, a goodness of fit task—could provide lower compactness indices that would align better with the production data. On the other hand, Meunier et al. (2003), who compared perception and production in L1 English, French, and Spanish, also observed larger categories in perception in comparison to production.

One plausible explanation for this phenomenon is the so-called “hyperspace effect”, as documented by Johnson et al. (1993). In their investigation, participants tended to select stimuli from more peripheral regions in perception than those they themselves produced. This effect was observed when native English listeners were asked to identify the best examples of various English vowel categories from a two-dimensional array of vowel stimuli characterized by differing F1 and F2 frequencies. Notably, the F1 and F2 values in their production of English vowels were also acoustically analyzed. Participants consistently favored stimuli with more peripheral frequency values as exemplars of a vowel, even if those values differed from their own production—the same phenomenon observed by Frieda et al. (2000) and Newman (2003).

While this study does not compare specific compactness indices for categories in perception versus production, it does elucidate the wider acoustic range in category perception compared to production. It seems that category compactness may be reduced during production to minimize articulatory effort, a fundamental constraint on speech production that does not apply to perception (Lindblom 1992).

Lastly, it is crucial to acknowledge the methodological disparities between perception and production studies. Production studies typically involve the articulation of vowels within specific contexts, whereas in the present study, vowels were perceived in isolation.

The L1 Spanish category /i/ exhibited the most significant within-participant variability, yet conversely, it demonstrated the least variability between participants. On average, participants assigned a broader range of variants to the phonetic category /i/ compared to /e/ and /a/. Even highly deviant variants of /i/—meaning the farthest-positioned tokens from the prototype selected in this study to represent vowels of interest—tended to be generally accepted as satisfactory by many participants. This observation can potentially be elucidated by the exposure participants had to a more diverse set of /i/ variants compared to /e/ and /a/ variants. A database of phonological inventories, Phoible, reports /i/ as occurring in 92% of the world’s languages, rendering it the most widespread vowel sound in the vocalic inventories of languages (Moran et al. 2014). Given that each language has a distinct distribution of each vowel category, it is plausible to assume that there exists a greater number of /i/ variants “in the wild” compared to /e/ (found in 61% of world languages) and /a/ (found in 86% of world languages). However, it is important to note that this hypothesis necessitates further exploration and empirical testing.

The limited variability of /i/ across participants in terms of perceptual compactness might be attributed to the inherent characteristics of a corner vowel, as proposed by Stevens (1972, 1989).

4.2. Category Compactness in Monolingual, Bilingual, and Multilingual Speakers

Measurements of native category compactness from monolingual, functional monolingual, bilingual, and multilingual speakers were collected. The initial hypothesis posited that the speaker type, denoting the number of languages spoken by an individual, would influence compactness and lead to more compact categories being found in more crowded vowel spaces of bilinguals and multilinguals. The findings revealed the opposite trend: the size of native categories increased across speaker groups. A significant difference in compactness was observed between monolingual and bilingual speakers and between monolingual and multilingual speakers, with bilinguals and multilinguals having significantly larger native categories in comparison to their monolingual peers. These findings might suggest the presence of what is commonly known as “composite” or “merged” categories in the bi/multilingual participants (Flege and Hillenbrand 1984; Flege 1991; Flege and Eefting 1987b; Sundara et al. 2006). Composite categories are particularly characteristic of sounds with acoustic properties that are similar across languages, as in the case of /i/, /e/, and /a/. The SLM (Flege 1995) predicts the merging of similar categories within the shared phonetic space of non-monolingual speakers. The SLM proposes that L1 and L2 phonetic categories interact because they share the same psychoacoustic space and are connected through the mechanism known as “equivalence classification”. According to this concept, the presence of linked phonemes perceived as phonetically similar may lead to a single, merged category used for both languages. If this indeed occurs, the model suggests that the L1 sound may assimilate to the L2 sound, resulting in the category in the phonetic space taking up an intermediate position relative to monolingual categories. These composite categories should exhibit a greater category size (and be less compact) since they merge acoustic distributions from at least two languages.

Despite numerous studies examining phonetic interactions in bilingual contexts, the existence of composite categories remains a subject of debate. Earlier in this paper, various scenarios were reviewed in which composite categories were either formed or not. These scenarios depend on factors such as proficiency levels, early versus late bilingualism, quantity and quality of input, individual differences, and the inherent characteristics of the sounds themselves (acoustically similar vs. acoustically distinct).

In this study, all non-monolingual participants reported a proficiency level of B2 in L2s they spoke (except the five participants mentioned earlier who reported proficiency level C1). While this proficiency level is relatively advanced, it may not guarantee the accurate emulation of novel phonetic categories. As a result, it could be that the participants possess composite L1 categories, though this assumption is based only on the indirect measures in this study and requires further empirical validation. This trend approached statistical significance for bilingual and multilingual speakers who exhibited significantly larger L1 phonetic categories compared to monolingual speakers.

It is possible that the reason why, in the present study, bi/multilingual speakers demonstrated larger (potentially composite) L1 native categories in comparison to monolingual speakers is linked to the limitations of the proficiency self-reported assessment measure. Such measures are generally considered reliable (Brown et al. 2014; Mistar 2011). Nevertheless, a more objective assessment method would be preferable. It could be the case that the bi/multilingual participants had a much lower proficiency level after all, which would explain the formation of composite categories—a more typical scenario for less proficient learners.

In addition to proficiency level, the quality and quantity of L2 input are also crucial factors in determining whether novel L2 categories or composite categories are formed (Piske and Young-Scholten 2008). Flege and Bohn (2021) emphasize the significance of input quality, in conjunction with input quantity, and highlight that, for instance, regular exposure to accented speech may result in impoverished sound representations, even for very proficient and early bilingual speakers (Ramon-Casas et al. 2009). A study by Kartushina and Martin (2019) also demonstrates that even solid gains in L2 phonetics that ensure near-native processing might be a subject of change over time. They reported a return drift bilingual Basque-Spanish speakers experienced after they were no longer in touch with their L3 English, i.e., their vowel production in English shifted towards native norms. The authors emphasize that aspects such as the frequency and circumstances of L1/L2 use are essential to understanding the acoustic properties of speech perception and production across languages. Exposure to a single L2 variety versus multiple varieties of the same language also plays a role in how non-native sounds are perceived, with exposure to multiple varieties resulting in more robust sound categories (Leung 2012).

In this study, certain aspects of the quality and quantity of L2 input were captured, such as age of acquisition, years of formal language instruction, and experience of living in the target language country. However, none of these variables yielded statistically significant results concerning L1 category compactness in bilingual and multilingual speakers.

It is important to note, however, that regarding the mode of language instruction, while many participants reported receiving formal education in their second language, this does not necessarily imply a lack of exposure to naturalistic language input beyond academic settings. For example, individuals learning Catalan, Basque, or Galician in school may have also been exposed to these languages in their home environments or through residing in regions where those languages are spoken.

To obtain a more accurate understanding of each participant’s language history and to disentangle the various factors underlying their language experience, a more comprehensive language questionnaire is necessary. Such a questionnaire would enable the retrieval of detailed language histories and facilitate a more thorough analysis of the numerous variables contributing to an individual’s language experience. The lack of such measures is a clear limitation of the present study.

At the individual level, the global compactness index (the sum of three compactness indices from the three vowels) exhibited considerable variation within each group, including monolinguals, functional monolinguals, bilinguals, and multilinguals, suggesting that it may be influenced by factors not captured in this study. Flege and Bohn (2021) have theorized that category compactness may be influenced by individual differences in cognitive and auditory processes. In future research, it would be beneficial to control for factors such as phonological short-term memory, auditory memory, attention control, and inhibitory control. For instance, Kogan and Mora (2022) demonstrated that acoustic short-term memory, responsible for processing non-linguistic stimuli, interacts with category compactness. Their findings revealed that individuals with greater acoustic memory were less influenced by compactness when distinguishing between two acoustically similar non-native sounds. Conversely, individuals with poorer acoustic memory heavily relied on category compactness for the same task. These variations in task performance may stem from different processing strategies: some individuals adopt a top–down approach, utilizing compact L1 phonetic categories, while others employ a bottom–up strategy, relying on acoustic memory to discern subtle phonetic–acoustic differences between sounds.

Other individual differences, such as motivation and the extent of one’s social networks, may also play a role in influencing compactness. Kornder and Mennen (2021) and Schmid (2002) provide in-depth analyses of how attitudinal factors, including identity and the motivation to sound like a native speaker, profoundly shape both L1 and L2 phonology. In addition, environmental variables such as population density, exposure to various accents, and lifestyle might influence compactness. For example, Lev-Ari (2017, 2018) reported that individuals with smaller social networks are more likely to adjust their perceptual boundaries between sounds like /d/ and /t/ after exposure to a speaker with atypical pronunciation compared to those with larger social networks. This suggests that the size of one’s social network modulates susceptibility to the influence of each interlocutor, potentially implying a connection between smaller social networks and more precise and compact phonetic categories.

Finally, recent accounts have underscored that the development of bi/multilingual language skills may constitute a unique linguistic phenomenon. This suggests the necessity of adopting alternative methodologies and departing from reliance on monolingual norms when evaluating bi/multilingual individuals (Rothman et al. 2022; Sakai and Moorman 2017).

4.3. Perception in Non-Monolingual Speakers and Future Directions

In this study, individuals who spoke more than one language demonstrated less compact or larger categories compared to monolingual speakers. This discovery naturally prompts the question of how effective sound processing is in non-monolingual listeners. Less compact and potentially overlapping vowel categories may pose challenges in accurate and rapid sound perception. For instance, research examining perception efficiency in languages with extensive vowel inventories and overlapping vowel categories indicates that speakers of these languages encounter difficulties in distinguishing between acoustically similar sounds. Meunier et al. (2003) documented a notably overlapping vowel space in the production of L1 English, which subsequently impacted L1 perception. Native English speakers, for instance, demonstrate a higher tendency to miscategorize their native vowels compared to native speakers of languages like Spanish (which has fewer vowel categories, with no overlap between vowel categories, as documented in Meunier et al. (2003)). Similarly, native speakers of Danish, another language characterized by a sizable and potentially overlapping vowel inventory, have been observed to grapple with effective vowel perception. Consequently, they rely on alternative contextual cues when listening, as evidenced by Ishkhanyan et al. (2019).

In the present study, comparisons between monolingual and non-monolingual listeners revealed no significant differences in RTs, suggesting that larger phonetic categories might not influence sound processing. Although the relationship between RTs and compactness was not significant, its direction seemed to be counterintuitive, with faster RTs indicating efficient perceptual processing at the phonetic level despite larger phonetic categories. A more coherent result would be observing longer RTs for non-monolingual participants who also have larger phonetic categories. Previous research studies have shown that non-monolingual individuals exhibit slower response times and a greater frequency of errors, especially in tasks that require rapid responses, when compared to their monolingual counterparts (Gollan et al. 2005). This processing cost is observable even in the dominant language of bilinguals (Ivanova and Costa 2008).

The only explanation for our results is the nature of the task we employed; our bilingual and multilingual participants were tasked with listening to isolated vowels, which may be easier to process compared to vowels embedded in context (Bosch and Ramon-Casas 2011; Llompart 2019). Alternatively, they could have faced greater challenges dealing with a task that involves a lexical component wherein the target vowels are phonologically represented within words. An illustrative example of this can be found in Bosch and Ramon-Casas (2011), where Catalan-dominant and Spanish-dominant bilinguals demonstrated similar proficiency in the accurate production of distinct Catalan vowels, /e/ and /ɛ/. However, differences between these groups emerged when the same vowels were embedded within words. The Spanish-dominant participants struggled to consistently and effectively apply their phonetic knowledge to the phonological representation of words.

Exploring the processing demands imposed by crowded vowel systems in non-monolingual speakers and how they navigate this challenge is of both theoretical and practical significance and should be explored in future research.

5. Conclusions

This study investigated whether the compactness of L1 phonetic categories depends on the number of languages an individual speaks. Specifically, comparisons were made between monolingual individuals and their bi/multilingual counterparts, as the latter group would presumably have a more densely populated vowel space. This raised the question of whether bi/multilingual speakers adjust their vowel categories, potentially moving towards greater compactness. The findings revealed the following pattern: individuals with a higher number of languages tended to exhibit larger and less compact categories.

These findings could point to the development of composite categories in bi/multilingual participants, particularly because the tested vowels /i/, /e/, and /a/ shared similar acoustic distributions across languages. It is plausible that the participants’ language proficiency was not sufficient to form distinct boundaries between similar categories. These results might also suggest that, in reality, participants in this study may not have been as proficient in their L2 languages as they reported.

Alternatively, it is essential to recognize that factors other than the number of languages and/or phonemes present in an individual’s sound inventory influence L1 phonetic compactness. These factors likely involve individual cognitive, emotional, and social aspects that interact with various dimensions of the perceptual system in a complex manner. Further research on individual differences in L1 category compactness would be valuable in elucidating this relatively novel concept.

Funding

This research received no external funding.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author/s.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

		A1	A2	B1	B2	C1	C2
COMPRENDER	Comprensión auditiva	Reconozco palabras y expresiones muy básicas que se usan habitualmente, relativas a mí mismo, a mi familia y a mi entorno inmediato cuando se habla despacio y con claridad.	Comprendo frases y el vocabulario más habitual sobre temas de interés personal (información personal y familiar muy básica, compras, lugar de residencia, empleo). Soy capaz de captar la idea principal de avisos y mensajes breves, claros y sencillos.	Comprendo las ideas principales cuando el discurso es claro y normal y se tratan asuntos cotidianos que tienen lugar en el trabajo, en la escuela, durante el tiempo de ocio, etc. Comprendo la idea principal de muchos programas de radio o televisión que tratan temas actuales o asuntos de interés personal o profesional, cuando la articulación es relativamente lenta y clara.	Comprendo discursos y conferencias extensos e incluso sigo líneas argumentales complejas siempre que el tema sea relativamente conocido. Comprendo casi todas las noticias de la televisión y los programas sobre temas actuales. Comprendo la mayoría de las películas en las que se habla en un nivel de lengua estándar.	Comprendo discursos extensos incluso cuando no están estructurados con claridad y cuando las relaciones están sólo implícitas y no se señalan explícitamente. Comprendo sin mucho esfuerzo los programas de televisión y las películas.	No tengo ninguna dificultad para comprender cualquier tipo de lengua hablada, tanto en conversaciones en vivo como en discursos retransmitidos, aunque se produzcan a una velocidad de hablante nativo, siempre que tenga tiempo para familiarizarme con el acento.
COMPRENDER	Comprensión de lectura	Comprendo palabras y nombres conocidos y frases muy sencillas, por ejemplo las que hay en letreros, carteles y catálogos.	Soy capaz de leer textos muy breves y sencillos. Sé encontrar información específica y predecible en escritos sencillos y cotidianos como anuncios publicitarios, prospectos, menús y horarios y comprendo cartas personales breves y sencillas.	Comprendo textos redactados en una lengua de uso habitual y cotidiano o relacionada con el trabajo. Comprendo la descripción de acontecimientos, sentimientos y deseos en cartas personales.	Soy capaz de leer artículos e informes relativos a problemas contemporáneos en los que los autores adoptan posturas o puntos de vista concretos. Comprendo la prosa literaria contemporánea.	Comprendo textos largos y complejos de carácter literario o basados en hechos, apreciando distinciones de estilo. Comprendo artículos especializados e instrucciones técnicas largas, aunque no se relacionen con mi especialidad.	Soy capaz de leer con facilidad prácticamente todas las formas de lengua escrita, incluyendo textos abstractos estructural o lingüísticamente complejos como, por ejemplo, manuales, artículos especializados y obras literarias.
HABLAR	Interacción oral	Puedo participar en una conversación de forma sencilla siempre que la otra persona esté dispuesta a repetir lo que ha dicho o a decirlo con otras palabras y a una velocidad más lenta y me ayude a formular lo que intento decir. Planteo y contesto preguntas sencillas sobre temas de necesidad inmediata o asuntos muy habituales.	Puedo comunicarme en tareas sencillas y habituales que requieren un intercambio simple y directo de información sobre actividades y asuntos cotidianos. Soy capaz de realizar intercambios sociales muy breves, aunque, por lo general, no puedo comprender lo suficiente como para mantener la conversación por mí mismo.	Sé desenvolverme en casi todas las situaciones que se me presentan cuando viajo donde se habla esa lengua. Puedo participar espontáneamente en una conversación que trate temas cotidianos de interés personal o que sean pertinentes para la vida diaria (por ejemplo, familia, aficiones, trabajo, viajes y acontecimientos actuales).	Puedo participar en una conversación con cierta fluidez y espontaneidad, lo que posibilita la comunicación normal con hablantes nativos. Puedo tomar parte activa en debates desarrollados en situaciones cotidianas explicando y defendiendo mis puntos de vista.	Me expreso con fluidez y espontaneidad sin tener que buscar de forma muy evidente las expresiones adecuadas. Utilizo el lenguaje con flexibilidad y eficacia para fines sociales y profesionales. Formulo ideas y opiniones con precisión y relaciono mis intervenciones hábilmente con las de otros hablantes.	Tomo parte sin esfuerzo en cualquier conversación o debate y conozco bien modismos, frases hechas y expresiones coloquiales. Me expreso con fluidez y transmito matices sutiles de sentido con precisión. Si tengo un problema, sorteo la dificultad con tanta discreción que los demás apenas se dan cuenta.
HABLAR	Expresión oral	Utilizo expresiones y frases sencillas para describir el lugar donde vivo y las personas que conozco.	Utilizo una serie de expresiones y frases para describir con términos sencillos a mi familia y otras personas, mis condiciones de vida, mi origen educativo y mi trabajo actual o el último que tuve.	Sé enlazar frases de forma sencilla con el fin de describir experiencias y hechos, mis sueños, esperanzas y ambiciones. Puedo explicar y justificar brevemente mis opiniones y proyectos. Sé narrar una historia o relato, la trama de un libro o película y puedo describir mis reacciones.	Presento descripciones claras y detalladas de una amplia serie de temas relacionados con mi especialidad. Sé explicar un punto de vista sobre un tema exponiendo las ventajas y los inconvenientes de varias opciones.	Presento descripciones claras y detalladas sobre temas complejos que incluyen otros temas, desarrollando ideas concretas y terminando con una conclusión apropiada.	Presento descripciones o argumentos de forma clara y fluida y con un estilo que es adecuado al contexto y con una estructura lógica y eficaz que ayuda al oyente a fijarse en las ideas importantes y a recordarlas.
ESCRIBIR	Expresión escrita	Soy capaz de escribir postales cortas y sencillas, por ejemplo para enviar felicitaciones. Sé rellenar formularios con datos personales, por ejemplo mi nombre, mi nacionalidad y mi dirección en el formulario del registro de un hotel.	Soy capaz de escribir notas y mensajes breves y sencillos relativos a mis necesidades inmediatas. Puedo escribir cartas personales muy sencillas, por ejemplo agradeciendo algo a alguien.	Soy capaz de escribir textos sencillos y bien enlazados sobre temas que me son conocidos o de interés personal. Puedo escribir cartas personales que describen experiencias e impresiones.	Soy capaz de escribir textos claros y detallados sobre una amplia serie de temas relacionados con mis intereses. Puedo escribir redacciones o informes transmitiendo información o proponiendo motivos que apoyen o refuten un punto de vista concreto. Sé escribir cartas que destacan la importancia que le doy a determinados hechos y experiencias.	Soy capaz de expresarme en textos claros y bien estructurados exponiendo puntos de vista con cierta extensión. Puedo escribir sobre temas complejos en cartas, redacciones o informes resaltando lo que considero que son aspectos importantes. Selecciono el estilo apropiado para los lectores a los que van dirigidos mis escritos.	Soy capaz de escribir textos claros y fluidos en un estilo apropiado. Puedo escribir cartas, informes o artículos complejos que presentan argumentos con una estructura lógica y eficaz que ayuda al oyente a fijarse en las ideas importantes y a recordarlas. Escribo resúmenes y reseñas de obras profesionales o literarias.

Appendix B

An adapted version of the Bilingual Language Profile questionnaire (Birdsong et al. 2012).

References

Baker, Wendy, and Pavel Trofimovich. 2005. Interaction of Native- and Second-Language Vowel System(s) in Early and Late Bilinguals. Language and Speech 48: 1–27. [Google Scholar] [CrossRef] [PubMed]
Best, Catherine T., and Michael D. Tyler. 2007. Nonnative and second-language speech perception. Language Experience In Second Language Speech Learning 17: 13–34. [Google Scholar] [CrossRef]
Birdsong, David, Libby M. Gertken, and Mark Amengual. 2012. Bilingual Language Profile: An Easy-To-Use Instrument to Assess Bilingualism. Austin: COERLL, University of Texas at Austin. Available online: https://sites.la.utexas.edu/bilingual/ (accessed on 9 April 2022).
Bosch, Laura, and Marta Ramon-Casas. 2011. Variability in vowel production by bilingual speakers: Can input properties hinder the early stabilization of contrastive categories? Journal of Phonetics 39: 514–26. [Google Scholar] [CrossRef]
Bosch, Laura, Costa Albert, and Núria Sebastián-Gallés. 2000. First and second language vowel perception in early bilinguals. The European Journal of Cognitive Psychology 12: 189–221. [Google Scholar] [CrossRef]
Bradlow, Ann R. 1995. A comparative acoustic study of English and Spanish vowels. The Journal of the Acoustical Society of America 97: 1916–24. [Google Scholar] [CrossRef] [PubMed]
Brown, N. Anthony, Dan P. Dewey, and Troy L. Cox. 2014. Assessing the Validity of Can-Do Statements in Retrospective (Then-Now) Self-Assessment. Foreign Language Annals 47: 261–85. [Google Scholar] [CrossRef]
Chang, Charles B. 2012. Rapid and multifaceted effects of second-language learning on first-language speech production. Journal of Phonetics 40: 249–68. [Google Scholar] [CrossRef]
Chang, Charles B., Kevin Tang, and Andrew Nevins. 2023. Individual differences in vowel compactness persist under intoxication in first and second languages. Paper presented at the 20th International Congress of Phonetic Sciences, Prague, Czech Republic, August 7–11; Edited by Radek Skarnitzl and Jan Volín. Prague: Guarant International, pp. 1182–86. Available online: https://drive.google.com/file/d/15U2l2y4_-9lyZAgmiccQYXYj9zBi_CAu/view (accessed on 23 September 2023).
Chládková, Kateřina, and Paola Escudero. 2012. Comparing vowel perception and production in Spanish and Portuguese: European versus Latin American dialects. The Journal of the Acoustical Society of America 131: EL119–EL125. [Google Scholar] [CrossRef] [PubMed]
Coleman, John. 2003. Discovering the acoustic correlates of phonological contrasts. Journal of Phonetics 31: 351–72. [Google Scholar] [CrossRef]
Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: Press Syndicate of the University of Cambridge. Available online: https://rm.coe.int/16802fc1bf (accessed on 15 June 2023).
Escudero, Paola, and Daniel Williams. 2012. Native dialect influences second-language vowel perception: Peruvian versus Iberian Spanish learners of Dutch. The Journal of the Acoustical Society of America 131: EL406–12. [Google Scholar] [CrossRef]
Ettlinger, Marc, and Keith Johnson. 2010. Vowel Discrimination by English, French and Turkish Speakers: Evidence for an Exemplar-Based Approach to Speech Perception. Phonetica 66: 222–42. [Google Scholar] [CrossRef]
Flege, James E. 1995. Second language speech learning: Theory, findings, and problems. In Speech Perception and Linguistic Experience: Issues in Cross-Language Research. Edited by Winifred Strange. Baltimore: York Press, vol. 92, pp. 233–77. Available online: https://www.researchgate.net/profile/James-Flege/publication/333815781_Second_language_speech_learning_Theory_findings_and_problems/links/5d071d2692851c900442d6b2/Second-language-speech-learning-Theory-findings-and-problems.pdf (accessed on 15 June 2023).
Flege, James Emil. 1991. Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America 89: 395–411. [Google Scholar] [CrossRef]
Flege, James Emil, and James Hillenbrand. 1984. Limits on phonetic accuracy in foreign language speech production. The Journal of the Acoustical Society of America 76: 708–21. [Google Scholar] [CrossRef]
Flege, James Emil, and Ocke-Schwen Bohn. 2021. The Revised Speech Learning Model (SLM-r). Second Language Speech Learning: Theoretical and Empirical Progress 10: 9781108886901.002. [Google Scholar] [CrossRef]
Flege, James Emil, and Wieke Eefting. 1987a. Cross-language switching in stop consonant perception and production by Dutch speakers of english. Speech Communication 6: 185–202. [Google Scholar] [CrossRef]
Flege, James Emil, and Wieke Eefting. 1987b. Production and perception of English stops by native Spanish speakers. Journal of Phonetics 15: 67–83. [Google Scholar] [CrossRef]
Flege, James Emil, Carlo Schirru, and Ian R.A. MacKay. 2003. Interaction between the native and second language phonetic subsystems. Speech Communication 40: 467–91. [Google Scholar] [CrossRef]
Fox, John, and Sanford Weisberg. 2019. An R Companion to Applied Regression. Thousand Oaks: Sage. Available online: https://us.sagepub.com/en-us/nam/an-r-companion-to-applied-regression/book246125 (accessed on 9 January 2022).
Franken, Matthias K., Daniel J. Acheson, James M. McQueen, Frank Eisner, and Peter Hagoort. 2017. Individual variability as a window on production-perception interactions in speech motor control. The Journal of the Acoustical Society of America 142: 2007–18. [Google Scholar] [CrossRef] [PubMed]
Frieda, Elaina M., Amanda C. Walley, James E. Flege, and Michael E. Sloane. 2000. Adults’ Perception and Production of the English Vowel /i/. Journal of Speech, Language, and Hearing Research 43: 129–43. [Google Scholar] [CrossRef]
Gollan, Tanar H., Rosa I. Montoya, Christine Fennema-Notestine, and Shaunna K. Morris. 2005. Bilingualism affects picture naming but not picture classification. Memory & Cognition 33: 1220–34. [Google Scholar] [CrossRef]
Guenther, Frank H. 1995. Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychological Review 102: 594–621. [Google Scholar] [CrossRef]
Guenther, Frank H., Michelle Hampson, and Dave Johnson. 1998. A theoretical investigation of reference frames for the planning of speech movements. Psychological Review 105: 611–33. [Google Scholar] [CrossRef] [PubMed]
Holliday, Jeffrey J. 2015. A longitudinal study of the second language acquisition of a three-way stop contrast. Journal of Phonetics 50: 1–14. [Google Scholar] [CrossRef]
Huffman, Marie K., and Katharina S. Schuhmann. 2020. The relation between L1 and L2 category compactness and L2 VOT learning. Proceedings of Meetings on Acoustics 42: 060011. [Google Scholar] [CrossRef]
Ishkhanyan, Byurakn, Anders Højen, Riccardo Fusaroli, Christer Johansson, Kristian Tylén, and Morten H. Christiansen. 2019. Wait for it! Stronger influence of context on categorical perception in Danish than Norwegian. Paper presented at the 41st Annual Conference of the Cognitive Science Society, Montreal, QC, Canada, July 24–27; Edited by Ashok Goel, Colleen Seifert and Christian Freksa. Austin: Cognitive Science Society, pp. 1949–55. [Google Scholar] [CrossRef]
Ivanova, Iva, and Albert Costa. 2008. Does bilingualism hamper lexical access in speech production? Acta Psychologica 127: 277–88. [Google Scholar] [CrossRef] [PubMed]
Iverson, Paul, and Patricia K. Kuhl. 1995. Mapping the perceptual magnet effect for speech using signal detection theory and multidimensional scaling. The Journal of the Acoustical Society of America 97: 553–62. [Google Scholar] [CrossRef] [PubMed]
Johnson, Keith, Flemming Edward, and Richard Wright. 1993. The Hyperspace Effect: Phonetic Targets Are Hyperarticulated. Language 69: 505. [Google Scholar] [CrossRef]
Kartushina, Natalia, Alexis Hervais-Adelman, Ulrich Hans Frauenfelder, and Narly Golestani. 2016. Mutual influences between native and non-native vowels in production: Evidence from short-term visual articulatory feedback training. Journal of Phonetics 57: 21–39. [Google Scholar] [CrossRef]
Kartushina, Natalia, and Clara D. Martin. 2019. Third-language learning affects bilinguals’ production in both their native languages: A longitudinal study of dynamic changes in L1, L2 and L3 vowel production. Journal of Phonetics 77: 100920. [Google Scholar] [CrossRef]
Kartushina, Natalia, and Ulrich Hans Frauenfelder. 2013. On the role of L1 speech production in L2 perception: Evidence from Spanish learners of French. Proceedings of Interspeech, 2118–22. [Google Scholar] [CrossRef]
Kartushina, Natalia, and Ulrich Hans Frauenfelder. 2014. On the effects of L2 perception and of individual differences in L1 production on L2 pronunciation. Frontiers in Psychology 5: 1246–46. [Google Scholar] [CrossRef] [PubMed]
Klatt, Dennis H. 1980. Software for a cascade/parallel formant synthesizer. The Journal of the Acoustical Society of America 67: 971–95. [Google Scholar] [CrossRef]
Kogan, Vita. 2020. The Effect of First Language Perception on the Discrimination of a Non-Native Vowel Contrast: Investigating Individual Differences. Ph.D. thesis, University of Barcelona, Barcelona, Spain. Available online: https://diposit.ub.edu/dspace/handle/2445/151460 (accessed on 20 March 2024).
Kogan, Vita V., and Joan C. Mora. 2022. L1-based perceptual individual differences in the acquisition of second language phonology: Investigating the compactness of native phonetic categories. Laboratory Phonology, 24. [Google Scholar] [CrossRef]
Kornder, Lisa, and Ineke Mennen. 2021. Longitudinal Developments in Bilingual Second Language Acquisition and First Language Attrition of Speech: The Case of Arnold Schwarzenegger. Languages 6: 61. [Google Scholar] [CrossRef]
Kuhl, Patricia K. 1991. Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Perception & Psychophysics 50: 93–107. [Google Scholar]
Leung, Alex Ho-Cheong. 2012. Bad influence? – an investigation into the purported negative influence of foreign domestic helpers on children’s second language English acquisition. Journal of Multilingual and Multicultural Development 33: 133–48. [Google Scholar] [CrossRef]
Leung, Alex Ho-Cheung. 2014. Input multiplicity and the robustness of phonological categories in child L2 phonology acquisition. In Proceedings of the International Symposium on the Acquisition of Second Language Speech (New Sounds 2013). Montreal: Concordia University, pp. 401–15. [Google Scholar]
Lev-Ari, Shiri. 2017. Talking to fewer people leads to having more malleable linguistic representations. PLoS ONE 12: e0183593. [Google Scholar] [CrossRef] [PubMed]
Lev-Ari, Shiri. 2018. Social network size can influence linguistic malleability and the propagation of linguistic change. Cognition 176: 31–39. [Google Scholar] [CrossRef] [PubMed]
Lindblom, Björn. 1992. Phonological units as adaptive emergents of lexical development. In Phonological Development: Models, Research, Implications. Edited by Charles Albert Ferguson, Lise Menn and Carol Stoel-Gammon. York: The York Press, vol. 131, p. 163. [Google Scholar]
Lively, Scott E., and David B. Pisoni. 1997. On prototypes and phonetic categories: A critical assessment of the perceptual magnet effect in speech perception. Journal of Experimental Psychology: Human Perception and Performance 23: 1665–79. [Google Scholar] [CrossRef]
Llompart, M. 2019. Bridging the Gap between Phonetic Abilities and the Lexicon in Second Language Learning. Ph.D. thesis, Ludwig Maximilian University, Munich, Germany. Available online: https://edoc.ub.uni-muenchen.de/24192/1/Llompart_Garcia_Miguel.pdf (accessed on 3 June 2024).
Major, Roy C. 1992. Losing English as a First Language. The Modern Language Journal 76: 190–208. [Google Scholar] [CrossRef]
Meunier, Christine, Cheryl Frenck-Mestre, Taïssia Lelekov-Boissard, and Martine Le Besnerais. 2003. Production and perception of vowels: Does the density of the system play a role? Paper presented at the 15th International Congress of Phonetic Sciences, Barcelona, Spain, August 3–9; Edited by Maria Josep Solé, Daniel Recasens and Jordi Romero. Barcelona: Université Autonome de Barcelone, pp. 723–26. Available online: https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2003/papers/p15_0723.pdf (accessed on 20 March 2024).
Mistar, Junaidi. 2011. A study of the validity and reliability of self-assessment. TEFLIN Journal 22: 46. [Google Scholar] [CrossRef]
Moran, Steven, Daniel McCloy, and Richard Wright. 2014. PHOIBLE. Available online: http://phoible.org (accessed on 2 September 2022).
Newman, Rochelle S. 2003. Using links between speech perception and speech production to evaluate different acoustic metrics: A preliminary report. The Journal of the Acoustical Society of America 113: 2850–60. [Google Scholar] [CrossRef] [PubMed]
Perkell, Joseph S., Frank H. Guenther, Harlan Lane, Melanie L. Matthies, Ellen Stockmann, Mark Tiede, and Majid Zandipour. 2004. The distinctness of speakers’ productions of vowel contrasts is related to their discrimination of the contrasts. The Journal of the Acoustical Society of America 116: 2338–44. [Google Scholar] [CrossRef] [PubMed]
Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. Typological Studies in Language 45: 137–58. Available online: http://www.phon.ox.ac.uk/jpierrehumbert/publications/exemplar_dynamics.pdf (accessed on 1 September 2023).
Piske, Thorsten, and Martha Young-Scholten, eds. 2008. Input Matters in SLA. Bristol: Multilingual Matters. [Google Scholar] [CrossRef]
R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available online: https://www.R-project.org (accessed on 27 February 2022).
Ramon-Casas, Marta, Daniel Swingley, Núria Sebastián-Gallés, and Laura Bosch. 2009. Vowel categorization during word recognition in bilingual toddlers. Cognitive Psychology 59: 96–121. [Google Scholar] [CrossRef] [PubMed]
Reiterer, Susanne M., Vita Kogan, Annemarie Seither-Preisler, and Gašper Pesek. 2020. Foreign language learning motivation: Phonetic chill or Latin lover effect? Does sound structure or social stereotyping drive FLL? Psychology of Learning and Motivation 72: 165–205. [Google Scholar] [CrossRef]
Revelle, William R. 2017. Psych: Procedures for Personality and Psychological Research. Available online: https://cran.r-project.org/package=psych (accessed on 27 February 2022).
Rothman, Jason, Bayram Fatih, Vincent DeLuca, Grazia Di Pisa, Jon Andoni Duñabeitia, Khadij Gharibi, Jiuzhou Hao, Nadine Kolb, Maki Kubota, Tanja Kupisch, and et al. 2022. Monolingual comparative normativity in bilingualism research is out of “control”: Arguments and alternatives. Applied Psycholinguistics 44: 316–29. [Google Scholar] [CrossRef]
Saito, Kazuya, Magdalena Kachlicka, Yui Suzukida, Ingrid Mora-Plaza, Yaoyao Ruan, and Adam Tierney. 2024. Auditory processing as perceptual, cognitive, and motoric abilities underlying successful second language acquisition: Interaction model. Journal of Experimental Psychology: Human Perception and Performance 50: 119–38. [Google Scholar] [CrossRef] [PubMed]
Sakai, Mari, and Colleen Moorman. 2017. Can perception training improve the production of second language phonemes? A meta-analytic review of 25 years of perception training research. Applied Psycholinguistics 39: 187–224. [Google Scholar] [CrossRef]
Sandoval, Antonio Moren, Doroteo T. Toledano, Raúl de la Torre, Marta Garrote, and José M. Guirao. 2008. Developing a phonemic and syllabic frequency inventory for spontaneous spoken Castilian Spanish and their comparison to text-based inventories. Paper presented at Language Resource and Evaluation Conference, Marrakech, Morocco, May 26–June 1; Edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis and Daniel Tapias. Paris: European Language Resources Association (ELRA), pp. 1097–100. Available online: https://aclanthology.org/L08-1 (accessed on 27 February 2022).
Schmid, Monika S. 2002. First Language Attrition, Use and Maintenance. Amsterdam: John Benjamins Publishing Company. [Google Scholar] [CrossRef]
Stevens, Kenneth N. 1972. The quantal nature of speech: Evidence from articulatory-acoustic data. In Human Communication: A Unified View. Edited by Edward E. David and Peter B. Denes. New York: McGraw-Hill, pp. 51–66. [Google Scholar]
Stevens, Kenneth N. 1989. On the quantal nature of speech. Journal of Phonetics 17: 3–45. [Google Scholar] [CrossRef]
Stoet, Gijsbert. 2010. PsyToolkit: A software package for programming psychological experiments using Linux. Behavior Research Methods 42: 1096–104. [Google Scholar] [CrossRef] [PubMed]
Stoet, Gijsbert. 2016. PsyToolkit. Teaching of Psychology 44: 24–31. [Google Scholar] [CrossRef]
Sundara, Megha, Linda Polka, and Shari Baum. 2006. Production of coronal stops by simultaneous bilingual adults. Bilingualism: Language and Cognition 9: 97–114. [Google Scholar] [CrossRef]
Thornburgh, Dianne F., and John H. Ryalls. 1998. Voice onset time in spanish-english bilinguals: Early versus late learners of english. Journal of Communication Disorders 31: 215–29. [Google Scholar] [CrossRef] [PubMed]
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, Dewey Dunnington, and Teun van Den Brand. 2007. ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. Available online: https://doi.org/10.32614/cran.package.ggplot2 (accessed on 27 May 2024).
Zhai, Alex, Meghan Clayards, and Heather Goad. 2023. Individual variability in L1 category compactness on L2 production compactness and accuracy. Paper presented at the 20th International Congress of Phonetic Sciences, Prague, Czech Republic, August 7–11; Edited by Radek Skarnitzl and Jan Volín. Prague: Guarant International, pp. 2880–84. Available online: https://drive.google.com/file/d/15U2l2y4_-9lyZAgmiccQYXYj9zBi_CAu/view (accessed on 5 February 2024).

Figure 1. The 32 synthesized vowels of /i/ distributed across a mel-scaled F1*F2 psychoacoustic space.

Figure 2. The global compactness index for monolinguals, functional monolinguals, bilinguals, and multilinguals (kHz²).

Table 1. Descriptive statistics for global compactness index for monolinguals, functional monolinguals, bilinguals, and multilinguals (kHz²).

Speaker Type	Min	Max	Mean	SD
monolingual	91.75	226.64	181.45	43.29
functional monolingual	112.93	302.14	194.27	37.68
bilingual	168.56	329.27	211.54	38.52
multilingual	178.69	292.7	214.7	33.99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kogan, V.V. Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken? Languages 2024, 9, 238. https://doi.org/10.3390/languages9070238

AMA Style

Kogan VV. Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken? Languages. 2024; 9(7):238. https://doi.org/10.3390/languages9070238

Chicago/Turabian Style

Kogan, Vita V. 2024. "Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken?" Languages 9, no. 7: 238. https://doi.org/10.3390/languages9070238

Article Menu

Compactness of Native Vowel Categories in Monolingual, Bilingual, and Multilingual Speakers: Is Category Compactness Affected by the Number of Languages Spoken?

Abstract

1. Introduction

1.1. Individual Differences in L1 Speech

1.2. Phonetic Category Compactness and L2

1.3. Phonetic Category Compactness and the Influencing Factors

1.4. Phonetic Space of Bi/Multilingual Speakers

2. Materials and Methods

2.1. Participants

2.2. Instruments

Vowel Identification Task

2.3. Procedure

3. Results

3.1. Overview of the Data

3.2. Compactness Index and Speaker Type

3.3. Reaction Time and Speaker Type

4. Discussion

4.1. Compactness in Perception and Production

4.2. Category Compactness in Monolingual, Bilingual, and Multilingual Speakers

4.3. Perception in Non-Monolingual Speakers and Future Directions

5. Conclusions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI