Discrimination of Degrees of Foreign Accent across Different Speakers

Pérez-Ramón, Rubén

doi:10.3390/languages9030072

Open AccessArticle

Discrimination of Degrees of Foreign Accent across Different Speakers

by

Rubén Pérez-Ramón

School of International Liberal Studies, Waseda University, Tokyo 169-8050, Japan

Languages 2024, 9(3), 72; https://doi.org/10.3390/languages9030072

Submission received: 28 September 2023 / Revised: 22 January 2024 / Accepted: 24 January 2024 / Published: 23 February 2024

(This article belongs to the Special Issue Speech Analysis and Tools in L2 Pronunciation Acquisition)

Download

Browse Figures

Versions Notes

Abstract

:

Second-language learners often encounter communication challenges due to a foreign accent (FA) in their speech, influenced by their native language (L1). This FA can affect rhythm, intonation, stress, and the segmental domain, which consists of individual language sounds. This study looks into the segmental FA aspect, exploring listeners’ perceptions when Spanish interacts with English. Utilizing the SIAEW corpus, which replaces segments of English words with anticipated Spanish-accented realizations, we assess the ability of non-native listeners to discriminate degrees of accent across male and female voices. This research aims to determine the impact of voice consistency on detecting accentedness variations, studying participants from Japanese and Spanish. Results show that, while listeners are generally able to discriminate degrees of foreign accent across speakers, some segmental transformations convey a more clear distinction depending on the phonological representations of the native and accented realisations on the listener’s system. Another finding is that listeners tend to better discriminate degrees of accent when words are more native-like sounding.

Keywords:

foreign accent; vowels; perception; discrimination

1. Introduction

A foreign accent (FA) refers to the distinct pronunciation characteristics and prosodic patterns that manifest in a non-native speaker’s utterance of a second language (L2) (Derwing and Munro 2015; Uzun 2023). These characteristics stem from the influence of the speaker’s native language and can vary widely in prominence and nature. Interspeaker differences can pose a challenge for listeners and, more particularly, students of a second language. These differences manifest in two key areas: variations in accent among non-native speakers who share the same first language (L1), and variations among non-native speakers from different L1 backgrounds built up from a range of factors, including individual exposure to the target language, learning environment, and personal motivation (Boduch-Grabka and Lev-Ari 2021; Mackay et al. 2006; Moyer 2007). Such complexity in FA can impact the effectiveness of language instruction and comprehension, emphasizing the importance of a nuanced understanding of FA in the pedagogy and practice of second-language learning.

The study of FA is integral to the field of second language acquisition, as it not only provides insights into the cognitive processes underlying language learning but also has practical implications (Ioup 1984). The presence of an FA may impact a listener’s perception of the speaker’s credibility, fluency, or even identity, thus influencing social interactions and opportunities (Adank et al. 2013). Furthermore, understanding and addressing FA can aid in the development of effective language teaching methodologies, assisting learners in achieving a higher level of communicative competence in their second language. The phenomenon of FA, therefore, occupies a significant role in both linguistic research and educational practice, linking the intricate mechanisms of language learning with broader social dynamics.

Traditionally, FA has been meticulously explored through a wide range of viewpoints in an attempt to capture its complex and manifold characteristics. A significant portion of this scholarly research has adopted a comprehensive or holistic approach, delving into the implications of FA in a broad sense (Flege et al. 1995; Munro and Derwing 1995; Piske et al. 2001). Such studies have been instrumental in shedding light on the intricate relationships between FA and various factors that may influence it. For example, researchers have analyzed the impact of the age at which learning begins (Asher and García 1969; Chen et al. 2020), the period of study or exposure to the language, and the level of proficiency in a second language (L2) (Kang 2020). These variables have been shown to play vital roles in the way accented speech is both produced and perceived.

A complementary strand of literature has taken a more focused approach, examining the role of specific individual features in FA. Within this context, prosodic features, which include elements such as intonation (Mennen 2015; Van Els and De Bot 1987), rhythm (Polyanskaya et al. 2017), and speech rate (Munro and Derwing 1995), have been identified as pivotal in giving rise to accented speech. Furthermore, the segmental domain (that is, sounds and phonemes) has been recognized as another key factor in the production of foreign-accented speech (Sereno et al. 2016). Researchers have delved into both vowels (Chan et al. 2017; Georgiou 2019) and consonants (Chen et al. 2020; Neuhauser 2011), uncovering how the mispronunciation of such sounds can be perceived as accented speech.

The advancement of foreign accent (FA) research has been significantly fueled by the utilization of innovative speech manipulation tools. These tools have been indispensable in providing a platform for analyzing and manipulating speech patterns, thereby enabling researchers to dissect and understand the subtle gradations of accented speech. The Pattern Playback machine, a pioneering instrument in its time, allowed researchers to convert visual patterns into sound, thereby laying the groundwork for systematic speech analysis. More recently, Praat (Boersma and Weenink 2023) has emerged as an essential tool in the field, providing a way to manipulate speech in a controlled manner. In recent years, computational advancements such as diphone synthesis (Taylor et al. 1998), Hidden Markov Model synthesis (Kayte et al. 2015) or Deep Neural Networks (Qian et al. 2014) have contributed significantly to the advancement of the field.

In the present study, we will explore the implications of segmental foreign accent. Particularly, by using a corpus that stems from novel manipulation techniques such as the splicing technique (García Lecumberri et al. 2014) and the gradation technique (Pérez-Ramón et al. 2020), our research aims to provide insights into whether learners of a second language are able to discern a range of degrees of FA when produced by different speakers. The significance of exploring segmental foreign accents is well-founded, having been substantiated by prior investigations (García Lecumberri et al. 2014; Pérez-Ramón et al. 2022b, 2023). Previous findings align with certain outcomes of holistic studies of accent. For instance, it has been observed that a segment pronounced with a noticeably strong FA does not necessarily lead to a decrease in overall intelligibility (Munro and Derwing 2020). Moreover, there are more specific conclusions that have emerged, such as the intriguing implication that the phonological representations of certain mispronounced segments may exert a more pronounced influence on intelligibility than others (Imai et al. 2005; Pérez-Ramón et al. 2022b).

For the purpose of our research, we will focus on the pronunciation of English vowels as articulated by native Spanish speakers. This focus is underpinned by the substantial differences between the vocalic systems of English and Spanish. While the Spanish vowel system is relatively simple, encompassing only five vowels [a, e, i, o, u] without any distinctions in length, the English vowel system is far more intricate and complex [ɜː, æ, ɑː, ɪ, iː, ɒ, ɔː, ʊ, uː] (as considered in this study, since the Southern British English variety will be used), utilizing duration as a distinguishing feature, which can lead to intelligibility conflicts (Nygard 2006). Furthermore, individual difficulties with vowels such as [ɪ] and [æ] (Flege et al. 1995) have been found, and (Franklin and Stoel-Gammon 2014) offers a comprehensive analysis of the intelligibility conflicts that may arise from the Spanish-like realisation of the vowels chosen for this study. These divergent structures compel Spanish speakers to adapt their existing vowel system to map onto the more complex English system, which can lead to recognizable confusion and, more importantly, significant intelligibility interference (Franklin and Stoel-Gammon 2014). This interference involves not only a perceivable accent but also specific pronunciation patterns that may obscure the intended message, impeding clear communication. The intricacy of the English system juxtaposed with the simplicity of the Spanish one has led to challenges in the phonological translation between the two. Analyzing these specific challenges in pronunciation helps to provide deeper insights into not only the nature of language acquisition but also the unique phonetic characteristics that differentiate these two languages.

The primary question driving our research is whether learners of an L2 are capable of discerning various degrees of accent across different speakers. To explore this question more deeply, we will conduct an AXB discrimination experiment. Within this framework, listeners will be exposed to two voices, one male and one female, that have been manipulated and fine-tuned using the previously mentioned techniques. These voices will pronounce the core vowel in English one-syllable words with five levels of Spanish accent, ranging from a completely non-native, Spanish accent to a completely native Southern British English accent.

The experiment involved two distinct groups of participants: one group for whom Spanish is their L1 and another group with Japanese as their L1. These two groups were selected because both of them share a similar vowel system of five vowels, which contrasts with the much more populated system of English. The most salient difference between Spanish and Japanese vowels in terms of spectral data would be the fact that the Japanese /u/ is not rounded ([ɯ]), as in Spanish ([u]). Otherwise, these two cohorts of speakers share a similar distribution of the five-vowel system in the vocalic space. Both groups, however, follow different strategies when it comes to the pronunciation of vowels. The Spanish system does not differentiate between long and short vowels, a feature that is used in English; at the same time, the Japanese language has a moraic rhythm, in which long vowels are pronounced as two time groups and can distinguish words (e.g., oba おば vs. obaa おばあ) (Bion et al. 2013). Additionally, in Japanese, vowels [ɯ] and [i] can be devoiced in certain contexts (e.g., suki すき [sɯ̥ki]). This design will allow us to evaluate the significance of having a matching first language when it comes to discerning accents.

Section 2 of this study will outline the methodology, and Section 3 will present the experiment’s results, accompanied by a comprehensive statistical analysis. Section 4 will explore possible answers to the research questions posed and Section 5 will conclude with insights into the implications of segmental foreign accents in language teaching, as well as recommendations for future research in this field.

2. Materials and Methods

In this section, we will thoroughly describe the experimental design, the audio samples employed, and the background of the participants.

2.1. Audio Tokens

The audio samples for this experiment were extracted from a pre-existing corpus known as the SIAEW corpus (Pérez-Ramón et al. 2022b). This corpus is comprised of a structured collection of words, focusing on sounds that Spanish speakers find particularly challenging to pronounce when speaking English. The composition of the SIAEW corpus is meticulously defined to cover a broad range of phonetic difficulties experienced by Spanish speakers. It includes 9 distinctive English vowels [ɜː, æ, ɑ, ɪ, iː, ɒ, ɔː, ʊ, uː], 9 consonants in initial position [h, ɹ, k^h, t^h, v, ʃ, z, ʤ, j] and 3 consonants in final position [b, d, g]. For each of these 22 sounds, 4 monosyllabic words were selected, bringing the total to 88 words. In this study, we will only utilize the vowel tokens (see Table 1).

As it can be seen, the English sound [ɜː] can convey two different mispronunciations among Spanish speakers: on one hand, the [ɜː→e] confusion and, on the other, [ɜː→o]. This dual transformation and the pronunciation chosen by Spanish speakers of English are highly influenced by the orthography of the carrier word. It is also important to note that, again, because of orthography, Spanish speakers could be prone to introduce some kind of rhoticity in these words in the form of a trill, a tap or other realisations. However, since words ending in a rhotic + consonant cluster are highly infrequent in Spanish and only ascribed to loan words, the decision was made to restrict the transformations to a more natural CVC syllabic structure.

The SIAEW corpus encompasses the recordings of each of these words by a male and a female speaker, both deemed bilingual with no trace of accent in either English or Spanish by native evaluators. For each word, the sound has been manipulated in order to replace the English target vowel with its most typical Spanish mispronunciation, as depicted in Table 1. As an example, for the word firm [fɜːm], the vowel [ɜː] was trimmed from the original recording and replaced with a [e] pronounced by the same speaker, resulting in a token pronounced as [fem]. The goal of this technique was to isolate the emergence of a Spanish accent to a single segment, while the remaining sounds of the word retained their original pronunciation. This way, perceptual changes could be directly attributed to the mispronunciation of the target segment. The complete procedure for the generation of segmental foreign accents can be found in García Lecumberri et al. (2014).

Other than these two tokens (i.e., the native realisation of the word and the Spanish accented version), the SIAEW corpus offers another three tokens per word that represent mid-points between the two extremes in terms of foreign accent. These three tokens have been generated following the gradation technique (Pérez-Ramón et al. 2020), which allows the researcher to blend the native and accented realisations of the segment in a weighted manner, resulting in a segment that deviates acoustically from both endpoints to the desired degree. It allows the user to generate continua of n steps between the two ends, enabling deep research of different degrees of foreign accent. Particularly, the three mid-points included in the SIAEW corpus have been calibrated to convey an equidistant amount of accent as perceived by native listeners (Pérez-Ramón et al. 2022a). In summary, the SIAEW corpus includes, for each of the 88 words (of which we will use only the 40 pertaining to vocalic target segments, as depicted in Table 1) and the two voices (male and female), 5 audio files that represent a fully foreign-accented token, three tokens calibrated to convey 25%, 50% and 75% of accentedness, respectively, and a fully native-like token. It is important to note that the fully native-like token was also created using the same technique as the other tokens. Specifically, the target segment was trimmed and replaced with a segment produced using the same strategy. This approach was employed to eliminate any potential bias arising from artefacts linked to our manipulation techniques. The final number of audio samples included in the present experiment was, therefore, 2 voices × 10 vowel transformations × 4 words × 5 accent steps = 400 audio samples.

2.2. Participants

Two cohorts of participants were signed up for this experiment. The first group consisted of 21 Spanish native speakers (mean age: 33.2; females: 13) that was recruited from internet services such as Amazon Mechanical Turk or other social networks.

The second group was a set of 21 Japanese native speakers (mean age: 20.1; females: 13) recruited from the pool of students of Waseda University in Tokyo.

The linguistic requirements for recruiting were the same for both groups, that is, an English proficiency of B1 to C1 according to the Common European Framework of Reference (Council of Europe 2001) and not being bilingual in any other language.

2.3. Experiment

The main task of the experiment was designed as an AXB task in which participants had to decide whether the X sound was pronounced more closely to A or to B. The X token was always one of the female voice tokens, while A and B were male voice tokens either the same step as X or two steps up or down (Table 2).

The order of A and B was randomised and balanced across trials in order to prevent a possible effect of order presentation. This means that, for a given step of X, either the {n – n –

n \pm 2

} or the {

n \pm 2

– n – n} trial was presented, but not both.

The experiment was delivered to the participants through the online platform Gorilla Anwyl-Irvine et al. (2020). All listeners were allowed to complete it at home, with the requirement that the location was quiet, and they employed headphones for better hearing of the sounds. A silence of 1200 ms was introduced between the three tokens of each trial. Only after hearing the three sounds, participants were presented with a screen that asked them to choose the sound that was more similar to X, and two big buttons labelled “A” and “B”, respectively. Every 40 trials, participants were advised to take a two-minute break before proceeding. Trials were randomised across participants. On average, the completion of the experiment took 44 min, and participants were paid upon its conclusion. Practice with a total of 10 tokens was presented before the main task was delivered so participants could adjust the volume and sound settings of their devices and get used to the pace of the experiment.

3. Results

In this section, the results provided by the participants will be analyzed. Unless otherwise specified, the models were general linear mixed models generated using the lme4 package (Bates et al. 2015) in R (R Core Team 2023) and post hoc pairwise comparisons were collected using the emmeans package (Lenth 2023).

3.1. Pre-Processing

Since the experiment was self-paced and participants completed the task at their homes or other locations uncontrolled by the researcher, trials in which reaction time was over 5000 ms were removed from the database. This led to the removal of the 2.62% of the answers provided by the participants.

In our analysis, we explored two potential influences on participant responses: the position of the ‘different’ token in each triad (either ‘A’ or ‘B’) and the direction of the triad (either ‘upwards’—e.g., A = 1, X = 1, B = 3—or ‘downwards’—e.g., A = 4, X = 4, B = 2). Despite the considerable number of trials (4785 in ‘A’ position and 5031 in ‘B’ position; 4941 ‘upwards’ and 4875 ‘downwards’), our models, accounting for interactions between cohorts, step pairs, token positions, and trial directions with participant IDs as a random intercept, revealed no significant impact from the token position for either cohort. However, a notable exception was found in the Spanish cohort, where the direction of the trial influenced responses for the comparison of steps 3 and 1 (z = 2.938, p = 0.0033). Given that this was an isolated case, we decided not to include the direction factor in further analyses, allowing us to focus on more substantial findings. No other factors were expected to interfere with the interpretation of the results.

3.2. Overall Results

Overall, results show similar behaviour of both cohorts for every pair of steps. Japanese listeners seem to be slightly more proficient in distinguishing pairs 1-3, 2-4 and 4-2, and steps 5-3 are slightly better distinguished by listeners of both cohorts (Figure 1).

The effects of cohort, steps compared, and the interaction between both factors on the correct response variable were examined using a generalized linear mixed-effects model. The model included the id of the participants as a random factor to account for individual variability. The results of a Type II Wald chi-square test are presented in Table 3.

The variable steps compared was found to be highly significant. Additionally, the interaction between cohort and steps compared was also significant, suggesting that the effect of steps depends on the cohort answering. The main effect of the cohort was found to be only marginally significant.

Post hoc analysis of the model revealed a significant difference between Japanese and Spanish in steps 1-3 (z = 2.351, p = 0.0187) and a marginally significant difference in the comparison of steps 4-2 (z = 1.939, p = 0.0525). The other contrasts were not found to be statistically significant at the 0.05 level. Additionally, differences across step pairs were also analyzed for each cohort. No significant difference was found between any pair of steps for the Japanese cohort; however, more variability was found for the Spanish cohort. Trials in which the target token (i.e., the X token pronounced by the female voice) was 1, 2 or 4 were significantly less accurately discriminated from the male voice than trials with 3 or 5 as the target token (Table 4).

3.3. Results by Vowel

One of the main advantages of the manipulation of tokens via the splicing technique and the gradation technique is that it brings the possibility of examining the contribution of each individual segment to the overall perception of foreign accents analyzed in the previous section. In this section, a comprehensive analysis of the results for each [non-native]→[native] vowel continuum will be provided.

The effects of cohort, steps compared, vowel, and their interactions were examined on the response variable using a generalized linear mixed-effects model, with the id of the participants included as a random factor. The results of a Type II Wald chi-square test are summarized in Table 5.

The results provide insights into the relationships among steps compared, vowel, and cohort. Both the main effects of steps compared and vowel were found to be highly significant. Additionally, significant pairwise interactions were detected between cohort and steps compared, cohort and vowel, and steps compared and vowel. The three-way interaction among cohort, steps compared, and vowel was marginally significant.

Given these findings, we further examined the interaction between cohort and steps compared for each vowel individually (Figure 2). No significant effect was detected in various continua (namely [a]→[æ], [a]→[ɑː] and [o]→[ɒ]). However, significant differences emerged in the steps compared factor for the remaining continua, all with a significance level of p < 0.001 except for [o]→[ɔː] (p < 0.05) for which the cohort factor was also found significant (p < 0.01). Finally, for the [i]→[iː] continuum, a significant effect of cohort (p < 0.01), steps compared (p < 0.001) and the interaction of both factors (p < 0.001) was also found.

The discrimination abilities of the two cohorts were evaluated using d-prime values for the [o]→[ɔː] and [i]→[iː] continua. Both cohorts demonstrated discrimination abilities above chance, as indicated by positive d-prime values. However, the Japanese listeners generally exhibited superior discrimination, particularly in the [o]→[ɔː] continuum, with d-prime values of 0.868 and 0.850 at the 3-1 and 5-3 steps compared, respectively, compared to the Spanish listeners’ 0.375 and 0.646 at the same levels. In the [i]→[iː] continuum, the Japanese cohort again showed better discrimination, notably with a d-prime value of 1.11 at step 4-2. Contrastingly, their performance at steps 3-5 resulted in a d-prime value of 0, indicating no better discrimination than random guessing. This unexpected result warrants further exploration and will be discussed in the following section.

A detailed pairwise analysis of each factor can be found in Appendix A for every individual sound, and the main conclusions will be outlined here. The steps compared factor significance arises from the fact that the native end of the continuum, i.e., the comparison of steps 4-2 and especially 5-3 gives a higher score than the other pairs. This finding implies that listeners are more able to discern accents across voices when these voices sound near-native rather than with a strong foreign accent.

Moreover, among the continua exhibiting significant effects of cohort, the Japanese cohort generally displayed a greater ability to distinguish accents across voices compared to the Spanish group. This finding, where participants with an L1 different from the accent in the experimental words were more skilled at discerning accents, will be further explored and discussed in the following section.

4. Discussion

In this paper, we have explored the capacity of non-native English speakers to distinguish varying degrees of foreign accent across different voices. Specifically, our analysis focused on two distinct cohorts: one in which the participants’ L1 matched the accent of the English words (Spanish), and another in which the L1 was mismatched (Japanese). Our investigation looked into their ability to discern five distinct degrees of foreign accent imposed over seven English nuclear vowels. The results yielded a complex picture, revealing that listeners can indeed detect differences in foreign accents across voices. However, this ability is not uniform across all scenarios. Two key conditions emerged that appeared to enhance the discrimination capabilities of the participants: (i) when the listener’s L1 does not match the speaker’s, thereby providing a potentially contrasting perspective, and (ii) when the speech samples sounded closer to native-like, potentially facilitating more refined discrimination.

Previous research has looked into the challenges that listeners encounter when trying to distinguish between segmental contrasts that are considered “difficult” to acquire. Specifically, these difficulties arise when the two segments being compared are identified as a single phonemic unit in the listener’s native language (Højen and Flege 2006; Pérez-Ramón et al. 2020). This phenomenon leads to increased challenges for individuals learning a second language, as they often struggle to establish robust and distinct categories that differ from or interfere with the categories embedded in their L1. Essentially, the cognitive framework formed by a person’s L1 can obscure the subtleties between phonemic units in an L2, causing them to perceive distinct sounds as identical (Escudero et al. 2009; Tuninetti and Tokowicz 2018). This phenomenon is further illuminated by Tyler and Best’s research on perceptual assimilation, which examines how native English speakers perceive various non-native vowel contrasts. Their findings suggest a consistent influence of native-language attunement on speech perception (Tyler 2014) that can result in the assimilation of non-native sounds to native phonological categories, affecting discrimination abilities.

Central to the overarching aim of this research is the observation that students who are learning a second language are generally exposed to a diverse array of accents. This exposure plays a crucial role in how learners interpret and process language. Learners are known to extract information from these accents in unique ways, often influenced by the preconceptions and expectations they harbour regarding their instructors. Specifically, there has been empirical evidence to demonstrate that these expectations extend to different areas of language expertise. For instance, Chinese students learning English often anticipate that native English-speaking teachers will demonstrate mastery in pronunciation. In contrast, they anticipate that non-native English-speaking teachers will display a deeper understanding of grammatical rules and strategic language use (Sung 2014). Similar findings have been uncovered for Vietnamese and Japanese students of English (Walkinshaw and Oanh 2014). In this case, participants regarded the speech of non-native teachers as not only more accented but also more comprehensible, i.e., easier to understand. This body of research brings to light the issue of interspeaker accent discrimination.

In our study, we have provided evidence that learners possess the ability to recognize varying degrees of foreign accents, a capability that manifests not only within individual speakers but also, and more notably, across different speakers. This ability to discriminate is significantly linked to the intensity of the FA as applied to nuclear vowels of monosyllabic words. Particularly, we observed that when the vowels exhibit a more native-like quality, listeners become more attuned to subtle variations in accent across speakers. This sensitivity to accentedness is not an isolated phenomenon but aligns with existing research that often reveals that small deviations from the norm can be enough to convey a significant increment in the perceived degree of foreign accent (Pérez-Ramón et al. 2020), enhancing the discriminatory capabilities of the listener.

Our investigation revealed varying abilities among non-native English speakers in discerning degrees of foreign accent, influenced by both the listener’s native language and the native-likeness of the speech samples. This variability can be partially explained through the lens of top-down and bottom-up processing. When listeners encountered non-native-like tokens, they likely engaged in top-down processing, utilizing contextual information and their prior linguistic knowledge to comprehend the speech (Xie and Myers 2017). This approach enables a talker-specific pathway to comprehension, which was particularly evident in the Japanese cohort’s ability to discriminate vowel length contrasts, a feature prominent in their L1. Conversely, in scenarios where speech samples were more native-like and thus provided less contextual clues for non-native listeners, a reliance on bottom-up processing was observed. Here, listeners depended more on the raw auditory information (Gerrits and Schouten 2004) to distinguish between different degrees of accent. Such a shift in cognitive processing strategy might explain the varied performance across different speech samples and cohorts.

A secondary finding of our research is the difference in discrimination capabilities across cohorts of listeners. Prior research has shown that the way individuals perceive non-native phonetic segments is deeply influenced by the extent to which these segments align with the phonological structures present in the listener’s own language (Hu 2021; Imai et al. 2005; Pérez-Ramón et al. 2020). This suggests that the phonological representations developed over time play a pivotal role in the interpretation of unfamiliar sounds.

It is known that listeners are more adept at distinguishing between segments when those segments have contrastive features in their native language (L1). Previous research has demonstrated that, for example, the contrast /ʃ/-/ʒ/ is perceived differently by English and Chinese listeners since the latter lack this specific phonological pair in their native consonant system (Chen and van de Weijer 2022). In another study (Schoonmaker-Gates 2015), the production of Spanish plosives by native and non-native speakers was rated for accentedness by English learners of Spanish. The conclusions drawn from this study suggest that contrasts in a second language can be acquired over time, but they are not straightforward for non-proficient speakers.

In our study, this can be seen in the differences between Spanish and Japanese listeners. Specifically, the ability to discriminate between degrees of accent across speakers is significantly superior in instances where the duration of a vowel serves as a differentiating cue, particularly in the transitions between [i→iː] and [o→ɔː]. Given that the Japanese language uses vowel duration as a distinctive feature (de Weers and Munro 2018; Hui and Arai 2020), it falls within expectations that Japanese listeners would more effortlessly discriminate varying degrees of accent in contrasts primarily defined by vowel length.

It remains unclear why only these two contrasts were especially clear for Japanese speakers. One possible explanation is that the non-native realisation of Japanese speakers of English for the [ɜː] vowel is neither [e] nor [o], but [a] (D’Angelo et al. 2021; Lengeris 2009). Therefore, their expectations for the realisation of these words may have hindered the subtleties of the differences between degrees of accentedness across speakers.

While the primary focus of our study is on the nuances of accent discrimination in language perception, our findings may hold important implications for language education. In the formal education setting, students are often exposed to instructors with a variety of accents and linguistic backgrounds, ranging from native speakers to non-native speakers of different proficiency levels. This diversity in accent and teaching approach, as previous research suggests, can significantly influence the learning process (Algethami 2017). Studies have shown that exposure to diverse accents can enhance students’ listening skills and linguistic flexibility, preparing them for real-world language use (Burke et al. 2018; Sumner 2009). Furthermore, adapting to varied teaching methodologies in response to these accents could foster cognitive adaptability and language processing skills, which are vital for language acquisition (Clarke and Garret 2004; Cristia et al. 2012).

Our study primarily explored the interspeaker discrimination abilities of non-native speakers when faced with accented speech in their target language. The pedagogical ramifications stemming from our findings are considerable. A key takeaway is the need for educators to be deeply attuned to the pervasive influence of a student’s L1. Such an understanding is crucial in effectively guiding their progress in L2 acquisition and in crafting feedback that is both insightful and culturally sensitive. Our suggestion for future research is to broaden its scope to include an in-depth analysis of consonant discrimination, thus complementing the vowel-focused findings of this paper. Additionally, a more extensive exploration into the sociolinguistic intricacies of accent discrimination can further enrich our insights and foster more holistic and understanding language teaching approaches.

5. Conclusions

The present study has shown the abilities of non-native listeners with a matched and unmatched interlanguage to discriminate different degrees of accentedness across speakers. Our analysis has shed light on the ability of learners of English as a second language to discern degrees of foreign accents across speakers. Delving into the distinctions between Spanish and Japanese listeners, we discerned that certain phonological characteristics of a listener’s native language can significantly influence their perceptual capabilities. Notably, the duration of vowels served as a defining cue for Japanese speakers, in alignment with their native phonological structures. As learners are exposed to a multitude of accents, both their native phonological frameworks and their linguistic experiences shape their discrimination abilities.

Funding

This article is a part of the outcome of research performed under a Waseda University Grant for Special Research Projects (Project number: 2023C-559).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The author would like to thank the effort of two anonymous reviewers and their comments. ChatGPT was incorporated to proofread part of the initial draft, and the output was reviewed and tailored by the author to align with the tone and purpose of the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FA	Foreign accent
L1	First language
L2	Second language

Appendix A

In this appendix, the data for the estimated marginal means (emmeans) pairwise comparison for each continuum is presented. Only significant results (p < 0.1) are provided.

Table A1. Significant pairwise comparison (estimated marginal means) for the [e→ɜː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(2-4)–(5-3)	−1.24 × 10 $^{0}$	0.387	−3.212	0.0166
Spanish	(1-3)–(5-3)	−1.96 × 10 $^{0}$	0.434	−4.519	0.0001
Spanish	(2-4)–(5-3)	−1.57 × 10 $^{0}$	0.439	−3.575	0.0047
Spanish	(3-1)–(5-3)	−1.26 × 10 $^{0}$	0.448	−2.800	0.0574
Spanish	(3-5)–(5-3)	−2.05 × 10 $^{0}$	0.434	−4.730	<0.0001
Spanish	(4-2)–(5-3)	−1.55 × 10 $^{0}$	0.438	−3.539	0.0054

Table A2. Significant pairwise comparison (estimated marginal means) for the [o→ɜː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(1-3)–(3-1)	1.11 × 10 $^{0}$	0.348	3.197	0.0174
Japanese	(1-3)–(4-2)	1.04 × 10 $^{0}$	0.348	2.992	0.0331
Japanese	(3-1)–(3-5)	−1.02 × 10 $^{0}$	0.340	−3.015	0.0308
Japanese	(3-5)–(4-2)	9.52 × 10 $^{- 1}$	0.339	2.805	0.0566

Table A3. Significant pairwise comparison (estimated marginal means) for the [a→æ] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(1-3)–(3-1)	1.11 × 10 $^{0}$	0.348	3.197	0.0174
Japanese	(1-3)–(4-2)	1.04 × 10 $^{0}$	0.348	2.992	0.0331
Japanese	(3-1)–(3-5)	−1.02 × 10 $^{0}$	0.340	−3.015	0.0308
Japanese	(3-5)–(4-2)	9.52 × 10 $^{- 1}$	0.339	2.805	0.0566

Table A4. Significant pairwise comparison (estimated marginal means) for the [a→ɑː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Spanish	(3-1)–(4-2)	8.41 × 10 $^{- 1}$	0.332	2.536	0.1137

Table A5. Significant pairwise comparison (estimated marginal means) for the [i→ɪ] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Spanish	(1-3)–(3-1)	−9.41 × 10 $^{- 1}$	0.326	−2.890	0.0445
Spanish	(3-1)–(3-5)	1.23 × 10 $^{0}$	0.328	3.767	0.0023
Spanish	(3-1)–(4-2)	9.91 × 10 $^{- 1}$	0.326	3.039	0.0287
Spanish	(3-5)–(5-3)	−1.20 × 10 $^{0}$	0.329	−3.644	0.0036
Spanish	(4-2)–(5-3)	−9.54 × 10 $^{- 1}$	0.327	−2.917	0.0412

Table A6. Significant pairwise comparison (estimated marginal means) for the [i→iː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(1-3)–(3-1)	−1.28 × 10 $^{0}$	0.361	−3.544	0.0053
Japanese	(1-3)–(4-2)	−1.66 × 10 $^{0}$	0.392	−4.240	0.0003
Japanese	(1-3)–(5-3)	−1.39 × 10 $^{0}$	0.367	−3.796	0.0020
Japanese	(2-4)–(3-1)	−1.25 × 10 $^{0}$	0.362	−3.458	0.0072
Japanese	(2-4)–(4-2)	−1.64 × 10 $^{0}$	0.393	−4.161	0.0005
Japanese	(2-4)–(5-3)	−1.37 × 10 $^{0}$	0.368	−3.712	0.0028
Japanese	(3-1)–(3-5)	1.50 × 10 $^{0}$	0.361	4.150	0.0005
Japanese	(3-5)–(4-2)	−1.88 × 10 $^{0}$	0.392	−4.799	<0.0001
Japanese	(3-5)–(5-3)	−1.61 × 10 $^{0}$	0.367	−4.393	0.0002
Spanish	(1-3)–(3-1)	−1.18 × 10 $^{0}$	0.330	−3.571	0.0048
Spanish	(1-3)–(3-5)	−9.53 × 10 $^{- 1}$	0.323	−2.952	0.0372
Spanish	(1-3)–(5-3)	−1.38 × 10 $^{0}$	0.337	−4.090	0.0006
Spanish	(2-4)–(3-1)	−1.03 × 10 $^{0}$	0.329	−3.133	0.0214
Spanish	(2-4)–(5-3)	−1.23 × 10 $^{0}$	0.336	−3.663	0.0034

Table A7. Significant pairwise comparison (estimated marginal means) for [o→ɒ] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(2-4)–(3-1)	1.07 × 10 $^{0}$	0.323	3.315	0.0118

Table A8. Significant pairwise comparison (estimated marginal means) for the [o→ɔː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
No significant results.

Table A9. Significant pairwise comparison (estimated marginal means) for the [u→ʊ] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(1-3)–(4-2)	9.64 × 10 $^{- 1}$	0.321	3.006	0.0317
Japanese	(2-4)–(4-2)	9.13 × 10 $^{- 1}$	0.320	2.856	0.0492
Japanese	(3-1)–(3-5)	−1.57 × 10 $^{0}$	0.353	−4.440	0.0001
Japanese	(3-5)–(4-2)	1.79 × 10 $^{0}$	0.354	5.047	<0.0001
Spanish	(3-1)–(5-3)	−9.68 × 10 $^{- 1}$	0.332	−2.919	0.0410

Table A10. Significant pairwise comparison (estimated marginal means) for the [u→uː] continuum.

Cohort	Contrast	Estimate	SE	z Ratio	p Value
Japanese	(2-4)–(3-1)	−1.46 × 10 $^{0}$	0.361	−4.055	0.0007
Spanish	(2-4)–(3-1)	−1.15 × 10 $^{0}$	0.353	−3.271	0.0137
Spanish	(2-4)–(3-5)	−9.75 × 10 $^{- 1}$	0.340	−2.865	0.0479
Spanish	(2-4)–(5-3)	−1.36 × 10 $^{0}$	0.370	−3.683	0.0032

References

Adank, Patti, Andrew J. Stewart, Louise Connell, and Jeffrey Wood. 2013. Accent imitation positively affects language attitudes. Frontiers in Psychology 4: 280. [Google Scholar] [CrossRef]
Algethami, Ghazi. 2017. The effects of explicit pronunciation instruction on the degree of perceived foreign accent in the speech of EFL learners. Research in Language (RiL) 15: 253–63. [Google Scholar] [CrossRef]
Anwyl-Irvine, Alexander L., Jessica Massonnié, Adam Flitton, Natasha Kirkham, and Jo K. Evershed. 2020. Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods 52: 388–407. [Google Scholar] [CrossRef] [PubMed]
Asher, James J., and Ramiro García. 1969. The optimal age to learn a foreign language. The Modern Language Journal 53: 334–41. [Google Scholar] [CrossRef]
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67: 1–48. [Google Scholar] [CrossRef]
Bion, Ricardo A., Kouki Miyazawa, Hideaki Kikuchi, and Reiko Mazuka. 2013. Learning phonemic vowel length from naturalistic recordings of Japanese infant-directed speech. PLoS ONE 8: e51594. [Google Scholar] [CrossRef]
Boduch-Grabka, Katarzuna, and Shiri Lev-Ari. 2021. Exposing individuals to foreign accent increases their trust in what nonnative speakers say. Cognitive Science 45: e13064. [Google Scholar] [CrossRef] [PubMed]
Boersma, Paul, and David Weenink. 2023. Praat: Doing Phonetics by Computer [Computer Program]. Available online: http://www.praat.org/ (accessed on 29 August 2023).
Burke, Kali, Michelle K. Tulloch, and Marieke van Heugten. 2018. Listener flexibility to lexical alterations for foreign- and native-accented speech. The Journal of the Acoustical Society of America 144: 1867. [Google Scholar] [CrossRef]
Chan, Kit Ying, Michael D. Hall, and Ashley A. Assgari. 2017. The role of vowel formant frequencies and duration in the perception of foreign accent. Journal of Cognitive Psychology 29: 23–34. [Google Scholar] [CrossRef]
Chen, Wenjun, and Jeroen van de Weijer. 2022. The role of L1-L2 dissimilarity in L2 segment learning–Implications from the acquisition of English post-alveolar fricatives by Mandarin and Mandarin/Wu speakers. Frontiers in Psychology 13: 1017724. [Google Scholar] [CrossRef]
Chen, Xi, Yi Liu, and Jinghong Ning. 2020. Factors Affecting Degree of Foreign Accent in Initials and Finals: A Study on Pakistan Learners of Cantonese. Available online: https://www.researchgate.net/publication/338688719_Factors_Affecting_Degree_of_Foreign_Accent_in_Initials_and_Finals_A_Study_on_Pakistan_Learners_of_Cantonese (accessed on 31 July 2023).
Clarke, Constance M., and Merrill Garret. 2004. Rapid adaptation to foreign-accented English. The Journal of the Acoustical Society of America 116: 3647–58. [Google Scholar] [CrossRef] [PubMed]
Council of Europe. 2001. Council for Cultural Co-operation. Education Committee. Modern Languages Division. In Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: Cambridge University Press. [Google Scholar]
Cristia, Alejandrina, Amanda Seidl, Charlotte Vaughn, Rachel Schmale, Ann Bradlow, and Caroline Floccia. 2012. Linguistic Processing of Accented Speech Across the Lifespan. Frontiers in Psychology 3: 479. [Google Scholar] [CrossRef] [PubMed]
D’Angelo, James, Toshiko Yamaguchi, and Yasuhiro Fujiwara. 2021. Features of Japanese English. In English in East and South Asia: Policy, Features, and Language in Use. London: Routledge, pp. 122–36. [Google Scholar]
Derwing, Tracey M., and Murray J. Munro. 2015. Pronunciation Fundamentals: Evidence-Based Perspectives for L2 Teaching and Research. Amsterdam: John Benjamins Publishing Company, vol. 42. [Google Scholar]
de Weers, Noortje, and Murray J. Munro. 2018. The Role of Duration in Japanese Speakers’ Productions of English Vowels. Paper presented at 9th Pronunciation in Second Language Learning and Teaching Proceedings, Salt Lake City, UT, USA, September 1–2. [Google Scholar]
Escudero, Paola, Titia Benders, and Silvia C. Lipski. 2009. Native, non-native and L2 perceptual cue weighting for Dutch vowels: The case of Dutch, German, and Spanish listeners. Journal of Phonetics 37: 452–65. [Google Scholar] [CrossRef]
Flege, James Emil, Murray J. Munro, and Ian R. A. MacKay. 1995. Factors affecting strength of perceived foreign accent in a second language. The Journal of the Acoustical Society of America 97: 3125–34. [Google Scholar] [CrossRef] [PubMed]
Franklin, Amber D., and Carol Stoel-Gammon. 2014. Using multiple measures to document change in English vowels produced by Japanese, Korean, and Spanish speakers: The case for goodness and intelligibility. American Journal of Speech-Language Pathology 23: 625–40. [Google Scholar] [CrossRef]
García Lecumberri, María Luisa, Roberto Barra Chicote, Rubén Pérez Ramón, Junichi Yamagishi, and Martin Cooke. 2014. Generating segmental foreign accent. Paper presented at Interspeech 2014, Singapore, September 14–18. [Google Scholar]
Georgiou, Georgios P. 2019. Bit and beat are heard as the same: Mapping the vowel perceptual patterns of Greek-English bilingual children. Bilingualism: Language and Cognition 22: 394–409. [Google Scholar] [CrossRef]
Gerrits, Ellen, and Martin E. H. Schouten. 2004. Categorical perception depends on the discrimination task. Perception & Psychophysics 66: 363–76. [Google Scholar]
Højen, Anders, and James Emil Flege. 2006. Early learners’ discrimination of second-language vowels. The Journal of the Acoustical Society of America 119: 3072–84. [Google Scholar] [CrossRef]
Hu, Chieh-Fang. 2021. Adaptation to an unfamiliar accent by child L2 listeners. Language and Speech 64: 491–514. [Google Scholar] [CrossRef]
Hui, C. T. Justine, and Takayuki Arai. 2020. Pitch and duration as auditory cues to identify Japanese long vowels for Japanese learners. Acoustical Science and Technology 41: 796–99. [Google Scholar] [CrossRef]
Imai, Satomi, Amanda C. Walley, and James Emil Flege. 2005. Lexical frequency and neighborhood density effects on the recognition of native and Spanish-accented words by native English and Spanish listeners. The Journal of the Acoustical Society of America 117: 896–907. [Google Scholar] [CrossRef]
Ioup, Georgette. 1984. Is there a structural foreign accent? A comparison of syntactic and phonological errors in second language acquisition. Language Learning 34: 1–15. [Google Scholar] [CrossRef]
Kang, Okim, Meghan Moran, Hyunkee Ahn, and Soon Park. 2020. Proficiency as a mediating variable of intelligibility for different varieties of accents. Studies in Second Language Acquisition 42: 471–87. [Google Scholar] [CrossRef]
Kayte, Sangramsing, Monica Mundada, and Jayesh Gujrathi. 2015. Hidden Markov model based speech synthesis: A review. International Journal of Computer Applications 130: 35–39. [Google Scholar] [CrossRef]
Lengeris, Angelos. 2009. Perceptual assimilation and L2 learning: Evidence from the perception of Southern British English vowels by native speakers of Greek and Japanese. Phonetica 66: 169–87. [Google Scholar] [CrossRef] [PubMed]
Lenth, Russell V. 2023. emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.8.7. Available online: https://CRAN.R-project.org/package=emmeans (accessed on 31 July 2023).
Mackay, Ian R. A., James Emil Flege, and Satomi Imai. 2006. Evaluating the effects of chronological age and sentence duration on degree of perceived foreign accent. Applied Psycholinguistics 27: 157–83. [Google Scholar] [CrossRef]
Mennen, Ineke. 2015. Beyond segments: Towards a L2 intonation learning theory. In Prosody and Language in Contact: L2 Acquisition, Attrition and Languages in Multilingual Situations. Edited by Elisabeth Delais-Roussarie, Mathieu Avanzi and Sophie Herment. Berlin: Springer, pp. 171–88. [Google Scholar]
Moyer, Alene. 2007. Do language attitudes determine accent? A study of bilinguals in the USA. Journal of Multilingual and Multicultural Development 28: 502–18. [Google Scholar] [CrossRef]
Munro, Murray J., and Tracey M. Derwing. 1995. Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning 45: 73–97. [Google Scholar] [CrossRef]
Munro, Murray J., and Tracey M. Derwing. 2020. Foreign accent, comprehensibility and intelligibility, redux. Journal of Second Language Pronunciation 6: 283–309. [Google Scholar] [CrossRef]
Neuhauser, Sara. 2011. Foreign Accent Imitation and Variation of VOT and Voicing in Plosives. Paper presented at International Congress of Phonetic Sciences (ICPhS), Hong Kong, China, August 17–21; pp. 1462–65. [Google Scholar]
Nygaard, Lynne C., Sabrina K. Sidaras, Jessica E. Duke, and Stig T. Rasmussen. 2006. Acoustic correlates of accentedness and intelligibility of Spanish-accented English vowels. Journal of the Acoustical Society of America 120: 3170. [Google Scholar] [CrossRef]
Pérez-Ramón, Rubén, Martin Cooke, and María Luisa García Lecumberri. 2020. Is segmental foreign accent perceived categorically? Speech Communication 117: 28–37. [Google Scholar] [CrossRef]
Pérez-Ramón, Rubén, Martin Cooke, and María Luisa García Lecumberri. 2022a. Generating iso-accented stimuli for second language research: Methodology and a dataset for Spanish-accented English. Paper presented at Annual Conference of the International Speech Communication Association, INTERSPEECH, Incheon, Republic of Korea, September 18–22; vol. 2022, pp. 1846–50. [Google Scholar]
Pérez-Ramón, Rubén, María Luisa García Lecumberri, and Martin Cooke. 2022b. The SIAEW Corpus of Spanish Iso-Accented English Words [Data Set]. Zenodo. [Google Scholar] [CrossRef]
Pérez-Ramón, Rubén, María Luisa García Lecumberri, and M. Cooke. 2023. The role of lexical context and language experience in the perception of foreign-accented segments. Poznan Studies in Contemporary Linguistics 59: 609–34. [Google Scholar] [CrossRef]
Piske, Thorsten, Ian R. A. MacKay, and James Emil Flege. 2001. Factors affecting degree of foreign accent in an L2: A review. Journal of Phonetics 29: 191–215. [Google Scholar] [CrossRef]
Polyanskaya, Leona, Mikhail Ordin, and Maria Grazia Busa. 2017. Relative salience of speech rhythm and speech rate on perceived foreign accent in a second language. Language and Speech 60: 333–55. [Google Scholar] [CrossRef]
Qian, Yao, Yuchen Fan, Wenping Hu, and Frank K. Soong. 2014. On the training aspects of deep neural network (DNN) for parametric TTS synthesis. Paper presented at 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4–9; pp. 3829–33. [Google Scholar]
R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 31 July 2023).
Schoonmaker-Gates, Elena. 2015. On voice-onset time as a cue to foreign accent in Spanish: Native and nonnative perceptions. Hispania 98: 779–91. [Google Scholar] [CrossRef]
Sereno, Joan, Lynne Lammers, and Allard Jongman. 2016. The relative contribution of segments and intonation to the perception of foreign-accented speech. Applied Psycholinguistics 37: 303–22. [Google Scholar] [CrossRef]
Sumner, Meghan. 2009. The benefit of variation in cross-language perception of voice onset time (VOT). The Journal of the Acoustical Society of America 125: 2767. [Google Scholar] [CrossRef]
Sung, Chit Cheung Matthew. 2014. An exploratory study of Hong Kong students’ perceptions of native and non-native English-speaking teachers in ELT. Asian Englishes 16: 32–46. [Google Scholar] [CrossRef]
Taylor, Paul, Alan W. Black, and Richard Caley. 1998. The architecture of the Festival speech synthesis system. Paper presented at the Third ESCA/COCOSDA Workshop (ETRW) on Speech Synthesis, Blue Mountains, Australia, November 26–29; pp. 147–52. [Google Scholar]
Tuninetti, Alba, and Natasha Tokowicz. 2018. The influence of a first language: Training nonnative listeners on voicing contrasts. Language, Cognition and Neuroscience 33: 750–68. [Google Scholar] [CrossRef]
Tyler, Michael D., Catherine T. Best, Alice Faber, and Andrea G. Levitt. 2014. Perceptual assimilation and discrimination of non-native vowel contrasts. Phonetica 71: 4–21. [Google Scholar] [CrossRef] [PubMed]
Uzun, Tarık. 2023. Foreign accent, identity and accent discrimination: A literature review. International Journal of Language Academy 11: 252–66. [Google Scholar] [CrossRef]
Van Els, Theo, and Kees De Bot. 1987. The role of intonation in foreign accent. The Modern Language Journal 71: 147–55. [Google Scholar] [CrossRef]
Walkinshaw, Ian, and Duongthi Hoang Oanh. 2014. Native and non-native English language teachers: Student perceptions in Vietnam and Japan. Sage Open 4: 2158244014534451. [Google Scholar] [CrossRef]
Xie, Xin, and Emily B. Myers. 2017. Learning a talker or learning an accent: Acoustic similarity constrains generalization of foreign accent adaptation to new talkers. Journal of Memory and Language 97: 30–46. [Google Scholar] [CrossRef]

Figure 1. Discrimination overall results (i.e., all vowels together). Error bars represent ±1 standard error. On the x-axis, the first number of each pair is the X sound (female voice) and the second number is the dissimilar step in the AXB discrimination task.

Figure 2. Discrimination results for each vowel individually. Error bars represent ±1 standard error. On the x-axis, the first number of each pair is the X sound (female voice) and the second number is the dissimilar step in the AXB discrimination task.

Table 1. Collection of vowels and their assigned words present in the SIAEW corpus.

English Sound	Spanish Confusion	Words
ɜː	e	bird, burn, firm, learn
ɜː	o	word, world, worm, worse
æ	a	back, cat, clap, pact
ɑː	a	fast, raft, shark, stark
ɪ	i	clip, mist, pick, sin
iː	i	beam, seem, steam, team
ɒ	o	cost, dot, pot, spot
ɔː	o	clause, fall, orb, storm
ʊ	u	look, nook, put, should
uː	u	choose, mood, moon, spoon

Table 2. Summary of the types of trials presented to participants in the AXB experiment.

Male Voice (A)	Female Voice (X)	Male Voice (B)
Step n	Step n	Step $n + 2$
Step n	Step n	Step $n - 2$
Step $n + 2$	Step n	Step n
Step $n - 2$	Step n	Step n

Table 3. Analysis of Deviance for correct answers using Type II Wald chi-square tests.

Correct Responses ∼ Cohort × Steps Compared + (1\|id)
Factor	Chi-Square	Degrees of Freedom	p Value
cohort	2.8097	1	0.0937
steps compared	44.9893	5	<0.001
cohort:steps	12.4690	5	0.0289

Table 4. Pairwise comparisons of trials within the Spanish cohort.

Contrast	Estimate	Standard Error	z Ratio	p Value
(1-3)–(3-1)	−0.30423	0.1019	−2.985	0.0337
(1-3)–(5-3)	−0.51510	0.1038	−4.963	<0.001
(2-4)–(3-1)	−0.35404	0.1018	−3.479	0.0067
(2-4)–(5-3)	−0.56491	0.1037	−5.450	<0.001
(3-1)–(4-2)	0.37370	0.1018	3.672	0.0033
(3-5)–(5-3)	−0.35351	0.1044	−3.385	0.0093
(4-2)–(5-3)	−0.58457	0.1037	−5.639	<0.001

Table 5. Analysis of deviance for correct answers using Type II Wald chi-square tests.

Correct Responses ∼ Cohort × Steps Compared × Vowel + (1\|id)
Factor	Chi-Square	Degrees of Freedom	p Value
cohort	2.4686	1	0.11614
steps compared	40.1330	5	1.404 × 10 $^{- 7}$
vowel	183.5956	9	<2.2 × 10 $^{- 16}$
cohort:steps compared	11.8557	5	0.03682
cohort:vowel	19.6130	9	0.02046
steps compared:vowel	210.1820	45	<2.2 × 10 $^{- 16}$
cohort:steps compared:vowel	61.3548	45	0.05272

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pérez-Ramón, R. Discrimination of Degrees of Foreign Accent across Different Speakers. Languages 2024, 9, 72. https://doi.org/10.3390/languages9030072

AMA Style

Pérez-Ramón R. Discrimination of Degrees of Foreign Accent across Different Speakers. Languages. 2024; 9(3):72. https://doi.org/10.3390/languages9030072

Chicago/Turabian Style

Pérez-Ramón, Rubén. 2024. "Discrimination of Degrees of Foreign Accent across Different Speakers" Languages 9, no. 3: 72. https://doi.org/10.3390/languages9030072

Article Menu

Discrimination of Degrees of Foreign Accent across Different Speakers

Abstract

1. Introduction

2. Materials and Methods

2.1. Audio Tokens

2.2. Participants

2.3. Experiment

3. Results

3.1. Pre-Processing

3.2. Overall Results

3.3. Results by Vowel

4. Discussion

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI