Production of Acoustic Correlates of Stress by L2 Spanish-Speaking Immigrants to Spain

Timothy L. Face

doi:10.3390/languages8040258

Abstract

Little work has examined the L2 acquisition of Spanish stress, and especially the production of its acoustic correlates, and the work that has is largely limited to inexperienced learners. This study examines the production of stress by L1 English/L2 Spanish speakers who are highly experienced with their L2, having lived much of their adult lives as immigrants in Spain. Data were collected from the reading of a short story, an extended reading with a plot was provided so that participants would not be focused on their pronunciation, thus producing speech closer to spontaneous speech while still allowing for control over what they produced. Intensity, duration, pitch and deaccenting were examined and the results from the L2 learners were compared to a control group of native speakers from Spain who performed the same task. While only one L2 learner’s stress production could be classified as completely native-like, as a group, their stress production approximated native speaker norms to a greater degree than has been found for most other aspects of L2 Spanish pronunciation in previous research. Nonetheless, L2 learners seemed to transfer duration patterns from their L1 into their L2 Spanish and also deaccented stressed syllables nearly twice as often as native speakers.

Keywords:

second language acquisition; phonology; stress; Spanish; duration; intensity; pitch; deaccenting; immigrants

1. Introduction

Lexical stress is important in Spanish, as it can be the only phonological feature that distinguishes two or more words, such as in the word pair hablo ‘I speak’ vs. habló ‘s/he spoke’ and in the word triplet depósito ‘deposit (n.)’ vs. deposito ‘I deposit’ vs. depositó ‘s/he deposited’, each of which is identical in pronunciation with the exception of which syllable is stressed. American English, on the other hand, does not have this type of contrast, with words distinguished solely based on which syllable bears lexical stress. While there are a few near contrasts, such as the noun contrast, with its initial stress, and the verb contrast, with its final stress, the difference in lexical stress is accompanied by a vowel quality difference as well (in the case of this example, the initial vowel in the noun contrast is [ɒ] while the initial vowel in the verb contrast is [ə]). Given this difference between the two languages, with stress alone being contrastive in Spanish but not in English, and the greater communicative importance of lexical stress in Spanish, it is predicted that American English speakers learning Spanish as a second language (L2) will have difficulty acquiring Spanish lexical stress.

There are relatively few studies on L2 Spanish stress compared to those on segmental acquisition. Most existing studies fall into one of two categories, investigating either the influence of the lexicon on Spanish stress placement (e.g., Bullock and Lord 2003; Carlson 2006; Lord 2007; Tight 2007) or the perception of stress (e.g., Face 2005; Lord 2001; Ortega-Llebaria et al. 2013; Saalfeld 2012). Only a few recent studies have looked at the L2 production of Spanish stress, with Mirisis (2019) examining stress placement and Kim (2015, 2020) examining acoustic correlates. The existing studies on L2 Spanish stress focus on university-level language learners. While these learners are sometimes labeled ‘advanced’ due to their course placement, they often have relatively little experience with Spanish. Considering that L2 learning can last a lifetime, even some of these so-called ‘advanced’ learners may be fairly early in their acquisition process. As a result, existing studies do not tell us what can realistically be expected in more experienced L2 learners of Spanish. The present study begins to investigate this question by examining three correlates of stress (i.e., pitch, duration and intensity) in the speech of a group of highly experienced L2 learners of Spanish who grew up in the United States, speak American English as their first language (L1), moved to Spain in adulthood and have spent much of their adult lives as immigrants living in Spain. Studying this group allows for examination of a later stage of L2 Spanish development that is largely unstudied and allows for consideration of what may be the realistic ultimate attainment for L2 learners of Spanish.

2. Previous Research

Quilis (1971) set out to examine the acoustic correlates of stress in Spanish speech production and found that pitch plays the largest role in marking stress, but that duration also plays a role. Intensity had been commonly claimed to be the primary marker of stress in Spanish based on impressionistic studies—and, in fact, acento de intensidad ‘intensity stress’ is a common way to refer to stress in Spanish—but Quilis found that it played little to no role in marking stress. These findings were tested from a perceptual perspective by Enríquez et al. (1989), who found that the same cues were responsible for listeners’ perception of stress, with pitch being most important, duration also contributing to stress perception, but intensity having no effect on the perception of stress.

Work on intonation over the past few decades has provided a lot of information about how pitch works in intonation languages like Spanish. Pitch is part of the intonation system, and while pitch accents are associated with stressed syllables, the resulting pitch movements are actually part of the intonation system and not the direct result of lexical stress. Given this, Ortega-Llebaria (2006) set out to examine stress production in cases where pitch accents were absent from words bearing lexical stress. To do so, she examined parenthetical phrases, which lack pitch accents and, therefore, are produced with a flat intonation pattern, with no notable rises and falls in pitch. In these cases, she found that duration is the primary marker of stress. While intensity also seems to mark stress in some cases, it does so less consistently. Nonetheless, in the vast majority of cases, stressed syllables are associated with a pitch accent (Cruttenden 1993), and therefore pitch movement is most often present on stressed syllables, meaning that listeners may relate it to stress even though its origin is the intonation system.

Before considering previous work on the L2 acquisition of Spanish stress, it is useful to consider how the Spanish stress system may pose challenges to speakers of American English. First, since American English reduces unstressed vowels, there is less communicative burden on the acoustic correlates of stress. Learners, therefore, may struggle to adequately use pitch and duration to communicate stress in Spanish. Second, American English tends to have larger duration differences between stressed and unstressed vowels than Spanish does (Hammond 2001) and so learners may need to reduce the duration of stressed vowels in comparison with their L1 (cf. Stevens 2011). Lastly, deaccenting (i.e., the lack of a pitch accent associated with a stressed syllable) is much more common in English than Spanish, so learners will need to learn to put pitch accents on stressed syllables more often in their L2 Spanish.

Kim (2020) is the most relevant previous study to the current study, as it is the only study (besides her 2015 pilot study) to consider the acoustic correlates of stress in the L2 Spanish speech of a group of L1 American English speakers. The study’s principal goal was to examine the acoustic correlates of stress in the speech production and perception of heritage speakers of Spanish, but Kim included a group of L2 learners as a comparison group, and the production findings for that group are relevant to the present study. Kim examined stress minimal pairs where segmentally identical words differed only in stress placement, with one being a paroxytone (i.e., having stress on the penultimate syllable of the word) and one being an oxytone (i.e., having stress on the final syllable of the word), such as canto ‘I sing’ and cantó ‘s/he sang’. Each word was inserted into carrier phrases that placed them in three different prosodic contexts: prenuclear position (i.e., prior to the last accented word of the phrase), nuclear position (i.e., the last accented word of the phrase), and unaccented context (i.e., where it would not receive a pitch accent). Duration, average intensity, average fundamental frequency (F0), and F0 peak displacement were measured. F0 is the rate of vibration of the vocal folds and is the physical parameter corresponding to pitch (though F0 and pitch are often used interchangeably, including in the present paper). By measuring F0 peak displacement, Kim recognizes that it is the F0 movement and how this aligns with the stressed syllable that is important in marking stress, with this being considered more important than the specific F0 value. While there are dialectal differences, in many varieties of Spanish, including that which Kim investigates in her study of heritage speakers, pitch accents are generally realized as rises in pitch through a stressed syllable, but the location of the peak varies based on context. In prenuclear position, the rise in pitch continues through the syllable and into the next; that is, the peak is ‘displaced’ from the stressed syllable. In nuclear position, on the other hand, the rise in pitch is contained within the stressed syllable, reaching its peak in that syllable; that is, there is no displacement of the peak. The L2 learners in Kim (2020) produced many words in prenuclear position without F0 peak displacement. Kim notes that this may be due to the influence of English. In the unaccented context, where there are no pitch accents and stress must be communicated via other acoustic correlates, the L2 learners showed a large overlap between stress patterns. In this context, the unstressed vowel was longer than the stressed vowel about half of the time. Taking into account these findings, as well as the other findings she reports in the study, Kim says that overall, the L2 learners “did not use the cues in a consistent manner” (p. 246).

In summary, the one study to examine this issue found that L2 learners were inconsistent in their use of the correlates of L2 Spanish stress. Based on previous work on Spanish stress, we know that pitch and duration are the most important correlates of stress, with intensity playing a much smaller role. It is precisely with pitch and duration that American English-speaking learners of Spanish might be expected to struggle with stress correlates in their L2. This is because American English uses pitch accents on stressed syllables less often than Spanish does and, when they are used, they often have a different phonetic realization, and because there is a greater difference in duration between stressed and unstressed vowels in American English than in Spanish. It must be noted that stress differences may be communicated in other ways as well. For example, the centralization of unstressed vowels is typical in American English and may lead L2 Spanish learners to reduce unstressed vowels in Spanish as well. Studies of L2 Spanish vowels, such as Menke and Face (2010) and Cobb and Simonet (2015), for example, have examined vowel quality differences across stress contexts. This is generally examined in studies on vowels rather than stress and has been studied for the current population in Menke and Face (2018). Since the present study is concerned with the acoustic correlates of stress in Spanish, which are generally claimed to be pitch, duration and intensity, and since Spanish does not have significant differences in vowel quality based on stress context (e.g., Quilis and Esgueva 1983), vowel quality is not considered in the present study.

3. Methods

The methods for this study were designed to address the following research questions:

RQ #1: How do highly experienced L2 learners of Spanish who have spent much of their adult lives living in Central Spain use pitch, duration and intensity as correlates to Spanish stress, and how does this usage compare to the use of these correlates by L1 Spanish speakers living in the same region?

RQ #2: To what extent do individual L2 learners approximate native Spanish speaker norms in their use of pitch, duration and intensity as correlates to Spanish stress?

The participants in this study were 12 native speakers of American English, 9 females and 3 males, who speak Spanish as a second language and live in central Spain, in either Madrid or Toledo. They were recruited for this study through personal contacts and an American club in the region. All were born and raised in the United States and immigrated to Spain as adults. While all began learning Spanish before moving to Spain, none of them began learning Spanish prior to adolescence, with all of them learning in school, beginning in middle school, high school or college. The age range of the participants was 41–84 years old, with a mean age of 62.9 years old. All had at least an undergraduate college education. Length of full-time residency in Spain ranged from 11 to 60 years, with a mean of 36 years.1 Table 1 shows this information for each participant. All participants spoke both Spanish and English in their daily lives, and self-reported estimates for Spanish were 40–90%, with a mean of 67.5%. Multiple participants commented that the percentage of English and Spanish used has changed over time, based on jobs held (e.g., more English when working for an international company than a Spanish company; more Spanish when working than when not working), the presence of a Spanish-speaking significant other, easier access to English presently than previously due to technology, etc. It is their long length of residence in Spain and experience immersed in a primarily Spanish-speaking culture that gives these learners a profile that distinguishes them from learners in other studies. While some speakers had studied other languages at some point in their lives, none were fluent in or actively used any language other than English and Spanish. A comparison group of five native Spanish speakers born, raised and living in central Spain was also included.2 These speakers had an age range of 52–71 years old, with a mean age of 59.8 years old. One had advanced proficiency in English and intermediate proficiency in German, while the others were monolingual.3

Table 1. Demographic information for each L2 Spanish-speaking participant.

Participants completed a background questionnaire inquiring as to their language background and use. Following the questionnaire, they were recorded having a 10–15 min conversation with the researcher and reading a short story. The short story is the source of the data for this study. Reading connected prose falls somewhere near the middle of the interlanguage speech continuum proposed by Tarone (1983); while not a naturalistic task, it allows subjects to immerse themselves in a coherent text, which is less devoid of meaning than shorter reading tasks, such as word lists. By eliciting learner productions of language in this way, it was possible to control for the linguistic context while still providing subjects with a task that involved extended speaking and a developing plot so as not to permit close focus on their pronunciation, and thus more closely approach spontaneous speech than would be the case with reading a word list or a list of disconnected sentences. The story was Aniversario, by Luis Romero, with slight modifications (e.g., changing names) in order to elicit additional tokens of certain sounds, and contained 1343 words. Recordings were made with a Zoom H2n digital recorder.

In order to examine stress production, 40 words from the story were selected that had /a/, /e/ or /o/ as the stressed vowel, had the identical vowel in either the preceding or following syllable, with half in each position, and occurred in prenuclear position. The identical vowels were the targets of analysis. Example words with /a/ are hermana ‘sister’ (with the unstressed /a/ after the stressed /a/) and garbanzos ‘chickpeas’ (with the unstressed /a/ before the stressed /a/). As /a/, /e/ and /o/ are the most common vowels in Spanish, this allowed for plenty of words meeting the criteria for analysis. Having identical vowels in adjacent syllables, with one being stressed and one being unstressed, allowed for a more reliable comparison of duration and intensity between the vowels since some vowels are inherently longer and/or have higher intensity than others, and while within the same word any variation in speech rate across the task would not affect the comparison of the stressed and unstressed vowels. By selecting words that occurred in prenuclear position, the final lengthening that is typical at the end of a phrase was avoided, as this would have skewed the duration measurements, as were differences in F0 pattern between prenuclear and nuclear positions. Of the 680 possible words for analysis, 45 were excluded, leaving 635 to be analyzed. In 42 of these cases, the speaker inserted a boundary after the target word, thus placing it in nuclear position. In the other three cases, the speaker mispronounced the word in such a way that made the word unusable for analysis.

For each target vowel, analysis was carried out using Praat (Boersma and Weenink 2023). Its duration and intensity peak were measured and for the stressed syllable its F0 movement (rise, fall, flat, rise-fall, fall-rise) or deaccenting was noted. The duration ratio of the two target vowels in each word was calculated by dividing the duration of the stressed vowel by the duration of the unstressed vowel. The duration ratio indicated, then, the duration of the stressed vowel as a percentage of the duration of the unstressed vowel. A value greater than 1.0 indicated that the stressed vowel was longer than the unstressed vowel while value less than 1.0 indicated that the stressed vowel was shorter. Similarly, the intensity ratio of the two target vowels in each word was calculated by dividing the intensity peak of the stressed vowel by the intensity peak of the unstressed vowel. The intensity ratio provided the intensity peak of the stressed vowel as a percentage of the intensity peak of the unstressed vowel, with a value greater than 1.0 indicating that the stressed vowel had a higher intensity peak than the unstressed vowel, and a value less than 1.0 indicating that the stressed vowel had a lower intensity peak. Statistical analyses for group comparisons were carried out using SPSS, with mixed models used for duration ratio and intensity ratio and a Chi-square used for F0 movement. For individual comparisons of each learner to the native speaker group, outliers for duration ratio and intensity ratio for each speaker were identified using the outliers function in SPSS, with pairwise case deletion, and removed from the data.4 For duration ratio, 3 outliers (1.6% of tokens) were removed from the native speaker group and 16 outliers (3.5% of tokens) were removed from the L2 learner group. For intensity ratio, 6 outliers (3.3% of tokens) were removed from the native speaker group and 10 outliers (2.2% of tokens) were removed from the L2 learner group. Then, the number of tokens for each learner that fell within the native speaker range was noted. For F0 movement on the stressed syllable, the percentage of deaccented tokens for each speaker was calculated as was the percentage of F0 rises on accented tokens. Each learner’s percentage was compared to the percentage ranges of the native speakers.

4. Results

4.1. Group Comparisons

Beginning with duration, both the L1 Spanish speakers and the L1 English-speaking learners of Spanish typically produced a longer stressed vowel than unstressed vowel. This was the case 71% of the time for the L1 Spanish speakers and 83% of the time for the L1 English speakers. Not only do the L1 English speakers produce a longer stressed vowel more often than the L1 Spanish speakers, but the duration ratio is greater as well. The mean duration ratio for the L1 Spanish speakers is 1.22 while for the L1 English speakers it is 1.404. This means that, on average, the L1 Spanish speakers produce stressed vowels that are 22% longer than the unstressed vowels, while the L1 English speakers produce stressed vowels that are 40.4% longer than the unstressed vowels. The results of a mixed effects model for duration ratio (the dependent variable), with speaker and word as random factors, and L1 (Spanish or English), vowel (/a/, /e/ or /o/) and stressed vowel (whether the stressed vowel is the first or the second vowel) as independent variables are presented in Table 2. Here it can be seen that the difference in duration ratio between the two groups is statistically significant. While vowel is not significant, stressed vowel is. In this case, the duration ratio is greater when the stressed vowel is the second of the two target vowels in a word. Stressed vowel also has a significant interaction with L1, indicating that while stressed vowel is significant on its own, it also is significantly different between the two speaker groups. The duration ratio is significantly greater in general when the stressed vowel is the second of the two vowels than when it is the first, but this difference is significantly greater for the L1 English speakers than for the L1 Spanish speakers.

Table 2. Mixed effects model for duration ratio with speaker and word as random factors.

While less frequently than for duration, for intensity, both groups also produced higher intensity on the stressed than the unstressed vowel, and at nearly identical rates. The L1 Spanish speakers produced higher intensity in the stressed vowel 66% of the time and the L1 English speakers 67% of the time. The mean intensity ratios of the two groups were also nearly identical, with the L1 Spanish speakers having a mean intensity ratio of 1.021 and the L1 English speakers having a mean of 1.019. In other words, on average, the stressed vowel of the L1 Spanish speakers was 2.1% higher in intensity than the unstressed vowel, while the stressed vowel of the L1 English speakers was 1.9% higher in intensity than the unstressed vowel. The results of a mixed effects model for intensity ratio (the dependent variable), with speaker and word as random factors, and L1 (Spanish or English), vowel (/a/, /e/ or /o/) and stressed vowel (whether the stressed vowel is the first or the second vowel) as the independent variables, are presented in Table 3. Here it can be seen that the small difference in intensity ratio between the two groups is not significant. Vowel is significant, with /o/ having a greater intensity ratio than /a/ and /e/, and stressed vowel is significant, with the intensity ratio being greater when the stressed vowel is the second of the two identical vowels compared to when it is the first. Neither vowel nor stressed vowel interact significantly with L1, meaning that these factors are consistent regardless of speaker group.

Table 3. Mixed effects model for intensity ratio with speaker and word as random factors.

With respect to F0 movement, both groups produced cases of deaccenting, where no pitch accent was present on the stressed syllable. This was nearly twice as common for the L1 English speakers, however, as they deaccented 18.9% of the time while the L1 Spanish speakers deaccented 9.9% of the time. Removing the cases of deaccenting, it is of interest to see how the groups compare in terms of F0 movement when a pitch accent is present. Table 4 shows the number of times each group produced each of the F0 movements. As can be seen, both groups are nearly categorical in their production of a rising F0 on the stressed syllable and a Chi-square test indicates that the distribution of F0 movements between the groups is not statistically significant.

Table 4. F0 movement on stressed syllable by speaker group after cases of deaccenting removed.

4.2. Individual Comparisons

While the group results present a larger view of the very experienced L1 English/L2 Spanish speakers in this study, it is also of interest to see the individual variation within that group. To that end, each L1 English speaker’s results were compared with the L1 Spanish speakers’ speech productions. For both duration ratio and intensity ratio, each speaker from both groups was examined individually and their outliers identified within SPSS and removed from the data. Then, the native speaker range for each of these measures was calculated and each L1 English speaker’s productions were examined to see how many of their productions fell within the native speaker range. Table 5 presents the number of tokens that each L1 English speaker produced that are within the native speaker range for both duration ratio and intensity ratio. It can be seen that for intensity ratio, 7 of the 12 speakers produce all of their tokens within the native speaker range, while for duration ratio only 2 of the 12 do so. Speaker 6 is the only speaker to produce all tokens in the native speaker range for both measures. For intensity ratio, the lowest percentage of tokens within the native speaker range is still over 94% (Speaker 7, with 33 of 35 productions within the native speaker range), while for duration ratio there are three speakers with fewer than 87% of their productions in the native speaker range, with the lowest being Speaker 11 (33 of 39 productions in the native speaker range, for 84.6%).

Table 5. Tokens for each L1 English speaker within the native speaker range for duration ratio and intensity ratio.

For F0 movement, both the percentage of tokens that were deaccented and, for tokens that were not deaccented, the percentage of rises in F0 during stressed syllables were calculated and compared with the native speaker range. These results are shown in Table 6. While as a group the L1 English speakers deaccented nearly twice as often as the native speakers, only 3 of the 12 L1 English speakers had a percentage of deaccenting that was outside of the native speaker range, with the highest percentage of tokens deaccented being 30% (Speaker 10, with 12 of 40 tokens deaccented). When an accent was present, 6 of the 12 L1 English speakers had a percentage of rises within the native speaker range, but this number may be somewhat deceiving since 3 of the 6 speakers outside of the native speaker range only had one token without a rise (the same as two of the native speakers), but the lower number of tokens for those speakers, in comparison with the native speakers, meant that the percentage fell below the native speaker range.

Table 6. Percentages of stressed syllables deaccented and percentage of F0 rises when not deaccented for each L1 English speaker.

5. Discussion

When considering the group results, it is notable that the results for duration ratio show that while both groups have significantly longer stressed vowels than unstressed vowels, the difference is greater for the L1 English speakers than for the L1 Spanish speakers. The L1 English speakers produce stressed vowels on average 40% longer than unstressed vowels while for the L1 Spanish speakers stressed vowels are only 22% longer. The significant interaction between L1 and stressed vowel (i.e., that the duration ratio is greater when the second vowel is stressed) highlights this, where, in this position, the duration ratio shows that L1 English speakers produce stressed vowels 63% longer, on average, than unstressed vowels, while L1 Spanish speakers produce stressed vowels 37% longer than unstressed vowels. It appears that the larger duration differences between stressed and unstressed vowels in English transfer into the L2 Spanish for the L1 English speakers in this study. For intensity ratio, there is not a significant difference between groups, and while stressed vowels have a higher average intensity than unstressed vowels for both groups, the difference is minimal. This agrees with previous research claiming that intensity plays little to no role in communicating Spanish stress. With respect to F0 movement, there is a significant difference in the degree of deaccenting. The L1 English speakers deaccent nearly twice as often as the L1 Spanish speakers (18.9% vs. 9.9%). This is consistent with the general pattern in the two languages, where English shows more deaccenting than Spanish. In cases where a pitch accent is present, both groups are nearly categorical in producing F0 rises on stressed syllables. This is consistent with previous studies claiming that F0 is the most consistent marker of Spanish stress. Overall, the group results show that the L1 English speakers in this study consistently produce F0 rises to mark stressed syllables when they produce pitch accents, though they deaccent considerably more than the L1 Spanish speakers do. Duration is also a clear marker of stress for this group, and while this is true for the L1 Spanish speakers as well, the L1 English speakers demonstrate a significantly larger difference, likely due to the influence of their L1. Finally, intensity differences between stressed and unstressed syllables are minimal for both groups. In sum, F0 rises and longer stressed syllables are the acoustic cues that mark stress for both groups, with the differences between groups being in how often an accent is present (more for the L1 Spanish speakers) and in the degree of difference in duration between stressed and unstressed vowels (larger for the L1 English speakers).

When considering the individual results, one of the questions of interest is what does ultimate attainment look like for these very experienced L2 learners? While there is no way to know for sure that they have reached the endpoint of their acquisition, it is likely that they have come close if they have not. After all, they have spent decades immersed in the Spanish culture, using the Spanish language on a daily basis. With all of their experience with the language, what do we see in their production of stress? Do they produce stress the way that native speakers do? If not, how are they different? It is worth noting that most of the L1 English speakers in this study did not produce the correlates of stress in a truly native-like way. For intensity ratio, only 7 of the 12 L1 English speakers produced all of their tokens within the native speaker range, and for duration ratio only 2 of the 12 did so. For F0 movement, the number of speakers within the native speaker range was higher. For deaccenting, 9 of the 12 were within the native speaker range for percentage of deaccenting, and for F0 rises on accented syllables, 6 of the 12 were within the native speaker range for percentage of rises (though, as noted above, of the 6 not in the native speaker range, 3 only had one token without an F0 rise). Speaker 6 was the only speaker to have all tokens within the native speaker range for duration ratio, intensity ratio, percentage of deaccenting and percentage of F0 rises on accented syllables. On the one hand, finding that only one speaker was completely native-like in producing the acoustic correlates of stress in the present study is consistent with work on L2 phonology showing that native-like speech production in the L2 is truly exceptional (e.g., Ioup et al. 1994; Kinsella and Singleton 2014; Moyer 1999; among others). On the other hand, even though only one speaker was completely native-like in their production of the acoustic correlates of stress, as a whole the group was not that different from the native speakers. The speakers that most differed from native speakers on each measure were still similar to the native speakers in their productions. For duration ratio, the speaker with the fewest tokens in the native speaker range still produced more than 84% of the tokens in that range. For intensity ratio, every speaker had more than 94% of tokens in the native speaker range. For F0 movement, the speaker with the most deaccenting showed a 30% rate of deaccenting, which is considerably more than the 20.5% that is the top of the native-speaker range, but still within 10%, while for percentage of F0 rises, the lowest percentage was 90.3%, which is still a very high percentage. In previous studies on Spanish ultimate attainment, speech production similar to that of native speakers has only been found for laterals (Face 2021a). In studies of several other sound classes, native-like pronunciation has not been found by any speaker, with most not even being close to native-like (Face 2018a, 2018b, 2021b; Face and Menke 2020; Menke and Face 2018). With the L1 English speakers in the present study being closer to native-like production of the acoustic correlates of stress than has been found for other aspects of L2 Spanish pronunciation, one must wonder whether these L2 learners would sound different to the native speaker’s ear in how they produce stress. The human ear tends to be more forgiving than close linguistic scrutiny (e.g., Stölten et al. 2015), suggesting that not achieving native-like values in acoustic measurements may not prevent native-speaking listeners from perceiving some highly experienced L2 learners as native speakers. While stress is just one aspect of their speech production, it could be that, at least in this aspect of their speech, their production is close enough to that of native speakers so as not to mark them as having a foreign accent.

6. Conclusions

The US-born, long-time immigrants to Spain who participated in the present study are like native Spanish speakers in their lack of use of intensity and their use of rising pitch accents to mark stress in their L2 Spanish, although their rate of deaccenting is nearly double that of the native speakers. Like native speakers, they also use longer duration to mark stressed vowels, but have a significantly longer duration difference between stressed and unstressed vowels than native speakers do. When considering individual learners, there is considerable variation, as would be expected. The majority of the speakers were native-like on most of the measures (i.e., intensity ratio, deaccenting percentage and percent of F0 rises on accented stressed syllables), but only two speakers were completely native-like in their use of duration to mark stress (i.e., duration ratio). The larger duration difference between stressed and unstressed vowels in English seems to transfer to the L2 Spanish of these learners and persist even after decades of experience living in Spain and using Spanish on a daily basis. When looking across all measures, one of the immigrant speakers was fully native-like in their production of the acoustic correlates of Spanish stress. Even though there were notable differences from native speakers, the other immigrants were much closer to native-like performance than has typically been found for highly experienced L2 learners on other aspects of the Spanish sound system.

While the present study adds to our knowledge of the phonological acquisition of L2 Spanish, especially with respect to the under-studied population of highly experienced learners, it also has limitations that are worthy of mention. The number of participants in the study is small. This is unavoidable, in that it results from participants being drawn from a unique population that is extremely small; nevertheless, it presents limitations. For example, the degree of variability between the speakers may not be representative of the variability across a larger population. In addition, it is possible that some of the results that were not statistically significant would be significant with a larger group of participants. Lastly, in the present study, only one of the 12 L2 learners was native-like in the production of the acoustic correlates of stress examined here, with this result supporting native-like pronunciation as being exceptional. However, it is possible that with a larger number of participants there would be more native-like pronunciation and that this would prove that the native-like production of stress is not truly exceptional. Another limitation is that there is no measure of Spanish language proficiency, which means that there could be considerable differences in proficiency between participants in spite of their long residence in Spain. In addition, this limits the possibility of further distinguishing this group of participants from less experienced speakers in prior studies.

Beyond the limitations just mentioned, there are several remaining questions about the L2 Spanish production of acoustic correlates of stress that future studies should address. First, while the L2 Spanish speakers in the present study were nearly categorical in their production of F0 rises on stressed syllables that were accented, it is unknown whether they produce the same rising pitch accents as native speakers. Multiple pitch accents are characterized by an F0 rise, with factors such as the alignment of the beginning and/or end of the rise with specific points (e.g., the beginning or end of the stressed syllable) distinguishing them. While the present study examined the phonetic presence of an F0 rise, future studies should investigate the phonological nature of the pitch accents used by experienced L2 learners and how they compare to those produced by native speakers. Second, the importance of the rate of deaccenting is unclear. It is an interesting finding that the immigrants in the present study deaccented nearly twice as frequently as the native speakers, yet, when looking at individual results, three quarters of them were within the native speaker range. It could be that one of the native speakers is an outlier that made more of the immigrants appear to have native-like rates of deaccenting. It could be that rate of deaccenting is more variable among native speakers than limited previous research has made it seem. Given that English deaccents more than Spanish, investigating these possibilities would help inform the importance of the rate of deaccenting in L2 Spanish. Third, it would be useful to understand why the greater difference in duration between stressed and unstressed vowels in English compared to Spanish is so persistent in the L2 Spanish of L1 English speakers. One might expect that this would not be the case, as L2 learners eliminate factors such as unstressed vowel reduction that enhance the duration difference. It would also be worth investigating if this persistence is seen in other aspects of duration and timing or if it is specific to stress production. Lastly, while there are differences in the production of acoustic correlates between these L2 Spanish-speaking immigrants and native speakers, these may or may not be sufficient to mark them as having a foreign accent. Since the human ear is more forgiving than linguistic analysis, perception studies would be valuable to determine whether any or all of the differences in the production of the acoustic correlates of stress are socially meaningful, which may potentially be the difference between whether speakers are seen as group members by native-speaking listeners or are marked as outsiders.

Funding

This research received no external funding.

Institutional Review Board Statement

The Institutional Review Board of the University of Minnesota determined that this study, as part of the research project “Development and Ultimate Attainment of Spanish L2 Phonology by Adult Speakers of American English” (study number 0609E92986) is exempt from review under federal guidelines 45 CFR Part 46.101(b) category #2 SURVEYS/INTERVIEWS; STANDARDIZED EDUCATIONAL TESTS; OBSERVATION OF PUBLIC BEHAVIOR.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are not available to anyone other than the author due to a lack of permission from the participants to share data based on their speech.

Acknowledgments

I thank Mandy Menke for her support of this research and her collaboration on other pieces of the larger project of which this research is part. Without her enthusiasm for this project, this research would never have come to fruition. I would also like to thank the four anonymous reviewers, whose comments were extremely useful and were instrumental in making this a stronger paper.

Conflicts of Interest

The author declares no conflict of interest.

Notes

1	These numbers are the amount of time the participants have lived in central Spain full time. Many spent a period of time going back and forth between the United States and Spain before making Spain their full-time home. An extreme case of this is the participant with the shortest full-time residency (11 years), who had also spent 29 summers and two full years in Spain prior to moving to Spain permanently upon retiring from his job in the United States.
2	While only five L1 Spanish speakers are included, a larger number was not deemed necessary due to this group being a point of comparison rather than the target of study, and as the five speakers included are representative of the speakers of this region.
3	The speaker with some proficiency in English and German was included in the comparison group because of her prototypical Castilian accent and the fact that many speakers in this region have some degree of proficiency in one or more languages other than Spanish. In addition, while not examined for all sounds, when this speaker was statistically compared to other members of the control group, she did not demonstrate any differences from the others.
4	It could be argued that outliers are relevant data and should be included in the analysis, a point with which, in general, I would agree and for that reason the outliers were not removed for the group comparisons. However, since for the individual comparisons the speech production of the L2 learners is being compared with native-speaker ranges, including outliers could drastically skew the ranges and artificially show most or all L2 tokens being produced within the native-speaker range. For this reason outliers were removed for the individual comparisons.

References

Boersma, Paul, and David Weenink. 2023. Praat: Doing Phonetics by Computer [Computer Program]. Version 6.3.17. Available online: http://www.praat.org/ (accessed on 10 August 2023).
Bullock, Barbara, and Gillian Lord. 2003. Analogy as a learning tool in second language acquisition: The case of Spanish stress. In Romance Linguistics: Theory and Acquisition. Edited by Ana Teresa Pérez-Leroux and Yves Roberge. Amsterdam: John Benjamins, pp. 281–97. [Google Scholar]
Carlson, Matthew T. 2006. The development of fine-grained phonological knowledge in adult second language learners of Spanish. Southwest Journal of Linguistics 25: 75–105. [Google Scholar]
Cobb, Katherine, and Miquel Simonet. 2015. Adult second language learning of Spanish vowels. Hispania 98: 47–60. [Google Scholar] [CrossRef]
Cruttenden, Alan. 1993. The de-accenting and re-accenting of repeated lexical items. In Proceedings of an ESCA Workshop on Prosody. Working Papers 41. Edited by David House and Paul Touati. Lund: Lund University Department of Linguistics, pp. 16–19. [Google Scholar]
Enríquez, Emilia V., Celia Casado, and Andrés Santos. 1989. La percepción del acento en español. Lingüística Española Actual 11: 241–69. [Google Scholar]
Face, Timothy L. 2005. Syllable weight and the perception of Spanish stress placement by second language learners. Journal of Language and Learning 3: 90–103. [Google Scholar]
Face, Timothy L. 2018a. Ultimate attainment of Spanish rhotics by native English-speaking immigrants to Spain. Lengua y Migración 10: 57–80. [Google Scholar]
Face, Timothy L. 2018b. Ultimate attainment in Spanish spirantization: The case of U.S.-born immigrants in Spain. Spanish in Context 15: 27–53. [Google Scholar] [CrossRef]
Face, Timothy L. 2021a. Ultimate attainment of Spanish laterals by native English-speaking immigrants to Spain. Journal of Second and Multiple Language Acquisition 9: 167–80. [Google Scholar]
Face, Timothy L. 2021b. What does advanced L2 pronunciation look like? Evidence from the ultimate attainment of Spanish consonants. In Advancedness in Second Language Spanish: Definitions, Challenges, and Possibilities. Edited by Mandy R. Menke and Paul A. Malovrh. Amsterdam: John Benjamins, pp. 144–69. [Google Scholar]
Face, Timothy L., and Mandy R. Menke. 2020. L2 acquisition of Spanish VOT by English-speaking immigrants in Spain. Studies in Hispanic and Lusophone Linguistics 13: 361–89. [Google Scholar] [CrossRef]
Hammond, Robert M. 2001. The Sounds of Spanish: Analysis and Application (with Special Reference to American English). Somerville: Cascadilla Press. [Google Scholar]
Ioup, Georgette, Elizabeth Boustagui, Manal El Tigi, and Martha Moselle. 1994. Reexamining the critical period hypothesis: A case study in a naturalistic environment. Studies in Second Language Acquisition 16: 73–98. [Google Scholar] [CrossRef]
Kim, Ji Young. 2015. Perception and production of Spanish lexical stress by Spanish heritage speakers and English L2 learners of Spanish. In Selected Proceedings of the 6th Conference on Laboratory Approaches to Romance Phonology. Edited by Erik W. Willis, Pedro Martín Butragueño and Esther Herrera Zendejas. Somerville: Cascadilla Proceedings Project, pp. 106–28. [Google Scholar]
Kim, Ji Young. 2020. Discrepancy between heritage speakers’ use of suprasegmental cues in the perception and production of Spanish lexical stress. Bilingualism: Language and Cognition 23: 233–50. [Google Scholar] [CrossRef]
Kinsella, Ciara, and David Singleton. 2014. Much more than age. Applied Linguistics 35: 441–62. [Google Scholar] [CrossRef]
Lord, Gillian. 2001. The Second Language Acquisition of Spanish Stress: Derivational, Analogical or Lexical? Doctoral dissertation, The Pennsylvania State University, State College, PA, USA. [Google Scholar]
Lord, Gillian. 2007. The role of lexicon in learning second language stress patterns. Applied Language Learning 17: 1–14. [Google Scholar]
Menke, Mandy R., and Timothy L. Face. 2010. Second language Spanish vowel production: An acoustic analysis. Studies in Hispanic and Lusophone Linguistics 3: 181–214. [Google Scholar] [CrossRef]
Menke, Mandy R., and Timothy L. Face. 2018. Spanish vowel production by English-speaking immigrants in Spain. Paper presented at the Current Approaches to Spanish and Portuguese Second Language Phonology Conference, Indiana University, Bloomington, IN, USA. [Google Scholar]
Mirisis, Christina A. 2019. L2 acquisition of Spanish stress in segmentally identical words. Hispanic Studies Review 4: 98–120. [Google Scholar]
Moyer, Alene. 1999. Ultimate attainment in L2 phonology: The critical factors of age, motivation, and instruction. Studies in Second Language Acquisition 21: 81–103. [Google Scholar] [CrossRef]
Ortega-Llebaria, Marta. 2006. Phonetic cues to stress and accent in Spanish. In Selected Proceedings of the 2nd Conference on Laboratory Approaches to Spanish Phonetics and Phonology. Edited by Manuel Díaz-Campos. Somerville: Cascadilla Proceedings Project, pp. 104–18. [Google Scholar]
Ortega-Llebaria, Marta, Hong Gu, and Jieyu Fan. 2013. English speakers’ perception of Spanish lexical stress: Context-driven L2 stress perception. Journal of Phonetics 41: 186–97. [Google Scholar] [CrossRef]
Quilis, Antonio, and Manuel Esgueva. 1983. Realización de los fonemas vocálicos españoles en posición fonética normal. In Estudios de fonética. Edited by Manuel Esgueva and Margarita Cantarero. Madrid: Consejo Superior de Investigaciones Científicas, pp. 159–251. [Google Scholar]
Quilis, Antonio. 1971. Caracterización fonética del acento español. Travaux de Linguistique et de Littérature 9: 53–72. [Google Scholar]
Saalfeld, Anita K. 2012. Teaching L2 Spanish stress. Foreign Language Annals 45: 283–303. [Google Scholar] [CrossRef]
Stevens, John J. 2011. Vowel duration in second language Spanish vowels: Study abroad vs. at-home learners. Arizona Working Papers in SLA & Teaching 18: 77–104. [Google Scholar]
Stölten, Katrin, Niclas Abrahamsson, and Kenneth Hyltenstam. 2015. Effects of age and speaking rate on voice onset time. Studies in Second Language Acquisition 37: 71–100. [Google Scholar] [CrossRef]
Tarone, Elaine. 1983. On the variability of interlangauge systems. Applied Linguistics 4: 143–63. [Google Scholar] [CrossRef]
Tight, Daniel G. 2007. Lexical subregularities and the stress preferences of L2 Spanish learners. Hispania 90: 565–78. [Google Scholar] [CrossRef]

Table 1. Demographic information for each L2 Spanish-speaking participant.

Speaker	Gender	Age	Length of Residence (Years)
1	F	84	60
2	F	68	48
3	M	63	36
4	M	72	11
5	F	47	23
6	F	48	26
7	F	41	15
8	F	63	37
9	F	63	36
10	M	65	25
11	F	77	55
12	F	64	37

Table 2. Mixed effects model for duration ratio with speaker and word as random factors.

	Num. df	Denom. df	F	Sig.
Intercept	1	627	4550.106	<0.001
L1	1	627	22.449	<0.001
Vowel	2	627	1.231	0.293
Stressed Vowel	1	627	110.412	<0.001
*L1Vowel**	2	627	1.723	0.179
*L1Stressed Vowel**	1	627	4.465	0.035

Table 3. Mixed effects model for intensity ratio with speaker and word as random factors.

	Num. df	Denom. df	F	Sig.
Intercept	1	627	323,470.331	<0.001
L1	1	627	0.352	0.553
Vowel	2	627	5.229	0.006
Stressed Vowel	1	627	16.563	<0.001
*L1Vowel**	2	627	0.444	0.642
*L1Stressed Vowel**	1	627	0.094	0.759

Table 4. F0 movement on stressed syllable by speaker group after cases of deaccenting removed.

	Rise	Fall	Rise-Fall	Flat
L1 Spanish	162	2	0	0
L1 English	356	3	4	4

χ² (3, N = 531) = 3.806, p = 0.283.

Table 5. Tokens for each L1 English speaker within the native speaker range for duration ratio and intensity ratio.

Speaker	Duration Ratio	Intensity Ratio
	NS Range: 0.581–2.076	NS Range: 0.94–1.114
1	34/34	33/34
2	34/35	37/37
3	38/40	38/39
4	36/38	38/38
5	32/33	35/35
6	37/37	40/40
7	31/36	33/35
8	34/36	35/35
9	35/37	38/39
10	33/38	37/39
11	33/39	37/37
12	33/34	35/35
Total	410/437 (93.8%)	436/443 (98.4%)

Table 6. Percentages of stressed syllables deaccented and percentage of F0 rises when not deaccented for each L1 English speaker.

Speaker	% Deaccented	% Rises When Accented
	NS Range: 0–20.5%	NS Range: 96.8–100%
1	8.8 (3/34)	100 (31/31)
2	13.2 (5/38)	97.0 (32/33)
3	17.5 (7/40)	100 (33/33)
4	18.4 (7/38)	90.3 (28/31)
5	10.8 (4/37)	100 (33/33)
6	17.5 (7/40)	100 (33/33)
7	25.0 (9/36)	96.3 (26/27)
8	19.4 (7/36)	96.6 (28/29)
9	17.9 (7/39)	93.8 (30/32)
10	30.0 (12/40)	96.4 (27/28)
11	20.5 (8/39)	93.5 (29/31)
12	27.8 (10/36)	100 (26/26)
Total	18.9 (86/453)	97.0 (356/367)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

Speaker	Gender	Age	Length of Residence (Years)
1	F	84	60
2	F	68	48
3	M	63	36
4	M	72	11
5	F	47	23
6	F	48	26
7	F	41	15
8	F	63	37
9	F	63	36
10	M	65	25
11	F	77	55
12	F	64	37

Speaker	Gender	Age	Length of Residence (Years)
1	F	84	60
2	F	68	48
3	M	63	36
4	M	72	11
5	F	47	23
6	F	48	26
7	F	41	15
8	F	63	37
9	F	63	36
10	M	65	25
11	F	77	55
12	F	64	37

Speaker	Gender	Age	Length of Residence (Years)
1	F	84	60
2	F	68	48
3	M	63	36
4	M	72	11
5	F	47	23
6	F	48	26
7	F	41	15
8	F	63	37
9	F	63	36
10	M	65	25
11	F	77	55
12	F	64	37