Next Article in Journal
German Noun Plurals in Simultaneous Bilingual vs. Successive Bilingual vs. Monolingual Kindergarten Children: The Role of Linguistic and Extralinguistic Variables
Next Article in Special Issue
Is the Foot a Prosodic Domain in European Portuguese?
Previous Article in Journal
Bilingualism of Children in Different Multilingual Contexts
Previous Article in Special Issue
Plural Alternations and Word-Final Consonant Syllabification in Brazilian Veneto
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unfolding Prosody Guides the Development of Word Segmentation

Center of Linguistics, School of Arts and Humanities, University of Lisbon, 1600-214 Lisbon, Portugal
*
Author to whom correspondence should be addressed.
Languages 2024, 9(9), 305; https://doi.org/10.3390/languages9090305
Submission received: 7 October 2023 / Revised: 7 August 2024 / Accepted: 7 September 2024 / Published: 19 September 2024
(This article belongs to the Special Issue Phonetic and Phonological Complexity in Romance Languages)

Abstract

:
Prosody is known to scaffold the learning of language, and thus understanding prosodic development is vital for language acquisition. The present study explored the unfolding prosody model of prosodic development (proposed in Frota’s et al. study in 2016) beyond early production data, to examine whether it predicted the development of early segmentation abilities. European Portuguese-learning infants aged between 5 and 17 months were tested in a series of word segmentation experiments. Developing prosodic structure was evidenced in word segmentation as proposed by the unfolding model: (i) a simple monosyllabic word shape crucially placed at a major prosodic edge was segmented first, before more complex word shapes under similar prosodic conditions; (ii) the segmentation of more complex words was easier at a major prosodic edge than in phrase-medial position; and (iii) the segmentation of complex words with an iambic pattern preceded the segmentation of words with a trochaic pattern. These findings demonstrated that word segmentation evolved with unfolding prosody, suggesting that the prosodic units developed in the unfolding process are used both as speech production planning units and to extract word-forms from continuous speech. Therefore, our study contributes to a better understanding of the mechanisms underlying word segmentation, and to a better understanding of early prosodic development, a cornerstone of language acquisition.

1. Introduction

Understanding early prosodic development is vital for language acquisition. Decades of accumulated evidence have shown that infants exhibit a precocious sensitivity to prosody from birth, and that prosody scaffolds the learning of language (e.g., Christophe et al. 2008; Gleitman and Wanner 1982; Jusczyk 1997). One of the abilities that infants need to develop to learn a language is word segmentation, that is the ability to segment potential word-forms from continuous speech. Prosody has been suggested to facilitate early word segmentation, for example by promoting the extraction of word-forms placed at major prosodic edges. Despite the widely recognized importance of prosody for language acquisition, there is not yet a comprehensive model of prosodic development, that integrates high-level (phrasal) and low-level (word) prosody, as well as perception and production. The unfolding prosody model of prosodic development (Frota et al. 2016), initially proposed on the basis of early production data, suggests that prosodic development proceeds by the gradual unfolding of key prosodic domains of the ambient language, which starts with a simple unit and advances through the expansion and nesting of more complex prosodic structures. In the current study, we explore the unfolding prosody model beyond early production data, to examine whether it successfully predicts the development of word segmentation abilities.
All languages display some kind of prosodic structure, i.e., hierarchically organized prosodic units, which may be implemented by specific rules and realized in language-particular ways (Frota 2012; Shattuck-Hufnagel and Turk 1996). Units like syllables, feet, prosodic words, phonological phrases, and intonational phrases form prosodic groupings at different levels, and typically each lower-level unit is included in a higher-level structure. Prosodic structure has been suggested to provide a scaffolding to language, given that different prosodic units interface and align with different components of grammar (Frota and Vigário 2018; Nespor and Vogel 2007; Selkirk and Lee 2015). It has also been seen as the governing framework of adult speech recognition and adult speech production planning (McQueen and Dilley 2020; Shattuck-Hufnagel 2019). Unsurprisingly, prosody has also been shown to play a key role in language development, providing information that helps infants discover many aspects of the organization of their native language. This process is known as prosodic bootstrapping (Höhle 2009; Morgan and Demuth 1996; Gervain et al. 2020 for a recent review). Consequently, a clear understanding of early prosodic development seems fundamental for the understanding of the abilities driving language learning and the functions of prosody in language acquisition.
Existing accounts of prosodic development have been mostly production-oriented, and mainly focused on prosodic structure up to the word level (Demuth 2018a, 2018b; Kehoe 2018; Vihman 2018, for reviews). By focusing on the development of the production of word shapes, many current proposals do not integrate phrasal prosody or early prosodic development as manifested in infants’ speech perception and processing. In recent state-of-the-art overviews, word prosody and phrasal prosody are treated separately (Chen et al. 2020; Fikkert et al. 2020). In Demuth’s (2018a) overview of the development of prosodic structure, which is almost fully based on production data, the discussion comprises the acquisition of feet, prosodic word structures, and higher levels related to the production and prosodization of grammatical morphemes. The author highlights that it is ‘still not clear how and when the higher levels of the PP [phonological phrase] and IP [intonational phrase] are acquired’ (p. 689). Further, two fundamental disagreements persist in the literature. First, in some accounts it is argued that prosodic structure develops in a bottom-up fashion (Fikkert 1994; Demuth and Fee 1995), while others propose a top-down development (Santos 2005). Second, it is undetermined whether prosodic structure is fully available to the child, or whether acquisition begins without access to the relevant hierarchy of prosodic units (Demuth and Fee 1995). Indeed, a comprehensive account of the acquisition of prosodic structure is yet to be offered.
The model of prosodic unfolding (Frota et al. 2016) explicitly integrates high-level and low-level prosody. Put forward to account for early production data, it contends that key units of the prosodic structure of the ambient language match early on in development, and development proceeds by unfolding the different prosodic levels. Unfolding is defined as a process of expansion and nesting of prosodic units of different types, between the higher phrasal level of the intonational phrase and the lower syllable level. In the particular case of European Portuguese child speech produced between one and two years of age, prosodic phrasing was found to evolve in three major steps, as depicted in (1). In the first step, the initial production unit is constrained to a one syllable/one prosodic word (PW)/one intonational phrase (IP) unit. In the second, unfolding proceeds by expansion at the PW-level, that is the production unit is enlarged to comprise a one-PW phrase where the word may have a more complex shape. At this stage, there is a distinction between the syllable and the PW. Finally, unfolding proceeds through expansion at the phrase (IP)-level, that is the prosodic unit is enlarged at the phrase-level to comprise a prosodic phrase with more than one PW and thus a more complex phrasing pattern, as in adult speech. At this stage, there is a distinction between the PW-level and the phrase-level.
The key evidence for this developmental pattern in European Portuguese children’s speech came from word truncation data, intonation contours (namely, pitch accent distribution at the word and phrase levels and pitch reset), and pause distribution. For example, the bisyllabic target word Tatá ([taˈta], the way the child calls herself), or maçã ([mɐˈs ɐ ̃ ], ‘apple’) is first either truncated to a monosyllabic word shape (e.g., [ˈta], [ˈmɐ], respectively) produced with a nuclear pitch contour, as shown in (1a) for Tatá, or uttered with a nuclear pitch contour in each syllable (e.g., the declarative falling contour, H+L* L%). The falling pitch accent in each syllable indicates that each syllable was produced as a PW. The presence of a nuclear contour, together with a pause separating the two syllables, indicates that the one-syllable PW was produced as a prosodic phrase. In the second step (1b), the bisyllabic target displays one pitch accent, showing that both syllables were integrated into the same PW, which receives a nuclear contour and is separated from other PWs by a pause and pitch reset. Finally, more than one PW is integrated in a prosodic unit with a single intonation contour (the typical statement contour from EP, H* H+L* L%), without a pause and/or pitch reset, setting PWs apart into different prosodic phrases (1c). This approach is in line with earlier suggestions; based on observations from several languages, the domain of the intonation contour (in current prosodic phonology terms, the IP) serves as a production unit at the beginnings of child speech (Boysson-Bardies et al. 1981), and that the first word combinations are successions of one-word phrases (Behrens and Gut 2005). The development from the first to the second step is suggested to occur before 1;06, and the development from the second to the third step around 1;09 (Frota et al. 2016). Prosodic unfolding is proposed to capture the learning of prosodic structure, that is how sound sequences are organized into prosodic groupings of varying size and complexity, that might largely precede the learning of syntactic and pragmatic aspects of language (and might facilitate such learning).
Languages 09 00305 i001
The model of prosodic unfolding is thus intended to capture the development of key units of prosodic structure between the syllable and the IP (such as the foot and the PW, and possibly phrase-level structures between the PW and the IP). Differences in prosodic unfolding across languages are expected to be driven by language particular prosodic properties. For example, in a language like English, unfolding might start from a foot-PW-IP unit or include an additional step with the development of a foot-sized PW and phrase. This would be consistent with the relevant role played by the foot in the phonology and prosodic structure of English, and the presence of a minimal constraint on the format of early words, that match a binary foot (Hammond 1999; Demuth 2006). These prosodic properties contrast with the lack of evidence for the foot in the phonology of European Portuguese (Vigário 2016; see also Vigário and Martínez-Paricio 2023), together with the absence of minimality effects on early word shapes (Vigário et al. 2006). How language specific prosodic properties might impact prosodic unfolding is an empirical question that remains to be addressed. Another open empirical question is whether the prosodic unfolding view might account for early prosodic development beyond production data, namely as it is manifested in infants’ and toddlers’ speech perception and processing. Like other accounts, prosodic unfolding was proposed and further tested (Matos 2021) exclusively on the basis of production data. It has also been related to the development of prosody-based speech planning in acquisition (Shattuck-Hufnagel 2020). Importantly, its predictions for infants’ speech perception and speech processing have not yet been investigated. The main goal of the current study is to address the implications of the prosodic unfolding view for infants’ speech perception and speech processing. To that end, we focused on whether and how prosodic unfolding might guide the development of word segmentation abilities.
Early word segmentation has a central role in language acquisition, especially in relation to word learning and the development of syntax (Nazzi et al. 2003; Newman et al. 2006; Singh et al. 2012). It has long been recognized that prosody plays a part in early word segmentation (e.g., Christophe et al. 2003; Jusczyk and Aslin 1995), and that the age at which segmentation abilities emerge may vary across languages. The word-like units that infants begin to extract from continuous speech seem to be modulated by the rhythmic structure of their native language. Infants learning stress-timed languages, like English, German, and Dutch, start by segmenting units that begin with a stressed syllable at 7.5 months, but fail with iambic words until 10.5 months (e.g., Höhle and Weissenborn 2005; Jusczyk et al. 1999; Kooijman et al. 2009). By contrast, infants learning syllable-timed languages, such as French, Catalan, and Spanish, start by segmenting syllable-sized units (Bosch et al. 2013; Nishibayashi et al. 2015, among others) at 6 months. However, several studies failed to find evidence for bisyllabic (iambic) word segmentation by Parisian French-learning infants before 16 months of age, with successful segmentation being constrained by passage-word order of presentation, differences in infant directed speech style, or duration of the familiarization phase, unlike for English-learning infants (e.g., Nazzi et al. 2014). The advantage in segmenting both monosyllabic and bisyllabic trochaic word-forms in stress-timed languages, and segmenting monosyllabic word-forms in syllable-timed languages, has been explained by the Rhythmic Segmentation Hypothesis (Mersad et al. 2010): Given infants’ precocious sensitivity to language rhythm, infants learn a language-particular rhythmic-based segmentation procedure that they use to identify and extract word-forms from continuous speech. Word stress has also been suggested to be a language-specific cue that might be useful in infant word segmentation, especially in languages like English where it reliably aligns with the beginning of a word (Jusczyk et al. 1993, 1999). In addition, intonation (high pitch) has been reported to be crucial to German-learning infants’ perception of stressed syllables as word onsets (Zahner et al. 2016). Another aspect of prosody has been shown to play an important role in early segmentation abilities, namely the position of the target word at a major prosodic edge. English-learning infants at 8 months of age found words at utterance edges easier to segment than words within the utterance (Seidl and Johnson 2006). Moreover, they were able to segment target monosyllabic words at 6 months of age only if located at utterance edges (Johnson et al. 2014), and recognize syllable sequences only when they aligned with an IP boundary rather than straddling it (Shukla et al. 2011). Thus, major prosodic edges seem to provide stronger cues to word boundaries, offering a perceptual advantage that may underlie the early emergence of word segmentation. However, given the language-particular nature of the cues to prosodic edges (Frota 2012; Johnson and Seidl 2008; Wellmann et al. 2012), the function of prosodic edges as facilitators of word segmentation needs to be tested with different languages.
European Portuguese (EP) is a language with an uncommon combination of prosodic properties. The language has been described as displaying a mixed rhythm, with a set of phonetic and phonological properties that are usually found either in stress-timed (e.g., English, Dutch) or syllable-timed languages (e.g., French, Spanish; Frota and Vigário 2001). Word stress in EP also displays a conflicting set of cues (Frota et al. 2020). The frequency distribution of stress patterns fails to provide clear-cut data for a predominant pattern in the language, and the main correlates of stress are an unusual combination of vowel quality cues and duration cues. The language-specific stress cues underlie a processing advantage for iambic stress in adults and infants (Frota et al. 2020; Lu et al. 2018), and children start producing iambic word shapes sooner (and more frequently) than trochaic ones (Vigário et al. 2006). Additionally, differently from other Romance languages, EP offers strong cues to higher prosodic phrase edges as well as to PW edges, but not to lower phrase edges such as the phonological phrase (Frota 2014; Frota and Prieto 2015; Vigário 2003). Pitch range, lengthening, and pause distribution are robust cues to high-level phrasing, with the rightmost PW in the IP/utterance being the most prominent one and bearing the nuclear pitch contour; the PW-level is signaled by edge-specific phonotactics and prominence-related cues (Frota 2000; Vigário 2003). Languages with mixed rhythm, like EP, constitute a challenge to the Rhythmic Segmentation Hypothesis, which does not offer a clear prediction for the development of word segmentation in EP-learners. The position of word stress is also challenging, given that it is not a reliable indicator of either side of the word. In turn, the strong cues to major prosodic edges and to the PW could provide support to word segmentation.
The role of prosody in early segmentation abilities in European Portuguese-learning infants has been examined in two former studies, that focused on monosyllabic word-forms. Evidence for word segmentation was found as early as four–five months of age, but only for target words located at utterance edges, while segmentation within the utterance was still developing by ten months of age (Butler and Frota 2018). These findings highlight the crucial role played by the salient cues found at major prosodic edges. They also underscore a protracted development of monosyllabic word segmentation regardless of prosodic edges, that contrasts with the findings from syllable-timed languages, and may be due to the mixed rhythmic nature of EP. Moreover, 12-month-old infants were able to successfully segment at utterance-internal major prosodic edges, but not at other utterance-internal positions (Severino 2016). Overall, EP-learning infants demonstrated segmentation abilities in the first year of life at major prosodic edges only, namely at utterance-final and utterance-internal intonational phrase edges. It is unknown how segmentation abilities develop beyond monosyllabic word-forms.
To the best of our knowledge, there are no studies investigating the potential interaction between the rhythmic factors that constrain the timing and development of the segmentation of different word shapes (i.e., monosyllabic, bisyllabic trochaic, or iambic word shapes), and the prosodic edge factor that constrains the location in the utterance of the word-forms that are segmented. Models of developing word segmentation that include prosody have separately explored the role of stress (Börschinger and Johnson 2014) and the role of prosodic boundaries (Ludusan and Dupoux 2016; Ludusan et al. 2022). Moreover, such models have not considered the paths of prosodic development. The unfolding prosody model of prosodic development offers a framework to explore whether and how high-level and low-level prosody shape the development of word segmentation abilities. In the current study, we examined whether developing prosodic structure might constrain early word segmentation in European Portuguese-learning infants. We hypothesized that the development of word segmentation abilities is modulated by prosodic development as predicted by the unfolding prosody model. Specifically, monosyllabic word segmentation is expected to precede bisyllabic word segmentation under similar prosodic phrasing conditions, and segmentation next to a major prosodic edge is expected to be easier than in phrase-medial positions. Moreover, the language-specific prominence patterns both at the phrase-level, where prominence is rightmost, and at the word-level, where the iambic pattern is reported to be easier to process, are expected to favor the segmentation of iambic word shapes over trochaic word shapes. The development of word segmentation is thus predicted to target first monosyllabic word-forms placed at a major prosodic edge (IP/utterance), followed by more complex word shapes also next to a major prosodic edge, namely if they display an iambic prominence pattern, and, finally, word shapes placed at other, less prominent, prosodic positions.
The predictions derived from the unfolding prosody model were investigated through a series of seven word segmentation experiments in EP-learning infants aged between 5 and 17 months. A modified version of the visual familiarization paradigm with a passages-first order was used, given that previous studies on the prosodic edge factor also first familiarized infants with passages containing a target word and then tested their segmentation abilities, presenting isolated word-forms that either were or were not present in the familiarization passages. Also, in line with prior work, word segmentation would be demonstrated by a difference in looking times to targets (familiar word-forms presented at the prosodic edge or prosodic medial position) and distractors (unfamiliar word-forms not present in familiarization).

2. Materials and Methods: Overview across Experiments

2.1. Participants

All the children that participated in this study were full-term, typically developing infants raised in monolingual EP homes, recruited from the wider Lisbon area. Overall, 140 children were tested and included in analyses. Data collection took place before the COVID-19 pandemic, and after December 2022 (for six of the children). This ensured that all children that participated were born and had lived up to the moment of data collection either in pre- or post-pandemic times, excluding any effects of COVID-19-related factors on early language development (such as those reported in (Frota et al. 2022), specifically on the development of early word segmentation abilities).
This study was carried out in accordance with the recommendations of the European Union Agency for Fundamental Rights and the Declaration of Helsinki. The experimental procedures and informed consent protocols were approved by the ethics committees “Comissão de Ética para a Saúde do Centro Hospitalar Lisboa Norte” (Ref.ª DIRCLN-16JUL2014-208) and “Comissão de Ética para a Saúde da Administração Regional de Saúde de Lisboa e Vale do Tejo” (Proc.015/CES/INV/2014) as part of the EBELa project, and by the Ethics Committee of the School of Arts and Humanities, University of Lisbon (number 13_CEI2019) as part of the PLOs project. Informed written consent was obtained from caregivers prior to data collection.

2.2. Materials

The word segmentation experiment from Butler and Frota (2018) was used. That first study investigated the effect of major prosodic edges on the emergence of word segmentation abilities by presenting passages with monosyllabic target word-forms in two prosodic conditions—at the utterance-edge and at utterance-medial position—followed by lists of isolated words. The utterance-edge condition examined was the utterance final edge, given that this is the most prosodically prominent position in the language, due to the salient prosodic cues that characterize the rightmost PW in the utterance (namely, the presence of the nuclear pitch contour cued by pitch range and lengthening). In the series of experiments reported in the current study, the same passages were used as in Butler and Frota (2018), but the monosyllabic targets were replaced by bisyllabic targets. The bisyllabic targets were pseudowords with a CV.CV structure, the most common format for bisyllabic words in the language, both in adult- and child-directed speech (Frota et al. 2012; Vigário et al. 2006). There were two passages for each bisyllabic pseudoword: One with the pseudoword located at the utterance-edge, the most prominent position, and one with the pseudoword located in utterance-medial position, the less prominent position. In utterance-medial position, the pseudoword was aligned with a lower phrase boundary in half of the cases, and matched a lower phrase-internal word in the other half. The passages consisted of six short sentences each (between 10 and 12 syllables in length). There were no major prosodic boundaries (i.e., IP boundaries) within the sentences, and thus all utterances correspond to a single IP. The target pseudoword was placed after a lexical word half of the time, and after a functional word the other half. This controlled variability in number of syllables and placement of the targets followed previous studies (e.g., Johnson et al. 2014; Polka and Sundara 2012; Seidl and Johnson 2006), and ensured that, overall, the stimuli were closer to the variability found in speech. Crucially, the only experimental contrast between the passages was the utterance-edge versus utterance-medial location of target pseudowords (the stimuli passages and word-forms used in the 7 experiments are provided in Appendix A).
To replicate Butler & Frota’s study as closely as possible, the passages and word lists were recorded by the same female native speaker of EP as in the initial study, and in the same infant-directed speech style, using the same Sony unidirectional microphone (sampling frequency, 22,050 Hz). A sound file was created in Audacity for each passage, with a 500 ms interval between each sentence. Each word list was created from different exemplars of a target word-form recorded in isolation, with varying intonation patterns, to ensure that the children’s task was not simply matching the acoustic patterns of the isolated word-forms with those of word-forms previously heard in the passages. Instead, children would have to extract the word-form presented in the passage and recognize new exemplars of the word-form presented in the word list. Word lists included 15 exemplars of each word-form, with a 500 ms interval between each exemplar. The passages and all the sound stimuli used are available at (http://labfon.letras.ulisboa.pt/babylab/infant_word_segmentation/unfolding_prosody_supporting_materials.htm).

2.3. Procedure

A modified version of the visual familiarization paradigm was used, following Butler and Frota (2018). The procedure was the same across the seven experiments. Participants were seated on a caregiver’s lap in front of a computer monitor, with loudspeakers hidden behind the monitor. The stimuli were presented at ~70 (+/−5) dB intensity. A camera was placed above the monitor to record the experimental session. The experiment initiated with a baby-friendly, attention-getting image. After the child fixated that image for two consecutive seconds, a trial started. Each trial consisted of a red display paired with a sound file. Trial duration was controlled by the child, i.e., the trial stopped if the infant looked away from the screen for more than two seconds, or the sound file ended. Then, the attention getter was presented again.
A segmentation experiment consisted of two phases: familiarization and test. Each participant was familiarized with two passages, one with a target word-form in utterance-edge position, and the other with a different target word-form in utterance-medial position. The two passages were presented alternatively until 25 s of looking time accumulated for each passage. Once this criterion was reached, the familiarization phase ended, and the test phase immediately followed. This ensured that all infants were exposed to the familiarization stimuli for the same amount of time. In the test phase, four different word lists were used, each containing exemplars of a different target word-form. Two of the lists had the two target words heard during familiarization (one presented in utterance-edge position, and one presented in utterance-medial position), and the other two lists had target words unfamiliar to the children (i.e., not presented during familiarization). Each of the word lists was randomly presented three times, split into three blocks of four test trials so that each word list was heard once before any word list was presented for a second time, and each word list was heard twice before any word list was presented for a third time. After the presentation of the 12 (4 × 3) test trials, the experiment ended. Importantly, the methods followed were very similar to those implemented in earlier studies on other languages, namely passage size/length and complexity (Jusczyk et al. 1999; Nazzi et al. 2014), number of exemplars of each target word in a word list in the test phase, and structure of the test phase (Johnson et al. 2014; Nazzi et al. 2014; Seidl and Johnson 2006). The duration of the familiarization phase was the same as in Seidl and Johnson (2006). The LOOK software (Meints and Woodford 2005) was used to control stimuli presentation and to record participants’ time looking to the screen, that was coded online by an experimenter. The experimenter, who was blind to the experimental conditions and wore headphones playing masking music, was hidden from the participants’ view. In this paradigm, any consistent difference in looking times to familiar word-forms in utterance-edge position, familiar word-forms in utterance-medial position, and unfamiliar word-forms is taken as an indication of segmentation abilities (e.g., Johnson et al. 2014; Seidl and Johnson 2006).

3. Segmentation of Bisyllabic Iambic Word-Forms

The two former studies on early segmentation abilities in EP-learning infants (Butler and Frota 2018; Severino 2016) demonstrated that in the first year of life, segmentation of monosyllabic word-forms was successful only at major prosodic edges (utterance/IP-final edge). In particular, segmentation of monosyllabic targets at major prosodic edges was found from four to five months of age. If unfolding prosody guides the development of word segmentation as hypothesized in the current study, it is predicted that, in EP, segmentation of monosyllabic word-forms placed at a major prosodic edge critically precedes segmentation of more complex word shapes under similar prosodic conditions. Secondly, segmentation of more complex word shapes is expected to be easier at a major prosodic edge than in intonational phrase-medial position. In addition, given that both phrase and word prominence patterns suggest that the iambic pattern is easier to process and acquire (Frota 2014; Frota et al. 2020; Lu et al. 2018; Vigário et al. 2006), the first type of complex word shapes to be segmented are expected to display an iambic prominence pattern. The following series of experiments were run to test these predictions. The finding that successful bisyllabic segmentation develops first at major prosodic edges, but later than the segmentation of simple monosyllabic word shapes, would provide supporting evidence for the unfolding prosody model beyond production.

3.1. Experiment 1

In a previous study, evidence for segmentation of monosyllabic words was found early in development, from four to five months of age. In Experiment 1, we focused on 5–7-month-old infants to examine whether they already demonstrated segmentation abilities for bisyllabic iambic word shapes.

3.1.1. Participants

Twenty infants were tested and included in the analysis (mean age 6 months 13 days; age range 4 months 25 days–7 months 14 days; eleven girls and nine boys). One additional infant was tested but excluded due to fussiness.

3.1.2. Materials

Four bisyllabic pseudowords with iambic stress were used in Experiment 1: FISSÁ [fiˈsa], CANÉ [kɐˈnɛ], PINÓ [piˈnɔ], and SUTÉ [suˈtɛ]. The onset consonants chosen are frequent in the language, and the pseudowords end with an open vowel matching the common format of CV.CV words with an iambic stress pattern. Acoustic measurements of the target pseudowords in the utterance-edge and utterance-medial conditions can be found in Table 1. Both word duration and pitch range were measured, given that they constitute robust cues to high-level prosodic phrasing in the language. Pitch range measured the difference between the lowest pitch value and the highest pitch value in the word. As expected, the pseudowords in the utterance-edge condition display pre-boundary lengthening together with a greater pitch range in comparison to the pseudowords in the utterance-medial condition. The larger pitch range is manifested by a pitch fall, due to the presence of a low boundary tone at the utterance/IP edge (annotated as L% in Table 1, according to the labelling conventions within the auto-segmental metrical framework of intonational phonology; Frota 2014; Ladd 2008). By contrast, no boundary tones are found in IP-medial position.

3.1.3. Procedure

The pseudowords were paired for passage presentation, and half of the infants were familiarized with FISSÁ and CANÉ, and the other half with PINÓ and SUTÉ. Consequently, two of the four pseudowords were familiar targets in the test phase of the experiment, whereas the other two were unfamiliar pseudowords. During passage presentation, the position of the target word in the utterance was counterbalanced (i.e., half of the participants heard FISSÁ at the utterance-edge, and for the other half FISSÁ was presented in the utterance-medial position, and so on).

3.1.4. Results and Discussion

The participants’ online-coded looking time was extracted, and, like in Butler and Frota (2018) original study, the looking time data were analyzed by means of ANOVA.
Average looking times in the test phase for the three experimental conditions—familiar targets in utterance-edge position, familiar targets in utterance-medial position, and unfamiliar word-forms—are plotted in Figure 1. A repeated measures ANOVA with a within-subject factor of condition (edge, medial, unfamiliar) found no significant differences in looking time (F(1.330, 24.271) = 0.982, p = 0.356, η2 = 0.049; degrees of freedom were corrected using Greenhouse–Geisser estimates, given that the assumption of sphericity was violated, χ2(2) = 0.496, p = 0.002). The results thus indicate that 5–7-month-old infants are not segmenting bisyllabic iambic word shapes.
The findings from Exp.1 contrast with earlier results from same age infants showing segmentation abilities for monosyllabic word-forms at the utterance-edge condition (Butler and Frota 2018). The present results suggest that segmentation of monosyllabic word-forms in the utterance-edge condition precedes segmentation of the more complex bisyllabic word shapes, also in the utterance-edge condition. These findings are in line with the predictions of the unfolding prosody model. However, it is unknown when and how segmentation abilities for bisyllabic iambic word-forms emerge. To further investigate this question, a group of older infants was tested in Experiment 2.

3.2. Experiment 2

In Experiment 2, 8–10-month-old infants were tested to examine whether segmentation abilities for bisyllabic iambic word-forms have already emerged by this age, especially at the utterance-edge.

3.2.1. Participants

Twenty infants were tested and included in the analysis (mean age 9 months 6 days; age range 7 months 18 days–10 months 6 days; nine girls and eleven boys). Five additional infants were tested but excluded due to fussiness (4) and camera failure (1).

3.2.2. Materials and Procedure

The materials and procedure were the same as in Experiment 1.

3.2.3. Results and Discussion

Average looking times in the test phase for the three experimental conditions (familiar targets in edge position, familiar targets in medial position, and unfamiliar word-forms) are shown in Figure 1. A repeated measures ANOVA with the within-subject factor of condition (edge, medial, unfamiliar) revealed no effect (F(2, 38) = 3.236, p = 0.171, η2 = 0.089), showing that segmentation abilities for bisyllabic iambic word-forms have not emerged yet.
Previous findings on the segmentation of bisyllabic words with an iambic stress pattern by American English-learning 7.5-month-old infants showed a mis-segmentation strategy whereby the stressed syllable was treated as a word onset (Jusczyk et al. 1999). French-learning infants have also been shown not to segment bisyllabic words at 6 months and 8 months of age, while segmenting syllables embedded in bisyllabic words (Goyet et al. 2013; Nazzi et al. 2014; Nishibayashi et al. 2015). These earlier findings raise the question whether the EP-learning infants in Experiment 2 might be unable to segment whole bisyllabic words, but would show successful segmentation of the embedded stressed syllable. In other words, whether they would be processing the stressed syllable as a monosyllabic target and thus apply a monosyllabic segmentation strategy, especially in the prosodic edge condition. This possibility was explored in Experiment 3.

3.3. Experiment 3

In Experiment 3, another group of 8–10-month-old infants were tested to investigate the possibility that at this age infants are using a monosyllabic segmentation strategy and extracting the stressed syllable instead of segmenting the whole bisyllabic iambic word.

3.3.1. Participants

Twenty infants were tested and included in the analysis (mean age 8 months 28 days; age range 7 months 23 days–10 months 5 days; nine girls and eleven boys). Four additional infants were tested but excluded due to fussiness (2) and living in a bilingual household (2).

3.3.2. Materials and Procedure

The materials and procedure were the same as in Experiments 1 and 2, with one difference. In the test phase, infants were presented with monosyllabic word-forms instead of the bisyllabic targets. Four word-lists were used, each containing exemplars of the stressed syllable of each of the four bisyllabic iambic targets (FISSÁ, CANÉ, PINÓ, and SUTÉ) produced as a monosyllabic word: SÁ [ˈsa], NÉ [ˈnɛ], NÓ [ˈnɔ], and TÉ [ˈtɛ]. Two of the lists had monosyllabic pseudowords that matched the stressed syllable of the target words heard during familiarization, and the other two had monosyllabic pseudowords unfamiliar to the infants.

3.3.3. Results and Discussion

A repeated measures ANOVA with a within-subject factor of condition (edge, medial, unfamiliar) showed no significant differences in looking time (F(2, 38) = 1.259, p = 0.296, η2 = 0.062; edge M = 6.44, SD = 0.51; medial M = 5.81, SD = 0.55; unfamiliar M = 6.79, SD = 0.37). Thus, Experiment 3 did not reveal that infants were using a monosyllabic segmentation strategy.
The failure to segment the stressed syllable embedded in the bisyllabic iambic target, together with the unsuccessful segmentation of the whole bisyllabic word shown in Experiment 2, might indicate that the embedded syllable, by not matching the full properties of a monosyllabic word, is not recognized as such, suggesting that it is already being processed as part of a larger unit. However, the larger unit is not yet identified and extracted from continuous speech. To further explore the development of segmentation abilities for bisyllabic iambic words, a group of older infants was tested in Experiment 4.

3.4. Experiment 4

The aim of Experiment 4 was again to assess bisyllabic segmentation as in Experiments 1 and 2, to determine whether segmentation abilities for the more complex bisyllabic iambic word shape would emerge by the end of the first year of life.

3.4.1. Participants

Twenty infants were tested and included in the analysis (mean age 12 months 1 day; age range 10 months 29 days–12 months 29 days; ten girls and ten boys). Three additional infants were tested but excluded because they were distracted during the experiment.

3.4.2. Materials and Procedure

The materials and procedure were the same as in Experiments 1 and 2.

3.4.3. Results and Discussion

Average looking times in the test phase for the three experimental conditions (edge, medial and unfamiliar) are shown in Figure 1. A repeated measures ANOVA with the within-subject factor of condition found no effect (F(2, 38) = 2.448, p = 0.621, η2 = 0.025), showing that segmentation abilities for bisyllabic iambic word-forms have not emerged by the end of the first year of life.
The present findings are not totally unexpected. Several studies on word segmentation in French-learning infants have reported that successful bisyllabic segmentation before 12 months of age is constrained by different factors, such as passage-word order, distributional information, and duration of familiarization (Goyet et al. 2013; Nazzi et al. 2006, 2014; Nishibayashi et al. 2015). In Experiment 5, we investigated the possibility that, under similar experimental conditions as in Experiments 1, 2, and 4, successful segmentation of bisyllabic iambic word-forms in EP-learning infants would develop after 12 months of age.

3.5. Experiment 5

3.5.1. Participants

Twenty infants participated in this experiment (mean age 14 months 17 days; age range 13 months 2 days–17 months 17 days; six girls, fourteen boys). Two additional infants were tested but excluded due to caregiver interference (1) and for being premature (1).

3.5.2. Materials and Procedure

The materials and procedure were the same as in Experiments 1, 2, and 4.

3.5.3. Results and Discussion

The average looking times in the test phase for the three experimental conditions (familiar targets in utterance-edge position, familiar targets in utterance-medial position, and unfamiliar word-forms) are shown in Figure 1. As can be seen, looking times to bisyllabic iambic familiar targets in utterance-edge position were the longest, followed by looking times to familiar targets in utterance-medial position, and looking times to unfamiliar pseudowords were the shortest. A repeated measures ANOVA with the within-subject factor of condition (edge, medial, unfamiliar) revealed a significant effect (F(2, 38) = 6.304, p = 0.004, η2 = 0.249). Paired t-tests were carried out, comparing the three experimental conditions to each other (the corrected p value of 0.02 was used due to multiple comparisons). There were significant differences between edge and unfamiliar (t(19) = 3.318, p = 0.004, Cohen’s d = 0.74) and medial and unfamiliar (t(19) = 2.695, p = 0.014, Cohen’s d = 0.60), but not between edge and medial (t(19) = 1.339, p = 0.196, Cohen’s d = 0.30).
These results demonstrate evidence for segmentation of bisyllabic iambic word-forms both in utterance-edge and utterance-medial position. However, a larger difference was found between looking times to target words at the prominent edge position and unfamiliar words than between looking times to target words in the less prominent phrase-medial position and unfamiliar words, which was reflected in the magnitude of the effect sizes, with a large effect size for the former and a medium effect size for the latter (Lakens 2013). This suggests that the segmentation of bisyllabic iambic targets was easier at a major prosodic edge than in intonational phrase-medial position.
To further explore any differences between segmentation abilities at the prosodic edge and in medial position, an analysis was carried out on developing segmentation abilities for bisyllabic iambic word-forms across the experiments, with each condition (edge, medial, unfamiliar) analyzed separately. The data from Experiments 1, 2, 4, and 5 were analyzed through Linear Regression models with looking time to familiar targets in utterance-edge position, familiar targets in utterance-medial position, and unfamiliar word-forms as the dependent variables, and experiment (age group) as the predictor variable. Although the overall fit of the models was modest, the results showed that the predictor variable significantly affected the dependent variable only for looking time to familiar targets in edge position (edge: R2 = 0.06, F(1, 78) = 4.565, p = 0.036; medial: R2 = 0.04, F(1, 78) = 3.591, p = 0.062; and unfamiliar: R2 = 0.01, F(1, 78) = 0.772, p = 0.382). In other words, segmentation abilities at a major prosodic edge developed further across age groups than segmentation abilities in intonational phrase-medial position. These results are depicted in Figure 2.
The present findings show that, unlike monosyllabic word segmentation that emerges at the prosodic edge by 4–5 months of age (Butler and Frota 2018), the segmentation of bisyllabic iambic word-forms emerges much later, after 12 months of age. In addition, although bisyllabic segmentation was found in both edge and medial prosodic conditions, our findings show that it was easier (and further developed) at the prosodic edge. These findings are in line with two predictions of the unfolding prosody model on the development of early word segmentation in European Portuguese: (i) segmentation abilities for monosyllabic word-forms placed at a major prosodic edge precede the segmentation of more complex word shapes under similar prosodic conditions; and (ii) segmentation of more complex word shapes is easier at a major prosodic edge than in intonational phrase-medial position. Another prediction related to the unfolding prosody model is that the first type of complex word shapes to be segmented are expected to display iambic prominence. In EP, a complex word shape with iambic prominence next to a major prosodic edge yields a prosodic configuration where the stressed syllable is simultaneously the boundary syllable, providing the strongest cues for the salient edge position. The expectation that the segmentation of iambic word-forms precedes the segmentation of trochaic word-forms in EP is addressed in the next section.

4. Segmentation of Bisyllabic Trochaic Word-Forms

The findings from Experiments 1 to 5 established that in EP segmentation of bisyllabic iambic target words emerges after 12 months of age. Moreover, and as expected, segmentation was found to be easier at a major prosodic edge, the utterance-edge-final position. The following experiments examine the development of segmentation abilities for bisyllabic trochaic word-forms. The finding that bisyllabic iambic target words are segmented before bisyllabic trochaic words would provide additional evidence for the unfolding prosody model, and would strengthen previous findings on a processing advantage for iambic prominence in the language.

4.1. Experiment 6

In Experiment 6, segmentation abilities for bisyllabic trochaic word-forms are examined in 11–12-month-old EP-learning infants. At this age, infants were shown not to be able to segment bisyllabic iambic words (Experiment 4 above).

4.1.1. Participants

Twenty infants were tested and included in the analysis (mean age 11 months 22 days; age range 10 months 25 days–12 months 28 days; six girls and fourteen boys). Four other infants were tested but excluded due to fussiness (2), sleepiness (1), and living in a bilingual household (1).

4.1.2. Materials and Procedure

The materials and procedure were the same as in the previous experiments, with a crucial difference. Instead of bisyllabic iambic targets, bisyllabic pseudowords with trochaic stress were used in Experiment 6. The four trochaic pseudowords were FESSA [ˈfɛsa], CANO [ˈkanu], PENO [ˈpɛnu], and SOTA [ˈsɔta]. Thus, the same CV.CV format was used, as well as the same set of frequent onset consonants in the language. Acoustic measurements of the trochaic targets in the utterance-edge and utterance-medial conditions are reported in Table 2. We focused on word duration and pitch range measures, as they constitute robust cues to high-level prosodic phrasing patterns in the language. As for iambic targets, the trochaic pseudowords in the edge condition show pre-boundary lengthening as well as a larger pitch range in comparison to trochaic pseudowords in the medial condition. The larger pitch range is due to the pitch fall, which is the acoustic manifestation of the low boundary tone at the utterance/IP edge (annotated as L% in Table 2). Again, no boundary tones were found in IP-medial position.
Following a similar procedure as in the previous experiments, the trochaic pseudowords were paired for passage presentation. Thus, half of the infants were familiarized with FESSA and CANO, and the other half with PENO and SOTA. Accordingly, two of the four pseudowords were familiar targets in the test phase of the experiment, and the other two were unfamiliar pseudowords. Moreover, the position of the target word in the utterance was counterbalanced in passage presentation (i.e., half of the infants heard FESSA at the utterance-edge, and the other half heard FESSA in utterance-medial position, and so on).

4.1.3. Results and Discussion

A repeated measures ANOVA with the within-subject factor of condition (edge, medial, unfamiliar) showed no significant differences in looking time (F(2, 38) = 0.086, p = 0.919, η2 = 0.005; edge M = 8.62, SD = 2.47; medial M = 8.32, SD = 3.25; unfamiliar M = 8.28, SD = 3.54). The results reveal that 11–12-month-old infants are not segmenting bisyllabic trochaic word shapes. Therefore, segmentation abilities for bisyllabic trochaic word-forms do not emerge before segmentation abilities for bisyllabic iambic word-forms, in line with our predictions. However, it may be the case that segmentation abilities for bisyllabic targets, whether iambic or trochaic, emerge together after 12 months of age. This possibility is investigated in Experiment 7.

4.2. Experiment 7

The goal of this experiment is to determine whether successful segmentation of bisyllabic trochaic word-forms in EP-learning infants develops at the same time as segmentation of bisyllabic iambic word-forms. Thus, Experiment 7 focused on the same age group as Experiment 5, that is the age group that demonstrated successful segmentation for iambic targets.

4.2.1. Participants

Twenty infants participated in this experiment (mean age 14 months 9 days; age range 13 months 1 day–17 months 1 day; nine girls and eleven boys). The infants in this group and in the infants in the bisyllabic iambic segmentation experiment did not differ in their age (t(38) = −0.875, p = 0.387).

4.2.2. Materials and Procedure

The materials and procedure were the same as in Experiment 6.

4.2.3. Results and Discussion

The average looking times in the test phase for the three experimental conditions—familiar targets in utterance-edge position, familiar targets in utterance-medial position, and unfamiliar word-forms—are plotted in Figure 3 (left). A repeated measures ANOVA with the within-subject factor of condition (edge, medial, unfamiliar) revealed no effect (F(2, 38) = 0.330, p = 0.721, η2 = 0.017), showing that segmentation abilities for bisyllabic trochaic word-forms have not emerged yet. These results contrast with the findings from Experiment 5 on bisyllabic iambic segmentation.
The looking times from the current experiment on trochaic segmentation were compared to the looking times from the experiment on iambic segmentation (Figure 3, right) by means of a 3 (condition: edge, medial, unfamiliar) x 2 (stress pattern: trochaic, iambic) Mixed ANOVA. There was no main effect of condition (F(2, 76) = 1.801, p = 0.172, η2 = 0.045), or stress pattern (F(1, 38) = 0.587, p = 0.448, η2 = 0.015). Importantly, a significant interaction was found between condition and stress pattern (F(2, 76) = 3.773, p = 0.027, η2 = 0.090). Multiple comparisons (Bonferroni corrected) showed a significant difference between edge and unfamiliar for the iambic stress pattern only (p = 0.01). No other differences were found (p > 0.1). These results confirm that infants behaved differently in Experiments 5 and 7, successfully segmenting iambic targets (Experiment 5) but not trochaic targets (Experiment 7). Moreover, these results corroborate the finding that segmentation of more complex word shapes, i.e., bisyllabic iambic targets, is further developed at a major prosodic edge than in intonational phrase-medial position.
The expectation that segmentation of iambic word-forms precedes the segmentation of trochaic word-forms was thus borne out, providing additional support for the unfolding prosody model and strengthening earlier findings on a processing advantage for iambic prominence in the language.

5. Summary of All Experiments and Segmentation Results

In the current study, a series of word segmentation experiments was carried out to examine whether the unfolding prosody model successfully predicted the development of word segmentation abilities in EP-learning infants. A summary of all the experiments and segmentation results is provided in Table 3. The word segmentation experiment from Butler and Frota (2018) was used, with the original monosyllabic targets being replaced by bisyllabic targets.
We have shown that word segmentation abilities develop as follows: (i) monosyllabic word-forms placed at a major prosodic edge are segmented first; (ii) segmentation of more complex word shapes next to a major prosodic edge follows, being easier/further developed than in intonational phrase-medial position; and (iii) the first type of complex word shapes to be segmented display an iambic prominence pattern. These findings are in line with the predictions of the unfolding prosody model on the development of early word segmentation in European Portuguese.

6. General Discussion

The goal of the present study was to examine whether developing prosodic structure might modulate early word segmentation in European Portuguese-learning infants. We put forward the hypothesis that the development of word segmentation abilities is shaped by prosodic development as predicted by the unfolding prosody model. Using a modified version of the visual familiarization paradigm with passages-first order, we tested three predictions: (i) segmentation of monosyllabic word-forms placed at a major prosodic edge (i.e., the utterance-edge-final position) precedes segmentation of more complex word shapes under similar prosodic conditions; (ii) segmentation of more complex word shapes is easier at a major prosodic edge than in phrase-medial position; and (iii) segmentation of complex word shapes with an iambic prominence pattern is favored over segmentation of word shapes with a trochaic pattern.
We have shown that, unlike monosyllabic word segmentation, which was found to emerge by 4–5 months of age for word-forms located at the utterance-edge-final position (Butler and Frota 2018), evidence for bisyllabic segmentation is found much later, after 12 months of age, confirming the first prediction. We have also found evidence for more robust segmentation of bisyllabic word-forms at a major prosodic edge than in intonational phrase-medial position in 13–17-month-old infants, confirming the second prediction. Finally, we have shown evidence that by 14 months segmentation abilities for bisyllabic iambic word-forms are well-developed, but the segmentation of bisyllabic trochaic word-forms has not emerged yet, confirming the third prediction. In the next paragraphs, we discuss the present findings in more detail.
The unfolding prosody model contends that prosodic development proceeds by the gradual unfolding of key prosodic domains of the ambient language, through a process of expansion and nesting of prosodic units of different types between the higher (utterance/IP) phrase-level and the lower syllable level (Frota et al. 2016), as illustrated in (1) above on the basis of early production data from EP. Our finding that monosyllabic word-forms placed at a major prosodic edge are segmented first, whereas segmentation of more complex word shapes also located at a major prosodic edge emerges much later, is in line with the initial step of prosodic unfolding as proposed for EP. In this first step, the initial production unit is constrained to a one syllable/one prosodic word/one intonational phrase unit. Similarly, the first word-form that is segmented is a one syllable–one prosodic word that is crucially placed at the utterance/IP edge. This suggests that the initial prosodic unit plays a role not only as a planning and production unit in early child speech, but also as the unit that infants use to start to identify and extract word-forms from continuous speech in a prosody-based segmentation procedure. Earlier findings from other languages seem to support this approach. On the production side, it has been suggested that the domain of the intonation contour (in another words, the IP) serves as a production unit at the beginnings of child speech (Boysson-Bardies et al. 1981). In addition, the beginnings of word production in some languages, like EP and possibly French, seem to be constrained to a one-syllable unit, while in languages like English or Dutch the shape of early words seems to be constrained to a binary foot (Demuth 2006, 2018a). On the perception and speech segmentation side, it has been shown that in some languages, like French and EP, infants start by segmenting syllable-sized units (Butler and Frota 2018; Nazzi et al. 2014; Nishibayashi et al. 2015); whereas in others, like English, German, and Dutch, they start by segmenting more complex, foot-sized units (Höhle and Weissenborn 2005; Jusczyk et al. 1999; Kooijman et al. 2009). Interestingly, at least for some of the languages studied from both perspectives, a similar prosodic unit seems to be shaping the early production and word segmentation outcomes. The extent to which developing prosodic structure is evidenced in speech perception, through sensitivity to language-particular cues to word segmentation, prior to appearing in speech production, is an empirical question that demands future experimental research on prosodic unfolding in various languages.
The present finding that segmentation of more complex word shapes is more robust and further developed at a major prosodic edge than in intonational phrase-medial position is in line with the second step of prosodic unfolding. In EP, this second step is characterized by the development of a more complex prosodic word shape that is produced as a major prosodic phrase. Correspondingly, stronger segmentation abilities were found for bisyllabic word-forms located at the utterance/IP edge. This suggests that the now enlarged production unit, that comprises a one-PW phrase where the word has a more complex shape, plays a role both as a planning and production unit in early child speech, and as a unit used to extract word-forms from the speech stream. Previous findings from other languages seem to provide some support for this view. It has long been reported that first word combinations are produced as successions of single-word phrases (Behrens and Gut 2005). Although most infant word segmentation studies did not control for prosodic structure, in languages that show an initial syllable-based segmentation strategy, such as French as demonstrated by 6-month-old infants, bisyllabic segmentation has been shown to develop much later, after 8 or even 12 months of age (Goyet et al. 2013; Nazzi et al. 2006, 2014; Nishibayashi et al. 2015). Findings reported in Nazzi et al. (2014) showed an ~10 months gap between French-learning infants’ ability to segment monosyllabic words and bisyllabic (iambic) words, using the same segmentation paradigm for the different age groups that was also used in Jusczyk et al. (1999) with English-learning infants. Unlike French-learning infants, English-learning infants do not demonstrate the developmental gap between monosyllabic and bisyllabic (trochaic) word segmentation, while EP-learning infants show a developmental pattern more similar to the French-learning infants. However, bisyllabic word segmentation in different languages needs to be revisited in studies that specifically address the role of prosodic structure. Studies on early child speech that examine the prosodic status of the words produced and the properties of higher-level prosodic phrasing are also required to further address the potential of the prosodic unfolding approach across languages.
We have also found that segmentation abilities for complex word shapes with an iambic prominence pattern are well-developed before segmentation abilities for word shapes with a trochaic pattern emerge. This result is in line with earlier findings on phrase-level and word-level prominence in EP. Prominence above the word-level has been characterized to be indisputably rightmost (Frota 2014; Vigário 2010). Moreover, several studies on the perception and acquisition of word-level prominence have provided converging evidence for advantage of the iambic pattern, which is easier to process and to acquire (Correia 2009; Frota et al. 2020; Lu et al. 2018; Vigário et al. 2006). Crucially, a prosodic word with iambic stress located at the utterance/IP edge results in the alignment of prominence at the word and phrase-levels. In this configuration, the prominent syllable is also the boundary syllable, providing the strongest cues for the salient prosodic edge position. It is thus not surprising that a one-PW phrase with iambic prominence is easier to produce, and an iambic PW placed at a major prosodic edge is easier to segment. The interplay between prominence patterns and prosodic structure in unfolding prosody is expected to be language-specific. For example, in a language like Hungarian, where stress generally falls on the initial syllable of the prosodic word and phrase-level stress is argued to be initial on the IP (Szendröi 2003; Varga 2002; Vogel and Kenesei 1987), a word-form with initial prominence placed at the initial IP-edge is expected to provide the strongest cues, favoring word segmentation. Studies on other languages are required to address these predictions. This would include studies not only looking at the edge-final position, but also the edge-initial position, in particular in languages where utterance onsets are very salient. Moreover, future research should also address when the segmentation of bisyllabic trochaic words emerges in EP-learning infants.
The findings from the current study, together with previous findings on early word segmentation in EP, demonstrate that segmentation abilities for (simple and complex) word-forms located at the utterance/IP edge develop before, and are stronger, than segmentation abilities for word-forms in phrase-medial position. In Butler and Frota (2018), successful segmentation of monosyllabic word-forms at the prosodic edge was shown by 4–5 months of age, whereas phrase-medial segmentation was still developing by 11 months of age. Indeed, 12-month-old infants were still not able to segment phrase-medial monosyllabic words (Severino 2016), suggesting that this ability develops after 12 months of age. In the present study, phrase-medial segmentation was found for bisyllabic iambic words after 12 months. However, segmentation abilities in phrase-medial position were not as robust as segmentation abilities at a major prosodic edge. These results are in line with the third step of prosodic unfolding. This third step is characterized by the development of a more complex phrasing pattern, with phrases that comprise more than one prosodic word. Since this is a later development in prosodic unfolding, the role of the complex phrase unit in the planning and production of child speech, and in word segmentation, is manifested later. On the production side, multiword prosodic phrases have been reported to succeed the production of sequential single-word phrases (Behrens and Gut 2005; D’Odorico and Carubbi 2003). On the word segmentation side, studies on American English learning-infants have demonstrated segmentation abilities for monosyllabic words at the utterance/IP edge at 8 and 6 months of age, but not at phrase-medial position (Johnson et al. 2014; Seidl and Johnson 2006). Moreover, American English learning-infants were able to segment words placed either at the utterance-initial or utterance-final edge, with an advantage for the utterance-initial edge (Johnson et al. 2014). Importantly, both utterance edges have been reported to be salient in English (Seidl and Johnson 2006), and word-initial prominence predominates in the language (Jusczyk et al. 1993). However, there are not yet studies looking at positional edge effects for bisyllabic target words in English. The studies on European Portuguese-learning infants have only examined edge-final segmentation, given that this position offers the most prominent edge-cues in the language. Overall, it seems that words aligned with major prosodic edges are easier to segment and to produce, unlike words in phrase-internal position, consistent with the unfolding prosody approach. However, cues to prosodic edges are known to be language-particular (Frota 2012; Johnson and Seidl 2008), and future research should address the facilitatory effect of major prosodic edges on the ease of production and segmentation, across languages.
The set of findings from the current study strongly supports the addition of developing prosodic structure to models of word segmentation (Börschinger and Johnson 2014; Ludusan and Dupoux 2016; Ludusan et al. 2022), so that both high-level and low-level prosody and the paths of evolving word and phrase complexity could be taken in account. Another major implication of our findings concerns the perception–production link in language acquisition. It has been proposed that the relationship between speech perception and production is central to speech processing in early language development, and the auditory analysis of speech is coupled with motor information involved in speech production (Best et al. 2016; Kuhl et al. 2014). Previous work related babbling abilities and speech discrimination abilities (Majorano et al. 2014), and babbling abilities and word segmentation abilities (Hoareau et al. 2019). In pre-babbling infants, it has been shown that oral-motor movements involving different articulators impact speech perception in a selective way, suggesting an early mapping between auditory and motor speech representations, well before the beginnings of babbling and speech production (Choi et al. 2019, 2021). However, very little is known about the perception–production link beyond segmental phonology. Our results have shown that developing prosodic structure shapes both speech perception, as manifested in the patterns of word segmentation, and speech production. Importantly, the units of prosodic unfolding seem to be used to extract word-forms from continuous speech and as speech production planning units (in the sense of Shattuck-Hufnagel 2019, 2020). This early prosodic mapping between perception and production seems to hold even before infants start producing speech, as in the initial step of prosodic unfolding. The present findings thus suggest a perception–production link for prosody in early acquisition that opens new avenues for research.
On the methodological side, a strength of the current study is the use of the exact same methods to examine developing monosyllabic and bisyllabic word segmentation abilities as a function of prosody across time. This ensures that our findings across age groups cannot be attributed to differences in the stimuli, paradigm, or other methodological issues. However, the fact that we did not explore other methodological possibilities is also a limitation of this study. For example, Nazzi et al. (2014) obtained different results with different durations of the familiarization phase, with longer durations facilitating word segmentation. Future research should examine whether a longer familiarization phase might support an earlier emergence of bisyllabic word segmentation in EP-learning infants.
In conclusion, we have shown that developing prosodic structure guides the development of word segmentation in 5–17-month-old infants, as predicted by the unfolding prosody model. The present study brings new evidence, beyond early speech production during the second year of life, in support of prosodic development as a process of expansion and nesting of prosodic units of different types between the higher phrasal levels of the utterance/intonational phrase and the lower syllable level. Therefore, our study contributes to a better understanding of the mechanisms underlying word segmentation, and to a better understanding of early prosodic development, a cornerstone of language acquisition.

Author Contributions

Conceptualization, S.F. and M.V.; methodology, S.F. and M.V.; validation; S.F.; formal analysis, C.S.; investigation: C.S.; data curation, C.S.; writing-original draft preparation, S.F.; writing-review and editing, S.F., C.S. and M.V.; supervision, S.F.; funding acquisition, S.F. All authors have read and agreed to the published version of the manuscript.

Funding

Fundação para a Ciência e Tecnologia: PTDC/LLT-LIN/29338/2017, UIDB/00214/2020, PTDC/LLT-LIN/1115/2021; European Regional Development Fund from the EU, Portugal 2020 and Lisboa 2020: PTDC/LLT-LIN/29338/2017.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Centro Hospitalar Lisboa Norte” (Ref.ª DIRCLN-16JUL2014-208, 2014) and “Comissão de Ética para a Saúde da Administração Regional de Saúde de Lisboa e Vale do Tejo” (Proc.015/CES/INV/2014, 2014), as well as by the Ethics Committee of the School of Arts and Humanities, University of Lisbon (number 13_CEI2019, 2019).

Informed Consent Statement

Informed consent was obtained from all subjects (their caregivers/legal guardians) involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to ethical reasons.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Disyllabic iambic word-forms (here and elsewhere, the accent mark signals the stressed vowel):
FISSÁ, CANÉ, PINÓ, SUTÉ
Monosyllabic word-forms (Experiment 3, word lists):
SÁ, NÉ, NÓ, TÉ
Disyllabic trochaic word-forms:
FÉSSA, CÁNO, PÉNO, SÓTA
Passages: Utterance-edge position
1.
Os vizinhos brincam com o teu FISSÁ/FÉSSA.
(The neighbors play with your FISSÁ/FÉSSA.)
Estão sempre a falar-nos do FISSÁ/FÉSSA.
(They are always talking about FISSÁ/FÉSSA.)
Elas viajavam muito de FISSÁ/FÉSSA.
(They travelled a lot by FISSÁ/FÉSSA.)
Os anões adoram bolachas e FISSÁ/FÉSSA.
(Dwarves love cookies and FISSÁ/FÉSSA.)
Quero agradecer tudo ao FISSÁ/FÉSSA.
(I want to thank it all to FISSÁ/FÉSSA)
A Dora anda no seu grande FISSÁ/FÉSSA.
(Dora walks in the big FISSÁ/FÉSSA.)
2.
O José deu um abraço ao CANÉ/CÁNO.
(Joseph gave a hug to CANÉ/CÁNO.)
As meninas comeram bastante CANÉ/CÁNO.
(The girls ate a lot of CANÉ/CÁNO.)
Todos eles já conheciam o CANÉ/CÁNO.
(They all already knew CANÉ/CÁNO.)
Vou comprar chocolates e CANÉ/CÁNO.
(I’m going to buy chocolates and CANÉ/CÁNO.)
Não gostei daquela foto do CANÉ/CÁNO.
(I did not like the photo of CANÉ/CÁNO.)
Um dos professores deu esse CANÉ/CÁNO.
(One of the teachers gave this CANÉ/CÁNO.)
3.
Os vizinhos compraram o teu PINÓ/PÉNO.
(The neighbors bought your PINÓ/PÉNO.)
Estão sempre a falar-nos do PINÓ/PÉNO.
(They are always talking about PINÓ/PÉNO.)
Elas viajavam muito de PINÓ/PÉNO.
(They travelled a lot by PINÓ/PÉNO.)
Os betos devoram bolachas e PINÓ/PÉNO.
(The betos ate biscuits and PINÓ/PÉNO.)
Quero agradecer tudo ao PINÓ/PÉNO.
(I want to thank it all to PINÓ/PÉNO)
A Dina anda no seu grande PINÓ/PÉNO.
(Dina walks in her big PINÓ/PÉNO.)
4.
O José abraçou um belo SUTÉ/SÓTA.
(Joseph hugged a beautiful SUTÉ/SÓTA.)
As meninas comem aquele SUTÉ/SÓTA.
(The girls ate that SUTÉ/SÓTA.)
Todos eles já conheciam o SUTÉ/SÓTA.
(They all already knew SUTÉ/SÓTA.)
Quero ter chocolates e SUTÉ/SÓTA.
(I want to have chocolates and SUTÉ/SÓTA.)
Gostei daquela imagem do SUTÉ/SÓTA.
(I liked that image of SUTÉ/SÓTA.)
Um dos alunos bateu no SUTÉ/SÓTA.
(One of the students hit the SUTÉ/SÓTA.)
Passages: Utterance-medial position
1.
A Marta pôs o seu FISSÁ/FÉSSA na mesa.
(Marta put her FISSÁ/FÉSSA on the table.)
Fizemos festas ao FISSÁ/FÉSSA vermelho.
(We cuddled the red FISSÁ/FÉSSA.)
Nunca comi FISSÁ/FÉSSA com morangos.
(I never ate FISSÁ/FÉSSA with strawberries.)
O Tó desenhou um FISSÁ/FÉSSA bonito.
(Tony drew a handsome FISSÁ/FÉSSA.)
Conheço FISSÁ/FÉSSA doce do Algarve.
(I know the sweet FISSÁ/FÉSSA of the Algarve.)
Eles disseram FISSÁ/FÉSSA muitas vezes.
(They said FISSÁ/FÉSSA many times.)
2.
Ela gosta do CANÉ/CÁNO e do Pedro.
(She likes CANÉ/CÁNO and Pedro.)
Não sei se o CANÉ/CÁNO chegou de Paris.
(I do not know if CANÉ/CÁNO arrived from Paris.)
O João deu um CANÉ/CÁNO grande ao Zé.
(João gave a big CANÉ/CÁNO to Joe.)
A Marina tomou CANÉ/CÁNO quente.
(Marina took a hot CANÉ/CÁNO.)
Acho que a CANÉ/CÁNO se divertiu muito.
(I think that CANÉ/CÁNO had a lot of fun.)
Trouxeram CANÉ/CÁNO branco e macio.
(They bought soft and white CANÉ/CÁNO.)
3.
A Marta viu o meu PINÓ/PÉNO na mesa.
(Marta saw my PINÓ/PÉNO on the table.)
Demos festas ao PINÓ/PÉNO felpudo.
(We cuddled the plushy PINÓ/PÉNO.)
Nunca comi PINÓ/PÉNO com açucar.
(I never ate PINÓ/PÉNO with sugar.)
O Miguel fez um PINÓ/PÉNO complicado.
(Miguel made a complicated PINÓ/PÉNO.)
Adoro PINÓ/PÉNO preto de Tavira.
(I love black PINÓ/PÉNO from Tavira.)
Eles disseram PINÓ/PÉNO tantas vezes.
(They said PINÓ/PÉNO so many times.)
4.
Ela gosta do SUTÉ/SÓTA e do Paulo.
(She likes SUTÉ/SÓTA and Paul.)
Não sei se o SUTÉ/SÓTA chegou do Porto.
(I do not know if SUTÉ/SÓTA arrived from Porto.)
O Marco deu um SUTÉ/SÓTA grande ao Zé.
(Mark gave a big SUTÉ/SÓTA to Joe.)
A Maria tomou SUTÉ/SÓTA com limão.
(Maria took SUTÉ/SÓTA with lemon.)
Creio que a SUTÉ/SÓTA se divertiu muito.
(I think that SUTÉ/SÓTA had a lot of fun.)
Trouxeram SUTÉ/SÓTA verde e macio.
(They brought green and soft SUTÉ/SÓTA.)

References

  1. Behrens, Heike, and Ulrike Gut. 2005. The relationship between prosodic and syntactic organization in early multiword speech. Journal of Child Language 32: 1–34. [Google Scholar] [CrossRef]
  2. Best, Catherine T., Louis M. Goldstein, Hosung Nam, and Michael D. Tyler. 2016. Articulating what infants attune to in native speech. Ecological Psychology 28: 216–61. [Google Scholar] [CrossRef] [PubMed]
  3. Bosch, Laura, Melània Figueras, Maria Teixidó, and Marta Ramon-Casas. 2013. Rapid gains in segmenting fluent speech when words match the rhythmic unit: Evidence from infants acquiring syllable-timed languages. Frontiers in Psychology 4: 106. [Google Scholar] [CrossRef] [PubMed]
  4. Boysson-Bardies, Bénédicte, Nicole Bacri, Laurent Sagart, and Michel Poizat. 1981. Timing in late babbling. Journal of Child Language 8: 525–39. [Google Scholar] [CrossRef] [PubMed]
  5. Börschinger, Benjamin, and Mark Johnson. 2014. Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars. Transactions of the Association for Computational Linguistics 2: 93–104. [Google Scholar] [CrossRef]
  6. Butler, Joseph, and Sónia Frota. 2018. Emerging word segmentation abilities in European Portuguese-learning infants: New evidence for the rhythmic unit and the edge factor. Journal of Child Language 45: 1294–308. [Google Scholar] [CrossRef]
  7. Chen, Aoju, Núria Esteve-Gibert, Pilar Prieto, and Melissa A. Redford. 2020. Development of Phrase-Level Prosody from Infancy to Late Childhood. In The Oxford Handbook of Language Prosody. Oxford: Oxford University Press, pp. 552–62. [Google Scholar] [CrossRef]
  8. Choi, Dawoon, Alison G. Bruderer, and Janet F. Werker. 2019. Sensorimotor influences on speech perception in pre-babbling infants: Replication and extension of Bruderer et al. (2015). Psychonomic Bulletin & Review 26: 1388–99. [Google Scholar] [CrossRef]
  9. Choi, Dawoon, Ghislaine Dehaene-Lambertz, Marcela Peña, and Janet F. Werker. 2021. Neural indicators of articulator-specific sensorimotor influences on infant speech perception. Proceedings of the National Academy of Sciences 118: e2025043118. [Google Scholar] [CrossRef] [PubMed]
  10. Christophe, Anne, Ariel Gout, Sharon Peperkamp, and James Morgan. 2003. Discovering words in the continuous speech stream: The role of prosody. Journal of Phonetics 31: 585–98. [Google Scholar] [CrossRef]
  11. Christophe, Anne, Séverine Millotte, Savita Bernal, and Jeffrey Lidz. 2008. Bootstrapping lexical and syntactic acquisition. Language and Speech 51: 61–75. [Google Scholar] [CrossRef]
  12. Correia, Susana. 2009. The Acquisition of Primary Word Stress in European Portuguese. Ph.D. dissertation, Universidade de Lisboa, Lisbon, Portugal. [Google Scholar]
  13. Demuth, Katherine. 2006. Crosslinguistic perspectives on the development of prosodic words. Introduction. Language and Speech 49: 129–35. [Google Scholar] [CrossRef] [PubMed]
  14. Demuth, Katherine. 2018a. The Development of Prosodic Phonology. In The Oxford Handbook of Psycholinguistics. Edited by Shirley-Ann Rueschemeyer and M. Gareth Gaskell. Oxford: Oxford University Press, pp. 675–89. [Google Scholar] [CrossRef]
  15. Demuth, Katherine. 2018b. Understanding the development of prosodic words: The role of the lexicon. In The Development of Prosody in First Language Acquisition. Edited by Pilar Prieto and Núria Esteve-Gibert. Amsterdam: John Benjamins, pp. 207–24. [Google Scholar] [CrossRef]
  16. Demuth, Katherine, and E. J. Fee. 1995. Minimal Prosodic Words in Early Phonological Development. Master’s dissertation, Brown University, Providence, RI, USA. Dalhousie University, Halifax, NS, Canada. [Google Scholar]
  17. D’Odorico, Laura, and Stefania Carubbi. 2003. Prosodic characteristics of early multi-word utterances in Italian children. First Language 23: 97–116. [Google Scholar] [CrossRef]
  18. Fikkert, Paula. 1994. On the Acquisition of Prosodic Structure. Doctoral dissertation, Holland Institute of Generative Linguistics, Leiden University, Leiden, The Netherlands. Available online: https://hdl.handle.net/2066/17308 (accessed on 9 February 2023).
  19. Fikkert, Paula, Liquan Liu, and Mitsuhiko Ota. 2020. The Acquisition of Word Prosody. In The Oxford Handbook of Language Prosody. Oxford: Oxford University Press, pp. 540–52. [Google Scholar] [CrossRef]
  20. Frota, Sónia. 2000. Prosody and Focus in European Portuguese: Phonological Phrasing and Intonation. New York: Routledge. [Google Scholar] [CrossRef]
  21. Frota, Sónia. 2012. Prosodic structure, constituents and their implementation. In The Oxford Handbook of Laboratory Phonology. Edited by Abigail C. Cohn, Cécile Fougeron and Marie K. Huffman. Oxford: Oxford University Press, pp. 255–65. [Google Scholar] [CrossRef]
  22. Frota, Sónia. 2014. The intonational phonology of European Portuguese. In Prosodic Typology II: The Phonology of Intonation and Phrasing. Edited by Sun-Ah Jun. Oxford: Oxford University Press, pp. 6–42. [Google Scholar] [CrossRef]
  23. Frota, Sónia, and Marina Vigário. 2001. On the correlates of rhythmic distinctions: The European/Brazilian Portuguese case. Probus 13: 247–73. [Google Scholar] [CrossRef]
  24. Frota, Sónia, and Marina Vigário. 2018. Syntax-phonology interface. In Oxford Research Encyclopedia in Linguistics. Edited by Mark Aronoff. Oxford: Oxford University Press. [Google Scholar] [CrossRef]
  25. Frota, Sónia, and Pilar Prieto, eds. 2015. Intonation in Romance: Systemic similarities and differences. In Intonation in Romance. Oxford: Oxford University Press, pp. 392–418. [Google Scholar] [CrossRef]
  26. Frota, Sónia, Jovana Pejovic, Cátia Severino, and Marina Vigário. 2020. Looking for the edge: Emerging segmentation abilities in atypical development. Speech Prosody 2020: 814–18. [Google Scholar] [CrossRef]
  27. Frota, Sónia, Jovana Pejovic, Marisa Cruz, Cátia Severino, and Marina Vigário. 2022. Early Word Segmentation Behind the Mask. Frontiers in Psychology 13: 879123. [Google Scholar] [CrossRef]
  28. Frota, Sónia, Marina Vigário, Fernando Martins, and Marisa Cruz. 2012. FrePOP—Frequency Patterns of Phonological Objects in Portuguese: Research and Applications. Extended: 2,000,000 Words. Lisboa: Laboratório de Fonética (CLUL/FLUL). Available online: http://frepop.letras.ulisboa.pt (accessed on 2 October 2022).
  29. Frota, Sónia, Nuno Matos, Marisa Cruz, and Marina Vigário. 2016. Early prosodic development: Emerging intonation and phrasing in European Portuguese. In Issues in Hispanic and Lusophone Linguistics: Interdisciplinary Approaches to Intonational Grammar in Ibero-Romance. Edited by Meghan E. Armstrong, Maria Del Mar Vanrell and Nicholas C. Henriksen. Philadelphia: John Benjamins, pp. 295–324. [Google Scholar] [CrossRef]
  30. Gervain, Judit, Anne Christophe, and Reiko Mazuka. 2020. Prosodic Bootstrapping. In The Oxford Handbook of Language Prosody. Edited by Carlos Gussenhoven and Aoju Chen. Oxford: Oxford University Press, pp. 563–73. [Google Scholar] [CrossRef]
  31. Gleitman, Lila R., and Eric Wanner. 1982. Language Acquisition: The State of the Art. Cambridge: CUP Archive. [Google Scholar]
  32. Goyet, Louise, Léo-Lyuki Nisibayashi, and Thierry Nazzi. 2013. Early syllable segmentation of fluent speech by infants acquiring French. PLoS ONE 8: e79646. [Google Scholar] [CrossRef]
  33. Hammond, Michael. 1999. The Phonology of English. A Prosodic Optimality-Theoretic Approach. Oxford: Oxford University Press. [Google Scholar]
  34. Hoareau, Mélanie, H. Henny Yeung, and Thierry Nazzi. 2019. Infants’ statistical word segmentation in an artificial language is linked to both parental speech input and reported production abilities. Developmental Science 22: e12803. [Google Scholar] [CrossRef]
  35. Höhle, Barbara. 2009. Bootstrapping mechanisms in first language acquisition. Linguistics 47: 359–82. [Google Scholar] [CrossRef]
  36. Höhle, Barbara, and Jurgen Weissenborn. 2005. Word segmentation in German-learning infants. Paper presented at the International Workshop Early Word Segmentation: A Crosslinguistic Approach Taking Advantage of Europe’s Linguistic Diversity, Paris, France, February 25–26. [Google Scholar]
  37. Johnson, Elizabeth K., Amanda Seidl, and Michael D. Tyler. 2014. The edge factor in early word segmentation: Utterance-level prosody enables word form extraction by 6-month-olds. PLoS ONE 9: e83546. [Google Scholar] [CrossRef]
  38. Johnson, Elizabeth K., and Amanda Seidl. 2008. Clause segmentation by 6-month-old infants: A crosslinguistic perspective. Infancy 13: 440–55. [Google Scholar] [CrossRef]
  39. Jusczyk, Peter W. 1997. The Discovery of Spoken Language. Cambridge: MIT Press. [Google Scholar]
  40. Jusczyk, Peter W., and Richard N. Aslin. 1995. Infants’ detection of the sound patterns of words in fluent speech. Cognitive Psychology 29: 1–23. [Google Scholar] [CrossRef]
  41. Jusczyk, Peter W., Anne Cutler, and Nancy J. Redanz. 1993. Infants’ preference for the predominant stress patterns of English worlds. Child Development 64: 675–87. [Google Scholar] [CrossRef]
  42. Jusczyk, Peter W., Derek M. Houston, and Mary Newsome. 1999. The beginnings of word segmentation in English-learning infants. Cognitive Psychology 39: 159–207. [Google Scholar] [CrossRef] [PubMed]
  43. Kehoe, Margaret. 2018. Prosodic phonology in acquisition: A focus on children’s early word productions. In The Development of Prosody in First Language Acquisition. Edited by Pilar Prieto and Núria Esteve-Gibert. Amsterdam: John Benjamins, pp. 165–84. [Google Scholar] [CrossRef]
  44. Kooijman, Valesca, Peter Hagoort, and Anne Cutler. 2009. Prosodic structure in early word segmentation: ERP evidence from Dutch ten-month-olds. Infancy 14: 591–612. [Google Scholar] [CrossRef] [PubMed]
  45. Kuhl, Patricia K., Rey R. Ramírez, Alexis Bosseler, Jo-Fu. Lotus Lin, and Toshiaki Imada. 2014. Infants’ brain responses to speech suggest analysis by synthesis. Proceedings of the National Academy of Sciences 111: 11238–45. [Google Scholar] [CrossRef]
  46. Ladd, D. Robert. 2008. Intonational Phonology. Cambridge: Cambridge University Press. [Google Scholar]
  47. Lakens, Daniël. 2013. Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology 4: 863. [Google Scholar] [CrossRef] [PubMed]
  48. Ludusan, Bogdan, Alejandrina Cristia, Reiko Mazuka, and Emmanuel Dupoux. 2022. How much does prosody help word segmentation? A simulation study on infant-directed speech. Cognition 219: 104961. [Google Scholar] [CrossRef]
  49. Ludusan, Bogdan, and Emmanuel Dupoux. 2016. The role of prosodic boundaries in word discovery: Evidence from a computational model. The Journal of the Acoustical Society of America 140: EL1. [Google Scholar] [CrossRef]
  50. Lu, Shuang, Marina Vigário, Susana Correia, Rita Jerónimo, and Sónia Frota. 2018. Revisiting Stress “deafness” in European Portuguese—A Behavioral and ERP Study. Frontiers in Psychology 9: 2486. [Google Scholar] [CrossRef]
  51. Majorano, Marinella, Marilyn M. Vihman, and Rory A. DePaolis. 2014. The relationship between infants’ production experience and their processing of speech. Language Learning and Development 10: 179–204. [Google Scholar] [CrossRef]
  52. Matos, Nuno. 2021. Medir o Tempo: Um Estudo Sobre os Padrões Duracionais em Português Europeu nos Primeiros Três anos de Idade [The Measures of Time: A Study of Duration Patterns in the First Three Years of Life]. Ph.D. dissertation, Universidade de Lisboa, Lisbon, Portugal. Available online: http://hdl.handle.net/10451/54763 (accessed on 19 September 2023).
  53. McQueen, James M., and Laura C. Dilley. 2020. Prosody and spoken-word recognition. In The Oxford Handbook of Language Prosody. Oxford: Oxford University Press, pp. 509–21. [Google Scholar] [CrossRef]
  54. Meints, Kerstin, and Alan Woodford. 2005. Lincoln Infant Lab Package 1.0: A New Programme Package for IPL Preferential Listening, Habituation and Eyetracking. Available online: https://www.lincoln.ac.uk/media/responsive2017/collegeofsocialscience/schoolofpsychology/Handbook_08-12-2009.pdf (accessed on 24 June 2013).
  55. Mersad, Karima, Louise Goyet, and Thierry Nazzi. 2010. Cross-linguistic differences in early word form segmentation: A rhythmic-based account. Journal of Portuguese Linguistics 10: 37–65. [Google Scholar] [CrossRef]
  56. Morgan, James L., and Katherine Demuth. 1996. Signal to Syntax: Bootstrapping from Speech to Grammar in Early Acquisition. Mahwah: Lawrence Erlbaum Associates. [Google Scholar]
  57. Nazzi, Thierry, Galina Iakimova, Josiane Bertoncini, Séverino Frédonie, and Carmela Alcantara. 2006. Early segmentation of fluent speech by infants acquiring French: Emerging evidence for crosslinguistic differences. Journal of Memory and Language 54: 283–99. [Google Scholar] [CrossRef]
  58. Nazzi, Thierry, Karima Mersad, Megha Sundara, Galina Iakimova, and Linda Polka. 2014. Early word segmentation in infants acquiring Parisian French: Task-dependent and dialect-specific aspects. Journal of Child Language 41: 600–33. [Google Scholar] [CrossRef]
  59. Nazzi, Thierry, Sara Paterson, and Annette Karmiloff-Smith. 2003. Early word segmentation by infants and toddlers with Williams Syndrome. Infancy 4: 251–71. [Google Scholar] [CrossRef]
  60. Nespor, Marina, and Irene Vogel. 2007. Prosodic Phonology, 2nd ed. Berlin: Mouton de Gruyter. [Google Scholar] [CrossRef]
  61. Newman, Rochelle, Nan Bernstein Ratner, Anne Marie Jusczyk, Peter W. Jusczyk, and Kathy Ayala Dow. 2006. Infants’ ability to segment the conversational speech signal predicts later language development: A retrospective analysis. Developmental Psychology 42: 643–55. [Google Scholar] [CrossRef] [PubMed]
  62. Nishibayashi, Léo-Lyuki, Louise Goyet, and Thierry Nazzi. 2015. Early speech segmentation in French-learning infants: Monosyllabic words versus embedded syllables. Language and Speech 58: 334–50. [Google Scholar] [CrossRef]
  63. Polka, Laura, and Megha Sundara. 2012. Word segmentation in monolingual infants acquiring Canadian English and Canadian French: Native language, cross-dialect, and cross-language comparisons. Infancy 17: 198–232. [Google Scholar] [CrossRef]
  64. Santos, Raquel. 2005. Strategies for word stress in Brazilian Portuguese. In Developmental Paths in Phonological Acquisition. Edited by Marina Tzakosta, Claartje Levelt and Jeroen van de Weijer. Special issue of Leiden Papers in Liguistics. Leiden: Universität Leiden, vol. 2, pp. 71–91. [Google Scholar]
  65. Seidl, Amada, and Elizabeth K. Johnson. 2006. Infant word segmentation revisited: Edge alignment facilitates target extraction. Developmental Science 9: 565–73. [Google Scholar] [CrossRef] [PubMed]
  66. Selkirk, Elisabeth, and Seunghun J. Lee. 2015. Constituency in sentence phonology: An introduction. Phonology 32: 1–18. [Google Scholar] [CrossRef]
  67. Severino, Cátia. 2016. Perception of Phrasal Prosody in the Acquisition of European Portuguese. Ph.D. dissertation, Universidade de Lisboa, Lisbon, Portugal. [Google Scholar]
  68. Shattuck-Hufnagel, Stefanie. 2019. Toward an (even) more comprehensive model of speech production planning. Language, Cognition and Neuroscience 34: 1202–13. [Google Scholar] [CrossRef]
  69. Shattuck-Hufnagel, Stefanie. 2020. The Role of Phrase-Level Prosody in Speech Production Planning. In The Oxford Handbook of Language Prosody. Oxford: Oxford University Press, pp. 521–38. [Google Scholar] [CrossRef]
  70. Shattuck-Hufnagel, Stefanie, and Alice E. Turk. 1996. A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research 25: 193–247. [Google Scholar] [CrossRef]
  71. Shukla, Mohinish, Katherine S. White, and Richard N. Aslin. 2011. Prosody guides the rapid mapping of auditory word forms onto visual objects in 6-mo-old infants. Proceedings of the National Academy of Sciences 108: 6038–43. [Google Scholar] [CrossRef] [PubMed]
  72. Singh, Leher, Steven Reznick, and Liang Xuehua. 2012. Infant word segmentation and childhood vocabulary development: A longitudinal analysis. Developmental Science 15: 482–95. [Google Scholar] [CrossRef]
  73. Szendröi, Kriszta. 2003. A stress-based approach to the syntax of Hungarian focus. The Linguistic Review 20: 37–78. [Google Scholar] [CrossRef]
  74. Varga, László. 2002. Intonation and Stress: Evidence from Hungarian. London: Palgrave Macmillan. [Google Scholar] [CrossRef]
  75. Vigário, Marina. 2003. The Prosodic Word in European Portuguese. Berlin: Mouton de Gruyter. [Google Scholar] [CrossRef]
  76. Vigário, Marina. 2010. Prosodic structure between the prosodic word and the phonological phrase: Recursive nodes or an independent domain? The Linguistic Review 27: 485–530. [Google Scholar] [CrossRef]
  77. Vigário, Marina. 2016. Segmental phenomena and their interactions: Evidence for prosodic organization and the architecture of grammar. In Manual of Grammatical Interfaces in Romance. Edited by Susann Fischer and Christoph Gabriel. Series Manuals of Romance Linguistics. Berlin and Boston: De Gruyter, vol. 10, pp. 41–73. [Google Scholar] [CrossRef]
  78. Vigário, Marina, and Violeta Martínez-Paricio. 2023. Is the foot a prosodic domain in European Portuguese? Languages. under review. [Google Scholar]
  79. Vigário, Marina, Maria João Freitas, and Sónia Frota. 2006. Grammar and frequency effects in the acquisition of prosodic words in European Portuguese. Language and Speech 48: 175–203. [Google Scholar] [CrossRef] [PubMed]
  80. Vihman, Marilyn. 2018. The development of prosodic structure: A usage-based approach. In The Development of Prosody in First Language Acquisition. Edited by Pilar Prieto and Núria Esteve-Gibert. Philadelphia: John Benjamins, pp. 185–206. [Google Scholar] [CrossRef]
  81. Vogel, Irene, and István Kenesei. 1987. The interface between phonology and other components of grammar: The case of Hungarian. Phonology Yearbook 4: 243–63. [Google Scholar] [CrossRef]
  82. Wellmann, Caroline, Julia Holzgrefe, Hubert Truckenbrodt, Isabell Wartenburger, and Barbara Höhle. 2012. How each prosodic boundary cue matters: Evidence from German infants. Frontiers in Psychology 3: 580. [Google Scholar] [CrossRef]
  83. Zahner, Katharina, Muna Pohl, and Bettina Braun. 2016. The limits of metrical segmentation: Intonation modulates infants’ extraction of embedded trochees. Journal of Child Language 43: 1338–64. [Google Scholar] [CrossRef]
Figure 1. Segmentation of bisyllabic iambic word-forms. Mean looking times (ms) for the experimental conditions: prosodic edge (target word in utterance-edge-final position), medial (target word in utterance-medial position), and unfamiliar word (word not presented in the familiarization phase). Error bars represent the standard error of the mean.
Figure 1. Segmentation of bisyllabic iambic word-forms. Mean looking times (ms) for the experimental conditions: prosodic edge (target word in utterance-edge-final position), medial (target word in utterance-medial position), and unfamiliar word (word not presented in the familiarization phase). Error bars represent the standard error of the mean.
Languages 09 00305 g001
Figure 2. Developing segmentation abilities for bisyllabic iambic word-forms across experiments.
Figure 2. Developing segmentation abilities for bisyllabic iambic word-forms across experiments.
Languages 09 00305 g002
Figure 3. Segmentation of bisyllabic word-forms with trochaic stress (left) and iambic stress (right) in 13–17-month-olds. Mean looking times (ms) for the experimental conditions: prosodic edge (target word in utterance-edge-final position), medial (target word in utterance-medial position), and unfamiliar word (word not presented in the familiarization phase). Error bars represent the standard error of the mean.
Figure 3. Segmentation of bisyllabic word-forms with trochaic stress (left) and iambic stress (right) in 13–17-month-olds. Mean looking times (ms) for the experimental conditions: prosodic edge (target word in utterance-edge-final position), medial (target word in utterance-medial position), and unfamiliar word (word not presented in the familiarization phase). Error bars represent the standard error of the mean.
Languages 09 00305 g003
Table 1. Acoustic and phonological properties of the iambic stimuli (a negative value in pitch range indicates a pitch fall).
Table 1. Acoustic and phonological properties of the iambic stimuli (a negative value in pitch range indicates a pitch fall).
IambicMedialEdge
MeanSDMeanSDt-Test
Word duration (ms)484.139.45595.756.74−10.19, p < 0.001
Pitch range (Hz)−35.283.62−62.224.314.89, p < 0.001
Tonal event L%
Table 2. Acoustic and phonological properties of the trochaic stimuli (a negative value in pitch range indicates a pitch fall).
Table 2. Acoustic and phonological properties of the trochaic stimuli (a negative value in pitch range indicates a pitch fall).
TrochaicMedialEdge
MeanSDMeanSDt-Test
Word duration (ms)512.7511.37636.4210.80−11.52, p < 0.001
Pitch range (Hz)38.1712.55−75.544.607.91, p <0.001
Tonal event L%
Table 3. Summary of all experiments and segmentation results. The original experiment from Butler and Frota (2018) is also described.
Table 3. Summary of all experiments and segmentation results. The original experiment from Butler and Frota (2018) is also described.
ExpStimuliSegmentationMean AgeAge Range# ParticipantsResult p-ValueResult Effect Size (η2)Edge/UnfamiliarMedial/Unfamiliar
Exp.1Bisyll. iambicWhole word6;134;25–7;14200.3560.049--
Exp. 2Bisyll. iambicWhole word9;067;18–10;06200.1710.089--
Exp. 3Bisyll. iambicEmbedded ˈσ8;287;23–10;05200.2960.062--
Exp. 4Bisyll. iambicWhole word12;0110;29–12;29200.6210.025--
Exp. 5Bisyll. iambicWhole word14;1713;02–17;17200.0040.249p = 0.004p = 0.014
Exp. 6Bisyll. IambicWhole word11;2210;25–12;28200.9180.005--
Exp. 7Bisyll. TrochaicWhole word14;0913;01–17;01200.7210.017--
Exp.5/Exp.7Iambic/ trochaicWhole word14;1213;01–17;17400.0270.090p = 0.010n.s.
Butler & FrotaMonosyllabicSyllable7;194;19–10;0840< 0.010.15p < 0.001n.s.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Frota, S.; Severino, C.; Vigário, M. Unfolding Prosody Guides the Development of Word Segmentation. Languages 2024, 9, 305. https://doi.org/10.3390/languages9090305

AMA Style

Frota S, Severino C, Vigário M. Unfolding Prosody Guides the Development of Word Segmentation. Languages. 2024; 9(9):305. https://doi.org/10.3390/languages9090305

Chicago/Turabian Style

Frota, Sónia, Cátia Severino, and Marina Vigário. 2024. "Unfolding Prosody Guides the Development of Word Segmentation" Languages 9, no. 9: 305. https://doi.org/10.3390/languages9090305

APA Style

Frota, S., Severino, C., & Vigário, M. (2024). Unfolding Prosody Guides the Development of Word Segmentation. Languages, 9(9), 305. https://doi.org/10.3390/languages9090305

Article Metrics

Back to TopTop