Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu

Jabeen, Farhat

doi:10.3390/languages7020103

Open AccessArticle

Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu

by

Farhat Jabeen

Linguistics and Literary Studies, Bielefeld University, 33615 Bielefeld, Germany

Languages 2022, 7(2), 103; https://doi.org/10.3390/languages7020103

Submission received: 25 January 2021 / Revised: 9 March 2022 / Accepted: 9 April 2022 / Published: 20 April 2022

(This article belongs to the Special Issue Phonology-Syntax Interface and Recursivity)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This study investigates speaker based variation in the use of word order and intonation to mark narrow and wide focus in Urdu. The identification of focus type and position, as well as the prosodic phrasing of declarative sentences produced in the target focus conditions, is also discussed. The results of a semi-spontaneous production experiment indicated no preference for a linear position, as the focused nouns were mostly placed in situ (89%). The analysis of phonetic cues showed significant inter- and intraspeaker variation in participants’ use of longer noun duration, higher F0 peak, and wider F0 range in the narrowly focused nouns, as compared with their counterparts produced in wide focus. In the identification survey conducted online, the consistent use of phonetic cues in speech production was found to influence the correct identification of narrow focus and the position of focused nouns. Another online survey, concerning the prosodic phrasing of sentences produced in narrow and wide focus, showed participants’ slight preference for a recursive Intonational Phrase boundary on the left edge of the narrowly focused nouns. The results of both the surveys show that Urdu speakers vary in their identification of focus as well as their choice of prosodic phrasing in the target contexts. This research highlights the role of individual variation in the use of word order and phonetic cues to mark narrow and wide focus in Urdu. It also illustrates that the identification of focus type and phrasing is far from uniform. These findings have implications for the analysis of intonation in general, as this study testifies that the production and identification of intonation and prosodic phrasing are not invariable and speakers and listeners differ in their use of available linguistic means (word order vs. intonational categories), the selection, as well as the manipulation of phonetic cues.

Keywords:

intonation; word order; focus identification; narrow focus; wide focus; prosodic phrasing; Accentual Phrase; Intermediate Phrase; Intonational Phrase; interspeaker variation; intraspeaker variation; Urdu; Hindi

1. Introduction

This paper reports on speaker related variation in the use of word order and intonational categories in the context of wide and narrow focus in Urdu. The phonetic cues reported here include the use of proportionate noun duration, F0 range, and F0 peak scaling in this understudied language. Urdu is an Indo-Aryan language spoken mainly in Pakistan, India, and the diaspora. Urdu and Hindi share the same ISO code and are mutually intelligible. The main differences between them are manifest in their orthography and formal lexicon (Ohala 1983). Urdu is written in Persio-Arabic script and borrows its formal lexicon from Persian and Arabic languages. Hindi, on the other hand, is written in the Devanagari script and uses Sanskrit based lexicon in formal contexts. Politically, Urdu is associated with Muslims, whereas Hindi is a reference for Hindu identity. In syntactic and semantic analyses, research findings based on one language are generally considered to be applicable to the other as well (cf. Bhatt and Dayal (2007); Butt (1993); Butt and King (1997)). Therefore, the following section on word order and information structure discusses the target phenomena for both Urdu and Hindi together. Most of the phonetic, phonological, and prosodic experiments, however, do not attempt this generalization and restrict their claims to either Hindi (cf. Féry et al. (2016); Kügler (2020); Patil et al. (2008)) or Urdu (cf. Urooj et al. (2019)). Given that, the prosodic analyses for Urdu and Hindi will be discussed in separate subsections in what follows.

1.1. Word Order and Information Structure in Urdu and Hindi

Urdu and Hindi are SOV languages. Kidwai (2000) claimed the following to be the canonical word order for these languages: “Subject—Indirect Object—Direct Object—Adjunct(s)—Verb—Auxiliaries” (p. 3). However, she reported that arguments can scramble to different positions in the sentence. This scrambling is far from random, as each linear position denotes information structure. Gambhir (1981) is one of the earliest and most elaborate analyses of word order and information structure in Hindi. Her work has heavily influenced the subsequent analyses of word order and information structure in Hindi as well as in Urdu. Butt and King (1996b) built on Gambhir’s analysis to offer an account of linear position and information structure in Urdu. Using data from personal exposure to the Urdu language in Lahore, Pakistan and Hindi in New Delhi, India, Butt and King (1996b) explain that the sentence initial position in Urdu is used for topicalization and scene setting, whereas the immediately preverbal position is used to mark narrow focus. The sentence final position is reserved for heavy NP shifting, news announcements, de-emphasis, background information, and to denote emphasis on new information (Butt and King 1996b; Kidwai 2000). Below, (1) illustrates the association of information structure with various linear positions.

(1)	XP_Topic XP_Focus Complex Predicate XP_{Background/de-emphasis etc.}

In the current research, we are concerned with understanding focus realization in Urdu by exploiting word order variation and intonational categories. We use the notion of focus in the sense of Alternative Semantics, introduced by Rooth (1992) and subsequently developed and modified by Krifka (2008). According to this theory, focus refers to the introduction of alternatives in discourse. In such an analysis, wide focus is induced by a wh-question eliciting all new information, as shown in (2-a). As per Butt and King, the answer to this question should be produced with a canonical word order in Urdu and Hindi, as given in (2-b). On the other hand, the congruent answer to the question in (2-c) is narrowly focused and the relevant noun phrase should be placed at the immediately preverbal position. This results in the noncanonical word order presented in (2-d).

(2)	a.	Ẇhat happened?
	b.	nɑ:z=ne seb k^hɑ.jɑ
		Naz=Erg apple.M.Sg eat.Perf.M.Sg
		‘Naz ate an apple’.
	c.	Who ate an apple?
	d.	seb nɑ:z=ne_Focus k^hɑ.jɑ
		apple.M.Sg Naz=Erg eat.Perf.M.Sg
		‘Naz ate an apple’.

While the relation between word order and information structure in Urdu has been discussed in earlier studies, their impact on intonation is not well understood. There is only one existing study that investigates this with reference to the position of wh-phrases in Urdu. In order to understand the interaction between word order, information structure, and intonation, Butt et al. (2016) conducted a production experiment with wh-phrases at the immediately preverbal and postverbal positions. The authors hypothesized that wh-questions with immediately preverbal wh-phrases were information seeking, whereas questions with immediately postverbal wh-phrases were rhetorical. In a forced choice task, Butt et al. (2016) presented their participants with either information-seeking or rhetorical contexts and asked them to choose between the questions with either an immediately preverbal or postverbal wh-phrase. For the prosodic analysis, participants were further required to produce their selected sentence loudly. The authors reported that participants did not show a preference for the position of wh-phrases in either information-seeking or rhetorical contexts. However, they found that the position of wh-phrases affected the intonation of wh-questions. The immediately preverbal wh-phrases were produced with the highest F0 peak in the sentences, whereas their postverbal variants carried no F0 rises. These findings support the claim that intonation and word order in Urdu are interconnected and cannot be properly understood in isolation. The role of information structure is still not clear in this regard.

1.2. Prosody of South Asian Languages

In his analysis of the prosody of a number of South Asian (SA) languages, Khan (2016, 2018) reported that the intonation systems of many SA languages consists of “Repetitive Rising Contours”. While there are exceptions to this phenomenon, as shown by Das and Mahanta (2019) in their analysis of Boro, this generally holds true for the SA languages analyzed so far. In her analysis of Bengali, Hindi, Tamil, and Malayalam, Féry (2010) proposed that these rising F0 contours mark phrasal prominence and that these Indian languages are intonational “Phrase” languages. In what follows, I present a brief overview of the prosody of three SA languages i.e., Bengali, Tamil, and Hindi1.

1.2.1. Bengali

In his analysis of Bengali intonation, Khan (2008, 2014) proposed three levels in its prosodic hierarchy: Accentual Phrases (AP), Intermediate Phrases (ip), and Intonational Phrases (IP). APs are produced with a low, high or a rising pitch accent. The ips carry a boundary marking tone along with longer duration, optional pitch reset, and a pause. The IP boundaries also carry a boundary tone on the right edge and are followed by a longer pause, as compared with the pauses following ips. Khan (2014) reported that the consecutive F0 peaks within an IP are downstepped, with reference to the immediately preceding peak.

The prosodic marking of focus in Bengali includes a low pitch accent on the left edge of the word followed by a high phrase boundary on the right edge (Hayes and Lahiri 1991). Furthermore, Lahiri and Fitzpatrick-Cole (1999) showed that the components of a complex predicate are phrased together in one phonological phrase when produced in wide focus. However, when one of these components is narrowly focused, the target word is produced as a phonological phrase with a rising contour of its own. The realization of a narrowly focused word in Bengali with a rising F0 contour has also been confirmed by Khan (2014). Although Khan (2014) and Hayes and Lahiri (1991); Lahiri and Fitzpatrick-Cole (1999) differed in their phonological analyses for these rising contours, it is clear that the presence of narrow focus affects the intonation and the phonological status of the target word in Bengali.

1.2.2. Tamil

Keane (2014)’s analysis of Tamil intonation is slightly different from that of Bengali, as she proposed only two levels of phrasing in the language, i.e., APs and IPs. The APs are produced with a rising F0 contour and successive rises within an IP are downstepped. Keane (2014) reported that the IPs in Tamil are marked by phrase final lengthening on the right edge and pitch reset. She also showed that there is no difference in the intonation of Tamil sentences produced in wide, and a subset of narrow, focus. However, she reported that the narrowly focused word was always produced as an AP on its own instead of forming a phrase with the following word, as observed in the context of wide focus. Hence, Keane (2014) claimed that narrow focus in Tamil affected the phonological status of the target word. This is reminiscent of Hayes and Lahiri (1991)’s analysis of Bengali and shows that, at least in these two SA languages, narrow focus leads to a particular prosodic phrasing.

1.2.3. Hindi

Patil et al. (2008) is the first systematic investigation of word order, information structure, and prosody in Hindi. They analyzed SOV and OSV sentences and found no difference in the F0 peak scaling and duration of narrowly focused nouns and their counterparts produced in wide focus. Moreover, the authors did not find any difference on the basis of word order, in their data. Patil et al. (2008) reported that compression following narrow focus was a reliable cue to distinguish between noun phrases produced in wide and narrow focus. Féry et al. (2016)’s analysis of narrowly focused and given noun phrases produced in SOV and OSV sentences offered further insights in focus marking, as the narrowly focused noun phrases were produced with a higher intensity than their given counterparts. Furthermore, Féry et al. (2016) reported the downstepping of consecutive F0 peaks in a sentence and found no effect of word order variation in the intonation of their target sentences.

In their analysis of structural and prosodic prominence in Hindi, Luchkina et al. (2015) offered promising results. The authors asked their participants to mark prominent words in the texts presented to them on a screen. They found that the use of emphatic discourse markers led to a significantly higher rate of the perceived prominence of the immediately following words. As for the position, the immediately preverbal and post-posed words were perceived as significantly more prominent when compared with the fronted words. Moreover, referentially unused words were found to be perceived as more prominent than the referentially given ones. The authors further reported that the participants’ perception of prominence was positively associated with higher vowel intensity and F0 maxima. On the basis of this, Luchkina et al. (2015) claimed that prominence in Hindi was marked by using syntactic, lexical, as well as prosodic means.

The use of postfocal compression in the narrow focus context in Hindi was investigated in detail by Kügler (2020). He reported a production and a perception study regarding the prosodic realization of narrow focus on direct and indirect objects in Hindi. His analysis of the production data showed that narrowly focused indirect objects were more often followed by compression, as compared with the focused direct objects. However, Kügler (2020) reported speaker based variation, as some speakers did not use postfocal compression following indirect object focus. Moreover, those who did use postfocal compression differed in degree from 10 to 30 Hz. In order to investigate if this difference in the presence vs. absence and the degree of postfocal compression was perceptually relevant, he ran a perception study where participants were required to listen to a fragment of the target sentence and complete the sentence using one of the two available object contrasts. The analysis showed that the presence of postfocal compression significantly improved the correct identification of target sentences. Kügler (2020) used these results to claim that postfocal compression is a cue to the perception of narrow focus in Hindi.

The studies reported in this section show that in two SA languages, Bengali and Tamil, narrow focus has a phrasing effect and the focused word is produced as a prosodic phrase on its own. In Hindi, however, narrow focus is marked by higher intensity and postfocal compression and the phrasing effect of focus has not yet been confirmed in this language.

1.3. Prosodic Phrasing in Urdu

Putting aside the role of word order and information structure, the prosody of Urdu has been analyzed in its own right using the autosegmental–metrical framework. Similar to Keane (2014)’s analysis of Tamil, Jabeen (2019c) proposed two levels of prosodic phrasing in Urdu: APs and IPs. Hayes and Lahiri (1991) have also proposed two levels of prosodic phrasing in Bengali, although they used “Phonological Phrases” instead of APs. In what follows, I summarize the analysis of Urdu prosodic phrasing.

1.3.1. Accentual Phrases

As reported by Khan (2016, 2018), the intonation of an Urdu declarative sentence produced in wide focus is a sequence of rising F0 contours. These rising contours have been analyzed as pitch accents (Urooj et al. 2019) or as phrase boundary tones (Jabeen 2019a; Jabeen and Delais-Roussarie 2019). Urooj et al. (2019) claimed that the rising contour found in Urdu may be grouped into “five types of Accentual Phrases” (p. 9). These APs may carry a monotonal or bitonal pitch accent (L*, H*, or L*+H) and, optionally, contain AP boundary tones on the left (aL) and/or the right edge (Ha, La). In our analyses (Jabeen 2019a; Jabeen and Delais-Roussarie 2019), we argue against Urooj et al. (2019)’s analysis of the rising F0 contours as pitch accents and propose that what they have analyzed as pitch accents are in fact phrase boundary tones. Moreover, Urooj et al. (2019) had to propose an optional low AP boundary on the left edge because the low tone does not always align with the lexically stressed syllable. If the idea of pitch accents is discarded, however, there is no need to propose an ad hoc low AP boundary as it becomes the default conclusion, as discussed in what follows.

The fact that the low tone in the rising contour does not always align with the stressed syllable has been reported by Jabeen (2019a). She found that, in disyllabic trochaic words, the low tone always aligned with the stressed syllable. However, when the stressed syllable was placed at the medial position in trisyllabic words (CV.ˈCV.CV), the low tone rarely aligned with it (15% in lab speech data and 11% in a corpus of read speech). Jabeen also reported the variable alignment of a high tone on the right edge of Urdu words. She found that, in words from the read speech corpus, the high tone mostly aligned with the case markers following bisyllabic nouns (93%). However, in trisyllabic nouns, the high tone was realized on the last syllable of the noun (97%). Jabeen argued that the alignment of these tones cannot be explained with reference to the position of lexical stress in Urdu. Using regression analysis of the temporal alignment of tones in the rising contour, she showed that the low tone occurred at a fixed distance from the left edge of a word and the temporal realization of this tone was not affected by the number of syllables in the target words. Moreover, Jabeen found that the alignment of the high tone varied on the basis of the number of syllables. The high tone in monosyllabic words was realized significantly earlier before the right edge, as compared with the high tone in bisyllabic and trisyllabic words. Based on the fixed alignment of low tones and the disregard for the position of the stressed syllable, Jabeen (2019a) argued that the low and the high tones in the rising contour in Urdu are phrase boundary tones. Féry (2017) has offered similar evidence for her analysis of phrase boundary tones in SA languages such as Hindi. Building on Jabeen (2019a)’s findings, Jabeen and Delais-Roussarie (2019) proposed that the rising contour delimits an Accentual Phrase in Urdu. According to their analysis, the low tone marks the left edge (Lp) and the high tone denotes the right boundary (Hp) of an AP, as (3) illustrates.

(3)	Lp Hp
	ʃɑ.ˈhi:.nɑ:=ne
	Shahina=Erg

The discussion above shows that Urooj et al. (2019) and Jabeen (2019a) agree that Urdu has Accentual Phrase boundary tones. The main difference in their analysis is found in the inventory of tones. Jabeen’s analysis of Urdu intonational inventory contains only AP boundary tones and interprets the difference in the realization of these tones as an issue of phonetic implementation.

Using data from a speech corpus, Jabeen and Delais-Roussarie (2020) reported that Urdu words consisting of six or more moras trigger a size constraint resulting in double rises (L1 H1 L2 H2) within an AP. They reported that the temporal alignment of low and high tones in single rises (L H) was similar to those of the edge tones (L1 H2) in double rises. The alignment of L2 and H1, however, differed from their counterparts L1 and H2 in words with double rises. Based on this, the authors concluded that the double rises in Urdu form one long Accentual Phrase, where L1 and H2 are Accentual Phrase boundary tones and the medial H1 and L2 are inserted for rhythmic reasons (Lp H L Hp).

Jabeen and Delais-Roussarie (2020) found that narrow focus leads to the realization of a rising contour on words that are produced without a rise in a wide focus context. (4) illustrates that a narrowly focused case marker is produced with a rising contour of its own. Based on their similar temporal alignment, Jabeen and Delais-Roussarie (2020) claimed that the rises realized in narrowly focused nouns have the same phonological status as their counterparts produced in wide focus (shown in (3)). Therefore, the focused case marker in (4) is labeled as an AP on its own. This is similar to focus induced prosodic phrasing in Tamil and Bengali, as discussed earlier.

(4)	Lp Hp LpHp
	ʃɑ.ˈhi:.nɑ= ne_Focus

It has been observed that the sentence final constituents in Urdu declaratives do not carry the high AP boundary (Hp) found in nonfinal APs. Jabeen (2019c) proposed that, their phonetic realization notwithstanding, the sentence final constituents form APs similar to their nonfinal counterparts. She argued that the Hp in sentence final APs is replaced by the low IP boundary (L%) associated with declaratives in Urdu. Her analysis is supported by the fact that the Lp is realized on the left edge of final APs. Stronger evidence in favor of Jabeen’s analysis of sentence final constituents comes from her analysis of variable word order in Urdu declaratives. Jabeen (2019c) showed that the phonetic realization of sentence final APs depends on the role of the syntactic constituent placed there. In her analysis of six possible word orders for an Urdu transitive sentence (SOV, SVO, OSV, OVS, VSO, VOS), she reported that the sentence final subjects were very likely to carry a complete rising F0 contour followed by a low IP boundary. Final objects also carried more instances of a rising contour as compared to the verb placed at the same position. These findings confirm that the sentence final constituent in Urdu is indeed an AP and the surface realization of this AP is affected by the role of the syntactic constituent placed there.

1.3.2. Intonational Phrases

In Jabeen (2019c)’s analysis of Urdu intonation, the APs are grouped into Intonational Phrases marked by a boundary tone. Her inventory of IP boundary tones includes two monotonal (L%, H%) and two bitonal (LH%, HL%) units. Declarative sentences are produced with a low boundary, whereas a high IP boundary is produced in questions. However, Jabeen (2019c, 2020) has shown that the use of boundary tones in polar and wh-questions is determined by the position of the questioned word and the wh-phrase, respectively. Jabeen (2019c) reported that, within an IP, high tones are downstepped with reference to the scaling of the immediately preceding F0 peak. Furthermore, the last syllable on the right edge of an IP is characterized by phrase final lengthening and nonmodal voice quality.

Recursive IPs in Urdu

Jabeen (2019b) used a production experiment to investigate if Intonational Phrases can be recursively organized in Urdu. The participants were asked to produce numeric sequences, as shown in (5), and directed to convey the position of hyphens in their speech.

(5)	213 - 4567 - 76

Jabeen (2019b) found inter- and intraspeaker variation in participants’ use of a juncture to mark the presence of hyphens in the target numeric sequences. She proposed that the presence of hyphens in her target sentences was indicated by Intonational Phrase boundaries. Jabeen (2019b) based this analysis on the use of longer syllable duration, higher F0 peak scaling, and nonmodal voice quality on the left edge of the short pause inserted to indicate the position of a hyphen. Furthermore, she argued that the overall downstep between F0 peaks within an utterance showed that the IPs inserted to mark hyphens’ position were embedded within a large Intonational Phrase. Below, (6) presents Jabeen (2019b)’s proposed prosodic phrasing in the numeric sequences. Furthermore, (6-a) shows that each numeric sequence produced with a rising contour is analyzed as an AP. Finally, (6-b) illustrates the embedded structure of Intonational Phrases.

(6)	a.	(2)_AP (13)_AP - (45)_AP (67)_AP - (76)_AP	AP phrasing
	b.	((213)_IP - (4567)_IP - 76)_IP	IP phrasing

The existence of a recursive IP boundary as evident by the use of upstepped F0 peaks has also been proposed by Jabeen (2020) in her analysis of Urdu polar questions. She showed that, when the questioned word was placed at a nonfinal position, it was produced with an upstepped F0 peak and the polar question carried a low IP boundary at the end of the sentence. However, when the questioned word was at the sentence final position, the polar question was realized with an upstepped H%. Jabeen (2020) proposed that the upstepped high tone on the nonfinal questioned words marked a recursive IP boundary. She argued that this analysis was supported by the fact that the upstepped high tone was retained when the questioned word was final in the sentence. If the upstepped high tone marked the boundary of a regular Accentual Phrase, this would mean that the boundary of a lower phrase level (AP) overrides the boundary of a higher phrase level (the L% of IPs found in polar questions with nonfinal questioned words). However, the analysis of IP boundaries in declaratives has already shown that it is the low IP boundary that override the high Accentual Phrase (Hp) boundary at the sentence final position (Jabeen 2019c). Therefore, the replacement of a higher phrasing level by a lower level in polar questions is not only unlikely, it is also negated by the existing analysis of IPs in Urdu declaratives.

To summarize, the analysis of the F0 contour of numeric sequences (Jabeen 2019b) and polar questions (Jabeen 2020) provides evidence in favor of recursive IP phrasing in Urdu.

1.3.3. Focus Realization in Urdu

There are only a few studies available regarding the intonation of focus in Urdu. As Urdu is claimed to have a fixed (immediately preverbal) focus position (Butt and King 1996b), the role of word order needs to be considered in any attempt to study the prosodic realization of focus in this language. To date, Jabeen (2017) is the only study that has investigated the intonation of narrow and wide focus at the preverbal position in Urdu. Based on the data collected from eleven speakers of Urdu, she reported the prosodic marking of narrow focus to be optional. Moreover, Jabeen found variation in prosodic marking, as speakers used a combination of cues to convey narrow focus. In her data, a narrowly focused noun phrase (noun and case marker) may be produced with the F0 peak aligning with the end of the noun, as compared with its alignment with the immediately following case marker observed in wide focus. She also found that the narrowly focused nouns were followed by the postfocal compression of the rising F0 contours but the use of this cue was discretional, as speakers could choose not to compress their F0 on postfocal constituents. Jabeen’s findings offer an interesting insight into the variable marking of narrow focus in Urdu. However, she did not delve further into speaker based differences and it is not clear if the use of phonetic cues varies between speakers, or if a single speaker also varies in their production of narrow focus.

A recent attempt to investigate the prosody of different types of focus in Urdu has been carried out by Jabeen and Braun (2018), who analyzed the prosodic realization of narrow vs. corrective focus. The results of their production experiment showed that the immediately preverbal nouns in the corrective focus context were produced with longer duration and wider F0 range, as compared with their variants in narrow focus. Moreover, the correctively focused nouns were followed by a steeper fall in F0 in the postfocal region. The F0 peaks in the narrow focus context were found to align with the end of the noun phrase (noun followed by case marker), whereas their counterparts in corrective focus aligned with the end of the focused noun. Jabeen and Braun (2018) used the insights generated from the production data to investigate if Urdu speakers were sensitive to the manipulation of syllable duration in narrow and corrective focus contexts. They set up an online perception experiment where the stimuli consisted of sentences carrying immediately preverbal nouns with manipulated durations. Jabeen and Braun (2018) used the duration ratio from the production experiment data to either shorten or elongate the syllables in the target nouns. The modified target sentences were presented to Urdu speakers in narrow and corrective focus contexts. The results of their experiment indicated Urdu speakers’ asymmetric perception of syllable duration. The participants found both elongated and shortened syllables as equally natural in the narrow focus context. However, they rated longer syllable duration as significantly more natural in the corrective focus context. The results of this experiment establish that Urdu speakers are indeed sensitive to duration manipulation in different focus contexts.

1.4. Individual Variation in Using Prosodic Cues

While introducing the complexities of prosodic variability, Breen et al. (2018) point to a number of factors contributing to variation in speech. Some of the factors discussed by them include morphosyntax, speech planning, individual variation, as well as differences in both speakers and listeners’ experiences. Speaker based variation at the segmental level is a well documented phenomenon (Ernestus 2012). While it is known that intonation is far from uniform (Kim 2019; Peppé et al. 2000), individual variation in the production and perception of intonational categories needs to be investigated in more detail. Baumann et al. (2006); Cangemi et al. (2015); Schuppler and Ludusan (2020) are some of the works providing a glimpse into speaker related variation in German prosody. Baumann et al. (2006) used a production experiment to show the variable use of categorical and gradient prosodic cues in the realization of wide, narrow, and contrastive focus. Cangemi et al. (2015) offered similar findings and reported differences both in the selection and manipulation of prosodic cues in their target focus contexts. Moreover, they observed intraspeaker variation in the use of cues to mark different focus types. With the help of a follow-up perception experiment, Cangemi et al. (2015) showed that speaker based variation is not a redundant property, as German listeners are sensitive to this variation. Their analysis of participants’ responses showed that listeners identified the correct information structure more often when a speaker had used all the target prosodic cues. Furthermore, Cangemi et al. (2015) found variation among listeners in the correct identification, as one listener appeared more successful (75% correct responses) at the task than another listener (56% correct responses). Notice that, even when a speaker used all the prosodic cues consistently, a quarter of the data still comprised incorrect identifications. Moreover, Ouyang and Kaiser (2015) reported speaker related differences in American English regarding the use of prosodic cues to mark new information and corrective focus. Kügler (2020) has also reported individual variation in the production of postfocal compression in Hindi as some of his participants did not use this prosodic cue at all whereas others differed in the degree of postfocal compression. This shows that speaker related variation is not a feature of a particular language and is, in fact, a wide spread phenomenon.

1.5. Research Questions

As mentioned earlier, there are no existing studies investigating speaker based variation in the production and identification of focus in Urdu. The current study aims to address this and investigates speaker related variation in the use of word order and intonation to mark narrow and wide focus in Urdu and addresses the following questions:

Do speakers from a homogenous group in terms of age, gender, and education differ in their use of word order and intonation? (Interspeaker variation)
Does an individual speaker vary in their use of word order and intonation? (Intraspeaker variation)
To what extent are speakers’ productions correctly identified with reference to focus type and the position of focused nouns?
Do the differences in word order and intonation affect the prosodic phrasing of declaratives produced in wide and narrow focus?

Questions 1–2 are answered using a production task eliciting ditransitive declaratives from twelve speakers of Urdu. Question 3 is addressed by running an online survey to identify focus type and position in the target sentences. Question 4 is tackled by analyzing the scaling of prefocal F0 peaks, lengthening of prefocal syllables, and postfocal compression. The analysis of prosodic phrasing is further investigated with the help of an online survey.

Prosodic marking of different focus types in Hindi has been variously reported to carry higher F0 peak scaling, wider F0 range, higher intensity, and longer duration (Féry et al. 2016; Genzel and Kügler 2010; Puri 2013). However, none of these cues are used consistently and it is interesting to investigate if narrow focus marking in Urdu is similar to that in Hindi. The inconsistent use of other cues notwithstanding, I predict that narrowly focused nouns are followed by postfocal compression, as this has been found to be a robust cue for the perception of narrow focus in Hindi (Kügler 2020). Furthermore, existing analyses have shown that the presence of narrow focus leads to focus induced prosodic phrasing in Bengali and Tamil (Hayes and Lahiri 1991; Keane 2014; Khan 2014). Therefore, it is posited that narrow focus results in a prosodic phrase boundary on the right or left edge of the focused nouns in Urdu. If so, the intonational cues to a focus induced phrase boundary should be absent in wide focus. In what follows, these hypotheses are tested with a production experiment.

In the following section, I present the experimental set up, the stimuli, and the analytical methods in detail. This is followed by the experimental findings regarding inter- and intraspeaker variation in Section 3 and a brief discussion of the findings in Section 4. The identification of focus types and positions in an online survey is reported in Section 5. The cues to prosodic phrasing in the production of narrow and wide focus are presented in Section 6 and the results of a survey regarding prosodic phrasing are given in Section 7. The implications of these results are discussed in Section 8 and the conclusion is offered in Section 9.

2. Methods: Production Experiment

2.1. Material

The dataset consisted of five ditransitive sentences comprising proper nouns followed by their corresponding monosyllabic case markers to indicate their grammatical role: Subject ne, oblique se or object ko. This was done to ensure that the same noun was used in a given grammatical role and to facilitate the comparison of same nouns in different word orders. All the nouns were bisyllabic with CV structure. The target sentences were presented in a context of a wh-question eliciting either wide or narrow focus on each of the target noun phrases. As per Krifka (2008), the information elicited by a wh-question and provided by a congruent answer was considered as narrowly focused in the experiment. Wide focus was elicited using a generic question ‘What happened?’. Below, (7) presents an example of questions invoking different focus contexts. The full set of target sentences is given in Appendix A.

(7)	a.	What happened?	Wide focus
	b.	Who asked Neelam for a kebab?	Subject focus
	c.	Whom did Laila ask for a kebab?	Oblique focus
	d.	What did Laila ask from Neelam?	Object focus
	e.	lɛ.lɑ=ne ni:.ləm=se kə.bɑ:b mã:.gɑ
		Laila=Erg Neelam=Obl kebab.Nom.M ask.Perf.M.Sg
		‘Laila asked Neelam for a kebab’.

2.2. Participants

The participants formed a homogenous group in terms of gender, age, and education. Twelve female speakers of Urdu were recorded for this experiment. They were all graduate students aged between 19 and 29 years. All the participants used Urdu for everyday and official communication. The speakers were paid a small remuneration for participating in the experiment.

2.3. Data Collection

The recordings took place in Faisalabad, Pakistan. The target sentences were presented to the participants using MS. PowerPoint. The participants were shown a set of nouns and a verb on each slide. They were asked to read the question given at the top of the screen, formulate an answer using the given chunks of text, and produce the sentence loudly and clearly. All the participants saw the same visual sequences of target words. The position of words on each slide was randomly organized. As the focus types (wide and narrow) and focus location (one of the three noun phrases) were randomly interspersed, no filler sentences were used in the stimuli. The target words were presented in the Persio-Arabic script used to write Urdu. In order to eliminate the influence of their word order on target sentences and to limit the set of variables, the wh-questions were presented in English. This mixed use of languages in the investigation of word order in Urdu has also been employed by Butt et al. (2016), who presented their target contexts in English and the target sentences in Urdu. This strategy may arguably influence the results of the experiment but if the target questions were presented in Urdu, the word order of the wh-questions would need to be accounted for in the statistical analysis. Therefore, I decided to eliminate this potentially confounding variable.

To ensure that the participants paid attention to what was being asked in the context question, the question words were presented in bold face. In the wide focus context, the entire question was written in bold. Figure 1 presents an example of a target sentence as presented to the participants to induce narrow focus on the object. The average duration of the experiment was nine minutes.

2.4. Data Analysis

2.4.1. Word Order

The author listened to the sentences and manually labeled them for their word order as well as the linear position (sentence initial, medial, immediately preverbal) of focused nouns. The resulting data was saved in a comma separated file, subsequently used for statistical analysis.

2.4.2. Intonational Analysis

F0 Contour and Peak Scaling

Praat (Boersma and Weenink 2013) [v. 6.0.56] was used for the analysis of target phonetic cues. Each sentence was annotated at syllable level by a graduate student. Following the prosodic analysis of Urdu offered by Jabeen (2019a) and Jabeen and Delais-Roussarie (2019, 2020), the annotation of tonal targets was carried out by the author of this study. To investigate the intonation contour, F0 valleys (L) and peaks (H) were marked in the entire sentence. The F0 contour was labeled manually. Due attention was paid to microprosody and tonal targets were not marked where F0 perturbation could have been caused by the preceding or following consonants. The timing and scaling of F0 troughs and peaks were measured using a modified Praat script originally written by Elvira-García (2018). F0 on Ls and Hs in each sentence was measured in semitones with reference to the minimum F0 in the entire sentence. The resulting semitone values were used to plot the F0 contour of sentences and to analyze the scaling of F0 peaks.

F0 Range

The semitone difference between the F0 peak and the immediately preceding L on focused nouns was used to calculate the F0 range of the target nouns.

Postfocal Compression

In this study, I use the term “postfocal compression” to refer to the scaling of rising contours in the postfocal region. Compression was measured as the difference in the scaling of F0 peak on the focused noun and the F0 scaling at the end of the immediately following noun. A higher difference in the scaling of the focused and the following noun is assumed to indicate postfocal compression. Instead of setting an arbitrary boundary to indicate the presence/absence of compression, the scaling of F0 was treated as gradient and analyzed as a continuous variable.

Proportionate Duration

The duration of the target noun was measured as a proportion of the duration of the entire sentence (noun duration/sentence duration).

2.4.3. Statistical Analysis

The statistical analysis was carried out separately for each target variable. Linear Mixed Effects Regression (LMER) models were run to analyze proportionate noun duration, F0 peak scaling, F0 range, and postfocal compression as dependent variables and focus type as the fixed factor. In order to investigate if the use of phonetic cues varied among individual speakers, an interaction between focus type and participants was also added to the LMER models. Items were used as random effects in all the regression models (Baayen et al. 2008). The analysis was carried out using R software of data analysis (R Core Team 2014) [v. 4.0.4]. The regression analysis was performed using the ‘lmer’ (Kuznetsova et al. 2017) and ‘lme4’ (Bates et al. 2015) packages. To obtain type III ANOVAs, the ‘car’ (Fox and Weisberg 2019) package was used. The results of these analyses are reported in what follows.

3. Results: Production Experiment

3.1. Word Order

Table 1 shows the placement of narrowly focused nouns at different positions.

First and foremost, none of the focused nouns were placed after the verb. Among the three preverbal positions available in a ditransitive sentence in Urdu, there is no clear preference for any given position. This also illustrates that subjects are mostly placed at the initial position, the indirect and direct objects occur most frequently at the medial position, and the locatives are preferably placed at the immediately preverbal position. Moreover, all the null marked objects are immediately preverbal and there is no variation here.

The data presented in Table 2 illustrates that an analysis in terms of canonical word order (subject–indirect object–direct object/null marked object–verb) can shed better light on the use of word order. This shows that focused nouns were mostly placed in situ. When a focused noun was moved from its canonical position, it was placed either at the sentence initial (indirect and direct objects) or medial position (subjects and locatives). The focused noun occurred at the immediately preverbal position when that was also its canonical position (null marked nouns and locatives). This appears to contradict the claim of Butt and King (1996b) regarding the immediately preverbal position being the fixed focus position in Urdu. However, this issue will be taken up again in Section 7.2. The preference for in situ placement is so clear that no meaningful discussion of interspeaker variation can be offered here.

Given the lack of preference for a particular linear position for placing focused noun phrases, I henceforth set aside the claims regarding the immediately preverbal position being the fixed focus position. In what follows, the position of nouns refers to their linear order produced in target focus contexts.

3.2. F0 Contour in Wide Focus

The production of target sentences in wide focus was found to be similar across speakers. The sentence initial and medial noun phrases were produced with a rising F0 contour each. A difference was observed in the production of immediately preverbal noun phrases. When null marked objects were placed at that position, only 36% of them were produced with a rising contour. The remaining null marked nouns were phrased together with the following verb. However, 71% of locative nouns at the same position were produced with a rising contour of their own. An example of this difference is presented in Figure 2.

3.3. F0 Peak Scaling

The scaling of F0 peaks in sentence initial focused noun phrases in Figure 3 shows a lot of variation among speakers in the overall scaling of F0 peaks as well as in the use of this cue to mark different focus types. Speakers 6 and 12 show a big difference in the scaling of peaks, with higher F0 peaks produced in narrow as compared with wide focus. On the other hand, speakers 7 and 9 produced lower F0 peaks in the context of narrow focus. The remaining eight speakers (1, 2, 3, 4, 5, 8, 10, and 11) barely differ between focus types regarding their use of F0 peak scaling. The statistical analysis showed a significant interaction between focus type and participants in the scaling of F0 peaks at the sentence initial (

χ

² = 32.3, DF = 11, p = 0.0006) position. As the number of individual data points was too few to warrant a reliable regression analysis, no further tests were carried out for individual speakers.

Figure 3 further illustrates that, at the sentence medial position, five speakers (5, 6, 7, 9, 12) exhibit a large difference in the scaling of F0 peaks in the target focus contexts. Only speakers 6 and 12, however, have produced higher F0 peaks in the context of narrow focus, whereas speakers 5, 7, and 9 have lowered the F0 peaks in narrow focus as compared with wide focus. This trend is also found in the data for four out of the remaining seven speakers, although the difference in their F0 peak scaling is very small. The results of the regression analysis showed a significant interaction (

χ

² = 24.5, DF = 11, p = 0.01) between focus type and participants in the use of F0 peak scaling at the medial position.

The use of F0 peak scaling in the immediately preverbal nouns, as shown in Figure 3, is similar to what has been observed at other positions in the sentence. Seven speakers (1, 3, 4, 6, 10, 11, 12) use higher F0 peaks to mark narrow focus as compared with wide focus. Conversely, three speakers (5, 7, 9) use lower peaks in the narrow focus context. The remaining two speakers (2, 8) do not use F0 peak scaling to distinguish between focus types. The interaction between participants and focus type at this position was highly significant (

χ

² = 42.0, DF = 11, p < 0.0001), thus alluding to a strong role of interspeaker variation.

Figure 3 also allows us to analyze intraspeaker variation in the use of F0 peak scaling in narrow and wide focus. Three speakers (6, 11, 12) always use higher F0 peaks in the context of narrow focus. Comparatively, three speakers (1, 3, 4) use this cue only at sentence initial and immediately preverbal positions. Speaker 10 uses it at both medial and immediately preverbal positions, whereas speaker 5 uses it only at the initial position. The remaining four participants (2, 7, 8, 9) do not use this cue at any position to distinguish between the target focus types.

3.4. F0 Range

The use of F0 range in sentence initial nouns is shown in Figure 4. Six speakers (3, 5, 6, 7, 9, 12) show a difference of more than one semitone in F0 range in wide and narrow focus contexts. However, while four speakers (3, 5, 6, 12) use a wider F0 range in narrow focus, two speakers (7, 9) narrowed their F0 range in this context. The remaining speakers exhibit minor differences (half a semitone or less) in their use of F0 range in wide and narrow focus. The LMER analysis showed a strong interaction (

χ

² = 43.8, DF = 11, p < 0.0001) between focus type and participants in the use of F0 range at the sentence initial position.

The middle plot in Figure 4 presents the mean F0 range of sentence medial nouns. Overall, speakers do not show a large difference in F0 range in wide and narrow focus. Meagre differences in F0 range as there are, three speakers (4, 6, 11) use a wider F0 range in narrow focus. Comparatively, four speakers (5, 7, 9, 12) use a narrower F0 range in narrow focus, as compared with wide focus. No significant interaction (p = 0.9) was found between focus type and participants in the use of F0 range at this position.

The bottom plot in Figure 4 illustrates the use of F0 range in immediately preverbal nouns. The negative values indicate that the target nouns were produced without a rising contour and F0 gradually fell over the target nouns. Having said that, speaker 4 produced a steeper fall on the target nouns in wide focus as compared with their counterparts produced in narrow focus. Seven of the participants (1, 2, 3, 8, 9, 10, 12) produced narrowly focused nouns with a wider F0 range, whereas only one speaker (11) uses a narrower range in that context. The remaining three speakers mark only a minute change in F0 range in the target focus contexts. The statistical analysis failed to find a significant interaction (p = 0.4) between participants and focus type.

The analysis of intraspeaker variation in the use of F0 range shows that four speakers (1, 3, 4, 6) consistently use a wider F0 range to mark narrow focus at all the three target positions. The remaining participants use this cue either inconsistently at one or the other positions, or use a narrower F0 range to mark narrow focus, as compared with wide focus.

3.5. Proportionate Duration

Figure 5 (top) shows the proportionate duration of target nouns at the sentence initial position produced in wide and narrow focus. It is clear that, while some speakers do elongate the nouns in the context of narrow focus as compared with wide focus, this pattern is not present among all speakers. Eight of them use longer duration in the narrow focus context, whereas four speakers (2, 5, 8 10) do not differ much in the duration of target nouns in the two focus contexts. This shows that there is interspeaker variation in the use of noun duration as well as the extent of elongation in the target focus contexts. The regression analysis showed a significant interaction (

χ

² = 21.8, DF = 11, p = 0.02) between participants and focus type in the use of proportionate noun duration at this position.

The proportionate duration of sentence medial nouns is shown in Figure 5 (middle). The overall pattern of duration manipulation is similar to the one observed in sentence initial nouns. In their production of medial nouns, seven speakers (1, 3, 4, 6, 9, 11, 12) produced longer nouns in the context of narrow focus as compared with wide focus. Five speakers (2, 5, 7, 8, 10) either do not manipulate duration or mark very small differences to distinguish between wide and narrow focus. The analysis of interaction between focus type and participants in the use of relative duration at the medial position failed to reach significance (p = 0.07).

The proportionate duration of immediately preverbal nouns is given in the bottom plot in Figure 5. A majority of speakers (1, 3, 4, 5, 6, 9, 11, 12) render the narrowly focused nouns with longer duration. Contrarily, four speakers (2, 7, 8, 10) make only minute changes in the duration of the target nouns in narrow focus. The analysis showed a significant interaction (

χ

² = 23.7, DF = 11, p = 0.01) between participants and focus type, in the use of proportionate duration at the immediately preverbal position.

As for intraspeaker variation, seven speakers (1, 3, 4, 6, 9, 11, 12) use longer noun duration to mark narrow focus at initial, medial, as well as immediately preverbal positions. On the other hand, two speakers (2, 10) never elongate the narrowly focused nouns at any linear position. The use of longer duration by other speakers (5, 7, 8) is not consistent, as they use it at one or the other position. This shows that individual speakers optionally use longer proportionate duration to mark narrow focus in Urdu.

3.6. Summary

Table 3 summarizes the overall use of phonetic cues by different speakers to mark narrow focus. It shows that, while speaker 6 used all the three target cues to mark narrow focus at initial, medial, and immediately preverbal positions, speaker 2 used none of these except wider F0 range only once. The variation observed in Table 3 mainly stems from the sparse use of higher F0 peak scaling and a wider F0 range at the sentence initial and medial positions. The use of longer proportionate duration, however, seems to be quite robust, as most of the participants used it at least once in a target focus position.

4. Discussion: Production Experiment

4.1. Phonetic Cues

The data presented above shows inter- and intraspeaker variation in the selection and manipulation of phonetic cues to distinguish between wide and narrow focus. This variability has been found in a homogenous group of Urdu speakers who share the same gender (female), age (19 to 29 years), and education level (university graduates). It is clear that the variation found in focus marking does not result from any of these factors. Moreover, individual speakers’ use of a given phonetic cue also varies when the focused noun is placed at different positions in the sentence. This indicates that Urdu speakers pick and choose among the available phonetic cues to mark narrow focus.

A reviewer suggested that the variable use of phonetic cues by Urdu speakers may be an indication of change in progress. Considering that the current study is the first of its kind in Urdu, it is not possible to draw a comparison to determine if this variability is new to Urdu or an established norm. It has also been argued that the multilingual background of my respondents results in an unstable use of phonetic cues. However, it is important to remember that most of the population in Pakistan/South Asia is multilingual and the intonation of many South Asian languages, analyzed hitherto, is similar in the use of “Repetitive Rising Contours” (Khan 2016). Considering this, the variable use of phonetic cues by Urdu speakers may not be attributed to multilingualism. Moreover, their linguistic background does not explain why a single speaker varied in their use of phonetic cues at different linear positions in the sentence. Importantly, individual variation in the use of phonetic cues to mark narrow focus has also been reported in Hindi (Kügler 2020), a language closely related to Urdu. With the help of a perception experiment, Kügler (2020) further showed that the use of greater postfocal compression (30 Hz) led to more successful identification of focus type as compared with smaller postfocal compression (10 Hz) or the complete absence of this cue. These findings not only support the existence of variation in the use of phonetic cues to mark focus, they also indicate that some strategies are communicatively more successful than the others. Finally, the variable use of phonetic cues has also been reported in British and American English (Kim 2019; Ouyang and Kaiser 2015; Peppé et al. 2000) as well as in German (Baumann et al. 2006; Cangemi et al. 2015). The presence of individual variation reported in these languages shows that this phenomenon is not specific to Urdu or Hindi. The findings of the current study support the argument that the variable use of phonetic cues should be seen as a norm and not as an anomaly.

4.2. Word Order

The results of the production experiment showed that the participants did not place narrowly focused nouns at the immediately preverbal position and instead used any of the three preverbal positions available in a ditransitive sentence. Hence, the only preference, if any, is to avoid placing the narrowly focused nouns after the verb. No speaker based variation has been found in the use of a specific position to place the focused noun. Unlike the use of phonetic cues, participants have been unanimous in their lack of word order manipulation to mark narrow focus. It appears that, in this task, the preference for the canonical word order is stronger than the option for narrowly focused words to be placed at the immediately preverbal position. The lower preference for the latter makes sense, considering that word order is not the only means to convey narrow focus as phonetic cues are also available to perform the job. Similar findings have been reported for Russian, another language with flexible word order (cf. Luchkina and Cole (2019)). Further discussion on this is offered in Section 8.1.

Importantly, caution should be observed while considering the results reported here. This data was collected in a laboratory setting and thus represents a type of formal, albeit semi-spontaneous, speech. It is possible that in informal communication in their everyday life, these participants do manipulate word order to mark narrow focus. Moreover, Butt and King (1996b)’s claims regarding the fixed focus position were based on their personal exposure to Urdu and Hindi and their observations might have been based on colloquial usage. Therefore, a corpus based study of word order variation in different contexts can shed better light on this phenomenon.

5. Identification of Focus Type and Position

The production experiment reported above illustrated that Urdu speakers vary in their selection of phonetic cues to mark narrow focus. Moreover, their manipulation of a given phonetic cue is not consistent either. It has been shown in Table 3, above, that speaker 6 is the only one to use longer noun duration, higher F0 peak scaling, and wider F0 range to mark narrow focus consistently. On the other hand, speaker 2 used none of these cues in the narrow focus context. Given the variation observed in the production data, it is pertinent to ask if Urdu listeners are sensitive to the presence/absence of these phonetic cues. Therefore, I ran an online survey in order to investigate the importance of phonetic cues, as produced by speakers 2 and 6 from the production experiment, to distinguish between wide and narrow focus and the position of focused nouns.The details of the survey are given as following.

5.1. Methods: Focus Identification

5.1.1. Apparatus and Stimuli

As reported in Section 1.4 earlier, Cangemi et al. (2015) have shown that the data from a production experiment may be used to investigate the identification of focus types by listeners. Moreover, Jabeen and Braun (2018) have successfully used the results of their production data to design an experiment to investigate the perception of syllable duration in Urdu. Partially drawing on the methodology used in these studies, I used the data from the production experiment as stimuli for the identification task. I conducted an online survey using PsyToolkit (Stoet 2010, 2017). The survey was divided into three parts: The training phase, the test phase, and the collection of personal information. The training phase consisted of three sentences similar to the target focus conditions. The test phase contained a pseudo-randomized set of forty sentences (five sentences in each focus condition) produced by speakers 2 (using no phonetic cues to mark focus) and 6 (using all the phonetic cues investigated here to mark focus). In the section for personal information, participants were asked about their biological sex, age, mother tongue, and education level.

The training and test phases of the survey consisted of multiple choice tasks. For every item in the survey, participants were asked to listen to the target sentence and identify which word in the given sentence was prominent. This was followed by a list of the words (three nouns and a verb) from the target sentence. The last option in the list allowed them to select that none of the words in the list was prominent. The order of options in the list remained the same for each participant throughout the experiment. The question and the relevant options were presented in the Persio-Arabic script. Figure 6 presents an example target sentence as displayed on screen in the survey. The first option in the figure is the subject (Sara), the second option is the object (Gardener), the third is the locative (Multan), and the fourth is the verb (Summon). The last item in the list is ‘None of these’ and was interpreted as wide focus in the analysis. The mean duration of the survey was seven minutes. The experimental set up blocked multiple participations from a given respondent.

5.1.2. Participants

A total of 127 respondents participated in the survey. However, not all of them responded to all the items. The minimum number of responses to an item in the test phase was 71, whereas the maximum number of responses for an item was 85. Regardless of the number of data points provided by them, all the participants were included in the analysis. Only sixty participants chose to provide their personal information. Among these, forty-seven were male. Eighty percent of these participants were aged between twenty-one and thirty-five years. The remaining participants were between thirty-six and fifty years.

5.1.3. Data Analysis

The analysis was carried out using R (R Core Team 2014). The author analyzed the identified focus type and position from the survey. Table 4 illustrates the assignment of correct vs. incorrect labels to participants’ identified focus type and position, with reference to the target focus type and position elicited in the production experiment. The data visualizations were produced by using ‘ggplot2’ (Wickham 2016). The results of the online survey are presented in what follows.

5.2. Results: Focus Identification

The overall identification of focus type and position are presented in Table 5. The sentences produced by speaker 6, who consistently used longer noun duration, higher F0 peak, and wider F0 range to mark narrow focus, have a high frequency of correct identifications. Comparatively, speaker 2’s lack of use of any of the above mentioned phonetic cues to mark narrow focus resulted in a very high percentage of incorrect identifications. This shows that, while the consistent use of phonetic cues does not guarantee the correct identification of focus type and position in Urdu, the lack of these cues does lead to incorrect identification.

Further detailed analysis showed that the percentage of the correct identification of sentences produced by speaker 6 was negatively affected by wide focus. When the sentences produced in wide focus were removed from the analysis, this improved the percentage of the correct identification of focus position in sentences produced by speaker 6, as presented in Table 6.

Figure 7 illustrates the details of the correct identification of focus type and position for each target speaker. It shows that the percentage of the correct identification of initial and medial narrow focus produced by speaker 2 is quite low. However, her data was correctly identified well above the chance level, in the context of immediately preverbal narrow focus.

The identification of focus type and position in sentences produced by speaker 6 presents a different picture. The survey respondents successfully identified around two-thirds of the sentences carrying narrow focus at medial and immediately preverbal positions but were slightly less successful with sentence initial narrow focus. The correct identification of sentences with wide focus was below the chance level, at only 15%. Further analysis showed that these sentences were frequently identified to carry prominence at immediately preverbal position (49%). Interestingly, the identification of immediately preverbal prominence in sentences with wide focus was not limited to the sentences with locative nouns produced with a rising contour. The null marked nouns, produced without an F0 rise of their own, were also identified to be prominent in this context.

5.3. Discussion: Focus Identification

The results of the survey offer an interesting insight into the use of phonetic cues to identify focus type and position in Urdu. The identification of sentences produced by speaker 2 clearly indicates that the lack of longer syllable duration, higher F0 peak scaling, and wider F0 range in narrow focus, as compared with wide focus, results in the incorrect identification of focus position. Moreover, the mostly correct identification of the narrowly focused sentences produced by speaker 6 shows that the presence of these phonetic cues facilitates but does not guarantee correct identification by Urdu speakers. Cangemi et al. (2015) have reported similar findings for German listeners, where the consistent use of prosodic cues resulted, at best, in 75% correct identification of focus type. I argue that, overall, the lack of phonetic cues plays a more important role in the incorrect identification, as compared with the consistent use of these cues for the correct identification, of focus type and position by Urdu speakers.

Importantly, this survey did not aim to investigate the range of the manipulation of phonetic cues that is perceptually relevant. As I have not manipulated the phonetic cues in the target sentences, it is not possible to determine exactly which cues help in the correct identification of focus type and position. However, as Jabeen and Braun (2018) have shown that Urdu speakers are sensitive to duration manipulation in other focus contexts, it is expected that the survey participants in the current study paid attention to duration differences observed in narrow and wide focus contexts. Moreover, the identification of prominence at the immediately preverbal position in wide focus shows that the perception of this structural prominence is based on linear position rather than the use of phonetic cues. The results of this survey allude to a complex interplay between the use of phonetic cues and linear position in the identification of prominence in Urdu. This interplay has also been observed in other languages, such as Russian (Luchkina and Cole 2019) as well as New Zealand English and Samoan (Calhoun et al. 2019).

6. Prosodic Phrasing

In this section, I report on the presence of upstepped F0 peaks and prefocal lengthening on the left edge and postfocal compression on the right edge of narrowly focused nouns from the production experiment. The use of these cues is discussed with reference to sentences produced in wide focus. I argue that the manipulation of F0 peak scaling and syllable duration leads to variable prosodic phrasing in sentences with narrow and wide focus. The detailed analysis of these cues is presented in what follows.

6.1. Prefocal Upstepping

During the acoustic analysis of the target sentences, it was observed that, while the narrowly focused nouns did not carry the highest F0 peaks in the sentence, the peaks immediately before the focused noun were upstepped. Figure 8 presents an example of the string identical sentences produced in wide and narrow focus. Notice that, in wide focus, the second F0 peak is scaled lower than the initial peak, with a difference of 1.2 semitones. However, when the same sentence is produced with narrow focus at the immediately preverbal noun nə.mɑz ‘prayer’, the medial F0 peak, on the left edge of the narrowly focused noun, is upstepped as compared with the sentence initial peak, with a difference of 2 semitones. Crucially, there is no pitch reset after the upstepped F0 peak, as the peak on the focused noun is scaled lower than the sentence initial peak. Further details of upstepping before narrowly focused nouns placed at sentence medial and immediately preverbal positions are presented below. The analysis for sentence initial nouns is lacking here, as these nouns do not have a preceding F0 peak.

Figure 9 (top) presents the scaling of sentence initial F0 peaks when preceding the medial narrowly focused nouns and their variants produced in wide focus. It can be seen that speakers vary in their scaling of F0 peaks in the target contexts. Five speakers (1, 3, 6, 10, 12) upstepped the F0 peaks on the left edge of narrowly focused nouns, as compared with the peaks produced in wide focus. Most of the remaining speakers lowered the prefocal peaks, whereas speaker 8 shows no difference in her scaling of sentence initial F0 peaks in wide and narrow focus. This difference among speakers in the scaling of prefocal peaks in the target focus types led to a significant interaction (

χ

² = 36.3, DF = 11, p = 0.0001) between participants and focus type.

The scaling of sentence medial F0 peaks on the left edge of the target nouns placed at the immediately preverbal position is given in the bottom chart in Figure 9. Similar to the pattern observed for sentence medial focused nouns, speakers optionally use upstepped F0 peaks on the left edge of the narrowly focused nouns. Seven speakers (1, 3, 5, 6, 10, 11, 12) use higher F0 peak scaling before the narrowly focused nouns in comparison with wide focus. On the other hand, three participants (7, 8, 9) have lower F0 peaks on the left edge of focused nouns. The remaining two speakers (2, 4) do not differ in their scaling of F0 peaks in the target contexts. The speaker based variation in prefocal peak scaling showed significant interaction with focus type (

χ

² = 59.1, DF = 11, p < 0.0001).

As for intraspeaker variation in the scaling of an immediately prefocal F0 peak when compared with its variant produced in wide focus, five speakers (1, 3, 6, 10, and 12) consistently upstepped and two speakers (7, 9) consistently downstepped their prefocal F0 peaks. The remaining speakers were inconsistent in their scaling of F0 peaks.

6.2. Prefocal Lengthening

Lengthening of the syllable immediately preceding a prosodic phrase boundary has been found to cue prosodic phrasing in Urdu (Jabeen 2019c). In what follows, I present the analysis of prefocal syllable lengthening in sentences with medial and immediately preverbal narrow focus vs. wide focus. The analysis includes all the target sentences, irrespective of the grammatical role of a noun placed at initial, medial, or immediately preverbal position.

Figure 10 (top) shows the proportionate duration of the syllable on the left edge of sentence medial nouns2 produced in narrow and wide focus. It shows that four speakers (1, 4, 5, 10) elongate the target syllable in narrow focus as compared with wide focus. The degree of elongation, however, varies between speakers, as the first speaker shows the greatest extent of elongation. Six participants (3, 6, 11, 7, 8, 12) do not manipulate the duration of the target syllable in the given focus contexts. The remaining two speakers shortened the target syllable in narrow focus, as compared with wide focus.

The prefocal syllable duration in sentences with immediately preverbal narrow focus and wide focus3 is presented in Figure 10 (bottom). Three speakers (1, 2, 6) elongated the prefocal syllable in sentences with an immediately preverbal focus as compared with wide focus, although the degree of elongation is not the same for all of them. Speaker 2 shows a bigger difference in proportionate syllable duration than speaker 6. The data from the rest of the speakers varies in this context, as some of them do not manipulate duration at all, whereas others shortened the prefocal syllable.

The analysis of intraspeaker variation in prefocal lengthening offers interesting insights. Only two speakers (1, 10) consistently use this cue at both medial and immediately preverbal positions and elongate the prefocal syllable in narrow focus as compared with the syllable in wide focus. The remaining speakers vary in their use of duration, as they either elongate the prefocal syllable or shorten it at different positions, in comparison with the variants produced in wide focus. Having said that, elongation is found more frequently before the narrowly focused nouns at the medial position than before the immediately preverbal focused nouns.

6.3. Postfocal Compression

The lowering of F0 after narrow focus has been reported to be an important cue to identify the position of narrow focus in Hindi (Kügler 2020). Given that Urdu and Hindi are very similar languages, it is pertinent to ask if Urdu speakers also use postfocal compression in their production of narrow focus.

A reviewer questioned the usefulness of measuring postfocal compression in sentences with medial narrow focus, as the F0 is scaled low in the immediately preverbal nouns regardless of their focus context. However, I found that the postfocal region was not always compressed and, sometimes, the noun phrases following narrow focus did carry a rising F0 contour, albeit with reduced F0 range. Hence my use of the term “compression” instead of the more widely used “deaccentuation” or “de-phrasing”. In this data, I found that 27% of the nouns immediately following the narrowly focused noun at the medial position were produced with a rising F0 contour. Comparatively, 48% of their counterparts in wide focus were produced with a rise. Therefore, I argue that the analysis of compression in this context is nontrivial.

The postfocal region in this dataset exhibited two trends: Participants either produced reduced rising contours following the narrowly focused noun or the F0 in the postfocal region was completely compressed. Figure 11 illustrates this difference. It shows that a string identical sentence with sentence initial narrow focus was produced with an F0 peak following the narrowly focused noun (Figure 11a) by one speaker, while another speaker compressed the postfocal region (Figure 11b).

For a sentence with initial narrow focus, I measured F0 scaling on the last syllable of NP2, as this is where an F0 peak occurs in the wide focus context. The scaling of F0 (in semitones) in the postfocal NP2 was compared with its counterpart produced in wide focus. Similar measurements were taken on NP3 to analyze postfocal compression after sentence medial narrow vs. wide focus. The immediately preverbal nouns were excluded from this analysis, as the verbs were always produced with falling F0 regardless of their focus context.

Figure 12 (top) presents the F0 scaling of the noun immediately following sentence initial narrow focus and its variant produced in wide focus. It shows that ten speakers use postfocal compression as indicated by the lower scaling of F0 in the narrow focus condition. Interestingly, speaker 6 does not lower the F0 in the postfocal region, whereas she had consistently used higher F0 peak, wider F0 range, and longer proportionate duration to distinguish narrowly focused nouns from their counterparts in wide focus. Speaker 12 also does not lower the F0 after initial narrow focus. The statistical analysis showed a significant interaction between participants and focus type with reference to postfocal compression after initial narrow focus (

χ

² = 21.3, DF = 11, p < 0.03).

The F0 scaling after target nouns at the sentence medial position is given in Figure 12 (bottom). Three speakers (6, 10, 12) failed to show postfocal compression in this context, while another three of them (2, 4, 8) do not vary F0 scaling in any context. The remaining six speakers use lower F0 scaling after the narrowly focused nouns. The analysis showed a significant interaction between focus type and participants in F0 scaling after sentence medial nouns (

χ

² = 25.6, DF = 11, p = 0.007).

The analysis of intraspeaker variation in postfocal compression reveals that it was used far more frequently after sentence initial as compared with medial focus. Moreover, speakers 6 and 12 never used this cue, whereas two speakers (8 and 10) used it only in the sentence initial narrow focus context.

6.4. Discussion: Prosodic Phrasing

The data presented above shows speaker based variation in the upstepping of high tone and proportionate syllable duration on the left edge of the narrowly focused words. As for postfocal compression on the right edge of focused nouns, it is used more frequently than the other two cues to prosodic phrasing. However, inter- and intraspeaker differences were found in the use of compression as well. This shows that Urdu speakers do not use these cues uniformly and consistently to mark prosodic phrasing. Furthermore, it indicates that narrow focus does not always lead to a change in prosodic phrasing as compared with the identical sentences produced in wide focus.

A pertinent question here is the level of phrasing in the prosodic hierarchy marked by the use of upstepping and lengthening on the left edge. In the current analysis, I follow Jabeen (2019c)’s proposal of two levels of phrasing above a prosodic word in Urdu: Accentual Phrase (AP) and Intonational Phrase (IP). I argue that the use of upstepped high tones and longer proportionate duration on the left edge of the focused noun indicates a recursive Intonational Phrase boundary. My proposed phrasing for a sentence with medial narrow focus is given in (8-a), whereas (8-b) shows that an identical sentence with wide focus is produced as a single IP.

(8)	a.	[ [Subject]_IP Object_Focus Object Verb]_IP	Medial narrow focus
	b.	[ Subject Object Object Verb]_IP	Wide focus

It could be argued that lengthening on the left edge of narrowly focused nouns is indicative of the left edge of an IP, as shown in (9). However, Jabeen (2019b, 2019c) has shown that lengthening is a cue to the right edge of an IP in Urdu. Hence, the alternative, illustrated in (9), is ruled out in the current discussion.

(9)

a.

[ Subject [Object_Focus Object Verb]_IP ]_IP

On the other hand, it is possible that upstepping and lengthening immediately preceding the focused nouns mark an Accentual Phrase boundary. Jabeen and Delais-Roussarie (2020) had raised the same question regarding the realization of double rising contours (L1 H1 L2 H2) in Urdu noun phrases (nouns followed by case markers). Their analysis of the temporal alignment of these tones showed that they form one long Accentual Phrase. To my knowledge, there is no existing evidence for recursive APs in Urdu.

A reviewer argued that the upstepping and lengthening reported above could be cues for an Intermediate Phrase boundary. In the following section, I report the findings of an online survey regarding Urdu speakers’ preference for a recursive IP boundary vs. an Intermediate Phrase boundary on the left edge of narrowly focused nouns. The details of the survey are presented in what follows.

7. Identification of Prosodic Phrasing

This survey was conducted to investigate Urdu speakers’ perception of prosodic phrasing in sentences produced in narrow and wide focus. Overall, three possible options for prosodic phrasing are investigated here, as (10) shows.

(10)	a.	The sentences with wide or narrow focus formulate one IP and the use of narrow focus does not affect prosodic phrasing at the IP level.
	b.	There is a recursive Intonational Phrase boundary on the left edge of narrowly focused nouns.
	c.	There is an Intermediate Phrase boundary on the left edge of narrowly focused nouns.

The points under discussion here are the presence or absence of a prosodic phrase boundary on the immediate left of focused nouns and the nature of that boundary. In the current investigation, I do not rule out the possibility of a prosodic phrase boundary on the right edge of the focused words. However, in order to not overwhelm the respondents, I decided to limit this investigation to the left edge boundary only. I used an online survey to evaluate which of the options presented in (10) are relevant for Urdu. The survey is partially based on the methodology used by Kentner and Féry (2013) in their investigation of perceived prosodic phrasing in German nouns with different associations of coordination. They had used a subset of the data from their production experiment as stimuli for the perception of prosodic phrase boundaries and asked their participants to choose among the possible phrasing conditions given in a list. The details of this survey on prosodic phrasing are presented below.

7.1. Methods: Phrasing Identification

7.1.1. Apparatus

I designed this online survey using PsyToolkit (Stoet 2010, 2017). The overall format of this survey was similar to the one reported in Section 5, earlier. The survey consisted of three sections. The first section comprised the training part, where each participant was presented with four sentences similar in structure to the target sentences. The second section was the test phase and the third section solicited personal information such as participants’ sex, age, mother tongue, and education.

On the welcome screen, participants were introduced to the concept of prosodic phrasing with examples. They were shown an Urdu sentence with three possible ways of prosodic grouping, as shown in (11). They were informed that (11-a) is an example of one large phrase. Additionally, (11-b) shows a sentence where a phrase is embedded into a larger phrase of the same level. Furthermore, (11-c) was explained to contain a sentence with a smaller phrase embedded in a larger phrase. The example sentences were written in the Persio-Arabic script and the directions and explanations were also provided in Urdu.

(11)	a.	[Subject Object Noun Verb]	One Intonational Phrase
	b.	[ [Subject Object] Noun Verb]	IP embedded in IP
	c.	[ (Subject Object) Noun Verb]	ip embedded in IP

The participants were directed to listen to the upcoming sentences and decide the relevant grouping for the target sentences. On each page of the training and test phase, the respondents were presented with a single audio sentence and a question asking them to select the prosodic grouping for the target sentence. Every question was followed by five possibilities of prosodic phrasing. Below, (12) presents the options shown to the participants on the screen. (12-a) shows that the target sentence forms one IP and (12-b and c) illustrate the presence of a prosodic phrase boundary on the left edge of narrowly focused noun at the sentence medial position. The relevant boundary in (12-b) is a recursive IP and in (12-c) it is an Intermediate Phrase. For sentences with an immediately preverbal narrow focus, (12-d and e) show a prosodic phrase boundary to the left of focused nouns. In addition, (12-d) offers an embedded IP and (12-e) presents an Intermediate Phrase at the left edge of focused noun. As per the research question here, the alternatives presented in (12-b and c) are not relevant for these sentences, just as the options given in (12-d and e) are irrelevant for sentences with medial narrow focus. I decided to keep the set of alternatives for prosodic phrasing constant in order to make it easy for participants to process the task.

(12)	a.	[Subject Object Noun Verb]	One IP
	b.	[ [Subject] Object Noun Verb ]	Recursive IP
	c.	[ (Subject) Object Noun Verb ]	ip within IP
	d.	[ [Subject Object] Noun Verb ]	Recursive IP
	e.	[ (Subject Object) Noun Verb ]	ip within IP

Every time an item was displayed on the screen, the survey question was followed by a reminder that square brackets indicated a “larger” prosodic phrase, whereas parentheses referred to a “smaller” prosodic phrase. Figure 13 presents an example from the survey, as displayed to the participants.

7.1.2. Stimuli

As reported in Section 5, the sentences produced by speaker 6 were frequently identified correctly in terms of the position of narrow focus. Therefore, the stimuli in the current survey consisted of sentences with medial and immediately preverbal narrow focus as well as wide focus produced by this speaker4. The sentences with initial focus were eliminated, as my analysis of recursive IP phrasing can be verified only when narrow focus is placed at non-initial positions. One sentence with a wide focus and one with a medial narrow focus were not added to the survey, as they had been produced with noncanonical word order. Thus, the stimuli consisted of thirteen target sentences (medial narrow focus: 4, immediately preverbal narrow focus: 5, wide focus: 4).

7.1.3. Participants

Thirty-two respondents took part in this survey. However, they did not respond to all the items. The minimum number of responses for an item in the test phase was sixteen, whereas the maximum number was twenty. Only thirteen participants provided their personal information. Among them, nine were male. Twelve of these participants were aged between twenty-one and thirty five years. The remaining one was under twenty years of age. All the participants had graduated with at least one university degree.

7.1.4. Data Analysis

The data was preprocessed using R. In order to analyze the log odds of different prosodic phrasings, I ran a Multinomial Logistic Regression model using the “nnet” package (Venables and Ripley 2002). The model used the position of the focused noun to predict the log odds of each prosodic phrasing. Wide focus was used as the reference level for comparison with narrowly focused nouns at medial and immediately preverbal positions. The p-value was calculated using the two-tailed Z test. The results of the analysis are presented in what follows.

7.2. Results: Phrasing Identification

The overall analysis showed that 48% of the participants perceived a prosodic phrase boundary on the left edge of focused nouns at the sentence medial as well as immediately preverbal positions. Table 7 presents a detailed look at respondents’ preferences and shows that there is considerable variation in their selection of prosodic phrasing.

The identification of prosodic phrasing differs on the basis of focus type and the position of focused nouns. Twenty-six percent of the sentences produced in wide focus were identified to constitute only one IP, as compared with sentences with narrow focus at medial (16%) and immediately preverbal (25%) positions. The sentences with medial narrow focus were most frequently identified to carry a recursive Intonational Phrase boundary on its left edge (25%). On the other hand, the preference for an Intermediate Phrase boundary on the left edge of the focused noun (23%) or a recursive IP boundary on its right edge (22%) are also above chance level. The analysis of immediately preverbal narrow focus does not clarify the situation either. The identification of only one IP or a recursive IP boundary, as well as an Intermediate Phrase boundary on the left edge of the focused noun, is well above the chance level. However, there is a slight preference for a recursive IP boundary on the left edge of narrowly focused nouns at both sentence medial and immediately preverbal positions. The results of multinomial regression showed that recursive IPs on the left edge of medial focused nouns were “preferred” significantly (1.41, p = 0.01), as compared with this phrasing in wide focus context. No other phrasing preference was found to be significant at any focus position. Therefore, one could only cautiously argue that participants used an upstepped F0 peak and/or longer syllable duration on the left edge of medial focused nouns to identify prosodic phrase boundaries.

Although there is no overwhelming preference for one type and level of prosodic phrasing in any focus condition here, sentences with immediately preverbal narrow focus resemble more closely the sentences with wide focus, as compared with their counterparts with medial narrow focus. The sentences with wide, as well as immediately preverbal narrow, focus have been identified to carry a recursive IP boundary on the left edge of the immediately preverbal nouns with almost similar frequency. The participants’ preference for only one IP or an Intermediate Phrase boundary on the left edge of the immediately preverbal nouns is also similar in these two contexts. Recall that in Section 5.2, earlier, it was reported that 49% of the wide focus sentences produced by speaker 6 were identified to carry prominence at the immediately preverbal position. The combined findings of the two surveys on the identification of focus position and prosodic phrasing show that the immediately preverbal position is structurally salient for Urdu speakers, as their perception of prominence at this position in narrow focus is similar to the counterparts in a wide focus context. These results partially uphold the claims of Butt and King (1996b) regarding the structurally fixed focus position. It could be argued that Butt and King (1996b)’s analysis regarding the immediately preverbal focus position in Urdu may have been influenced by this structural salience. Moreover, the findings of Luchkina and Cole (2019) and Calhoun et al. (2019), discussed in the next section, attest that structural vs. prosodic prominence plays an important role in other typologically different languages as well.

Noticeably, the respondents of this survey perceived a prosodic phrase boundary on the left of immediately preverbal nouns in every focus context, as shown in the red box in Table 7. Moreover, this is the only position where the effect of focus type and position is overcome, as the participants agree in their perception of a prosodic phrase boundary here in medial and immediately preverbal narrow focus as well as in wide focus. The perception of a phrase boundary before the immediately preverbal nouns was found in sentences with locative nouns (produced with a rising contour) as well as with null marked nouns (without an F0 rise of their own) placed at this position. Having said that, the perception of salience by position does not override the identification of prominence based on the use of phonetic cues. This presents a complex interplay of prominence perception in Urdu, as prominence is influenced both by linear position and the presence of phonetic cues. This data shows that, while the narrowly focused words may not be placed at the immediately preverbal position, the words that are placed there are perceived to be prominent.

Finally, the inconclusive findings of this survey may be attributed to the fact that the stimuli for narrow focus condition did not contain postfocal compression, a significant cue for the identification of the position of narrow focus in Hindi (Kügler 2020). The stimuli for this survey on prosodic phrasing were taken from the production experiment reported earlier in the current study. I had selected a participant who used all the phonetic cues (higher F0 peak scaling, wider F0 range, and longer proportionate duration) consistently at all the target positions (initial, medial, immediately preverbal). Moreover, she had used two of the cues to prosodic phrasing in Urdu, i.e., the lengthening and upstepping of a high tone on the left edge of the narrowly focused nouns. Granted, the absence of postfocal compression might have been an important deterrent and led to the lack of preference for a particular prosodic phrasing in any of the narrow focus contexts. However, this does not explain why there is no clear preference for prosodic phrasing for the sentences produced in a wide focus context.

8. General Discussion

8.1. Word Order and Intonation

The data presented in this study allude to a strong link between the use of phonetic cues and the structural focus position for the perception of prominence in Urdu. This interplay has also been observed in crosslinguistic comparisons with typologically distinct languages. One such example is offered by Luchkina and Cole (2019)’s analysis of Russian, a language with word order variation similar to Urdu. In Russian, constituents with new information focus are placed postverbally, as compared with the immediately preverbal position in Urdu. Despite the difference in preference for the structural focus position, both languages display remarkable similarities in the variable use of phonetic cues to mark information structure. Luchkina and Cole (2019) reported that they found considerable overlap in the use of phonetic cues to mark discourse new and given information in Russian. The differences in the use of duration, intensity, F0 max, and F0 range to mark new and given information were subtle and the effect sizes for significant difference in each of these cues were small. Moreover, the authors found interspeaker variation in the selection of phonetic cues as well as in the extent of their use, as some speakers exploited a given cue to a larger extent than the others. The preference for the canonical word order and optional use of word order manipulation, the variable use of phonetic cues, and the use of structural and prosodic prominence in Russian are very similar to the findings reported for Urdu in the current study.

Similarly, Calhoun et al. (2019)’s analysis of prominence perception in Samoan and English provides another example of the effect of linear position on perceived prominence from two very different language families. Their perception experiments showed that, both in Samoan and English, the perception of prominence was strongly affected by the position of words expected to carry a nuclear accent. Moreover, Calhoun et al. (2019) found that, when there was a clash between prominence marked by structural position and prosodic cues, English as well as Samoan speakers chose in favor of the structurally induced prominence. A similar trend has also been reported for Urdu, in Figure 7 presented in Section 5.2, earlier. This tendency in Urdu, however, is weak, as 66% of the narrowly focused words at the immediately preverbal position were perceived to be prominent, as compared with the sentence initial (58%) and medial focused nouns (64%). This shows that the alignment of phonetic cues with the structurally prominent immediately preverbal position only slightly improves the perception of prominence for Urdu speakers. Moreover, unlike the findings in English and Samoan reported by Calhoun et al. (2019), Urdu speaking participants preferred to use phonetic cues to identify prominence, when presented with a mismatch between phonetic and positional prominence. The findings reported in Calhoun et al. (2019) do not align perfectly with the patterns observed in Urdu. However, a comparison of these results shows that a given phenomenon, i.e., the use of structural position aligned with the target phonetic cues to perceive prominence, manifests itself to a different extent across languages.

8.2. Syntax–Phonology Interface

Over the years, there has been extensive research investigating the mapping between syntactic structures and prosodic categories. In her proposal of an “end-based theory”, Selkirk (1986) proposed an X^Max parameter dictating that the edges of syntactic constituents align with the edges of prosodic constituents. This parameter leads to a language specific preference for marking either the right or the left edge of constituents. This edge-based theory of prosody–syntax mapping was later modified to propose that the right and left edges of syntactic and prosodic constituents match with each other (Selkirk 2011). In the Match Theory, however, syntactic and prosodic phrases are not always isomorphic and the theory tolerates non-exhaustive parsing and the recursive phrasing of prosodic domains. This allows prosodic domains to be somewhat free of syntactic structures and makes room for phonological processes to determine the boundaries of prosodic domains.

Prosodic analyses incorporating embedded prosodic phrases have been frowned upon, as earlier versions of alignment theories claimed that each new level in the prosodic hierarchy equates to a new level of phrasing. However, Ladd (1986), among others, showed that a prosodic analysis based on recursive phrasing is not only feasible but also desirable in certain contexts. Furthermore, Ito and Mester (2012) have developed a detailed analysis espousing the superiority of recursive phrasing based analysis using maximal and minimal relations. In their analysis of prosodic phrasing in Japanese, they argued that some phenomena, such as accent culminativity, downstep, and initial rise, can be better explained by allowing the embedding of minimal phonological phrases into maximal phonological phrases. They further claimed that this relational account offers a better alternative to proposing distinct levels of phrasing. Importantly, Ito and Mester (2012)’s analysis did not argue against syntax–phonology mapping. Instead, their analysis allowed structure without having to propose new categories to explain phonological phenomena. Keeping this in mind, in the following subsections, I explain the prosodic organization of the data reported in this study.

8.3. Position and (Non-)Specificity

As shown by the results of both the online surveys presented above, the nouns placed at the immediately preverbal position are perceived to carry a prosodic phrase boundary on their left edge, regardless of focus type and position. This can be explained with reference to the syntactic phrasing of null marked nouns and verbs in Urdu. Butt and King (1996a) have reported that the null marked nouns at the immediately preverbal position are nonspecific and are semantically and syntactically incorporated with the following verb. Thus, the constraint dictating the matching of syntactic and prosodic phrasing (Selkirk 1984, 2011) explains, and indeed predicts, a phrase boundary on the left edge of the null marked nouns at the immediately preverbal position. However, this constraint and nonspecificity do not explain the existence of a phrase boundary on the left edge of a locative noun placed at the immediately preverbal position. As per Butt and King (1996a)’s analysis, locative nouns, and all case marked nouns, have the same scrambling opportunity and specificity structure as accusative nouns. The fact that Urdu speakers perceived a boundary on the left edge of immediately preverbal null marked as well as locative nouns shows that the perception of this phrase boundary is an effect of position in general and not the incorporated status of null marked nouns only.

8.4. Prosodic Phrasing

In this subsection, I discuss three possible analyses of prosodic phrasing in the narrow focus context in Urdu. The realization of upstepped F0 peaks and prefocal lengthening is used to propose that there is a recursive Intonational Phrase boundary on the left edge of narrowly focused nouns. I also discuss the alternative analyses of these boundaries, as marking edges of Accentual Phrases or Intermediate Phrases.

8.4.1. Recursive AP Boundary

Below, (13) illustrates the phrasing structure identified in the online survey on the perception of phrase boundaries in Urdu. As the phonological nature of this phrase boundary is as yet undecided, I use ‘|’ here to mark its position. Below, (13-a) shows that the participants perceived a phrase boundary on the left of the focused noun as well as the immediately preverbal noun, while (13-b) carries only one embedded phrase boundary, as the edges of the narrowly focused and immediately preverbal nouns are aligned due to the position of the focused word. The perception of a phrase boundary on the left of NP3 in (13-c) confirms the existence of a boundary induced by the structural position. A possible analysis is that both the focus induced and structural boundaries result in Accentual Phrases. However, the very low preference for a phrase boundary after NP1 in (13-b and c) confirms that the focus and position induced boundaries are different in nature from the AP boundaries proposed by Jabeen and Delais-Roussarie (2020) in their analysis of rising F0 contours in Urdu.

(13)	a.	[ NP1 \| NP2_Focus \| NP3 Verb ]	Medial narrow focus
	b.	[ NP1 NP2 \| NP3_Focus Verb ]	Im. preverbal narrow focus
	c.	[ NP1 NP2 \| NP3 Verb ]	Wide focus

Following Jabeen and Delais-Roussarie (2019, 2020), I analyze every rising F0 contour in the target sentences as an AP and the low and high tones in a rising contour mark the left and right edges of the AP (Lp Hp), respectively. Thus, both the sentences shown in Figure 8 presented in Section 6.1 earlier contain APs at the sentence initial and medial positions. The narrowly focused noun at the immediately preverbal position is produced as an AP on its own, as indicated by the rising F0 contour. Its counterpart produced in wide focus, however, forms an AP together with the verb, as shown by the lack of a rising contour on the immediately preverbal noun in this context. In addition, (14) shows the difference in the AP structures of sentences presented in Figure 8. Importantly, there is no difference in the AP phrasing of sentences with initial or medial narrow focus vs. wide focus, as the sentence initial and medial noun phrases are always produced with a rising F0 contour. Thus, it is only at the immediately preverbal position in a ditransitive sentence that the focus induced difference in AP phrasing is captured.

(14)	a.	(NP1)_AP (NP2)_AP (NP3 Verb)_AP	Wide focus
	b.	(NP1)_AP (NP2)_AP (NP3)_AP (Verb)_AP	Im. preverbal narrow focus

The Accentual Phrase structure presented in (14) does not capture and explain the presence of upstepped F0 peaks and syllable lengthening on the left edge of focused nouns. Moreover, the results of the survey on prosodic phrasing showed that the participants did identify a phrase boundary on the left edge o focused nouns. It is clear from the discussion above that this cannot be an AP boundary.

8.4.2. Recursive IP Boundary

In her discussion of Intonational Phrases in Urdu, Jabeen (2019c) has shown that the consecutive rising contours marking APs within an IP are downstepped. Similar findings have been reported for other SA languages, such as Hindi (Patil et al. 2008), Bengali (Khan 2014), and Tamil (Keane 2014). In fact, Khan (2016, 2018) analyses them as an areal feature of South Asian languages that have been analyzed so far. Jabeen (2019c) has also reported the presence of lengthening before an IP boundary. Furthermore, Jabeen (2019b) found evidence for recursive IPs, as indicated by the presence of upstepped F0 peaks, preboundary lengthening, and nonmodal voice quality, in the production of numeric sequences in Urdu. Considering the presence of preboundary lengthening and upstepped F0 peaks on the left edge of the narrowly focused nouns, I propose the existence of a recursive IP boundary at that position. This argument is further supported by the fact that not a single instance of upstepped F0 peaks was found in sentences produced in wide focus in the production experiment reported in this study. The nonterminal nature of this IP boundary is indicated by the lack of pitch reset after the upstep, as the F0 peaks on the medial and immediately preverbal focused nouns are scaled lower than the sentence initial peaks. Below, (15) illustrates my analysis of Intonational Phrase boundary alignment in sentences with a narrow focus at the immediately preverbal position and wide focus in Urdu. This analysis is concerned with focus related phrasing only. The phrase boundary induced by the structural position is discussed, later, in Section 8.4.4.

(15)	a.	[ NP1 NP2 NP3 Verb ]_IP	Wide focus
	b.	[ [NP1 NP2 ]_IP NP3_Focus Verb ]_IP	Im. preverbal narrow focus

Importantly, the phrasing proposed in (15) does not alter the analysis of Accentual Phrases, as illustrated in (14), and allows the focused as well as prefocal noun phrases to be parsed as APs, similar to their counterparts produced in wide focus.

8.4.3. Intermediate Phrase Boundary

As mentioned by a reviewer, the upstepping of F0 peaks and lengthening on the left edge of narrowly focused nouns may also be analyzed as markers of an Intermediate Phrase boundary. They proposed that, in wide focus, the Intermediate Phrase boundary aligns with the sentence final IP boundary but the former surfaces sentence medially in a case of nonfinal narrow focus. In that case, they argued, upstep is a cue to sentence medial Intermediate Phrase boundary. However, I claim that the support for this analysis is weakened by (1) the results of the survey on prosodic phrasing, (2) the use of phonetic cues found in the production experiment, as well as by (3) the existing literature on the analysis of Intonational Phrase boundaries in Urdu.

In the survey on the perception of prosodic phrasing, participants showed a (slight) preference for a recursive IP boundary as compared with an Intermediate Phrase boundary on the left edge of narrowly focused nouns. Although the survey results do not offer conclusive evidence in favor of either Intermediate Phrase or a recursive IP boundary, the fact remains that a recursive IP boundary was preferred by the participants.

In her proposal of a prosodic hierarchy in Urdu, Jabeen (2019c) argued against the presence of an Intermediate Phrase as she did not find any independent evidence or a phonological process to delimit this level of phrasing. That remains the case in my analysis of narrow and wide focus as well. Jabeen (2019c) has shown that preboundary lengthening is a cue to an Intonational Phrase boundary in Urdu. One needs to choose an arbitrary cut-off point for syllable elongation in order to use the same cue to propose an Intermediate Phrase boundary on the left edge of narrowly focused nouns. Moreover, the discussion of upstepped high tones in polar questions (Jabeen 2020) presented in Section Recursive IPs in Urdu Section 1.3.2 has already shown that analyzing these tones as marking Intermediate Phrase boundaries results in unlikely structures that are not supported by empirical data. Furthermore, high tones are used to mark Intonational Phrase boundaries as well as AP boundaries in Urdu. In order to use the same cue to propose an ip boundary, one needs to offer randomly chosen ranges for the scaling of high tones, to mark three different types of prosodic phrase boundaries. Finally, the superiority of recursive phrasing over proposing multiple levels of prosodic phrasing has been convincingly advocated by Ito and Mester (2012) in their analysis of Japanese intonation. Importantly, I do not rule out the possibility of an Intermediate Phrase boundary in Urdu and it is possible that future research can help make an empirical decision regarding this level of phrasing in the prosodic hierarchy of Urdu. However, there is no cue, phonological or prosodic, that distinguishes the left edge of focused nouns as Intermediate Phrase boundaries.

8.4.4. Structural Phrase Boundary

Below, (16) illustrates the phrase boundaries perceived by Urdu speakers in sentences with medial and immediately preverbal narrow focus as well as wide focus. The phonetic realization of a focus induced boundary at the left edge of sentence medial and immediately preverbal nouns differs from the realization of a position related boundary. The former is marked by upstepped F0 peaks, while the latter is not. The use of preboundary lengthening is not applicable in this comparison, as elongation in the narrow focus context is measured with reference to the baseline provided by the wide focus context. The data presented in the current study shows that the structurally induced prosodic phrase boundary is not marked by using any of the phonetic cues analyzed here. In fact, the participants perceived a phrase boundary at this position, notwithstanding the production of the immediately preverbal noun with or without a rising F0 contour.

(16)	a.	[ [NP1 ]_IP NP2_Focus \| NP3 Verb ]_IP	Medial narrow focus
	b.	[ [NP1 NP2 ]_IP NP3_Focus Verb ]_IP	Im. preverbal narrow focus
	c.	[ NP1 NP2 \| NP3 Verb ]_IP	Wide focus

In the last subsection, I argued against the presence of Intermediate Phrase boundaries in Urdu, as this analysis requires setting up arbitrary cut-off points in the use of preboundary lengthening and the scaling of upstepped F0 peaks so that the same cues may be used to identify sentence final IP boundaries as well as sentence medial ip boundaries. The arguments against proposing an Intermediate Phrase boundary in Urdu are also relevant for the analysis of a phrase boundary induced by structural position. Moreover, there are no phonetic and phonological cues to distinguish the phrase boundary before the immediately preverbal noun in (16-c), from the Accentual Phrase boundaries before the medial noun (NP2) in the same sentence produced in wide focus. One possibility is to analyze the structurally induced phrase boundary perceived in wide focus as an AP boundary. However, the fact remains that the perception of a phrase boundary on the left edge of medial nouns (NP2) produced in wide focus was very low. This indicates that the boundary before immediately preverbal nouns is perceptually more salient for Urdu speakers than the other AP boundaries in the sentence. Based on this, I propose to analyze the structurally induced phrase boundary as a recursive IP boundary, as shown in (17). Admittedly, this is not an empirical but an analytical decision that allows a uniform interpretation of embedded phrasing in Urdu, analogous to the recursive IP boundary on the left edge of narrowly focused nouns at the sentence medial position.

(17)

[ [NP1 NP2 ]_IP NP3 Verb ]_IP

Wide focus

8.5. Variation in Prosodic Phrasing

As reported in Section 6, not all speakers upstepped the F0 peaks and elongated the syllables on the left edge of narrowly focused nouns. According to my analysis, failure to perform this results in the lack of difference in the prosodic phrasing of sentences with narrow and wide focus. Similarly, when a noun is produced with a rising contour in both wide and narrow focus contexts, there is no difference in their AP structure as far as the target noun is concerned. However, the AP structure of sentences produced in wide and narrow focus may differ in the postfocal region. Below, (18-a) shows the AP structure of a sentence with initial narrow focus followed by postfocal compression, as compared with the Accentual Phrase structure of a sentence produced in wide focus given in (18-b). Importantly, postfocal compression may not always lead to dephrasing, as APs may still be there, albeit produced with a compressed F0 register.

(18)	a.	(NP1)_AP NP2 NP3 Verb	Initial narrow focus
	b.	(NP1)_AP (NP2)_AP (NP3 Verb)_AP	Wide focus

The analysis offered here shows that prosodic phrasing by Urdu speakers is not uniform and speakers’ variable use of phonetic cues results in variation in prosodic phrasing as well. This is further supported by the differences found in the perception of prosodic phrasing structure in the online survey. The existence of speaker related variation in the use and manipulation of phonetic cues has also been reported in Hindi (Kügler 2020), Tamil (Keane 2014), German (Baumann et al. 2006; Cangemi et al. 2015; Schuppler and Ludusan 2020), American English (Kim 2019; Ouyang and Kaiser 2015), British English (Peppé et al. 2000), Russian (Luchkina and Cole 2019), as well as New Zealand English and Samoan (Calhoun et al. 2019). These are typologically different languages with distinct intonation systems. The comparable variation in the use of phonetic cues and consequent prosodic phrasing in Urdu provides further evidence that intonation is variable.

9. Conclusions

The results reported here show that Urdu speakers use gradient (F0 peak scaling, F0 range, proportionate duration) and categorical (presence/absence of embedded IP boundary at the left edge) cues to mark narrow focus. Based on the findings presented in this study, I argue that the use of phonetic cues should be seen as a variable phenomenon. If this were put on a continuum where one end represents the use of every phonetic cue at all the positions (speaker 6) and the other end represents the complete absence of these cues (speaker 2), most of the Urdu speakers fall nearer to the former pole of the continuum. Therefore, both speakers 2 and 6 are anomalies in this regard and the mix and match strategy exhibited by the remaining speakers appears to be the norm in the realization of wide and narrow focus in Urdu. This strategy affords the speakers a parsimonious way to signal focus type and position, without sacrificing the intended meaning.

The findings of this study also indicate Urdu speakers’ preference for the in situ placement of narrowly focused nouns. Moreover, the results of the surveys on the identification of focus type and prosodic phrasing show a complex interchange between the immediately preverbal position and the use of phonetic cues in the identification of prominence. These findings broaden our understanding of the use of phonetic cues and illustrate that not all the cues need to be used simultaneously to convey the intended information status. Moreover, as prosodic phrasing is cued by the use of relevant phonetic cues, the presence/absence of the latter affects the prosodic phrase structure of the target sentences. The current study also underlines the need to rigorously investigate individual variation and how speakers manipulate phonetic cues to achieve their communicative goals. Furthermore, it contributes to the emerging understanding of structural and prosodic prominence observed in languages with flexible word order.

Funding

The data collection for the production experiment reported in this study was funded by the DFG project FOR2111 “Questions at the Interfaces” (“Information Structure and Questions in Urdu/Hindi”, P4), Grant number BU 1806/9-2.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of the University of Konstanz as part of the DFG funded project “Questions at the Interfaces”, Grant number BU 1806/9-2.

Informed Consent Statement

Written informed consent was obtained from all participants involved in the study. The participants were informed that they could exit the experiments at any stage.

Data Availability Statement

The data from the production experiment and the two online surveys, as well as the R Markdown files with the code used to analyze data, may be retrieved here: https://osf.io/54jqw/ (accessed on 8 March 2022).

Acknowledgments

Many thanks go to the anonymous reviewers and the guest editors for their in depth feedback. Very many thanks especially to Frank Kügler for his close editing and detailed comments. I would also like to thank Petra Wagner for her valuable comments on the first draft of this article. The errors, of course, remain mine.

Conflicts of Interest

The data for the production experiment presented here was collected as part of my PhD work and partially analyzed and reported in my dissertation. The research was conducted in the absence of any financial or commercial relationship that could be considered as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IP	Intonational Phrase
ip	Intermediate Phrase
AP	Accentual Phrase
SA	South Asian

Appendix A. Data Set

a.	mo.nɑ=ne no.mi=ko nə.mɑ:z pəɽ.hɑi
	Mona=Erg Nomi=Acc prayer.Nom.F.Sg read.Perf.F.Sg
	‘Mona had Nomi say a/the prayer’.
b.	nɛ.nɑ=ne lɛ.lɑ=ko ʧə.nɑ:b=mẽ ph $\tilde{ɛ}$ .kɑ
	Naina=Erg Laila=Acc Chenab=in throw.Perf.M.Sg
	‘Naina threw Laila in (the river) Chenab’.
c.	sɑ.rɑ=ne mɑ.li=ko mʊl.t̪ɑ:n bʊ.lɑ.jɑ
	Sara=Erg gardner.M.Sg=Acc Multan call.Perf.M.Sg
	‘Sara summoned a/the gardner to (the city of) Multan’.
d.	lɛ.lɑ=ne ni:.ləm=se kə.bɑ:b mã:.gɑ
	Laila=Erg Neelam=Obl kebab.Nom.M ask.Perf.M.Sg
	‘Laila asked Neelam for a kebab’.
e.	lɛ.lɑ=ne i.mɑm=se ʊ.d̪^hɑ:r li.jɑ
	Laila=Erg Imam=Obl loan.Nom.M.Sg take.Perf.M.Sg
	‘Laila borrowed from Imam’.

Appendix B. Statistical Model

Below, (19-a) illustrates the Linear Mixed Effects Regression models run to analyze the interaction between participants and focus type in the production of a target dependent variable (DV). These variables included the scaling of F0 peaks in the immediately prefocal and focused nouns, F0 range of target nouns, proportionate noun duration, and postfocal compression. Furthermore, (19-b) was used for ANOVA.

(19)	a.	Ṁodel = lmer(DV ∼ focus_type * participants + (1 \| item), data)
	b.	Anova(Model, type = “III”)

Notes

1	The discussion of focus realization in the following subsections is limited to studies investigating the difference between wide and narrow focus. For analyses of other focus types, see Choudhury (2015) for Bengali, Puri (2013) and Genzel and Kügler (2010) for Hindi.
2	This is the last syllable of the sentence initial noun phrase.
3	Last syllable of the sentence medial noun phrase.
4	As a reviewer pointed out, speaker 6 exhibited upstepping and elongation on the left edge of narrowly focused nouns but did not use postfocal compression on the right edge. However, the results of the identification survey showed a high rate of correct identification of focus position (63%) for the sentences produced by her. Speaker 1 would have been a likely candidate to provide data for this survey as she used prefocal upstepping and elongation as well as postfocal compression. However, there is no data available for focus identification for this speaker. Therefore, I decided to make a trade-off and to use data from speaker 6 who used all the phonetic cues consistently to mark narrow focus as well as two of the three phrasing-related cues.

References

Baayen, Harold, Doug J. Davidson, and Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59: 390–412. [Google Scholar] [CrossRef] [Green Version]
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67: 1–48. [Google Scholar] [CrossRef]
Baumann, Stefan, Martine Grice, and Susanne Steindamm. 2006. Prosodic marking of focus domains–Categorical or gradient? Paper presented at Speech Prosody, Dresden, Germany, May 2–5. [Google Scholar]
Bhatt, Rajesh, and Veneeta Dayal. 2007. Rightward scrambling as rightward movement. Linguistic Inquiry 38: 287–301. [Google Scholar] [CrossRef]
Boersma, Paul, and David Weenink. 2013. Praat: Doing Phonetics by Computer [Computer Program, v. 6.0.56]. Available online: http://www.praat.org/ (accessed on 20 November 2017).
Breen, Mara, Chigusa Kurumada, Michael Wagner, Duane Watson, and Kristine Yu. 2018. Introducing prosodic variability. Laboratory Phonology: Journal of the Association for Laboratory Phonology 9: 5. [Google Scholar] [CrossRef] [Green Version]
Butt, Miriam. 1993. Object specificity and agreement in Hindi/Urdu. In Papers presented at 29th Regional Meeting of the Chicago Linguistic Society: Volume 1: The Main Session. Chicago, IL, USA: The Chicago Linguistic Society, pp. 89–103. [Google Scholar]
Butt, Miriam, and Tracy H. King. 1996a. Focus, adjacency, and nonspecificity. Paper presented the Linguistic Society of America Annual Meeting, University of Pennsylvania, Philadelphia, PA, USA, 21 January 1996; Edited by Rajesh Bhatt. Available online: https://www.ling.upenn.edu/sassn/512/node16.html (accessed on 27 June 2019).
Butt, Miriam, and Tracy H. King. 1996b. Structural topic and focus without movement. Paper presented at First LFG Conference, Grenoble, France, August 26–28; Edited by Miriam Butt and Tracy Holloway King. Stanford: CSLI Publications. [Google Scholar]
Butt, Miriam, and Tracy Holloway King. 1997. Null Elements in Discourse Structure. Written to Be Part of a Volume that Never Materialized. Available online: http://ling.uni-konstanz.de/pages/home/butt/ (accessed on 14 April 2021).
Butt, Miriam, Farhat Jabeen, and Tina Bögel. 2016. Verb cluster internal wh-Phrases in Urdu: Prosody, syntax and semantics/pragmatics. Linguistic Analysis 40: 445–87. [Google Scholar]
Calhoun, Sasha, Emma Wollum, and Emma Kruse Va’ai. 2019. Prosodic prominence and focus: Expectation affects interpretation in Samoan and English. Language and Speech 64: 2–22. [Google Scholar] [CrossRef]
Cangemi, Francesco, Martina Krüger, and Martine Grice. 2015. Listener-specific perception of speaker-specific productions in intonation. In Individual Differences in Speech Production and Perception. Frankfurt: Peter Lang, pp. 123–45. [Google Scholar]
Choudhury, Arunima. 2015. Interaction between prosody and information structure: Experimental evidence from Hindi and Bangla. Ph.D. Thesis, University of Southern California, Los Angeles, CA, USA. [Google Scholar]
Das, Kalyan, and Shakuntala Mahanta. 2019. Intonational phonology of Boro. Glossa 1: 1–35. [Google Scholar]
Elvira-García, Wendy. 2018. Extract F0 from Points. Barcelona: University of Barcelona, R Foundation for Statistical Computing, Available online: http://www.wendyelvira.ga/ (accessed on 31 August 2018).
Ernestus, Mirjam. 2012. Segmental within-speaker variation. In The Oxford Handbook of Laboratory Phonology. Edited by Abigail C. Cohn, Cécile Fougeron, Marie K. Huffman and Margaret E. L. Renwick. Oxford: Oxford University Press, pp. 93–102. [Google Scholar]
Féry, Caroline. 2010. The intonation of Indian languages: An areal phenomenon. In Festschrift for Ramakant Agnihotri. Edited by Imtiaz Hasnain and Shreesh Chaudhury. Singapore: Akar Publishers, pp. 288–312. [Google Scholar]
Féry, Caroline. 2017. Intonation and Prosodic Structure. Cambridge: Cambridge University Press. [Google Scholar]
Féry, Caroline, Pramod Pandey, and Gerrit Kentner. 2016. The prosody of focus and givenness in Hindi and Indian English. Studies in Language 40: 302–39. [Google Scholar] [CrossRef] [Green Version]
Fox, John, and Sanford Weisberg. 2019. Companion to Applied Regression. [Retrieved v. 3.0-6]. Available online: https://r-forge.r-project.org/projects/car/ (accessed on 20 January 2020).
Gambhir, Vijay. 1981. Syntactic Restrictions and Discourse Functions of Word Order in Standard Hindi. Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA, USA. [Google Scholar]
Genzel, Susanne, and Frank Kügler. 2010. The prosodic expression of contrast in Hindi. Paper presented at Speech Prosody 2010, Chicago, IL, USA, May 10–14. [Google Scholar]
Hayes, Bruce, and Aditi Lahiri. 1991. Bengali intonational phonology. Natural Language and Linguistic Theory 9: 47–96. [Google Scholar] [CrossRef]
Ito, Junko, and Armin Mester. 2012. Recursive prosodic phrasing in Japanese. In Prosody Matters: Essays in Honor of Elisabeth Selkirk. Edited by Toni Borowsky, Shigeto Kawahara, Takahito Shinya and Mariko Sugahara. London: Equinox, pp. 280–303. [Google Scholar]
Jabeen, Farhat. 2017. Position vs. prosody: Focus realization in Urdu/Hindi. Paper presented at Phonetics and Phonology in Europe (PaPE) Conference, Köln, Germany, June 12–14. [Google Scholar]
Jabeen, Farhat. 2019a. Interpretation of LH intonation contour in Urdu/Hindi. Paper presented at International Congress of Phonetic Science, Melbourne, Australia, August 5–9. [Google Scholar]
Jabeen, Farhat. 2019b. Recursive intonation phrases in Urdu/Hindi. Paper presented at Phonetik und Phonologie (PundP) Conference, Düsseldorf, Germany, September 25–27. [Google Scholar]
Jabeen, Farhat. 2019c. Prosody and Word Order: Prominence Marking in Declaratives and Wh-Questions in Urdu/Hindi. Ph.D. Thesis, University of Konstanz, Konstanz, Germany. [Google Scholar]
Jabeen, Farhat. 2020. Focused or questioned? Intonation of polar questions and narrow focus in Urdu/Hindi. Paper presented at FASAL 10, Columbus, OH, USA, March 21–22. [Google Scholar]
Jabeen, Farhat, and Bettina Braun. 2018. Production and perception of prosodic cues in narrow and corrective focus in Urdu/Hindi. Paper presented at Speech Prosody 2018, Poznań, Poland, June 13–16. [Google Scholar]
Jabeen, Farhat, and Elisabeth Delais-Roussarie. 2019. Towards a phonological analysis of the rising contours in Urdu/Hindi. Paper presented at ICPHS satellite Workshop Intonational Phonology of Typologically Rare or Understudied Languages, Melbourne, Australia, August 4. [Google Scholar]
Jabeen, Farhat, and Elisabeth Delais-Roussarie. 2020. The Accentual Phrase in Urdu/Hindi: A prosodic unit at the interplay between rhythm and intonation. Paper presented at International Conference on Speech Prosody 2020, Tokyo, Japan, May 23–26. [Google Scholar]
Keane, Elinor. 2014. The intonational phonology of Tamil. In Prosodic Typology II: The Phonology of Intonation and Phrasing. Edited by Sun-Ah Jun. Oxford: Oxford University Press, pp. 118–53. [Google Scholar]
Kentner, Gerrit, and Caroline Féry. 2016. A new approach to prosodic grouping. The Linguistic Review 30: 277–311. [Google Scholar] [CrossRef]
Khan, Sameer ud Dowla. 2008. Intonation Phonology and Focus Prosody of Bengali. Ph.D. Thesis, University of Southern California, Los Angeles, CA, USA. [Google Scholar]
Khan, Sameer ud Dowla. 2014. The intonational phonology of Bengladeshi Standard Bengali. In Prosodic Typology II: The Phonology of Intonation and Phrasing. Edited by Sun-Ah Jun. Oxford: Oxford University Press, pp. 81–117. [Google Scholar]
Khan, Sameer ud Dowla. 2016. The intonation of South Asian Languages: Towards a comparative analysis. Paper presented at FASAL 6, University of Massachusetts, Amherst, MA, USA, March 12–13. [Google Scholar]
Khan, Sameer ud Dowla. 2018. Building a unified intonational model for South Asian languages: InTraSAL. Paper presented at 34th South Asian Languages Analysis Roundtable, Konstanz, Germany, June 19–21. [Google Scholar]
Kidwai, Ayesha. 2000. XP-Adjunction in Universal Grammar: Scrambling and Binding in Hindi-Urdu. Oxford: Oxford University Press. [Google Scholar]
Kim, Jiseung. 2019. Individual differences in the production of prosodic boundaries in American English. Paper presented at International Congress of Phonetic Science, Melbourne, Australia, August 5–9; Available online: https://icphs2019.org/icphs2019-fullpapers/pdf/full-paper_865.pdf (accessed on 12 June 2020).
Krifka, Manfred. 2008. Basic notions of information structure. Acta Linguistica Hungarica 55: 243–76. [Google Scholar] [CrossRef] [Green Version]
Kügler, Frank. 2020. Post-focal compression as a prosodic cue for focus perception in Hindi. Journal of South Asian Linguistics 10: 38–59. [Google Scholar]
Kuznetsova, Alexandra, Per B. Brockhoff, and Rune H. B. Christensen. 2017. lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software 82: 1–26. [Google Scholar] [CrossRef] [Green Version]
Ladd, D. Robert. 1986. Intonational phrasing: The case for recursive phrasing. Phonology Yearbook 3: 311–40. [Google Scholar] [CrossRef]
Lahiri, Aditi, and Jennifer Fitzpatrick-Cole. 1999. Emphatic clitics and focus intonation in Bengali. In Phrasal Phonology. Edited by René Kager and Wim Zonneveld. Dordrecht: Foris Publications, pp. 119–44. [Google Scholar]
Luchkina, Tatiana, and Jennifer Cole. 2019. Perception of word-level prominence in free word order language discourse. Language and Speech 64: 1–32. [Google Scholar] [CrossRef] [PubMed]
Luchkina, Tatiana, Jennifer Cole, Preeti Jyothi, and Vandana Puri. 2015. Prosodic and structural correlates of perceived prominence in Russian and Hindi. Paper presented at 18th International Congress of Phonetic Sciences, Glasgow, UK, August 10–14; Available online: https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2015/Papers/ICPHS0793.pdf (accessed on 2 November 2018).
Ohala, Manjari. 1983. Aspects of Hindi Phonology. Dordrecht: Indological Publishers and Booksellers. [Google Scholar]
Ouyang, Iris, and Elsi Kaiser. 2015. Individual differences in the prosodic encoding of informativity. In Individual Differences in Speech Production and Perception. Frankfurt: Peter Lang, pp. 147–88. [Google Scholar]
Patil, Umesh, Gerrit Kentner, Anja Gollrad, Frank Kügler, Caroline Féry, and Shravan Vasishth. 2008. Focus, word order and intonation in Hindi. Journal of South Asian Linguistics 1: 55–72. [Google Scholar]
Peppé, Sue, Jane Maxim, and Bill Wells. 2000. Prosodic variation in Southern British English. Language and Speech 43: 309–34. [Google Scholar] [CrossRef]
Puri, Vandana. 2013. Intonation in Indian English and Hindi Late and Simultaneous Bilinguals. Ph.D. Thesis, University of Illinois, Urbana Champaign, IL, USA. [Google Scholar]
R Core Team. 2014. R: A Language and Environment for Statistical Computing [v. 3.6.1]. Vienna: R Foundation for Statistical Computing. [Google Scholar]
Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1: 75–116. [Google Scholar] [CrossRef]
Schuppler, Barbara, and Bogdan Ludusan. 2020. An analysis of prosodic boundary detection in German and Austrian German read speech. Paper presented at International Conference on Speech Prosody 2020, Tokyo, Japan, May 23–26. [Google Scholar]
Selkirk, Elizabeth. 1984. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, MA: MIT Press. [Google Scholar]
Selkirk, Elizabeth. 1986. On derived domains in sentence phonology. Phonology Yearbook 3: 371–405. [Google Scholar] [CrossRef]
Selkirk, Elizabeth. 2011. The syntax-phonology interface. In The Handbook of Phonological Theory. Edited by John Goldsmith, Jason Riggle and Alan Yu. Hoboken: Wiley-Blackwell, pp. 435–84. [Google Scholar]
Stoet, Gijsbert. 2010. PsyToolkit—A software package for programming psychological experiments using Linux. Behavior Research Methods 3: 1–47. [Google Scholar] [CrossRef]
Stoet, Gijsbert. 2017. PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments. Teaching of Psychology 44: 24–31. [Google Scholar] [CrossRef]
Urooj, Saba, Benazir Mumtaz, and Sarmad Hussain. 2019. Urdu intonation. Journal of South Asian Linguistics 10: 2–22. [Google Scholar]
Venables, William, and Brian Ripley. 2002. Modern Applied Statistics with S. New York: Springer. [Google Scholar]
Wickham, Hadley. 2016. Elegant Graphics for Data Analysis. New York: Springer. [Google Scholar]

Figure 1. An example of stimulus presentation. The red labels indicating the grammatical role of each chunk of text is added here only for the readers of this study.

Figure 2. F0 contour of two sentences produced in wide focus by the same speaker. The sections in the red box show the difference in the F0 contour of null marked (a) and locative nouns (b) at the immediately preverbal position. The latter is produced with a rising contour, whereas F0 falls gradually in the former.

Figure 3. Mean F0 peak scaling in target nouns produced in wide and narrow focus at sentence initial (top), medial (middle), and immediately preverbal (bottom) positions.

Figure 4. Mean F0 range in target nouns produced in wide and narrow focus at sentence initial (top), medial (middle), and immediately preverbal (bottom) positions.

Figure 5. Mean proportionate duration of target nouns (noun duration/sentence duration) at sentence initial (top), medial (middle), and immediately preverbal (bottom) positions.

Figure 6. An example sentence from the online survey on focus identification.

Figure 7. Correct identification of focus type and position for sentences produced by speakers 2 & 6.

Figure 8. F0 contour of a sentence produced by the same speaker in wide and narrow focus at the immediately preverbal position. The focused noun is shown in the red box. The second peak in wide focus is downstepped (1.2 St.) but its counterpart in narrow focus is upstepped (2.0 St.).

Figure 9. Mean scaling of sentence initial F0 peaks immediately before the medial nouns (top) and scaling of medial peaks at the left of preverbal (bottom) nouns produced in wide and narrow focus.

Figure 10. Mean syllable duration before sentence medial (top) and immediately preverbal (bottom) narrow vs. wide focus.

Figure 11. Variation in F0 contour in the postfocal region following sentence initial narrow focus. The focused word, sɑ.rɑ, is enclosed in the red box. The focused word may optionally be followed by an F0 peak (a). Alternatively, F0 can be compressed after the focused noun (b).

Figure 12. Mean F0 peak scaling on the noun phrase after sentence initial (top) and medial (bottom) nouns produced in narrow and wide focus.

Figure 13. An example sentence from the survey on prosodic phrasing in Urdu.

Table 1. Occurrence of narrowly focused nouns at different positions.

Role	Initial	Medial	Im. Preverbal
Subject	87%	13%	-
Indirect object	9%	91%	-
Direct object	29%	71%	-
Null-marked object	-	-	100%
Locative	-	8%	92%
Total	35%	33%	32%

Table 2. In situ placement of narrowly focused nouns.

Role	Initial	Medial	Im. Preverbal	Total
Scrambled	16%	17%	-	11%
in situ	84%	83%	100%	89%

Table 3. Use of phonetic cues in the context of narrow focus at different positions.

Phonetic Cues	Position	1	2	3	4	5	6	7	8	9	10	11	12
	Initial	√		√	√	√	√					√	√
Higher F0 peak	Medial						√				√	√	√
	Preverbal	√		√	√		√				√	√	√
	Initial	√		√	√	√	√						√
Wider F0 range	Medial	√		√	√		√					√
	Preverbal	√	√	√	√		√	√	√	√	√		√
	Initial	√		√	√	√	√	√		√		√	√
Longer duration	Medial	√		√	√		√	√	√	√		√	√
	Preverbal	√		√	√	√	√			√		√	√

Table 4. Examples of the assignment of correct vs. incorrect labels for the identification of focus type and position by the participants of the online survey.

Target Production	Identified	Label
Initial narrow focus	Medial narrow focus	Incorrect
Initial narrow focus	Initial narrow focus	Correct
Wide focus	Preverbal narrow focus	Incorrect

Table 5. Identification of focus type and position in sentences produced by the two speakers (2 = used no phonetic cues; 6 = used all the target phonetic cues).

Speaker	Incorrect	Correct
02	76%	24%
06	49%	51%

Table 6. Identification of focus position in sentences produced by speaker 6, who had used all the target phonetic cues.

Incorrect	Correct
37%	63%

Table 7. Percentage of identified prosodic phrasing for focus positions and types. The data in the red box shows the only context where respondents largely agreed on their identification of a phrase boundary in all focus contexts.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jabeen, F. Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu. Languages 2022, 7, 103. https://doi.org/10.3390/languages7020103

AMA Style

Jabeen F. Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu. Languages. 2022; 7(2):103. https://doi.org/10.3390/languages7020103

Chicago/Turabian Style

Jabeen, Farhat. 2022. "Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu" Languages 7, no. 2: 103. https://doi.org/10.3390/languages7020103

APA Style

Jabeen, F. (2022). Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu. Languages, 7(2), 103. https://doi.org/10.3390/languages7020103

Article Menu

Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu

Abstract

1. Introduction

1.1. Word Order and Information Structure in Urdu and Hindi

1.2. Prosody of South Asian Languages

1.2.1. Bengali

1.2.2. Tamil

1.2.3. Hindi

1.3. Prosodic Phrasing in Urdu

1.3.1. Accentual Phrases

1.3.2. Intonational Phrases

Recursive IPs in Urdu

1.3.3. Focus Realization in Urdu

1.4. Individual Variation in Using Prosodic Cues

1.5. Research Questions

2. Methods: Production Experiment

2.1. Material

2.2. Participants

2.3. Data Collection

2.4. Data Analysis

2.4.1. Word Order

2.4.2. Intonational Analysis

F0 Contour and Peak Scaling

F0 Range

Postfocal Compression

Proportionate Duration

2.4.3. Statistical Analysis

3. Results: Production Experiment

3.1. Word Order

3.2. F0 Contour in Wide Focus

3.3. F0 Peak Scaling

3.4. F0 Range

3.5. Proportionate Duration

3.6. Summary

4. Discussion: Production Experiment

4.1. Phonetic Cues

4.2. Word Order

5. Identification of Focus Type and Position

5.1. Methods: Focus Identification

5.1.1. Apparatus and Stimuli

5.1.2. Participants

5.1.3. Data Analysis

5.2. Results: Focus Identification

5.3. Discussion: Focus Identification

6. Prosodic Phrasing

6.1. Prefocal Upstepping

6.2. Prefocal Lengthening

6.3. Postfocal Compression

6.4. Discussion: Prosodic Phrasing

7. Identification of Prosodic Phrasing

7.1. Methods: Phrasing Identification

7.1.1. Apparatus

7.1.2. Stimuli

7.1.3. Participants

7.1.4. Data Analysis

7.2. Results: Phrasing Identification

8. General Discussion

8.1. Word Order and Intonation

8.2. Syntax–Phonology Interface

8.3. Position and (Non-)Specificity

8.4. Prosodic Phrasing

8.4.1. Recursive AP Boundary

8.4.2. Recursive IP Boundary

8.4.3. Intermediate Phrase Boundary

8.4.4. Structural Phrase Boundary

8.5. Variation in Prosodic Phrasing

9. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Data Set

Appendix B. Statistical Model

Notes

References

Share and Cite