Next Article in Journal
Root, Thematic Vowels and Inflectional Exponents in Verbs: A Morpho-Syntactic Analysis
Next Article in Special Issue
Prosodic Word Recursion in a Polysynthetic Language (Blackfoot; Algonquian)
Previous Article in Journal
The L3 Polish Lateral in Unbalanced Bilinguals: The Roles of L3 Proficiency and Background Languages
Previous Article in Special Issue
Lexical Category and Downstep in Japanese
 
 
Article
Peer-Review Record

Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu

Languages 2022, 7(2), 103; https://doi.org/10.3390/languages7020103
by Farhat Jabeen
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Languages 2022, 7(2), 103; https://doi.org/10.3390/languages7020103
Submission received: 25 January 2021 / Revised: 9 March 2022 / Accepted: 9 April 2022 / Published: 20 April 2022
(This article belongs to the Special Issue Phonology-Syntax Interface and Recursivity)

Round 1

Reviewer 1 Report

Review of Manuscript languages-1105094

Title: Word order, intonation, and prosodic phrasing: Individual differences in the production of narrow and wide focus in Urdu

 

The manuscript describes a well-designed and well-written study of focus realization and scrambling within a particular population of Urdu speakers, defined by age, gender, and education level. The author demonstrates using a range of methodologies that Urdu speakers exhibit wide variation in how they prosodically mark different word orders in responses to the same stimulus question, and curiously, they do not appear to use the immediately pre-verbal position as a “focus position”, contra the conventional wisdom about Urdu and other Indo-Aryan languages. The authors also argue that narrow focus is (most effectively) realized with longer duration, expanded pitch range, and higher pitch on the focused element (or alternatively raised pitch on the immediately pre-focal element), rather than sentence position. I am generally convinced of the results and interpretations, and I hope that this study will eventually be published in Languages.

 

At the moment, I still have some questions, suggestions, and concerns that I would like to see addressed before publication. These are not major in the sense that they do not question the validity of the study. Instead, they are mostly based on presentation, clarification, and justification for the various choices made by the author. Given these observations, I recommend that the manuscript undergo minor revisions. My specific comments are listed below.

 

  1. Recursive IPs

 

I am convinced by the data presented by the author that there is a boundary represented by the upstep seen before some constituents with narrow focus. However, it’s not clear to me how the author has established that this is an indication of IP recursion specifically. I’d feel more convinced of this argument if this internal IP boundary “looked” like an IP boundary, with the lengthening, tones, etc., expected of any IP boundary tone. (Why isn’t this L%? If this is an H%, what does H% get used for elsewhere?) These tones just look like an upstepped version of the AP tone already expected there.

 

Could we instead call it something else, like an intermediate phrase or minor phrase or something that doesn’t make the reader assume that we should see independent evidence of IP-boundary-ness here? I could imagine for example that there could be intermediate phrases in the language, and that they normally span entire IPs, but in cases like this, an IP can be broken into two or more ips, where the ip boundary is realized solely via upstep of the AP boundary tone and downtrend-reset thereafter. (Also, recursive IPs make me wonder if there could be another level of recursion below this, and so on…)

 

  1. Deaccenting

 

How do we know that the items after /iˈmÉ‘m se/ are deaccented in Fig. 7? If they were “fully realized”, they would just have a Lp, and there could very well be one there from what I see in the pitch track. More importantly, these post-focal items look identical to the last two words in both renditions of Fig. 3, of which at least the left side is not described as having any deaccenting. In general, it would be useful to know what “deaccenting” means when talking about the position in the sentence where we expect the AP right-boundary to be overridden by the IP boundary tone anyway. In such cases, are we looking for the “elbow” of the Lp? Or something else? I ask because I can’t imagine we’d expect the full Lp…Hp sequence there in IP-final position anyway (assuming the final words are all grouped into one AP), so spelling out exactly what deaccenting looks like in this position would be useful.

 

  1. Listening task

 

Compared to the production task, which was collected from a wide range of speakers within a demographic group, the listening task was just conducted by two speakers of very different linguistic background, one of whom is the author of the paper. The author should note that the listening task results may need to be taken with these considerations in mind, especially when claims are made about when focus is perceived or misperceived, based on such a small sample.

 

  1. Anomalous Speaker 02

 

The author mentions that the low Kappa between intended focus and perceived focus “may be attributed to the second speaker” (p. 27). Can that be quantified by re-calculating the Kappa without Speaker 02?

 

  1. Stress

 

Although not central to the aim of the paper, I note that in pitch tracks of words with non-initial stress, we see evidence of that stressed syllable being the attractor of the L target of the AP. For example, the stressed syllables of the following words bear the Lp but were not word-initial:

/mʊlˈt̪ɑn/ (Fig. 2)

/iˈmɑm/ (Fig. 2; Fig. 7; Fig. 11)

/ʊˈd̪ʱɑɾ/ (Fig. 11, right side)

/nəˈmɑz/ (Fig. 15, right side)

 

Similarly some IP-final APs show similar signs, although of course here we’re just looking at the elbow in the final LP…L% contour, where the LP marks the first L target bending into a low plateau:

/bʊˈlɑjɑ/ (Fig. 2)

/ʊˈd̪ʱɑɾ/ (Fig. 7, left side; Fig. 11, left side)

/nəˈmɑz/ (Fig. 15, left side)

 

Given that some studies of Urdu intonation (Jabeen 2019a, Jabeen & Delais-Roussarie 2019, both cited by the authors) show no effect of stress on Lp attraction but others do (Urooj et al. 2019, not cited by the authors), can we at least assume that there is intralanguage variability in terms of whether speakers put the Lp on the stress vs. on the initial syllable in words with non-initial stress? Otherwise the data from this paper seem to totally contradict the author’s own reports of stress-insensitivity in the literature.

 

Urooj, Saba; Benazir Mumtaz, & Sarmad Hussain. (2019). Urdu intonation. Journal of South Asian Linguistics 10 (1), 2–22.

  1. Visualizations

 

Table 1

Instead of only presenting the position of all focused nouns (collapsing across syntactic role) as in Table 1, it would be better to see all the different role separated out into rows. Something like:

 

Syntactic role of focused element

Initial

Medial

Im. preverbal

Postverbal

Subject

N%

N%

N%

0.0%

Indir. object

N%

N%

N%

0.0%

Dir. object

N%

N%

N%

0.0%

Total

32.8%

35.2%

31.9%

0.0%

 

Pitch track text size

The figures with multiple pitch tracks are too small to read comfortably. The pitch tracks themselves are fine; just the text is too small even when quite zoomed in.

 

Focus in pitch track images

Ideally, there would be some way to show in the illustrated pitch tracks which (if any) constituent is intended (via the context question) to be under narrow focus: for example, a colored underline or a diacritic “FOC” annotated somewhere on top of the image could be useful. The red boxes seem to sometimes, but not always, illustrate the focused constituent.

 

  1. Wording and typographical comments

 

  1. p. 4: “six or moras” → “six or more moras”

  2. p. 5: “ As Urdu is claimed to have a default focus position” → “ As Urdu is claimed to have a fixed (pre-verbal) focus position”

  3. 7: “a homogenous groups” → “ a homogenous group” (or “homogeneous”)

 

  1. 10: “When nominative objects were placed at that position, only 36% of them were produced with a rising contour. The remaining nominative nouns were phrased together with the following verb.” “Nominative objects” is a term more generally understood in a Hindi/Urdu-specific context, and might appear to be a contradiction to those unfamiliar with this. Maybe instead, consider “null-marked objects/nouns”?

 

  1. 12: “two speakers (7, 9) used this cue to mark wide focus” → this seems a bit counter-intuitive. Instead of saying that these two speakers used wider f0 to mark wide focus, it seems more intuitive to me to say that they used narrower f0 to mark narrow focus. By which I mean, is “wide focus” a thing to be marked, leaving narrow focus as a default? One would imagine non-default the prosodic marking should be on the thing that is non-default.

 

  1. 13: I think it’s worth noting that the two speakers who appear to have shorter duration on their narrow focused words also have two of the smallest differences between wide and narrow focus. It seems like speakers either don’t make much difference at all in lengthening for narrow focus (speakers 1, 2, 5, 8, 10) –– where the averages are right on top of one another or a little one way or another –– or they make a big difference, always in the direction of narrow focus getting longer duration (speakers 3, 4, 6, 9, 11, 12), but no one makes a huge duration reduction under narrow focus. I think the wording should be slightly changed to make sure this is clear.

 

  1. 22: “upstepped F0 peaks on left and” → “upstepped F0 peaks on the left of, and”

Author Response

Hallo

thanks a lot for your insightful comments. I found them very helpful.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper deals with variation in the production of Urdu narrow and wide focus sentences in different word orders.

 

In this paper the author asks the following research questions:

  1. Do speakers form a homogenous group
  2. is there intra-speaker variation as well?
  3. is prosodic phrasing affected?
  4. The Identification of focus types in relation to focused Nouns

Generally, the following seems to be experimental results:

There was a tendency to use prosodic cues more at the sentence initial and immediately preverbal positions. Generally the outcome of focus-induced intonation is a rise. There were also upstepped F0 peaks on left and post-focal deaccentuation on the right edge of the focused nouns. However, there is an issue with the methodology used to arrive at the answers for 3 and 4 and I will discuss that shortly.

Before discussing the details of the paper I would like to point out two outstanding issues – as the authors themselves admit, Urdu is a language spoken by the people of Pakistan along with their mother tongues – Punjabi, Sindhi, Saraiki etc. It is presumably spoken by only 7 -8 % of the population as the first language. It is eminently surprising that instead of probing this very certain source of variation, the author chooses to cast this factor aside it as insignificant. Any variationaist study cannot proceed without the various sources of variation in the signal. 

Secondly the technical usage of the word ‘cues’ is pretty much unclear. A listening experiment was conducted where the author participated as one of the subjects of the experiment along with another subject, and hence the discussion in page 30 makes it seem as if the author is talking about perceptual cues. The problem is that these may very well be acoustic cues or perceptual cues but without following proper experimental methodology none of these claims are acceptable.

Page 10, section 3.1 this discussion is not clear. What evidence was provided by Butt and King? How was it contradicted by the evidence here?

Figure 2: same speaker or different speaker?

Page 11, section 3.3.1 the question is how is this significantly different. The main purpose here is of a phrase final rise and both the speakers seem to have done the same albeit with differences,. How much are these differences perceptible?

Page 12:

Lines 1-3

Interestingly, the focused noun at the sentence initial position was always produced

with a rising F0 contour, a phenomenon that was not observed in the production of

focused nouns at any other position in the sentence.

Why is it so? How is it relevant of the discussion the paper.

Figure 4: page 12

please explain this diagram. If this plotting is supposed to reflect the sentence 'the remaining speakers barely differed between focus types regarding their use of F0 peak scaling.' then it's not entirely descriptive. because they are only 1, 4 and 5

Page 14

3.3.5

please correlate these speakers with their use of Urdu

section 3.4.1

Figure 7: What is the difference between these highs and downstepped highs in Urdu?

Page 15, Figure 8The diagram is not described properly. ‘Significant interaction’ leading to what interaction.

Page 30

Page 30: Implications

“The results reported here show that Urdu speakers use gradient (F0 peak scaling, F0

range, proportionate duration) and categorical (embedded IP boundary at the left

edge and post-focal deaccentuation) cues to mark narrow focus.”

While there is evidence of variation, the gradient nature of the variation has not been clearly demonstrated anywhere.While the gradience may perhaps be shown, how do the results of the experiments shown here show anything categorical?

“AP structure of sentences produced in wide and narrow focs may differ in the post focal region. “ But is there an statistical result to show this?

 

Suggestions for improvement: I would urge upon the authors to properly correlate the variation seen in the speech signal with the background of each speaker. Before casting away this aspect as unrealistic or suspect, the author could also take note of the fact that the variation itself could arise because of the instability in the intonational cues and hence many of the intonational aspects of this variety of Urdu maybe transitional.

Secondly, the 'listening' experiment with the author and another speaker really fails to meet any standard of a perception experiment. There is no proper stimuli, no research design, methodology, or adequate number of participants. It only deserves a very small mention as an attempt to cross-verify impressionistic judgments. That experiment cannot be used to discuss intonational ‘cues’.

Author Response

Hallo

The response to your comments is attached here.

Best

Author Response File: Author Response.pdf

Reviewer 3 Report

The present paper is one of the most detailed analyses of focus in Urdu. It makes consistent claims about the variable use of intonational cues in the context of wide and narrow focus by Urdu speakers on the basis of both production and perception experiments. The authors present their case from the perspective of the identification of focus type and position and prosodic phrasing. They show the presence of both inter-speaker and intra-speaker variation.

The authors propose the existence of a recursive IP boundary on the left edge of the narrowly focused noun, with the presence of pre-boundary lengthening and upstepped F0 peaks in that position.

The paper claims the following cues as relevant for the realization of focus in Urdu- Higher F0 peak, wider F0 range and Duration. The authors support the claim that all the cues need not be used simultaneously to convey the intended information structure.

I find the paper publishable as is. However, a couple of issues need to be addressed. One of them is the role of Intensity as an intonational cue. In some recent studies (see, e.g., Fery et al. 2016. The prosody of Focus in Hindi and Indian English. Studies in Language 42(2): 302-339), Intensity has additionally been found to play a significant role in the realization of focus in both Hindi and Indian English: focused consituetnts are produced with higher intensity than given ones. In the present paper, the authors have noted that intensity has been shown to be one of the cues in the study of focus in Russian (Luchkina and Cole 2019), but have have left it out of their investigation. One could conjecture that the use of intensity could be present in those cases in which the other cues are found to be absent (Speaker 2). A second issue is that of phonological processes as cues to focus in Urdu. Fery et al (2016) note a significant use of the process of glottal stop insertion before focussed constituents, especially the vowel initial ones. The absence of this process in Urdu could point out a phonological difference between Urdu and Hindi. It could well be the case that the examination of phonological processes has not been considered here.

The following minor typos need to be corrected:

P.53, l. 8 (top)                        ‘…been convincingly advocated by (Ito and Mester 2012)

                                    … by Ito and Mester (2012)

  1. 55 7.3.3 Variation in prosodic phasing (phrasing?)

 

 

Author Response

Thanks a lot for your thought-provoking comments. My response is attached here.

Best

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

This is an extensively revised version of the previously submitted paper. This is a much well researched paper on individual variation than the previous version. The perception experiment has proper results which can be correlated with the production results

Couple of (minor) issues that I would like to point out are the following

Page 5

“We argue against Urooj et al.’s analysis of the rising F0 contours as pitch accents.

A detailed look at the figures presented in Urooj et al. shows that the L*+H occurs in monosyllabic or bisyllabic words with a short vowel in the first syllable. The phonetic realization of this F0 contour can be explained by the fact that the entire rising contour needed to be produced on a short bisyllabic word. ..Therefore, we propose that what Urooj et al. have analyzed as pitch accents is in fact the phonetic implementation of the same rising F0 contour. Moreover, they had to propose an optional low AP boundary on the left edge because the low tone does not always align with the lexically stressed syllable. If the idea of pitch accents is discarded however, there is no need to propose an ad hoc low AP ….”

 

Is it possible that these phonetic differences are features of individual variation as well? If so, this aspect then ties in with the claim of variation in the rest of the paper.

 

Page 49

 

“The findings of the current study support the argument that the

variable use of intonational cues should be seen as a norm and not as an anomaly”

 

I think the comment that variation in intonation should be considered the norm rather than the exception is completely unnecessary. It assumes that other people working on intonation are unaware of this fact. Also, with regard to what the authors’ have claimed with regard to Urooj et. al.’s work this claim stands in stark contrast. Because it is not clear what the authors' mean by 'phonetic difference' there. Is it an expected and predictable difference related to variation?

Secondly, It is precisely because of this variation that the authors’ may choose to respect the fact that the two speakers (among the three, one is a complete outlier) who have claimed their mother tongue to be Punjabi have shown consistent intonational cues throughout the experiment. It’s not clear whether any of this is related to being their Punjabi speakers, and that may be a subject of further investigation, but in all the contexts (sentence initial, sentence medial and preverbal positions) the intonational cues are strikingly similar for these two speakers. This may be noted by the readers as well. Noting this fact does not weaken the authors’ claim about individual variation and it only strengthens it, as speaker 2 bucks this trend completely.

Page 48 “However, it is important to remember that the intonation

of South Asian languages is generally similar in their use of “Repetitive Rising Contours”

(Khan 2016, 2020).”

This point should be deleted. Khan’s analysis of just a couple of languages cannot be held true for hundreds of languages and thousands of dialects in South Asia. India alone is home to 22 official languages belonging to three different genetically diverse groups – Indo-Aryan, Dravidian and Tibeto-Burman. The Tibeto-Burman languages of India are very far from Khan’s analysis (Das and Mahanta, 2019)

Finally, the paper is very lengthy. Some discussions which do not have much bearing on the paper (like the discussion on German variation) can be shortened.

 

References

Das K. & Mahanta S., (2019) “Intonational phonology of Boro”, Glossa: a journal of general linguistics 4(1), p.103. doi: https://doi.org/10.5334/gjgl.758

Author Response

Thanks for your comments. My response is attached here.

 

Best

Author Response File: Author Response.pdf

Back to TopTop