# **The Classification of Arabic Dialects**

**Traditional Approaches, New Proposals, and Methodological Problems** 

> Edited by Simone Bettega and Roberta Morano Printed Edition of the Special Issue Published in *Languages*

www.mdpi.com/journal/languages

## **The Classification of Arabic Dialects: Traditional Approaches, New Proposals, and Methodological Problems**

## **The Classification of Arabic Dialects: Traditional Approaches, New Proposals, and Methodological Problems**

Editors

**Simone Bettega Roberta Morano**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Simone Bettega University of Turin Italy

Roberta Morano University of Leeds UK

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Languages* (ISSN 2226-471X) (available at: https://www.mdpi.com/journal/languages/special issues/arabic dialects).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-6139-4 (Hbk) ISBN 978-3-0365-6140-0 (PDF)**

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editors**

#### **Roberta Morano**

Roberta Morano completed a Ph.D. in linguistics at the University of Leeds, UK, in 2019, specializing in the Arabic varieties spoken in northern Oman. Her research interests focus on language documentation and ecolinguistics, especially the relationship between the environment and its representation in language. She has published several articles on the morphosyntax and sociolinguistics of Omani dialects. Her most recent publication is "Diachronic variation in the Omani Arabic vernacular of the al-'Awabi district: from Carl Reinhardt (1894) to the present day" (2022).

#### **Simone Bettega**

Simone Bettega obtained his Ph.D. in linguistics at the University of Turin, Italy, in 2016. His research focuses on the documentation and typological classification of spoken varieties of Arabic, particularly those of southeastern Arabia. He has published articles and monographs on several aspects of the syntax of spoken Arabic. His most recent publication is the volume "Gender and Number Agreement in Arabic", which was co-authored by Luca D'Anna.

## *Editorial* **Preface**

**Simone Bettega 1,\* and Roberta Morano <sup>2</sup>**


To seek for knowledge is to strive for systematization. All scientific disciplines, from physics to biology, from philosophy to medicine, have always been haunted by the question of classification. That categorization constitutes such a fundamental prerequisite to the progress of science is made all the more problematic by the fact that reality is, ultimately, a slippery thing. The world that we explore through our senses is a unified whole, continuous and undivided; it comes to us as an uninterrupted stream of perception and experience, and in the midst of this flow, it is hard to decide where something ends, and something new begins.

No one, arguably, knows this better than a linguist. Languages are, by definition, classificatory systems. They are all shared by a community of speakers who have all agreed on the fact that the world is to be segmented in a certain way. Leaves rest on branches; branches stem from trees; trees congregate to form woods, or sometimes forests. The act of speaking allows us to put order into fractal chaos. Unfortunately, not all communities of speakers, and therefore not all languages, interpret reality in the same way. Some have looser boundaries, others stricter ones. The idea of a forearm is a different thing to different people (different speakers, that is); and where does an elbow belong, exactly? No one better than a linguist understands the frustration that comes with the impossibility of classification, and this is because to classify languages is to classify classifications; even worse, it is to classify self-classifying classifications (and the dangers implicit in recursive categorizations have been well-known at least since the formulation of Russell's famous paradox).

Arabic dialectology is but a minor sub-branch of the general field of linguistics. Yet, its scope is undeniably vast. Varieties of Arabic are legion: they are spoken by hundreds of millions of people, scattered over a territory remarkably larger than geographical Europe, spanning from the Atlantic Ocean to the Persian Gulf, from southern Turkey to the southernmost tip of the Arabian Peninsula, and even making inroads into central Asia, sub-Saharan Africa, and a number of Mediterranean islands. Arabic dialects represent the modern descendants of one of the language families with the longest history of written attestation in the world, and display an amazing variety of forms and structures at the typological level. To put order into so vast a matter is obviously no easy task, and it should come as no surprise that the different classifications currently available to scholars of Arabic dialectology are all somewhat unsatisfactory, and have been subject to heavy critiques over the course of the years. As Owens (2013) elegantly puts it, "If till today simple models for classifying Arabic dialects elude us [ ... ], it is no doubt in large part because an originally diverse proto-situation has continued to diversify across the vast geographical region where Arabic is spoken".

We are not saying, of course, that classificatory systems for Arabic dialects do not exist: they do, and have been employed to some effect. Critiques to these systems, however, also abound. To make but a couple of examples, one can think of what is probably the most widely employed categorization in the field of Arabic dialectology, the one that distinguishes "Bedouin" varieties from "sedentary" ones. This bipartite subdivision has

**Citation:** Bettega, Simone, and Roberta Morano. 2022. Preface. *Languages* 7: 58. https:// doi.org/10.3390/languages7010058

Received: 24 February 2022 Accepted: 28 February 2022 Published: 3 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

been put into question for conflating the synchronic and diachronic dimensions, together with sociolinguistic considerations, not always in a methodologically sound manner (for a critique, see, among many Palva 2006, p. 605; Watson 2011, p. 869; Vicente 2019, p. 109; and also an interesting note in Magidow 2016, p. 93). Even when we focus on systems of categorization that are narrower in scope, because they are concerned with specific areas within the Arab World, or specific subsets of phenomena, we still encounter problems. It is the case, for instance, of the traditional labels employed to classify Maghrebi varieties, which have recently been put into serious question by Taine-Cheikh (2017), Mion (2018), and Benkato (2019).

From all of the above, it should be clear that we are still far from finding unified and satisfactory solutions to many practical and terminological problems that have long been haunting the field of Arabic dialectology. We hope that the present volume can represent yet another, if minor, contribution to the vast collective effort of trying to better assess, organize and understand varieties of spoken Arabic. Obviously, the numerous papers that appear in the following pages differ greatly from one another in terms of scope and focus. This is no doubt because, as we have said, the subject of inquiry is vast, and exists simultaneously at both a local and supra-local, general scale. Both dimensions, we believe, are important, and both are represented among the studies presented here. Clear examples of the former are the article by Herin, Younes, Al-Wer, and Al-Srur and the one by Torzullo. ¯ The two papers tackle similar issues, and put into question some of the categories which have historically been used to classify the dialects of Northern Arabia and the Southern Levant, with a focus on Jordan. In particular, they do so by also paying attention to the recent social developments of the area, which have involved great amounts of linguistic contact and consequent dialect levelling and mixing. Such attention to the socio-linguistic landscape of the area under investigation is also present in Leitner's discussion of the labels "rural", "urban", and "g@l@t" in contemporary Iraqi and Irani Arabic; here, these terms are re-examined in the light of recent phenomena of urbanization and population movement.

The dialectological situation of northern Africa, as noted above, is particularly complex, and several of the contributions that appear in this volume focus on this specific area. We start East, from the Egypto-Sudanic region, whose dialects are the object of Leddy-Cecere's inquiry. By also applying the methods of historical glottometry, Leedy-Cecere calls into question the validity of the traditional classification that claims a relationship to exist between the dialects of modern Egypt and Sudan. Sudanic Arabic is also treated in Manfredi's and Roset's study, whose focus, however, is broader, as it encompasses the whole "Baggara Belt", a strip of land more than 2500 km long that stretches from Sudan to Nigeria and is mostly inhabited by semi-nomadic and Arabic-speaking cattle herders. While examining the internal dialectal composition of Baggara Arabic, the two authors also provide new data for the refinement of the isoglosses commonly adopted for the identification of a West Sudanic dialect subtype. Finally, Sokhey's treatment of the palatalization of /n/ in Cairene Arabic suggests this specific trait to be sociolinguistically salient, and indexical of socioeconomic status, thus warranting further inquiries in the sociolinguistic situation of the Egyptian's capital.

If we move to the Maghreb proper, three of the articles presented here deal specifically with this region. Benkato and Pereira argue for broader inclusion of syntactic isoglosses in the classificatory systems of Arabic dialects, and offer a contribution in this sense by examining the emergence of a verbal copula in some dialects of Tunisia and northwestern Libya, a feature that appears to cut across the established isogloss lines of the area. Francisco proposes a re-examination of the categorization of southern Moroccan dialects in light of new data that have recently become available, questioning the validity of the labels "Bedouin", "Hilali", and "Ma ¯ Qqili" when referred to these varieties. Finally, La Rosa offers a preliminary description of the dialect of the Mahdia area in Tunisia, in which Bedouin, rural and urban features seem to be conflated, an observation that could help to better assess the linguistic nature of the Tunisian Sahel.

Issues of classification, of course, do not only arise in relation to the diatopic distribution of linguistic features; diachrony plays an important role as well. In Iriarte Díez's article we find both dimensions being addressed at once: starting from a survey on the role of cognate infinitives in Lebanese Arabic, the author broadens the scope of her analysis by bringing data from the Semitic language family at large to bear. This comparison reveals the Lebanese data to be in line with what is known about other Semitic languages, with the possible (and curious) exception of Classical Arabic, whose descriptions have often adopted a dismissive attitude towards the topic of cognate infinitives. The question of diachrony, and, therefore, of origins and evolution, is more directly addressed by Al-Jallad, who isolates a number of features that appear to characterize both the modern dialects and the ancient pre-Islamic epigraphic inscriptions, to the exclusion of Classical Arabic. Stokes is also concerned with the historical developments of Arabic, when he argues that the vowels which appear before the pronominal suffixes in several modern dialects are actually derived from original case vowels, and subdivides dialects into two main groups depending on how these vowels developed. Magidow's paper, finally, takes issue with the possibility of reconstructing the linguistic history of the Arabic languages by directly relating it to attested population movements and settlement patterns, and proposes the application of a new heuristic approach, based on sociolinguistics and geography, to re-examine the extant categories of Arabic dialectology.

Yet, another approach to the problem of classification is that adopted by both Turner and Youssef, who employ the tools of linguistic typology to try and bring order into the variegated reality of spoken Arabic. Turner uses definiteness as a case study, which he investigates applying the Reference Hierarchy framework, thus showing the importance of using semantic typology as a metric for grouping dialects, rather than relying on the presence of forms alone. Youssef, on the other hand, laments how attempts at classifying Arabic varieties based on consonantal realizations have historically employed a mixture of both linguistic and non-linguistic parameters: as an alternative, he proposes to investigate the phonological nature of the dialects through the use of segmental typology. As with Turner's paper, Youssef also underlies how typological inquiry allows for different possible categorizations of the dialects, which can support, refine, or disprove already existing classificatory models, but also suggest new viable groupings, and provide insights into diachronic processes.

In conclusion, there is not doubt that the systems that have been used to classify Arabic dialect up until this moment are not entirely satisfactory, and can be improved. What remains to be understood is what, of these systems, is there to be saved, what can be safely discarded, and what hitherto unexplored methodologies can be fruitfully applied to the field of inquiry. It seems to us that the articles that make up this volume are all relevant in this sense, and that they contain promising and interesting ideas worth exploring and expanding upon. We can only hope that the readers will share our views, and that in the following pages they will be able to find answers, new questions, and the inspiration to push the boundaries of their research even further.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

Benkato, Adam. 2019. From Medieval Tribes to Modern Dialects: On the Afterlives of Colonial Knowledge in Arabic Dialectology. *Philological Encounters* 4: 2–25. [CrossRef]

Magidow, Alexander. 2016. Diachronic Dialect Classification with Demonstratives. *Al-'Arabiyya* 49: 91–115.

Mion, Giuliano. 2018. Pré-hilalien, hilalien, zones de transition. Relire quelques classiques aujourd'hui. In *Mediterranean Contaminations. Middle East, North Africa, and Europe in Contact*. Edited by Giuliano Mion. Berlin: Klaus Schwarz, pp. 102–25.

Owens, Jonathan. 2013. A House of Sound Structure, of Marvelous form and Proportion. In *The Oxford Handbook of Arabic Linguistics*. Edited by Jonathan Owens. Oxford: Oxford University Press.

Palva, Heikki. 2006. Dialects: Classification. In *Encyclopedia of Arabic Language and Linguistics*. Edited by Kees Veerstegh. Leiden and Boston: Brill, vol. 1, pp. 604–13.


## *Article* **The Emergence of a Mixed Type Dialect: The Example of the Dialect of the Bani** ˁ**Abbad¯ Tribe (Jordan)**

**Antonella Torzullo**

Department of Near Eastern Studies, University of Vienna, 1090 Vienna, Austria; antonella.torzullo@univie.ac.at

**Abstract:** The present article aims at questioning the status of the *šawi ¯* dialect of the Bani <sup>Q</sup>Abbad tribe ¯ by providing a new analysis of the main distinctive phonological, morphological, and syntactical traits which may hint at dialect mixing. The data provided by the field research, based on a functional framework that relies on descriptive linguistics and a typological approach, show that this dialect is deeply affected by a koineizing tendency due to increasing contacts with the populations of the neighboring areas (especially <sup>Q</sup>Amman and Sal ¯ t.) which, in turn, leads to the gradual loss of its authentic features. Finally, this paper discusses whether the dialect of the Bani <sup>Q</sup>Abbad should still be ¯ considered as belonging to the *yigul ¯* group (recently renamed Central Bedouin *ygulu ¯* ) of the Syro-Mesopotamian sheep-raising tribes or if a new typology of mixed type dialects should eventually be adopted for the dialects displaying important markers of both Bedouin and sedentary types.

**Keywords:** spoken Arabic varieties; dialect classification; Jordanian Arabic; Arabic dialectology; Arabic linguistics

#### **1. Introduction**

*The Traditional Classification of Jordanian Dialects in the Light of Recent Developments*

The Hashemite Kingdom of Jordan is characterized by a considerable diversity of regional dialects.

Nevertheless, in some cases the typological classifications of the Arabic dialects spoken in this country prove to be problematic, because there seem to be a considerable number of transitional and mixed type dialects.

As stated by Sawaie (2011, p. 499): "records of the linguistic situation in Trans-Jordan in the early part of the 20th century are not available. Consequently, it is hard to state with certainty which dialects dominated then".

The only available source on the Arabic dialects spoken in this area in this period is Bergsträsser's (1915) *Sprachatlas von Syrien und Palästina*.

In 1936 Cantineau drew a classification of several nomadic dialects in his *Études sur quelques parlers de nomades arabes d'Orient*. In the first part of his study, he distinguished the dialects of the camel-rearing tribes from the small-cattle ones and divided the latter into two groups: the atrochaic and the trochaic dialects (Cantineau 1936, p. 114). However, after extending the scope of his research to some other tribes and completing some data he had previously collected, in the second part of his study Cantineau (1937, p. 110) classified the dialects into four groups (which he labelled as a *classement rationnel*): group A, B, C and Bc.

Further investigations were reported in 1963 when for the first time R. Cleveland typologically classified the Jordanian dialects through phonological, morphological, and syntactic characteristics and drew "the most general outlines of the situation illustrated by a very limited number of dialectal characteristics" (Cleveland 1963, p. 56).

In his work, he divided the Arabic dialects spoken in the area into four groups:

*(1) Yigul–(2) B ¯* @*gul-(3) B ¯* @*kul-(4) B ¯* @ij*ul¯* .

**Citation:** Torzullo, Antonella. 2022. The Emergence of a Mixed Type Dialect: The Example of the Dialect of the Bani ޑAbbad Tribe (Jordan). ¯ *Languages* 7: 9. https://doi.org/ 10.3390/languages7010009

Academic Editors: Simone Bettega and Roberta Morano

Received: 19 October 2021 Accepted: 18 December 2021 Published: 5 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The groups were named according to the pronunciation of "*he says*" since this feature "indicates both an important phonetic and morphological characteristic" (Cleveland 1963, p. 171).

From 1969, Palva dedicated a considerable number of publications to the linguistic situation of Jordan and in 1984 he set a more complete classification of the Jordanian dialects in his "A General Classification for the Arabic Dialects Spoken in Palestine and Transjordan", where he added some new criteria of analysis with regard to those applied by Cleveland, namely the reflex of the sequences *CvCaCv*- and *–aXC* (where *X* is a guttural), the gender distinction in the 2nd and 3rd persons plural in personal pronouns and verbs, the use of the adverbs *here* and *now*, and the occurrence of the compound negation *ma¯* ... *š*.

The new classification he outlined divides the Jordanian dialects into three main groups:

	- a. the Galilean dialects (*biqul¯* );
	- b. the central Palestinian dialects (*bik.ul ¯* ), more conservative than the Galilean dialects;
	- c. the south Palestinian dialects (*bigul¯* ), closely related to the previous ones, they have some features that show a greater Bedouin (especially Negev) influence;
	- d. the north and central Transjordanian dialects (*bigul¯* ), closely related to the Horan dialects;
	- e. the south Transjordanian dialects (*bigul ¯* ), that are influenced by the Hijazi Bedouin dialects of Arabia Petraea and represent a mixed dialect type;
	- a. The dialects of Negev (*bigul¯* ), which show some features typical of the sedentary dialects (namely the *b*-imperfect). The dialects of this group typologically belong to the Sinai type, and they exhibit some similarities with the Bedouin dialects of Arabia Petraea;
	- b. The dialects of Arabia Petraea (*yigul ¯* ), they display some affinities with the Hijazi dialects;
	- c. The dialects of the Syro-Mesopotamian sheep-rearing tribes (*yigul¯* ), spoken in Transjordan, "(they) belong to the same type as the rest of the dialects of the sheep-rearing tribes in the Syrian and Mesopotamian peripheries of the Syrian Desert" (Palva 1984, p. 372).
	- d. The dialects of the North Arabian Bedouin type, *yigul ¯* , spoken in Transjordan by the Sirh. an, the Bani S ¯ . axar and the Bani Xalid. ¯

Herin et al. (2022) have recently implemented Cantineau, Cleveland and Palva's groupings with new elements and obtained the following categories1:


Their "taxonomy" has the merit to account for the changes caused by recent dialect contacts between speakers of Bedouin sub-groups varieties but does not take into consideration the effects of the increasing dialect mixing between the Bedouin and the sedentary population, which contributed to blur the linguistic boundaries set by this dichotomy. In particular, some Bedouin tribes such as the Bani ޑAAbbad (or ¯ ޑAAbab¯ ¯ıd), who live in the vicinity of the big urban centers, start to assimilate the speech habits of neighboring communities and lose the authentic *badaw¯ı* features that characterize their dialects.

During the last century, most of the Jordanian tribes, except for some living in the southern districts of the country (like the Bdul), abandoned their nomadic lifestyle and ¯ settled in some definite areas in semi-sedentary or sedentary conditions.

Many economic and social factors have contributed to intensifying the contacts between Bedouin and sedentary variations and the tendency to dialect levelling or koineization. Among these, the most relevant factors were: the increasing contacts with the inhabitants of the cities of Salt. and <sup>ޑ</sup>AAmman and those with the refugees coming from Palestine ¯ and Syria, the growing access to education, the proliferation of television series produced in Egypt, Syria, and Lebanon, and in some cases even inter-marriage with sedentary people (Rosenhouse 1984).

This is particularly evident in the dialect of the ޑAAbab¯ ¯ıd, in central Jordan: their dialect is deeply affected by a levelling process toward sedentary dialects that has left traces in phonology, morphology and syntax that will be discussed in detail in the section *Results*.

In the light of these findings, the aim of this study is to question the status of the *šawi ¯ yigul¯* dialect type that is geographically and historically attributed to this vernacular and to point out that the adoption of a more refined classification would better account for mixed dialects which display important markers of both Bedouin and sedentary types.

#### **2. Materials and Methods**

This article is based on a corpus consisting of 20,000 words<sup>2</sup> of transcribed unmonitored interviews3 of both men and women.

The speakers selected for this qualitative analysis belong to the Bani ޑAAbbad tribe, ¯ are distributed across different ages (the youngest girl was 5 years of old at the time of the interviews and the oldest man was 94 years old) and display various educational and socio-economic backgrounds.

The data resulted from two fieldwork campaigns carried out in July and August 2016 and from January until August 2017.

In order to limit the impact of my presence on the oral productions of the informants, the interviews<sup>4</sup> were carried out by Jordanians5: a boy from Karak, and one belonging to the semi-nomadic tribe of the ޑAAgˇarma, as well as an ¯ ޑAAbbadi girl. ¯

#### *2.1. The Tribe*

The Bani ޑAAbbad are a confederation that is divided into two main groups: al- ¯ Gburiyya ˇ and al-Grumiyya ( ˇ Peake 1958, p. 166). As noted by Shryock (1997, p. 40), "over time the ޑAAbab¯ ¯ıd have been internally fragmented and politically weak. [ ... ] This lack of consensus is commonly attributed to the diverse genealogical origins of the tribe'sclans".

According to Oppenheim (1943, p. 227), the areas most densely populated by the Bani <sup>ޑ</sup>AAbbad were M ¯ a¯h. is., Wad is-S ¯ ¯ır, <sup>ޑ</sup>AArag al- ¯ <sup>ij</sup>Am¯ır, <sup>ޑ</sup>AAr¯ <sup>d</sup> ¯ . a and al-G˙ or. However, nowadays ¯ some branches of the tribe also live in Bader al-Gad <sup>ˇ</sup> ¯ıda, as.-S. @b¯ıh. i, aš-Šuna a ¯ g-ˇ Gan <sup>ˇ</sup> ubiyya, ¯ al-Karama, W ¯ adi aš-Šit ¯ ņɇ , Yarga, Marg al- <sup>ˇ</sup> H. amam, ¯ <sup>ޑ</sup>A¯ Ira, and Na¯ޑAur, in the periphery of ¯ <sup>ޑ</sup>AAmman and in part in Salt ¯ . (Figures <sup>1</sup> and 2).

Most of the land in the Balga governorate belongs to the ޑAAbab¯ ¯ıd and indeed, a common Jordanian saying recites ޑA*Abbad min s ¯ ¯ıl* ޑA*a-s-s¯ıl* (ޑAAbbad from river to river), alluding ¯ to the fact that their tribal territory extends from the Zarqa Torrent to the Jordan River.

The exact number of members of the tribe is hard to estimate since there are no official censuses available. Nevertheless, according to the data recorded during my fieldwork the tribe could be composed of approximately 350,000 people.

Shryock (1997, p. 43) reports that until the 1950s most of the ޑAAbbadi clans were ¯ still living in tents (*byut aš-ša ¯* ޑA*r* in Arabic). They used to spend the winter in the Jordan Valley (locally called G˙ or or A ¯ gw˙ ar), where they grazed sheep and goats, whereas during ¯ the spring and summer months they moved to its crest (the *šifa*), where the temperatures are cooler.

**Figure 1.** Tribal map of the Balga District, Jordan—Adapted from Peake (1958, p. 253).

**Figure 2.** Reconstruction of the Bani ޑAAbbad¯ *d¯ıra* based on the information gathered during my fieldwork and realized by A. Cristaldi.

Unlike the desert tribes, the Bani ޑAAbbad had always been engaged in farming and ¯ in the late 1960s some members of the tribe started to prefer a more sedentary way of life. However, according to Shryock (1997, p. 46) even "in the midst of this rapid change, the

Balga tribes [have always] consider[ed] themselves fully Bedouin"and their identity is strictly linked with the values of *karama ¯* (generosity) and <sup>ޑ</sup>A*as. abiyya* (inter-tribal solidarity).

As stated by Sakarna (1999, p. 8) "today, the people of ޑAAbbadi tribe are no longer ¯ nomads and mostly live in settlements. They work as government employees, military individuals, farmers, and in other kinds of occupations".

#### *2.2. Functional Framework*

The research method applied in this work relies on descriptive linguistics which is "based on the empirical observation of regular patterns in natural speech" (François and Ponsonnet 2013, p. 184).

The present linguistic analysis is built on a corpus of narrative transcribed texts (20,000 words) taken from the recordings of unmonitored speech. After collecting this body of data, it was segmented through the program ELAN, and analyzed in order to "identify the distinctive component of the system and the principles that underline its organization" (François and Ponsonnet 2013, p. 184).

The functional framework used for my analysis combines typological and discoursebased approaches, since this "provides the tools necessary to address questions of the meanings underlying language variation" (Brustad 2000, p. 7).

In order to efficiently describe and explain the phonological and morphological variation of the dialect of the Bani ޑAAbbad, the methodology adopted was to synthesize those ¯ concepts that are able to most efficiently account for the data recorded during my fieldwork. It especially relied on the doctoral thesis of Herin on the dialect of the city of Salt. (which to date constitutes the only Jordanian vernacular exhaustively described), and the work of Haspelmath and Sims (2010), *Understanding Morphology*.

While examining the syntactical structures of the data corpus collected, I referred in particular to *Eléments de syntaxe générale* and *Syntaxe Générale, une introduction typologique, Tome 1 and 2* by Creissels (1995, 2006a, 2006b, respectively), and *The Syntax of Spoken Arabic* by Brustad (2000).

#### **3. Linguistic Analysis**

Although in 1999 Sakarna analyzed some phonological aspects of the dialect of the Bani ޑAAbbad, in particular those of the As-Sakarna branch, no in-depth study had been ¯ carried out on the dialect of the Bani ޑAAbbad tribe when in 2016 I started to gather the ¯ material for my master's thesis *Le dialecte des Bani* ޑA*Abbad: Analyse des traits phonologiques, ¯ morphologiques et syntaxiques discriminants*<sup>6</sup> (Torzullo 2018).

According to the information reported in Cantineau (1937), the above-mentioned classifications provided by Cleveland (1963) and Palva (1984), and the description of the dialect of the semi-nomadic al-ޑAAgˇarma tribe ( ¯ Palva 1976), my initial assumption was that the dialect of the Bani ޑAAbbad typologically belonged to the Jordanian ¯ *ygul¯* -group which is part of the so-called *Šawi ¯* dialects (in French also called *petits-nomades*).

My findings show that this dialect has a number of characteristics belonging to the small-cattle nomads described by Cantineau, such as the gender distinction in the 2nd and 3rd pl. persons, but they point out that it is deeply affected by a koineizing tendency that caused some significant changes in the structure of the vernacular and a loss of some authentic features.

This phenomenon is due to the increasing contact with the sedentary population of the neighboring areas of <sup>ޑ</sup>AAmman and Sal ¯ t., especially due to economic and educational reasons: most of the members of the tribe have access to higher education, which brings them to study in the capital, where the nearest universities are to be found. Furthermore, many ޑAAbab¯ ¯ıd work in these two cities and consequently have daily exchanges with people who do not speak their same variety of vernacular, and in some cases do not even understand their original dialect. These circumstances force them to opt for a more common variety, or at least to put aside some of the original traits of their vernacular and adopt some *madani*

(urban) linguistic features, in order to facilitate the communication with the outsiders of the tribe.

If, on the one hand, this convergence to a "common linguistic style" (Giles and Ogay 2007, p. 296) for the sake of intelligibility has improved the effectiveness of their contacts, on the other, this "accommodative code variation" (Giles et al. 1973, p. 179) has led over time to a penetration of sedentary elements in the phonology, morphology, and syntax of their speech and to the disappearance of distinctive *šawi ¯* marks, mainly among the youngest members of the tribe.

*3.1. Contact Induced Changes in the Contemporary Dialect of the Bani* ޑA*Abbad¯*

3.1.1. Phonology and Phonotactics

#### Reflexes of OA /q/

Old Arabic (OA) /q/ is usually realized as a voiced velar stop/g/in the dialect of the Bani <sup>ޑ</sup>AAbbad. Ex: ¯ *gawi* 'strong', *garye* 'village', *mant.ega* 'region', *dag¯ıga* ´minute´, *galb* ´heart´, *gis.s. a* ´story´.

However, it does not occur as a palatal variant /g/ in the contiguity of front vowels, ˇ as it certainly was in the 1960s and 70s in the *yigul ¯* dialect of the ޑAAgˇarma tribe who also ¯ lives in the Balga District.

In the corpus under analysis there are only four instances of this phenomenon: two of them, ޑA*egib ˇ* 'young children' and *gidd ˇ ami ¯* ´in front of me´, were produced by the oldest members of the tribe interviewed, a woman and a man of 90 and 94 years old respectively, and the latter two, *gider ˇ* ´cauldron, copper pot´ and *šigˇgˇ* 'part of the tent reserved to men', were obtained through elicitation.

The words *gidd ˇ ami ¯* and *gider ˇ* occur along with their variants *geddami ¯* and *gider* where the velar is not realized as affricate. This alternation can be considered as an example of "stylistic contrasts *plain colloquial* vs. *koineized colloquial*" (Palva 1976, p. 10).

#### Reflexes of OA /k/

The affricate /ˇc/ of OA /k/ shows a regressive character within this dialect. It rarely appears in spontaneous speech, and when occurring it is only used by the oldest speakers: *ih. cilha ˇ* 'tell her', *th. aˇci* 'you (f.) speak', *cˇ¯ıl* 'measure, weigh', *cˇ¯ıf* 'how', *biˇc¯ıla* 'type of Bedouin dessert', *cänna ˇ* 'daughter-in-law', *heˇ¯c* 'thus', *calb ˇ* 'dog', *yˇcammilen* 'they (f.) complete', *camˇ* 'how much', *cit ˇ ¯ır* 'a lot', *miˇcan¯* 'place'.

*¯* Given the number of instances of this variant it is possible to conclude that these lexemes are "vestigial variants" and represent some "fossilised traces of an earlier dialect system when such forms were more general" (Trudgill 1999, p. 321).

Even the morphological contrast between -*k* and -*cˇ*, which allows the opposition of the pronominal suffix of the 2nd m. sing. person and the pronominal suffix of the 2nd f. sing. person, is poorly attested7. It occurs in the speech of the older and less-educated speakers, while the youngest and those who work or study in ޑAAmman replace the allomorph ¯ *-ˇc* with *-ki*.

The recessive status of the affricate /ˇc/ is particularly evident within a family I interviewed: the use of the affricate /ˇc/ gradually disappears over the three generations of women who live in the same household8:


This example clearly showcases the shift from a purely Bedouin trait of the dialect to a more mixed type one by sedentarization.

Reflexes of the Old Diphthongs /ay/ and /aw/

The monophthongisation in /¯ı/ and /u/ of the old diphthongs /ay/ and /aw/ does ¯ not appear as a prominent trait in this dialect.

In *Studies in the Arabic Dialect of the Semi-Nomadic* ә*l-*ޑA*Agˇarma Tribe ¯* , Palva (1976, p. 19) reports: "one of the most striking characteristics distinguishing the dialects of the nomadic type from those of the sedentary type in the Syro-Palestinian dialect area is the fluctuation /e/-/ ¯ ¯ı/ and /o/-/ ¯ u/". ¯

This feature is briefly mentioned by Cantineau in *Les parlers arabes du H. or¯ an¯* (Cantineau 1946, p. 156), who refers to it as occurring in sporadic cases, while Bettini (2006, p. 30) only accounts for a fluctuation between the realizations /e/ and / ¯ ¯ı/ of /ay/ (which appears as a common trait in the dialects analysed in *Contes Féminins de La Haute Jézireh Syrienne*) and a long vowel /o/ for the diphthong /aw/. ¯

Behnstedt (1997, pp. 62–63), on the other hand, does not report this fluctuation at all in the Map31 of his *Sprachatlas von Syrien.*

In the corpus under analysis there are only four occurrences of monophthongisation in /¯ı/ and /u/ occurring in natural speech: ¯ *cˇ¯ıl* 'measure, weight, *cˇ¯ıf* 'how', *d ¯ . ¯ıf* 'guest' and *huš¯ at¯* 'fights', together with a sentence reported by a man describing the Bedouin traditions concerning the drinking of coffee:


One says: the first cup is for the guest, the second to talk about war, and the third to discuss injustice.

In addition to these instances, two examples of a fluctuation between the realizations /e/ and / ¯ ¯ı/ were obtained by elicitation: *l¯ıl~lel¯* 'night' and *z¯ıt~zet¯* oil´.

According to the tendencies that appear in the data, the realizations /e/ and / ¯ o/ of ¯ the old diphthongs /ay/ and /aw/ gained ground in the dialect of Bani ¯ ޑAAbbad, to the ¯ detriment of the most archaic ones. So, it is possible to observe: *bet¯* ´house, tent´, *ben¯* ´between´, *g˙er¯* ´other´, *xer¯* ´good, well´, *xel¯* ´horse´, ޑA*ela ¯* ´family´, ޑA*en¯* ´eye´, *tor¯* ´bull´, *yh. ošen ¯* ´they (f.) plough´, *zo¯gaˇ* ´wife´, *xof¯* ´fear´, *fog¯* ´above´, *t.or¯* ´cave´.

This state of the affairs suggests that the above-mentioned statement by Palva (1976) concerning the monophthongisation in /¯ı/ and /u does not currently hold true for this ¯ dialect. The few instances of the phonemes /¯ı/ and /u/ (</ay/ and /aw/) may thus ¯ represent the last traces of an older monophthongisation system where such forms were more widespread, and that in the last decades have been replaced by more sedentary forms.

#### Cases of Trochaism

According to Cantineau (1936, p. 114), trochaism is a typical trait of some *petits nomades* dialects, namely those of the <sup>Q</sup>Om¯ ur, S ¯ .lut, Bani X ¯ aled and Sirh ¯ . an. ¯

Dialects characterized by a trochaic rhythm maintain the older Arabic inflectional *-a*as well as the *-a* of the feminine morpheme between a long syllable and a pronominal suffix (Palva 1976, p. 25), i.e., CvCC or C<sup>−</sup> vC + pron. suff. > CvCCA + pron. suff. & C<sup>−</sup> vCA + pron. suff.

In my data this phenomenon is poorly attested (see Tables 1 and 2), and it only rarely appears in the speech of the oldest speakers:

**Table 1.** Trochaic occurrences in verbs.



**Table 2.** Trochaic occurrences in nouns and prepositions.

Given the number of instances of this trait, it is possible to conclude that nowadays the dialect of the Bani ޑAAbbad belongs to the atrochaic group even if the vestigial variants ¯ found in the corpus suggest that a trochaic rhythm characterized the vernacular at an earlier stage. This change in the rhythm of the syllable is most likely due to the influence of the surrounding sedentary dialects that display an atrochaic pattern.

#### 3.1.2. Morphology

#### Personal Pronouns

The independent forms of personal pronouns in the dialect of the Bani ޑAAbbad are ¯ illustrated in Table 3:


**Table 3.** Independent personal pronouns.

It is possible to observe for the 1 s. an alternation between the forms <sup>P</sup>*ana~*P*ani*.

According to the data concerning the dialect of the <sup>Q</sup>*Agˇarma ¯* , Palva (1976, p. 27) states: "it is impossible to decide whether [the first variant] is a traditionally genuine form or a loan from the neighbouring dialects [Bani Xaled, Sirh. an, etc.]". ¯

The second form, <sup>P</sup>*ani*, is regarded by Younes and Herin (2016, p. 4) and Isaksson (1999, p. 59) as characteristic of the *šawi ¯* tribes, in free variation with the form <sup>P</sup>*ani ¯* , already mentioned by Cantineau (1937, p. 173). Behnstedt (1997, p. 501), B. Herin (2010, p. 46) and Al Tawil (2019, p. 138) report that this form is commonly found in the H. or¯ an (both Syrian ¯ and Jordanian) and that it also marginally appears in the dialect of Salt..

As for the dialect of the <sup>Q</sup>Abab¯ ¯ıd, it appears only in sporadic instances:


The 1 p. also exhibits two variants: <sup>P</sup>*ah. na <sup>~</sup>*<sup>P</sup>@*h. na* (49) and <sup>P</sup>*ih. na* (20). The form *h.* @*nna* attested in the neighboring tribe of the ޑAAgˇarma and phonetically associated with the ¯ *gahawa* syndrome is not attested. Palva (1976, p. 26) affirms that the form <sup>P</sup>@*h. na* is "a stylistic variant" of *h.* @*nna*, and that their respective use depends on the form found in the tribe to which the ޑAAgˇarma address their speech. ¯

It is possible to suppose that *h.* @*nna* may also have been employed in the dialect of the Bani ޑAAbbad but that due to the contact with the adjacent sedentary dialects it disappeared, ¯ while the borrowing <sup>P</sup>*ih. na* penetrated in the vernacular and started to become more popular, especially among younger speakers.

Regarding the bound pronouns, it is worth mentioning that due to the recessive character of *-ˇc*, the original *šawi ¯* morphological contrast between the pronominal suffix of the 2nd m. sing. person *-k* and the pronominal suffix of the 2nd f. sing. person *-ˇc*, is not consistent.

The occurrences of the feminine bound pronouns' forms *-eˇc* and *-ˇc* are in total 12. The only instance that occurs in free speech is *xuwayateˇ ¯ c*´your (f.) friends (m.)´ (2), while the

other words containing this bound pronoun are to be found in optative sentences, in a few words occurring in some lines of poems and in some traditional quotes: *salama tsallmeˇ ¯ c* '(may you) live in good health' (1), *(Al* ˙ *l* ˙ *ah) yis*Q*id* <sup>Q</sup>*umreˇc* 'may God make you (f.) happy for all your (f.) life' (1), *salamteˇ ¯ c '*I wish you (f.) good health'(1), <sup>Q</sup>*ammeˇc* 'your (f.) paternal uncle (1)', *yxafeˇ ¯ <sup>c</sup>*'you (f.) are afraid' (3), *la*-*h. aleˇ ¯ <sup>c</sup>*'alone (f.)' (1), <sup>Q</sup>*¯ıneˇc* 'your (f.) eye' (1), <sup>P</sup>*ah.ebbeˇc* 'I love you (f.)' (1).

Thus, even if as noted by Palva (1976, p. 47), "there is [still] a (bedouinizing) tendency which is actualized in certain speech situations associated with traditional culture", in this case religious formulae, poetry and citations of famous quotes, the allomorph *-ˇc* is nowadays being replaced by the sedentary form *-ki*.

#### Interrogative Pronouns

The interrogative pronouns occurring in the data collected are reported in Table 4:


**Table 4.** Interrogative pronouns.

The traditional Bedouin forms (in bold in Table 4) are very marginal in the dialect of the Abab¯ ¯ıd: *min* (3), *weš¯* (1), *cˇ¯ıf* (1), *lweš¯* (6), and they are to be found only in the recordings of the older speakers.

The variants *mita;* <sup>P</sup>*emta; waymat ¯* for ´when´and *camˇ* for ´how much´ were obtained only through elicitations and were never used by the speakers in natural speech.

As for *šlon¯* , its use seems to be limited to the question *šlon-ak/iˇ ¯ c?* ´how are you´ and it does not occur in other positions.

Sedentary forms (which are to be considered as loans from the dialects of <sup>Q</sup>Amman¯ and Salt.) are the most attested ones in the corpus, and in many cases, they have already completely replaced their genuine nomadic counterparts.

#### Verbs C1=ij

The weak verbs C1=ij ij*axad ¯ -yaxu ¯ d ¯* 'to take' and <sup>ij</sup>*akal-yakul ¯* 'to eat' are as a rule reinterpreted as III *w/y* verbs in *šawi ¯* dialects (Younes and Herin 2016, p. 10) and realized as *kala* (ou *cala <sup>ˇ</sup>* ) et *xad¯ a*.

In respect to this trait, Cantineau (1936, p. 87) writes: "cette conjugaison caractérise d'une façon remarquable les parlers de nomades moutonniers et les oppose aux parlers de <sup>s</sup>édentaires syro-palestiniens, qui ont toujours <sup>P</sup>*akal*, <sup>P</sup>*axad¯ "*.

However, as illustrated in Tables 5 and 6, the dialect of the Bani <sup>Q</sup>Abbad shows the ¯ coexistence of two patterns for these two verbs, in both perfective and imperfective: one typically Bedouin (*xad¯ a–kala/yaxud ¯ ¯ -yakul) ¯* and one sedentary (<sup>P</sup>*axad¯* - <sup>P</sup>*akal/yoxud ¯ ¯ -yokil) ¯* .


**Table 6.** Inflexion of imperfective.


The forms reported by Cantineau (1936, p. 87) were fully confirmed by elicitation (even with some minor differences), while the instances contained in the recorded corpus are mixed:


(3) *rah¯.* <sup>Q</sup>*and aš-šex t ¯ .alabha, iši h. abbha al-muhimm, banat gar ¯ aybo w an-n ¯ as illi yagrab ¯ ulo m ¯ a h ¯ . abbu inno yh. ibbha la*<sup>P</sup>*anno t¯ ari u mrattab h ¯ . abbu inno yoxid ¯ ¯ minhum u ma¯ xad¯ a minhum.*

He went to the sheikh to ask for her hand, what is important is that he loved her, the girls of his relatives and the people who were related to him, did not like [the fact] that he loved her, because he was rich and wealthy, [so] they wanted him to take one of them, [but] he did not choose one of them.


The traditional paradigms still represent the majority of occurrences in the texts (52.5%), however, it is possible to observe a growing use of the sedentary forms (47.5%).

It is interesting to notice that the morphological variation between the two forms occurs only in natural speech but not when reporting some highly traditional tales or poems.

In this respect, Henkin (2010, p. 219) reports: "of all the oral registers, it is vernacular that reacts most significantly to dialectal and demographic variables [ ... ]. In contrast, the traditional registers of oral literature, including oral narrative and oral poetry [ ... ] are less affected by everyday communicational needs".

The use of one variety over the other depends on sociolinguistic factors, namely age and level of education. The use of sedentary paradigms is increasing, and they are replacing the traditional forms mostly in the speech of the youngest generations, who have greater contact with the inhabitants of the biggest urban centers.

#### Prepositions

The preposition ޑA*ugub* 'after' represents a vestigial variant in the dialect of the Bani ޑAAbbad, who nowadays use the form ¯ *ba*ޑA*d*.

According to Herin (2010, p. 132) " dans le parler traditionnel, *ba*ޑA*d*- signifie 'encore' et non 'après' qui est une influence des parlers urbains (QAmman et Palestine)". ¯

In the corpus under analysis, ޑA*ugub* occurs only twice, in the speech of the oldest people interviewed, i.e., a woman and a man of 90 and 94 years old, respectively:


This preposition was probably more widespread some decades ago when its sedentary counterpart *ba*Q*d* had not penetrated so deeply in the vernacular. The current status of ޑA*ugub* shows once again the linguistic changes in progress in this dialect, in the direction of a de-bedouinization of the most conservative and traditional forms, which facilitates communication with those outside the tribe.

#### Conditional Conjunctions

Who ruled after them? The English.

The most frequently attested conditional conjunction in the data collected is ij*id a* (39):


If you don't shake the cup, one keeps pouring you (coffee).

However, this form does not belong to the original <sup>Q</sup>Abbadi repertoire, but it is a ¯ progressive sedentary variant that supplanted the traditional form to introduce conditional clauses, *cˇan¯* , which has never been employed by the speakers.

#### 3.1.3. Syntax

Genitive Exponent

According to Younes and Herin (2016, p. 12) the most frequent form of genitive exponent found in most *Šawi ¯* dialects is *giyy*.

This form, first reported by Cantineau (1946, p. 204), was also attested by Cleveland (1963, p. 61), who defined it as characteristic of the Jordanian Bedouin dialects.

In the dialect of the <sup>Q</sup>Abab¯ ¯ıd there are no instances of this local variant, and the only occurrences of a genitive exponent in natural speech is the more general and widespread Levantine *taba*Q, which has to be regarded as a borrowing from the adjacent urban dialects:


The cauldron is a big pan where they put the grain of wheat

The forms *giyy* and *šiyy*<sup>12</sup> were productive at an earlier stage, as reported by a speaker of <sup>Q</sup>Arag al- ¯ <sup>P</sup>Am¯ır, but today they are only used in the field of trade:

(13) *Dar¯ g¯ıti dar taba ¯* <sup>Q</sup>*ti, hal-kalime kanat maw ¯ gˇuda, ¯ g¯ıti aw š¯ıti, ya*Q*ni mulki w ili, a zayy h ¯ ek¯ a,¯ al-banat¯ giyyati ¯ aw h. alal¯ ati ya ¯* <sup>Q</sup>*ni ili h. atta* <sup>Q</sup>*an al-ganam yg ˙ ulu-lo ¯ giyyati ¯ , kanat maw ¯ gˇude ¯ hassa¯*QQ *bista*Q*maluha at-tu ¯ gˇgˇar¯* <sup>Q</sup>*enna, mat¯ alan ygul lak ¯* <sup>P</sup>*ana gayyati ¯ 3 alaf¯* <sup>P</sup>*ana gayyati ¯ 10 alaf, ya ¯* <sup>Q</sup>*ni flusi. ¯*

*Dar g ¯ ¯ıt*13*i* or *dar taba ¯* <sup>Q</sup>*ti* 'my house', *g¯ıti* or *š¯ıti*, this word existed, it means of mine, mine, like this yes, *al-banat giyy ¯ ati ¯* 'my daughters' or *h. alal¯ ati ¯* ´my cattle´ for example, one also used to say *al-ganam giyy ˙ ati ¯* 'my sheep´, this existed but now businessmen use it, to say my 3 or 10 thousand, to indicate my money.

It is possible to conclude that the progressive variant *taba*<sup>Q</sup> superseded the equivalent traditional forms of genitive exponents in the dialect of the Bani <sup>Q</sup>Abbad. ¯

A similar scenario is also attested by Procházka (2018), Younes (2014) and Younes and Herin (2013) for the *Šawi ¯* Bedouin dialects spoken in Syria<sup>14</sup> and for those spoken in Lebanon by the Abu <sup>Q</sup>¯ Id and the <sup>Q</sup>At¯ıg. ˇ

This state of the affairs confirms the analysis of Palva (1982, p. 28), who states that "the genitive exponent belongs to the features in the dialects that are particularly exposed to koineization".

#### Negation

The negation in the dialect of the Bani ޑAAbbad has also undergone a process of seden- ¯ tarization.

It is possible to notice that the negative form *miš* has gained ground in the non-verbal negation at the expense of the genuine negative particles *ma¯* and *mu¯* (Palva 1976, p. 42; Al Tawil 2021, p. 23). These last variants represent, respectively, 16.45% and 13.92% of non-verbal negation in the data:


Before one [used to] dance the *šah. ge* and not the *dabka*.

In total, 63.29% of this type of negation is constituted by the item *miš* and the remaining 5% by *muš*, which has to be regarded as an older borrowing as it is already attested in H. or¯ an ( ¯ Cantineau 1946, p. 389):


It is also interesting to point out that the negative structure *ma fi-š ¯* began to spread in the speech of the youngest speakers of the tribe16:


According to Palva (1976, p. 42) the nominal negation *miš* and the structure *ma fi-šš ¯* are to be considered as K-forms borrowed from the neighbouring sedentary dialects and not as genuine Bedouin negation.

However, the high frequency of the use of *miš* suggests that this negation has already been integrated into the dialect of the Bani <sup>Q</sup>Abbad and that the other particles, i.e., ¯ *ma¯*, *mu¯*, which used to be employed in non-verbal negations, are to be regarded as vestigial variants.

#### *b*-Imperfect

Cantineau (1936, p. 83) affirms: "Le *b*-préfixe de l'inaccompli, si caractéristique des parlers des sédentaires syro-palestiniens, fait entièrement défaut dans les parlers des nomades". This statement, however, does not hold true anymore for the dialect of the Bani <sup>Q</sup>Abbad. ¯

In fact, the use of the morpheme *b*- prefixed to the imperfect is no longer excluded from this Bedouin dialect, and it is possible to find 53 occurrences of this trait in the corpus:


The adoption of the *b-imperfect* to mark habitual actions which take place in the present can be considered as an ongoing process in the dialect of the <sup>Q</sup>Abab¯ ¯ıd, and thus it is not rare to find some speakers using both forms (with and without *b*- prefix) in one and the same sentence:

(24) *Fa yitgawwaz az-zalama <sup>ˇ</sup>* ... ij*id ¯ a kan fi ¯* " *indhum dar¯ biskin bi-nafs ad-dar ygullak waladna m ¯ a¯ bit.la*<sup>Q</sup> *barra ya*ޑA*ni muh. arram inno yit.la*<sup>Q</sup> *yiskin xar¯ g al-mant <sup>ˇ</sup> .ega illi humma sakn ¯ ¯ın f¯ıha* ... *biz.*- *z. abt.... hay al ¯* - <sup>Q</sup>*ada. ¯*

Then the man gets married ... if they have a house, he lives in the same household, one says: our boy does not go elsewhere. That is to say that it is forbidden (for him) to go live outside the area where they live . . . Exactly . . . this is the custom.


Thus, it seems that, at this stage, this syntactic feature is used in free variation with the imperfect without *b*-, like among the Negev and Sinai Bedouins (Palva 1994, p. 462; Blanc 1970; De Jong 2000).

The *b*-imperfect can also be used to mark the progressive aspect of an action. However, in the corpus under analysis there is only one instance of this type of structure:


This sedentary trait has penetrated mostly the speech of the younger members of the tribe who have the tendency to conform more to the vernaculars of <sup>Q</sup>Amman and Sal ¯ t., and so it reveals "a generational variation amongst the receptiveness of the borrowings" (Younes 2017, p. 136).

#### Future

The most prominent way in *Šawi ¯* varieties to express future and volition is the use of the pseudo-verbs *widd*- and *rad–y(i)r ¯ ¯ıd* ´to want´<sup>17</sup> (Younes and Herin 2016, p. 11). The first particle is well attested in the dialect of the Bani <sup>Q</sup>Abbad while the verb ¯ *rad¯* is completely absent from it.

It is worth noticing that the pseudo-verb *widd*- occurs together with the sedentary variant *bidd*- all along the corpus:


Since the number of instances of these two forms is almost equivalent18, it is possible to affirm that the Bedouin and sedentary varieties of this pseudo-verb coexist in this vernacular. However, their use remains dichotomous: it is considered as equally eligible by the speakers of the new generation, while the older ones (55 years and older) seem to prefer the genuine Bedouin item over the *madani* one.

#### **4. Discussion**

#### *4.1. Bedouin or Sedentary?*

In the light of the traits analyzed above, and the common division of the Arabic dialects available, the classification of the dialect of the Bani <sup>Q</sup>Abbad is not straightforward. ¯

According to the historical background and geographical position in the Balga district, it should typologically belong to the Bedouin Jordanian *ygul¯* -type (part of the broader *Šawi ¯* dialects'*continuum*) or the recently theorized Central Bedouin *ygulu ¯* group. However, can it really be unproblematically labelled as such? Is it appropriate to still refer to this vernacular as a Bedouin dialect?

Such classifications which are based on typical, but nonetheless very few features, in fact do not account for all the discrepancies found in this vernacular and do not accurately allow to represent the actual state of the dialect.

As pointed out in the section *Linguistic Analysis*, this dialect diverges in many aspects from the "essential structure" (Palva 1969, pp. 14–15) of the *ygul ¯* group and it remains inaccurate to identify this dialect of the Jordan Valley with the *šawi ¯* label.

In fact, in addition to the elements already mentioned throughout the article, it is also possible to observe from Table 6 the lack of the formative /n/ in the 3. m. pl. of the imperfective. This constitutes a noticeable difference (*yax¯ d ¯ u* vs. *yax¯ d ¯ un¯* ) with the typical inflextion found in *šawi ¯* dialects (Younes and Herin 2016, p. 8) and adds supplementary intricacy in cataloguing this vernacular. Furthermore, other genuine *šawi ¯* characteristics have been replaced by sedentary forms, while some typically *madani* items are being acquired in addition to the original Bedouin ones.

All the features taken into account for this analysis hint at highlighting that this dialect is in a situation of dialect contact and in an ongoing process of linguistic change.

Holes (1995, p. 278) notes that during the last 70 years "Jordan has experienced a long-term drift from the countryside and the desert into its towns and cities", and massive migrations from neighboring countries, namely Palestine and Syria. As in other Middle Eastern states, "these social and political developments have had, and continue to have profound effects in spoken Jordanian Arabic varieties" (Holes 1995, p. 278), which in some cases tend to overlap more and more.

As Romaine (2010, p. 321) states:

"When groups in contact need to communicate, they have a number of possible choices. One is to use a lingua franca they both share [ ... ]. A second option is for one or more parties to learn the other group's language(s). In cases involving no substantial imbalances of power between the groups, stable multilingualism may result. However, where bilingualism is asymmetrical and the more powerful group imposes its language on a subordinate group, contact often leads to language shift or loss".

The second scenario applies in this case. The members of the Bani <sup>Q</sup>Abbad have day- ¯ to-day and long-lasting relations with the speakers of sedentary adjacent dialects and in order to facilitate the communication and mutual intelligibility avoid the use of traditional Bedouin features and adopt forms that can be easily understood by outsiders to the tribe19.

However, the linguistic patterns of accommodation and borrowings are not identical for all the people of the community (Palva 1982).

Traditional Bedouin features are disappearing faster from the speech of the young members of the tribe, and there was already a major tendency for women to make use of sedentary characteristics in the everyday vernacular 50 years ago (Palva 1976), since they appear more feminine and sophisticated than the respective Bedouin ones which relate to rough and tough masculinity (Holes 1995).

The borrowing of sedentary speech-habits does not mean that the Bani <sup>Q</sup>Abbad "give ¯ up their language willingly, but continue transmitting [it], albeit in changed form over time" (Romaine 2010, p. 321).

The members of the tribe are aware of this state of the affairs and in fact one of the older members declares in one of the recordings:



These changes by sedentarization20 result in a hybrid form of dialect which may be considered as a *dialect compromise* (Holes 1995) built on Bedouin and sedentary forms, which is not simple to define.

#### *4.2. Emergence of a Mixed Dialect Type by Sedentarization*

The members of the linguistic community in Jordan do not have equal command of all the varieties in use in the country. Thus, Bedouin speakers often accommodate to the more common speech patterns of the city dwellers for the sake of intelligibility. Over time this convergence materially affects the distinctiveness of a dialect in a situation of language contact (Gumperz 1969, p. 436)

Already in 1992 Palva illustrated the issue of the typological division into Bedouin and sedentary dialect types in Jordan basing his analysis on the vernaculars spoken in the cities of Salt. and Karak: "both dialects are labelled as *b*@*gul¯* dialects. [ ... ] However, as a matter of fact (they) display many typically Bedouin features, so markedly different from Syro-Palestinian sedentary dialects" (Palva 1992, pp. 53–54).

In this study of the dialect of the Bani <sup>Q</sup>Abbad the problem is the reverse, i.e., a ¯ historically Bedouin-type dialect that shows traits of sedentary dialects, and that thus exhibits important markers of both dialect types.

Due to the new *madani* established features it is not possible to label this dialect only as Bedouin, and according to the retention of some *šawi ¯* traits it cannot be defined as a sedentary dialect either.

The traditionally dichotomous split of Arab dialects in terms of Bedouin versus sedentary (Versteegh 1984; Rosenhouse 1984, 2006; Cadora 1992), is no longer sufficient to describe and classify those varieties like the Bani <sup>Q</sup>Abbad dialect, i.e., languages which ¯ have experienced relatively high degrees of contact to the extent that change is additive (Trudgill 2010, p. 301).

As pointed out by Lentin (1994) and Watson (2011), the mere use of this bipolar division is inaccurate and gives rise to a number of difficulties especially when classifying dialects in a situation of language contact.

Due to the linguistic developments that occurred in the dialect of the Bani <sup>Q</sup>Abbad it ¯ seemed functional to define it as a mixed type, in order to better underline its particular nature.

However, the emergence of mixed type dialects, such as this one, brings attention to the need to conceptualize new ways of grouping dialects and that further criteria of analysis need to be adopted, especially morphological and syntactic ones.

**Funding:** This research was partially funded by INALCO, with a grant destined to a fieldwork campaign in July and August 2016.

**Institutional Review Board Statement:** Ethical review and approval were waived for this study since it guarantees the anonymity of the participants. The study was conducted according to the guidelines of the Declaration of Helsinki. The interviewees and informants have always been regarded as partners who had a word in the (re)use of the material furnished by them. In order to respect all ethical demands usual in linguistic studies, they were always informed about the aims of the research, and before recording them, their consent was asked for publishing their speech in printed sources and online. Furthermore, the interview partners have never faced any problems with the Jordanian authorities because of the cooperation with the author. First, because most interviews and recordings were made by a local person who accompanied the author and, second, because research on spoken Arabic is not a sensitive topic in Jordan.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Only private dataset was analyzed for this study. It stems from the author's personal fieldwork in Jordan in 2016 and 2017, and it is available on request from the corresponding author and with the permission of the informants.

**Acknowledgments:** I would like to thank Stephan Procházka, Veronika Ritt-Benmimoun, Francesca Bellino and Bettina Leitner for their valuable remarks and critical comments on the draft versions of this article.

**Conflicts of Interest:** The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Notes**


#### **References**

Al Tawil, Miriam. 2019. La langue arabe parlée dans le H. or¯ an. Master's thesis, Universit ¯ à degli Studi di Catania, Catania, Italy. Al Tawil, Miriam. 2021. *Morpho-Syntactic Features of Bedouin Varieties in Northern Jordan*. Naples: Maydan, Working Paper in Publication. Behnstedt, Peter. 1997. *Sprachatlas von Syrien. I: Kartenband, Semitica Viva 17*. Wiesbaden: Harrassowitz.

Bergsträsser, Gotthelf. 1915. Sprachatlas von Syrien und Palästina. *Zeitschrift des Deutschen Palästina-Vereins* 38: 169–222.

Bettini, Lidia. 2006. *Contes Féminins de La Haute Jézireh Syrienne. Matériaux Ethno-Linguistiques d'un Parler Nomade Oriental*. Florence: Dipartimento di Linguistica.

Blanc, Haim. 1970. *The Arabic Dialect of the Negev Bedouins*. Jerusalem: Academic Press.

Brustad, Kristen. 2000. *The Syntax of Spoken Arabic: A Comparative Study of Moroccan, Egyptian, Syrian, and Kuwaiti Dialects*. Washington, DC: Georgetown University Press.

Cadora, Frederic J. 1992. *Bedouin, Village and Urban Arabic: An Ecolinguistic Study*. Leiden: Brill.

Cantineau, Jean. 1936. Études Sur Quelques Parlers de Nomades Arabes d'Orient 1. *Annales de l'Institut d'Études Orientales* 2: 1–118. Cantineau, Jean. 1937. Études Sur Quelques Parlers de Nomades Arabes d'Orient 2. *Annales de l'Institut d'Études Orientales* 3: 119–237. Cantineau, Jean. 1946. *Les Parlers Arabes du H. or¯ ân*. Paris: Klincksieck.

Cleveland, Ray L. 1963. A Classification of the Arabic Dialects of Jordan. *Bulletin of the American Schools of Oriental Research* 171: 56–63. [CrossRef]

Creissels, Denis. 1995. *Éléments de Syntaxe Générale*. Paris: Presses Universitaires de France—PUF.


Versteegh, Kees. 1984. *Pidginization and Creolization: The Case of Arabic*. Current Issues in Linguistic Theory, 33. Amsterdam: Benjamins. Watson, Janet C. E. 2011. Arabic Dialects (General Article). In *The Semitic Languages: An International Handbook*. Edited by Stefan Weninger. Berlin and Boston: De Gruyter Mouton, pp. 851–96.

Younes, Igor, and Bruno Herin. 2013. Un Parler Bédouin Du Liban Note Sur Le Dialecte Des "At¯ıg (Wˇ ad¯ ¯ı Xalid). ¯ *Zeitschrift Für Arabische Linguistik* 58: 32–65.

Younes, Igor, and Bruno Herin. 2016. Šawi Arabic. In ¯ *Encyclopedia of Arabic Language and Linguistics*, online ed. Leiden: Brill.

Younes, Igor. 2014. Notes Prélminaires Sur Le Parler Bédouin Des Abu ¯"Id (Vallée de La Békaa). *Romano-Arabica* 14: 355–87.

Younes, Igor. 2017. Dialect Contact in the Beqaa Valley. *Romano-Arabica* 17: 131–40.

## *Article* **The Classification of Bedouin Arabic: Insights from Northern Jordan**

**Bruno Herin 1,\*, Igor Younes 2, Enam Al-Wer <sup>3</sup> and Youssef Al-Sirour <sup>4</sup>**


**Abstract:** The goal of the present paper is to provide a revaluation of the classification of the Bedouin dialects of Northern Arabia and the Southern Levant, based on published or publicly available data and on first-hand data recently collected amongst some Bedouin tribes in Northern Jordan. We suggest extending previous classifications that identify three types of dialects, namely A ( " *nizi*), B (*šammari*), and C (*šawi ¯* ). Although intermediary or mixed types combining *šammari* features with *šawi ¯* features were already noted, our data suggest that further combinations are possible, either because they had so far been unnoticed or because recent levelling and dialect mixing have blurred the boundaries between some of the varieties.

**Keywords:** Arabic dialectology; classification; Bedouin Arabic; Jordan; Masa¯ " ¯ıd

#### **1. Introduction**

The goal of the present paper is to provide a revaluation of the classification of the Bedouin dialects of Northern Arabia and the Southern Levant, based on published or publicly available data and on first-hand data recently collected by the authors amongst some Bedouin tribes in Northern Jordan. We suggest extending Cantineau's (1936, 1937) classification that identifies three types: A ( " *nizi*), B (*šammari*), and C (*sawi ¯* ). Although Cantineau already noted intermediary or mixed types combining *šammari ¯* features with *šawi ¯* features, our data suggest that further combinations are possible, either because they have so far not been noticed or because recent levelling and dialect mixing have blurred the boundaries between some of the varieties. Foundational surveys include Cleveland (1963) who, much in the same way as Blanc (1964) coined the *gilit–q∂ltu* dichotomy, coined the dialectonyms *biqul¯* , *bikul¯* , *bigul¯* , *bi* " *ul¯* and *yigul¯* based on the 3.m.sg. of the imperfective of the verb \**qal¯* 'he said'. Further developments can be found in Palva (1984). Palva divides the Bedouin dialects of the Southern Levant into four groups, as below:


The problem with the *biqul–yig ¯ ul ¯* appellation is that it fails to capture the difference between a major split in Jordan, namely between dialects that exhibit final /n/ in the imperfective endings *-¯ın* and *-un¯* and those which exhibit *-¯ı* and *-u¯* (Herin 2019). Using the 3.m.pl. of the imperfective of *qal¯* would partially solve this problem, which, combined with geography, yields the following classification: Southern *ygulu ¯* , Central *ygulu ¯* , and Northern *ygul ¯ un¯* . Central *ygulu ¯* is in many ways identical to the Northern *ygul ¯ un š ¯ awi ¯* C; the presence or absence of /n/ is the main difference. Only Southern *ygulu ¯* is an extension of the North-West Arabian type (Palva 2011). Our focus will be the hitherto under-studied

**Citation:** Herin, Bruno, Igor Younes, Enam Al-Wer, and Youssef Al-Sirour. 2022. The Classification of Bedouin Arabic: Insights from Northern Jordan. *Languages* 7: 1. https:// doi.org/10.3390/languages7010001

Academic Editors: Simone Bettega and Roberta Morano

Received: 9 August 2021 Accepted: 28 October 2021 Published: 23 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Northern *ygul ¯ un¯* type with a special focus on the *Misa¯* " *¯ıd* dialect which exhibits many *šammari* features such as the apophonic passive (*yid ¯ kar* 'it is remembered') or a [dj ] reflex of \*/g/ ( ˇ *dyibal* 'moutain'), but also *šawi ¯* -like traits such as the [q] < /g/ ( ˙ *qer¯* 'other') and more surprisingly, features that are reminiscent of North-West Arabian such as the resyllabification of \*" inC1aC2aC3a ... into" inC1C2vC´ 3a ( " *in<sup>∂</sup>h. <sup>k</sup>úmat* 'it was ruled'). Consequently, the major taxonomies have to be combined to represent the overall picture more accurately. Additionally, sociolinguistic developments which have affected the classification of these dialects, such as dialect contact and koineization, need to be incorporated.

The data on which this paper draws were collected amongst members of the *Misa¯* " *¯ıd* tribe in 2019 in the municipality of *Umm al-Gimˇ al¯* in Northern Jordan, twenty kilometres East of Mafraq. With the help of Youssef Al-Sirour, a permanent resident of *Umm al-Gimˇ al¯* and an immediate member of the community under investigation, we visited local families and recorded two casual conversations. Because of the limited nature of the corpus, the present discussion should be considered provisional until more data are collected. We will first sum up Cantineau's classification followed by those put forward by Cleveland and Palva. Based on our own observations, we suggest essential amendments to these classifications. We then present the salient features of the dialect, followed by a small sample taken from the recordings. The last part deals with the classification of the present dialect in the light of previous literature. We also highlight some methodological issues regarding data collection, levelling, and short-term accommodation.

#### **2. Cantineau's Classification**

The first scholar to draw a comprehensive classification of the Bedouin dialects of Northern Arabia is Cantineau (1936, 1937). The first distinction relates to the occupational profile of the Bedouins located in this area, whom Cantineau called *"grands nomades"* ('great nomads') as opposed to *"petits nomades"* ('little nomads'). The former designates tribes which mostly rely, at least historically, on camel rearing, and the latter designates tribes which were mostly active in sheep rearing. This bipartite separation was further divided into three broad groups to which he attributed the letters A, B, and C. The A-group designates camel-rearers from the *Niza* " confederation. The B-group refers to camel-rearers from the *Sămmar* confederation, whereas the C-group refers to the sheep-rearing tribes of the Syro-Mesopotamian *badya ¯* 'steppe'. More marginally, Cantineau also talks about three smaller subgroups, the variety of *ar-Rass* in the *Gas.¯ım* region in the central-northern part of Saudi Arabia, the dialect of *al-Gˇ of¯* located in the far north of Saudi Arabia, and finally the dialects of the oasis of the Syrian desert of *al-Qar¯ıten¯* , Palmyra and *Suxne*.

Some features of the A-group ( *Niza* " ) include the affricate [ţ] and [dz] of etymological /k/ and /g/ (Standard Arabic /q/) in the vicinity of front vowels: *calbati ´* 'my she-dog' (< *kalbati*), *giddam ´* 'front' (< *giddam¯* ). Etymological /g/ can be realized [g ˇ <sup>j</sup> ], [dj ], and [Ã]: *didyad¯ ya* 'hen'(< *dagˇa¯gaˇ* 'hen'). The feminine ending *-a* exhibits no raising except in the vicinity of /i/, /¯ı/, or /j/ in which case it raises towards [æ]: *lah. yä* 'beard' (< *lih. ya*). Etymological diphthongs /aw/ and /ay/ are not monophthongised although the distance between the two elements is reduced, yielding, respectively and approximately, [ow] and [Ej ]: *gowz ˇ* 'nut' and *beyt* 'tent'. An important feature is the so-called *gahawa* syndrome, understood as the insertion of an anaptytic /a/ vowel between /g/, /x/, / ˙ h. /, /h/, or / " / and a following consonant of the type Ø → /a/ / aX\_C in which X is one of the aforementioned consonants and C is different from X: *dӌϬ ahr* → *dӌϬ ahar* 'back'. In addition to this, \*C1aC2aC3v sequences are resyllabified into C1C2vC´ 3v: *xšíba* < *xašaba* 'piece of wood'. The *gahawa* syndrome is also active in the passive participle template \*maC1C2uC¯ 3, in which case it also combines with the resyllabification rule: *mah. t.u¯t.* → *mah. at.u¯t.* → *mh. at.u¯t.* 'put'. Another important distinction introduced by Cantineau is trochaism vs. atrochaism. While these terms refer to a type of meter in Classical Greek poetry, his use of this parameter entails a particular syllabic type. Accordingly, Cantineau separates trochaic from atrochaic varieties. Trochaic varieties have the tendency to favour sequences of Cv/Cv syllables. CvC syllables are tolerated in final position or if followed by Cv or a final CvC/CvC: *ih. ás. adan*

'they (f.) harvest', *yakalan ¯* 'they (f.) eat', *rasa-na ¯* 'our head', *nagat-i ¯* 'my she-camel'. Atrochaic dialects do not restrict sequences of CvC syllables: *ih. ás.dan* 'they (f.) harvest', *yaklan ~ y ¯ aˇ¯clan* 'they (f.) eat' *ras-na ¯* 'our head', *nagt-i ¯* 'my she-camel'. The A group is strongly trochaic.

As far as morphology is concerned, these dialects feature the nominal suffix *-n* commonly called 'nunation' in Semitic studies, which essentially marks nouns denoting indefinite specific referents when they are complex NPs consisting of a nominal head and a modifier (Holes 2004). Another salient feature is the pronominal indexes which feature a final /n/ in the prefix conjugation: *t(v)gul¯ ¯ın* 'you (f.) say', *t(v)gul¯ un¯* 'you (m.pl.) say' and *y(v)gul¯ un¯* 'they (m.pl.) say'. As far as bound pronouns are concerned, a noticeable trait is the allomorph *-ah* of the 3.f.sg. after a final weak root consonant: " *aly-ah* 'on her' and *abw-ah* 'her father'. The 2.m.sg. and 2.f.sg. in those dialects surface as *-k* and *-´c* after words ending in a short vowel: *farás-k* 'your (m.) horse'. The 2.m.pl. and 2.f.pl. forms are *-kam* and *-kin* and the 3.m.pl and 3.f.pl. are *-ham* and *-hin*. Specific independent forms of free pronouns include 1.sg. *ana ¯* and 1.pl. *h. inna*. Another salient feature is the forms of the verbs *axad ¯* 'he took' and *akal* 'he ate', instead of *kala, xad¯ a*.

As far as group B (*šammari*) is concerned, much of the phonology and morphology is shared with group A. Differences arise in the following features. As noted by Cantineau (1937, p. 130), "*l'imala ¯* de la terminaison féminine est nette et forte, a un tel point qu'elle semble résister au *tafx¯ım* d'une consonne précédente": *gargure ¯* 'she-lamb', *nage ¯* 'she-camel'. These dialects are also characterised by the lenition of the feminine plural ending *-at¯* in pause in which case it reduces to *-a¯<sup>i</sup>* : *xams <sup>∂</sup>bs.ala¯i* 'five onions'. Concerning bound pronouns, *šammari* dialects exhibit *-ak* and *-i´c* in the 2.m.sg. and 2.f.sg. with any vowel syncope. In addition to this, the 1.sg. allomorph *-an* surfaces in all positions: *dӌϬ rub-an* 'he hit me' (< *dӌϬ arab-an* → *dӌϬ aráb-an* → *dӌϬ rub-an*). Cantineau also notes the allomorph *-(w)o* after final long *-a¯*: *gad ˙ a-o ¯* 'his lunch'. Our data suggest that this allomorph is selected after any long vowel, whether plain or monophthongised.

Group C dialects, also known as *šawi ¯* dialects, are spoken by the sheep-rearing tribes of the Syro-Mesopotamian *badya ¯* 'steppe' and its fringes. Distinct features include the affricates [Ù] and [Ã] as reflexes of /k/ and /g/ in front vowel environments. The reflex of etymological /g/ is always the affricate [ ˇ Ã]. A slight raising towards [æ] of final *-a* and *-a¯* is heard in non-back and non-velarised contexts: *šin¯ınä* 'butter milk', *ih. nä* 'we'. In terms of phonotactics, \*maC1C2uC¯ <sup>3</sup> stems are not susceptible to the *gahawa* syndrome and hence, there is no resyllabification. *Šawi ¯* dialects are also atrochaic, in that sequences of CvC syllables are not restricted: *yihárban* 'they (f.) escape', *yaklan ¯* 'they (f.) eat'. Specific morphological forms are 1.sg. *ani ¯* 'I' and *ih. nä* 'we' for free pronouns and the pairs *-kum*/*-ˇc∂n* and *-hum*/*-h∂n*.

#### **3. Cleveland's Classification of the Dialects of Transjordan**

Cleveland (1963) is an attempt to classify the dialects spoken in Jordan and Palestine, both sedentary and Bedouin. Cleveland coined new terms using the 3rd person singular of the verb *qal¯* 'he said' in the imperfective in order to designate the different dialectal groups. His first cluster, which he calls *yigul¯* , refers to all the Bedouin varieties which lack the *b-* prefix of the imperfective. The second group he distinguishes is *bigul ¯* , by which he refers to the sedentary populations of Jordan, including some locations on the west bank of the Jordan river. His third group is the *bikul ¯* type, which is characteristic of the sedentary rural populations of central Palestine. Lastly, the *bi* " *ul ¯* group incorporates the sedentary urban populations of Palestine, including those which settled more recently in Jordan. Cleveland does not mention a *biqul ¯* group which would include the Druze dialect of Azraq, Northern Jordan. This dialect is as yet undocumented but research in this community is ongoing and the findings will be published in due course.1 As we will see below, Cleveland's classification does not capture important differences found amongst the Bedouins. It also fails to capture the divergences amongst the indigenous sedentary dialects of Jordan, which, although all belong to the *bigul¯* group, exhibit a sharp division between a southern *mu* " *abi ¯* type and a northern-central *balgawi-h ¯ . or¯ ani ¯* type.

#### **4. Palva's Classification**

Palva (1984) delves deeper into Cleveland's classification using a larger pool of variables. Palva mentions the urban Palestinian dialects, which correspond to Cleveland's *bi* " *ul¯* . As far as rural dialects are concerned, he distinguishes between Galilean dialects (*biqul ¯* ), central Palestinian dialects (*bik. ul¯* ), south Palestinian dialects (*bigul¯* ), north and central Transjordanian dialects (*bigul ¯* ), and south Transjordanian dialects (*bigul ¯* ). His classification of the Bedouin dialects includes those of the Negev Bedouins (*bigul¯* ), the dialects of southern Jordan (*yigul¯* ), the dialects of the Syro-Mesopotamian sheep-rearing tribes (*yigul¯* ), and lastly the dialects of the North Arabian Bedouins (*yigul¯* ). Palva's classification distinguishes well between all the subgroups of the sedentary types but lumps together sub-divisions within the Bedouin type that ought to be differentiated. In the dialects of the Syro-Mesopotamian sheep-rearing tribes, no distinction is made between the dialects of the Jordan valley and the *šawi ¯* type. As regards the dialects of the North Arabian Bedouins, no further distinction is made between Cantineau's A and B groups.

#### **5. Addenda to Cantineau, Cleveland, and Palva**

#### *5.1. Younes' Subgrouping of Ca*

So far, only tribes which had *šawi ¯* type dialects had been located and for some of them investigated, thus belonging to Cantineau's C group. These are for example the *N* " *em¯* , *Lheb¯* , and *Bani Azz* " who, in Lebanon, are mainly located in the Northern and Eastern parts of the country. The dialects spoken by these tribes are all unmistakably of the *šawi ¯* type, exhibiting features such as the /ˇc/ and /g/ reflex of etymological /k/ and /g/, a first or second ˇ degree raising of final *-a* and *-a¯* to [æ] or [E], atrochaism, absence of the *gahawa* syndrome on the \*maC1C2uC¯ <sup>3</sup> template, the pseudo-verb *w∂dd* 'want', and the lexeme *∂t ¯ ∂m* for 'mouth'. In recent fieldwork carried out in the central part of the Bekaa valley by one of the authors of the present study, two new Bedouin tribes were investigated: the *Abu ¯*" *Id* and the *¯*" *Id¯ın*. Their presence in that part of the country had been, until then, unnoticed. Indeed, the presence of *H. sina* clans, who are a big sub-section of the *Niza* " confederation and to whom the *Abu ¯*" *Id* and the *¯*" *Id¯ın* are connected, was already attested in Syria. The *H. sina* are to the *Niza* " what the *T. ayy* are to the *Šammar* in that they are the first clans who migrated northwards into the Syro-Mesopotamian steppe around a millennium ago. This resulted in a prolonged contact with Bedouin tribes who had migrated earlier into the area such as the *Muwali ¯* , *H. ad¯ıd¯ın*, and *N* " *em¯* —who had dominated the Syro-Mesopotamian steppe. The linguistic outcome of this prolonged contact was convergence towards the *šawi ¯* type. After investigation, it turned out that the dialect of the *Abu ¯*" *Id* and the *¯*" *Id¯ın* exhibited a similar profile, with core *šawi ¯* features alongside with " *nizi* features. For instance, these dialects exhibit no raising of *-a* and *-a¯*, *gahawa* active in the \*maC1C2uC¯ <sup>3</sup> template, the verb *yibi* 'he wants', and a more pervasive use of nunation. This state of affairs led us to coin a new term for this type of configuration, using Cantineau's terminology. Consequently, it seemed opportune to use the combination of Ca letters to designate this type of dialects: upper case *C* for the *šawi ¯* component and lower case *a* for the " *nizi* component. Cantineau (1937) already used such a combination of letters for the varieties spoken in the *Gas.¯ım* area in modern-day Saudi Arabia that combine predominantly *šammari* features alongside with " *nizi* features: Ba.

#### *5.2. Herin's ygulu vs. yg ¯ ul¯ un¯*

As noted in Herin (2020), one of the shortcomings of Cleveland's *yigul¯* type is that it lumps together three sub-types within the Bedouin dialects of Jordan: the dialects of the Jordan valley Bedouins such as the *A*"*gˇarma ¯* , *Adw*" *an¯* , and *Ab*" *ab¯ ¯ıd*, the dialects of Bedouins of northern Jordan such as the *Bani S. axar*, *Sardiyye*, *Sirh. an¯* , *Al ¯ ¯*" *Isa*, and *Misa¯* " *¯ıd*, and finally the Bedouin varieties of Southern Jordan such as the *H. we¯t.at¯* , *Bdul ¯* and *Zawayda ¯* . The Jordan valley type differs from Cantineau's C group in that they lack the final /n/ in the imperfective endings *-¯ın* and *-un¯* , also found in the dialects of the Bedouins of northern Jordan. It appears that it would be more conclusive to use the 3.m.pl. inflexion of the imperfective of the verb *gal¯* to capture some of these differences. The following general classification would arise:


#### **6. Features of the** *Misa¯* " *¯ıd* **Dialect**

In 2019, Bruno Herin, Enam Al-Wer, and Youssef Al-Sirour began fieldwork amongst the *Misa¯* " *¯ıd* tribe in Umm al-Gimˇ al, Northern Jordan. The fieldwork was facilitated by Y ¯ usif, ¯ who is a member of the tribe, as noted above. In this exploratory phase of the research, we recorded two forty-minute sessions consisting of casual conversations and narratives. These recordings were subsequently transcribed and analysed. In the remainder of this article, we present our analysis of the salient features of this dialect based on these recordings.

#### *6.1. Phonology*

The phonetics of the feminine ending was mostly recorded as the unraised reflex [a]: *šid¯ıda* 'severe, extreme', *šaša ¯* 'piece of fabric/muslin', *mayya* 'water', *wah. da* 'one (f.)'. A first degree raising was recorded in *saknä ¯* 'dwelling (f.)', " *aš¯ırä* 'clan', " *ut ¯ maniyyä ¯* 'Ottoman', *lahdyä* 'speech, accent'. A second degree raising was also recorded in a handful of items such as *zg˙¯ıre* 'small' and *kt ¯ ¯ıre* 'much (f.)', and also after an emphatic sound as in *mih. ma¯s.e* 'coffee bean roasting pan'. The unraised reflex [a] is typical of " *nizi* type (in the Syro-Mesopotamian steppes) whereas the first-degree reflex is equally found in the *šawi ¯* varieties as in the " *nizi* dialects, although it is contextually conditioned (e.g., in front contexts). The second-degree raising found in some items most likely represents short-term accommodation, induced by the presence of speakers of other Jordanian dialects.<sup>2</sup> It may also be indicative of the course of future developments in the dialect, viz. convergence to koineised Jordanian varieties, especially since the younger members of the tribe have frequent face-to-face contact with speakers of other Jordanian dialects through formal education and in the workplace. The raising heard in *mih. ma¯s.e* after a velarized consonant on the other hand, is typical of the *šammari* type. Despite some degree of variation in the realization of the feminine ending in our data, the distribution found amongst the informants overall is consistent with the " *nizi* type.

In pause, a slight aspiration occurs after the feminine ending: " *aš¯ıräh*# 'clan', *gib¯ıläh*# 'tribe'. This feature is found in both the A " *nizi* and B *šammari* groups.

The etymological diphthongs /aw/ and /ay/ are both monophthongised to /o/ and ¯ /e/, respectively: ¯ *fog¯* 'above', *yom¯* 'day', *h. ol¯* 'around', *dor¯* 'turn/point in time', and *bet¯* 'tent', *t ¯ nen¯* 'two', *xel¯* 'horses'. Diphthongised realisations occurred in *Zbeyd* (tribal patronym), *xeyš ¯* 'jute'. These reflexes are common in the group C *šawi ¯* dialects. Groups A and B usually have more consistent slight diphthongised reflexes.

As far as the affrication of etymological /k/ and /g/ is concerned, the recorded reflexes all pattern respectively with the *šawi* type /ˇc/ and /G/: ˇ *h¯ıˇc* 'so', *cimä ˇ* 'desert truffle', *ciˇ t ¯ ¯ır* 'much'. Only one instance of /g/ < /g/ was recorded in ˇ *t.¯ıgˇ* 'endure'. Other items which were expected to be realised with /G/ were recorded with /g/: ˇ *šarg* 'east', *giddam¯* 'in front'. This, in all likelihood, is a short-term accommodation phenomenon induced by the presence of speakers of standard Jordanian. The same observation can be made about non-affricated reflexes of /k/ in items such as *kan¯* 'he was', *kit ¯ ¯ır* 'much' (also recorded with /ˇc/, see above), and *kib¯ır* 'big' all of which are normally affricated in the vernacular.

Etymological /g/ was recorded /d ˇ y/ in *dyibal* 'mountain', *dyaw* 'they (m.) came', and *idy¯ıban* 'they (f.) brought'. The affricate /G/ was also recorded: ˇ *yigˇun¯* 'they (m.) come', *gawwa ˇ* 'inside', *gild ˇ* 'skin'. The /dy/ reflex is common in groups A and B whereas the affricate /g/ is a hallmark of the ˇ *šawi ¯* type. The indigenous reflex is undoubtedly /dy/. Although a short-term accommodation effect cannot be ruled out, the presence of /g/ˇ could also be due to earlier change within the dialect, as noted by Cantineau in some camel-breeder varieties.

An interesting and somehow unexpected feature that was occasionally recorded is the *qalqala*, understood to be the uvular realisation of etymological /g/: ˙ *qer¯* 'other' (<*g˙er¯* ), *qali ¯* 'expensive' (<*g˙ali ¯* ), *muqsil* 'washing area' (<*magsil ˙* ). To the best of our knowledge, this phenomenon is a hallmark of the Mesopotamian *šawi ¯* dialects.

Final /t/ in the plural feminine ending *-at¯* interestingly drops in pause: *guza ˙* ǎ*ы*# 'raids', *šagla ˙* ǎ*ы*# 'things', *RdӌϬ a* " *iyya*ǎ*ы*# (toponym), *h. alala ¯* ǎ*ы*# 'livestock heads'. This feature, as mentioned above, was already noted as commonly occurring in the B and Bc dialects.

The laryngeal stop /"/ was recorded once as pharyngeal / " / in *sa* " *alt* 'I asked', which is a salient feature of North-West Arabian. In addition to this, / " / is often glottalised in pause: *hassa¯* " # [hassa:ƣƢ ] 'now', *mani ¯* " # [ma:niƣƢ ] 'hindrance', *be¯* " # [be:ƣƢ ] 'sale'.

Expectedly, \*C1aC2aC3v sequences are resyllabified into C1C2vC3v: *skánaw* (<*sakanaw*) 'they settled', *Šrufat¯* (tribal patronym < *Šarafat¯* ). Our corpus also attests the presence of resyllabification in derived templates such as form VII \*" inC1aC2aC3a: " *in∂h. kúmat* 'it was ruled' (*inh. akamat* → *inh. kamat* → *inh. kúmat* → *in∂h. kúmat*).

As far as the *gahawa* syndrome is concerned, it appears to be present in the dialect. Examples are *nh. ás.id* 'we harvest' (here combined resyllabification *náh.s.id* → *náh. as.id* → *nah. ás.id* → *nh. ás.id*), *ba* " *ad* 'after'. Our data do not attest the presence of the *gahawa* syndrome in \*taC1C2¯ıC3 and \*maC1C2uC¯ <sup>3</sup> templates, which would suggest that it patterns in this respect with the *šawi ¯* type. Further data are needed to firmly confirm this observation.

As expected, the article receives primary stress as is normally the case in all of the Bedouin varieties of the area. To the best of our knowledge, only monosyllabic words of the type C1v¯C3 and disyllabic words of the type C1vC2v(C3) can trigger the stress of the definite article. Attested instances in our data are:" *ál-mut.ar* 'the rain'," *án-nifal* 'the clover', "*ál-* " *arab* 'the Bedouins'. In addition to this and quite unexpectedly, we also encountered a stressed article with a C1vC2C3v word in" *ás.-s. ah.ra* 'the desert'. Further data are needed to confirm whether stress assignment on the article is licenced in other words of this type and also possibly in other templates, which, as far as we know, would be a novelty.

An unexpected stress-related feature we found in the data is the second syllable stress in the plurals of C1vC2vC3 type as in *nigát.* "points" which also surfaced as *ngat.* after high vowel elision in unstressed position. This is a feature found in North-West Arabian (Palva 2011).

#### *6.2. Morphology*

In the realm of verbal morphology, it appears that both the allomorphs *-aw* and *-am* in the 3.m.pl in the perfective are found: *winn-o gt.a* " *am kassaram min-* " *ind giddam al- ¯ gamal ˇ* 'and there they had cut and broken into pieces (the engravings) in front of the camel'. The *-aw* allomorph was recorded in the following: *h. ∂maw ba* " *adӌϬ -ham* " *ašaw u-tik ¯ a¯t ¯ araw u-lamma tika¯t ¯ araw, dyaw <sup>∂</sup>t ¯ bitaw hanä ¯* 'they protected each other, lived and multiplied and when they multiplied they came and settled here'.<sup>3</sup> These examples suggest that *-aw* and *-am* allophones are not in complementary distribution, unlike in some *šawi ¯* tribes along the Middle-Euphrates where one of the allomorphs is used exclusively in pause.

Person prefixes in the imperfective were often recorded with /a/ vowel: *yat.la* "'he goes out', *takbar* 'it gets bigger', *yamši* 'he walks', *talga* 'you find'. This is a typical camel-rearing trait not found in the *šawi ¯* dialects.

Initial glottal stop verbs such as *akal* and *axad ¯* behave similarly to what is found in the B, Bc, and C groups: *kalet-o ¯* 'I ate it', unlike " *nizi*-type dialects which have *akalt* and *axad ¯ t* 'I ate/have eaten', 'I took/have taken'.

As far as derived forms are concerned, the causative Form IV template \*aC1C2aC3 yiC1C2iC3 is well attested in our data: *n∂t.∂l* " *-o w-un∂n∂fdӌϬ -o* 'we take it out and dust it', *yumt.ar* 'it rains', *yiws.il* 'he brings'. The presence of this feature is not diagnostic of any sub-group but in the context of dialect contact and levelling, it is a noticeable feature. The imperfective of Form V \*taC1aC2C2aC3 was recorded as ytiC1aC2C2aC3 as in *ytidarrab*

'he trains'. Given that *šawi ¯* dialects are known for having yiC1aC2C2aC3 (*yidarrab*), the presence of this form is another indication of the camel-rearer background of the present dialect. This, in all likelihood, should also happen in form VI \*taC1aC¯ 2aC3 but our data lack instances of any verb of this type.

Another typical camel-rearer feature that is found in our data is what is referred to as the apophonic passive, known to be lost in the *šawi ¯* varieties. Only two instances were recorded: *yid ¯ kar* 'it is remembered' and *timadd* 'it is presented'. The template in the imperfective yiC1C2aC3 in which the /i/ vowel contrasts with the /a/ vowel was noted above as a marker of the active forms. Further data are needed to assess the productivity of the apophonic passive in the modern-day form of the dialect.

The pronominal morphology of the dialect appears to be mixed. We recorded the first person free forms *ana* and *ih. na*, which are found in the C-*šawi* group. Inversely, the bound plural forms *-kam* and *-ham* were found, which are camel-rearer forms. In the feminine plural, only the third person *-hin* is recorded in the data, but no second person. The first person singular bound pronoun surfaced as *-an* after a consonant: *wGi ˇ* " *at-an* 'it hurt me', *tud¯ ya* " *-an* 'it hurts me'. This *-an* form is typical of the B and Bc groups. In the same vein, we recorded the form *-wo* after long vowels, which are also found amongst the B and Bc groups: " *ale-wo ¯* 'on him', *<sup>∂</sup>nnxall¯ı-wo* 'we let him', *šifna-wo ¯* 'we saw him'. Moreover, an *-ah* allomorph in the 3rd person feminine singular was recorded after final /w/ and /y/ stems: " *aly-ah* 'on her', *abw-ah* 'her father', which patterns with both the A and B camel-rearer dialects. After consonants, initial consonant bound pronouns all have initial vowel allomorphs: *bilad- ¯ a-na* 'our country', *kill-a-ham* 'all of them'. This, of course, is reminiscent of the trochaic syllable type of the dialect and a distinctive feature of all the A and B camel-rearer varieties.

#### **7. Dialect Sample**

We present here a sample of the recordings to enable the reader to capture the nature of the dialect. Because much of the sessions consisted of group conversations in which turns were for the most part quick and uncontrolled, it was difficult to isolate long stretches of monologue. Another problem that quickly surfaced was the presence of several instances of mixed forms, which are due to dialect mixing and perhaps ongoing changes in the dialect itself. As explained earlier, the session involved participants with different dialect backgrounds, which as we quickly realised, prompted the informants to accommodate towards other Jordanian dialects. Nevertheless, the two short excerpts exhibit salient features that can be safely attributed to the local form of speech of the *Misa¯* " *¯ıd* tribe.

Speaker 1: Bu S ¯ . alih ¯ . :

*Ҵal-Mis¬ҵÎd ham Ҵakbar ҵašÎrä w-al-ҵaš¬yir h¬١çl dyiw¬r-na ҵašÎrt¾n ١çl… kull al-ҵaš¬yir h¬١i ٭çl baҵad֔щ-ha h¬n s¬knä b-al-manڒaga h¬y dyÎr¬n. Ҵu-s¬bigan gabl an-n¬s k¬nat ktÄzik ҵala baҵad֔щ-ha s¬bigan gabl-ma n٭kúmat ha-l-bl¬d yaҵni […] ҵala dçr alҵut֔m¬niyyä yimkin t٭akm al-bl¬d h¬١i k¬nt an-n¬s t٭ыma baҵad֔щ-ha b-al-gщuwщwщa. yaҵni yÄázu baҵad֔щ-ham u-h¬١çl.. ٭asb gщuwщwt al- щ ҵašÎrä lli gidd¬ma-ham […] m¬-ni waҵi kit֔Îr Ҵana ҵumr-i yimkin Ҵakt֔ar min-sabaҵÎn sinä, ٭ass m¬ smaҵϷt min-ha-l-gd¬m gabl. gщ¬lщaw al-Mis¬ҵÎd m¬ ҵumra-ham Ҵinno xa١aw, Ҵilli yfukkĀn ٭¬la-ham b-l-ÄϷza*ǎ*ы, yimdyĀn min-Äazu kyifukkok ٭¬la-ham, d¬yman manڍĀrÎn sib٭¬nalщlщ¬h.* 

The *Misa¯* " *¯ıd* are the biggest tribe and the other tribes are our neighbors, the two other tribes ... All these tribes live next to each other here in the region, they are neighbors. In the past, people used to raid each other, before the region was under control [ ... ] I think in the days the Ottomans controlled this region, people used to protect themselves in a warlike manner. I mean they used to raid each other and these ... It depends on the strength of the tribe which is facing them [ ... ] I don't remember well, I am maybe older than seventy, it comes from what I have heard before from the elders. They said that the *Misa¯* " *¯ıd* never took, those who emerge during raids, they get out of the raid they emerge, always victorious God bless.

Speaker 2: Umm S. alih ¯ . :

"*axabbr-o bass*"*ana ma dagg ¯ et la wa ¯ l* ˙ *l* ˙ *a* "*i šuf∂t wal* ˙ *l* ˙ *a šuft han dagg ¯ et¯* " *a-l-¯ıd-i wGˇí* " *at-an u-dagget¯* " *ale-(h) [ ¯* ... *] bass yat.la* "*ad-damm xalas. yarbut.an* " *aly-a<sup>h</sup>* " *adi yi ¯ dӌϬ all yom¯ en¯ ma tg ¯ ¯ım-o winn-a xadӌϬ ra* ... *bass ∂n-nas m ¯ a t ¯* " *árif inno h. ar.am gab ¯ ∂l* ... " *a-l-basa¯t.a* " *i wal* ˙ *l* ˙ *a* " *a-l-basa¯t.a, z¯ınä w-a* " *la¯Gˇ* " *i* " *la¯Gˇ dӌϬ arba*" *i dӌϬ arba dӌϬ arba gab∂l l-wa¯h. ad lama yu¯Gaˇ* " *-o katf-o katf-o kyiduggok* " *ale-wo yba ¯ t.t.al yu¯Gaˇ* " *-o* " *la¯G ya ˇ* " *ni [* ... *] wal* ˙ *l* ˙ *a madri šift wal* ˙ *l* ˙ *a nas w ¯ a¯Gid ˇ t ¯ ala¯t ¯ <sup>∂</sup>ngát.* " *i billa la wal* ˙ *l* ˙ *a ma marrat ¯* " *alay-yä ma¯ d ¯ ikart-ä* ... *šuft niswan¯ b¯ı-hin t¯ alat¯ ¯ nigát. <sup>∂</sup>kbar¯.* " *aGˇ ayiz ¯* " *i* ...

I will tell him but I didn't get tattooed, by God I saw, here, I tattooed my hand [and] it hurt, I tattooed it [ ... ] when the blood comes out, it's finished, they (f.) tied it normally for two days until it turns into a bruise ... But people before didn't know it was *h. ar. am¯* ... Because of simpleness, by God, because of simpleness, beauty, and remedy, yes, remedy, a blow, a blow, before, when someone had a sore shoulder, they would tattoo it and the pain would stop, I mean [it's a] remedy [ ... ] by God I don't know, I saw a lot of people with three dots [tattooed], yes, by God, this did not happen to me, I can't remember it ... I saw women with three dots [tattooed], old women yes . . .

#### **8. Discussion and Conclusions**

Below (Table 1) is an overview of all the features discussed above and their distribution in the relevant dialectal groups. As mentioned earlier, Cantineau attributed a letter-code to the different groups he investigated. The two-way division is between the camel-rearer type which sub-divides into A ( *Niza* " ) and B (*Šammar*) and sheep-rearer C group (*šawi ¯* ). In accordance with this classification, we decided to allocate the letter D to the North-West-Arabian type. From the Table 1 below, it quickly appears that the dialect of the *Misa¯* " *¯ıd* patterns with camel-rearer type. More precisely, it also appears to be closely connected to Cantineau's Bc type. In addition to this, our sample also reveals *šawi ¯* -like features such as the realisation of etymological /g/ as [q] ( ˙ Younes and Herin 2016) and the treatment of diphthongs. Moreover, and quite surprisingly, some features that are attested in the North-West Arabian sub-group were found in the data. These are for example the resyllabification of \*C1aC2aC3v in derived forms such as \*inC1aC2aC3a, the second syllable stress in plurals of the \*C1vC2vC3 pattern (which may also lead to first vowel elision), and also *sa* " *al* for *sa* " *al*. In conclusion, the dialect of the *Misa¯* " *¯ıd* matches for the most part the Bc sub-group but with *šawi ¯* -like features and also characteristics that are reminiscent of the North-West Arabian type. The question is how to account for such a pattern. There are at least two possibilities. The first one is that more complex configurations may have been unnoticed by Cantineau who indeed was not in a position to get large samples of data from all the tribes in the area. The second possibility is that recent dialect contact between speakers of all these sub-groups may have occurred, leading to dialect mixtures, as instantiated in our sample.

In terms of data collection and methodology, fieldwork in contexts that involve a fair amount of dialect contact can yield puzzling and conflicting linguistic output. This can also be exacerbated by short-term accommodation in the direction of the speech variety of the researcher(s). It is therefore paramount to secure the presence of an insider participant who can take the lead in carrying out data collection.

As far as the general classification of the dialects of Jordan and beyond is concerned, combining Herin and Younes' amendments to Cleveland, Palva, and Cantineau's classifications, it seems reasonable to posit the following taxonomy. We suggest that subsequent research should be framed within this canvas.

	- a. Mu" abi (southern, Karak, T ¯ . af¯ıle, etc . . . )
	- b. Balgawi-H ¯ . or¯ ani (central-north, Salt, ¯ " AGl ˇ un, etc . . . ) ¯
	- a. Nizi "
	- b. Šammari
		- i. Bc (Misa¯ " ¯ıd)
	- c. Šawi ¯
		- i. Ca (Bu¯ ¯" Id et ¯" Id¯ın in Lebanon, so far unattested in Jordan)

**Table 1.** Features of the Misa¯ " ¯ıd and the Bedouin sub-groupings.


**Author Contributions:** Conceptualisation, B.H. and E.A.-W.; Data collection, B.H., E.A.-W., Y.A.-S.; Transcription, I.Y.; Analysis, I.Y. and B.H.; Writing, B.H., I.Y. and E.A.-W.; Writing review and editing, E.A.-W. and B.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of the University of Essex on the 30th of November 2017.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Restrictions apply to the availability of these data. Data was obtained from the speakers and are available from the authors upon request and permission of the participants.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Notes**


#### **References**


## *Article* **New Perspectives on the Urban–Rural Dichotomy and Dialect Contact in the Arabic** *gələt* **Dialects in Iraq and South-West Iran**

**Bettina Leitner**

Department of Near Eastern Studies, University of Vienna, 1090 Vienna, Austria; bettina.leitner@univie.ac.at

**Abstract:** This paper reevaluates the ground on which the division into urban and rural *g*@*l*@*t* dialects, as spoken in Iraq and Khuzestan (south-western Iran), is built on. Its primary aim is to describe which features found in this dialect group can be described as rural and which features tend to be modified or to emerge in urban contexts, and which tend to be retained. The author uses various methodical approaches to describe these phenomena: (i) a comparative analysis of potentially rural features; (ii) a case study of Ahvazi Arabic, a *g*@*l*@*t* dialect in an emerging urban space; and (iii) a small-scale sociolinguistic survey on overt rural features in Iraqi Arabic as perceived by native speakers themselves. In addition, previously used descriptions of urban *g*@*l*@*t* features as described for Muslim Baghdad Arabic are reevaluated and a new approach and an alternative analysis based on comparison with new data from other *g*@*l*@*t* dialects are proposed. The comparative analysis yields an overview of what has been previously defined as rural features and additionally discusses further features and their association with rural dialects. This contributes to our general understanding of the linguistic profile of the rural dialects in this geographic context.

**Keywords:** dialect classification; dialect contact; urban; rural; *g*@*l*@*t*; *q*@*ltu*; spoken Arabic

#### Perspectives on the Urban–Rural Dichotomy and Dialect Contact in the Arabic *g*@*l*@*t* Dialects in Iraq and South-West Iran. *Languages* 6: 198. https://doi.org/10.3390/

**Citation:** Leitner, Bettina. 2021. New

Academic Editors: Simone Bettega and Roberta Morano

Received: 19 October 2021 Accepted: 18 November 2021 Published: 30 November 2021

languages6040198

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

This study aims at a critical reevaluation of the urban–rural division in the *g*@*l*@*t* dialects and the description of linguistic dynamics correlating with urbanization tendencies.<sup>1</sup> The urban–rural dichotomy is used in the descriptions of Arabic dialects from different regions (cf., for example, Abd-el-Jawad 1986; Abu-Haidar 1988; Ech-charfi 2020; Holes 1995; Ingham 1973; Miller 2007; Sharkawi 2014). However, until today there is only a small amount of evidence for common linguistic tendencies found among Arabic dialects in urban contexts (cf. Miller 2007, p. 2). Similarly, the clear-cut distinction into urban vs. rural regarding the *g*@*l*@*t* dialects2 of Iraq and Iran still seems to be built on weak ground. This study tries to sum up what we do and what we do *not* know about the division of the *g*@*l*@*t* dialects into rural and urban ones. By including an areal perspective and new data from the *g*@*l*@*t* dialects of Khuzestan, we hope to arrive at a more detailed description of the characterizing factors of rural dialects in general and the linguistic consequences of urbanization for dialects of the *g*@*l*@*t* group more specifically.

The present work brings together hitherto used linguistic criteria for the distinction into rural vs. urban *g*@*l*@*t* and other features determined by the author as possibly rural. The existence of these features is compared in the *g*@*l*@*t* dialects described so far, including new data from the *g*@*l*@*t* dialects in Khuzestan and two cognate dialects (Šawi and ¯ Khorasan Arabic).

The study also tries to retrace what processes are at work when different *g*@*l*@*t* dialects are in contact in urban contexts and questions the often not well defined and synonymous use of the terms 'Bedouin' and 'rural', as well as 'sedentary' and 'urban'.

The study includes a small-scale sociolinguistic survey revealing what native speakers of (urban) Iraqi Arabic subjectively tend to perceive as typically rural or urban.

In general, the study and classification of the *g*@*l*@*t* dialects has received only marginal attention, especially when compared with the much better studied *q*@*ltu* dialects (cf.

Hassan 2021, p. 51). Even though a number of studies on *g*@*l*@*t* dialects has been published since Blanc's classification of the Iraqi dialects into *q*@*ltu* and *g*@*l*@*t* in his seminal work on Baghdadi Arabic (Blanc 1964), they are still few. Among those who rely on freshly gathered data are: Hassan (2016a, 2016b, 2017, 2020, 2021), Mahdi (1985), Denz and Edzard (1966) and Abu-Haidar (2002) on South Iraqi Arabic; Ingham (1973, 1976), Leitner (2019, 2020) and Bettega and Leitner (2019) on Khuzestani Arabic; and Salonen (1980) on al-Shirqat (Širqat¯.)/Assur Arabic.

*On lacking definitions: What do 'Bedouin', 'sedentary', 'urban', and 'rural' mean in the context of the* g@l@t *and* q@ltu *dialects?*

In his important paper on the linguistic character and development of Muslim Baghdad Arabic, Palva appears to use the labels 'urban' and 'sedentary', and 'rural' and 'Bedouin' interchangeably.<sup>3</sup> By using the terms 'urban' and 'sedentary' as well as 'rural' and 'Bedouin' as quasi-synonyms (cf. Ech-charfi 2020, p. 67), we ignore the different nature of these terms and the different implications they have or had as socio-economic criteria at different times in history. It also ignores the fact that, for example, sedentary dialects can be rural as well. In Iraq, *urban* and *sedentary* as well as *rural* and *Bedouin* are indeed often closely linked, but a synonymous use of these concepts, especially for descriptions of the present-day linguistic classification, would be misleading. While—in our geographic context—*urban* and *sedentary* are concepts historically related to the *q*@*ltu* dialects of Iraq, the concepts *rural* and *Bedouin* are historically associated with the *g*@*l*@*t dialects*. 4

However, all four terms might denote very different things when looking at the present-day *g*@*l*@*t* dialects of Iraq and south-west Iran. Even though originally rural in character, many *g*@*l*@*t* speakers (or their ancestors) have moved to urban contexts and gradually replaced their rural identity with an urban one. Similarly, even though ultimately the *g*@*l*@*t* dialects are Bedouin, at present, the vast majority of their speakers lead a sedentary lifestyle. The usefulness of the latter distinction (sedentary vs. Bedouin type) has been recurrently criticized in the past years by scholars such as Janet Watson (2011, p. 859), who describes the Bedouin/sedentary split as "an oversimplification and of diminishing sociological appropriacy".

For synchronic descriptions of present-day Iraqi Arabic, it is therefore mainly the terms 'rural' vs. 'urban' that remain useful to describe the different socio-economic circumstances people live under, whereas the importance of the question of sedentary (*h. ad* ˙ *ar*) vs. Bedouin (*badu*) appears to be generally decreasing. This is also reflected in the sociolinguistic interviews I conducted with native speakers of Iraqi Arabic (see Section 3.3), who more often described certain features as typical of the countryside (*r¯ıf, aryaf¯* ) or the city (*mad¯ına*) than as typical of the Bedouin (*badu;* Q *ašayir ¯* lit. 'tribes'). In these interviews, the participants never use the term sedentary (*h. ad* ˙ *ar*) to characterize or specify the use of a feature. Still, it is important that these terms are apparently meaningful to native speakers in the present day.

In Khuzestan the urban–rural distinction has not played a role for a long time, as most inhabitants are of rural origins, and distinctions were made based on other socioeconomic factors closer to the sedentary–nomad split (cf. fn. 13). However, modern-day Khuzestan has witnessed a rapid growth of urban centers, especially in the city of Ahvaz (cf. 3.4), for which reason the term 'urban' and its socio-linguistic and socio-economic implications (e.g., increase of contact and leveling tendencies) must at least be considered as an arising category.

Regarding the historical linguistic situation in Baghdad, the predominant Muslim dialect used to be (according to Blanc 1964, p. 170, at least until the fourteenth century) of the *q*@*ltu* type and thus was characterized, as stated above, by the features [+urban] and [+sedentary]. Starting in the fourteenth century, the city of Baghdad was populated by incoming *g*@*l*@*t*-speakers (especially in the seventeenth and eighteenth centuries, cf. Palva 2009, p. 32), initially carrying the features [+rural] and [+Bedouin] and remodeling the former linguistic character of Baghdad, so that over the time its *g*@*l*@*t* character has become predominant. Nowadays, MBA (Muslim Baghdad Arabic) is a *g*@*l*@*t* dialect associated by Arabic dialectologists with the features [+urban] and [+Bedouin], since the incoming rural Bedouin dialects in Baghdad have been urbanized (cf. Palva 2009, p. 38).

This process, the urbanization of rural dialects, contains the loss of highly marked rural features (e.g., the *gahawa*-syndrome), motivated by the speakers' wish to adapt to the urban linguistic profile. This is, of course, also observed for other urbanized *g*@*l*@*t*-speaking contexts, which do not have a *q*@*ltu* substrate as we find it in MBA. The difference between MBA and other urbanized *g*@*l*@*t* dialects, which do not have a *q*@*ltu* substrate, is that the *g*@*l*@*t* dialects in Baghdad have adopted some *q*@*ltu* features (e.g., the marking of definite objects with a proclitic *l*-, cf. Palva 2009, p. 22). In this light, it appears useful to distinguish MBA from other *g*@*l*@*t* dialects that are nowadays spoken in (arising) urban contexts and which lack a *q*@*ltu* substratum (e.g., Basra Arabic and Ahvazi Arabic).

This paper follows the assumption that eventually all *g*@*l*@*t* dialects are originally rural and Bedouin in character but focusses on the present-day definition of rural *g*@*l*@*t* features and their (lack of) prestige analyzing which features tend to be modified most readily in urban contexts.

The abovesaid shows the multifaceted nature of the terms 'Bedouin', 'sedentary', 'urban', and 'rural' and their historical and modern-day application for the regions of Iraq and south-west Iran.

#### *Aims of This Paper*

The purpose of this paper is twofold and can roughly be divided into one part focusing on synchronic aspects and the other dealing primarily with diachronic aspects. The former (Sections 3.1, 3.3 and 3.4) is dedicated to the following overarching questions:

(i) What unites rural *g*@*l*@*t* dialects? Which features are marked rural features, i.e., strongly associated with rural speech by *g*@*l*@*t* speakers themselves?

(ii) What happens to rural dialects when their speakers move to urban contexts? Which features tend to emerge (innovations) or be dropped as a consequence of the adoption of an urban lifestyle by *g*@*l*@*t* speakers?

The diachronic part of this study (Section 3.2) is a critical evaluation of Palva's derivations of certain MBA features (or lack of certain features in MBA) via the *q*@*ltu* substrate, offering alternative explanations for the development of these features.

Against this background, this paper aims at reevaluating the hitherto applied linguistic criteria for the subclassification of the *g*@*l*@*t* group into urban and rural dialects and sheds new light on the question of the linguistic dynamics found in urban *g*@*l*@*t*-speaking contexts.

Section 2 presents the methods applied to answer the questions outlined above and the linguistic features focused on. Section 3 of this paper discusses the results of this study: Section 3.1 presents the distribution of the rural features analyzed, and is followed by a reevaluation of those MBA features which Palva explained as consequences of the *q*@*ltu* substrate (Section 3.2). Section 3.3 presents of the results of a small-scale sociolinguistic survey conducted among five urban Iraqis who fled to Vienna during the past five years on subjective perceptions of rural forms in Iraqi Arabic. Section 3 closes with a case study of the city of Ahvaz, pointing out linguistic tendencies found in *g*@*l*@*t* dialects spoken in arising urban contexts (Section 3.4).

Section 4 discusses the results of this analysis in the light of the questions proposed above. Section 5 concludes the study and provides an outlook on possible future studies on the urban–rural distinction in the *g*@*l*@*t* dialects.

#### **2. Materials and Methods**

In order to answer the above-outlined research questions, this paper starts with a comparative overview of seven phonological and morphological features and their existence in the *g*@*l*@*t* dialects of Khuzestan, Kwayriš/Babylon, al-Shirqat/Assur, Basra, and Muslim Baghdad. While the first three are usually associated with rural speech, the latter two are usually taken to represent urban-type *g*@*l*@*t*. The analysis further considers the existence of these linguistic variables in the Šawi dialects of Syria and south-eastern Anatolia ¯ <sup>5</sup> and

the Arabic dialects of Khorasan. Including the Šawi dialects and Khorasan Arabic hope- ¯ fully contributes to a better understanding of their obvious typological proximity to the *g*@*l*@*t* dialects.

The following features investigated for the purpose of this paper have either been listed by Blanc (1964, p. 166)<sup>6</sup> and/or Palva (2009, pp. 21–29) as typically rural (i), or are suggested by the author of this paper as possible further rural features (ii):

	- Affrication of \**q* > *g* > *gˇ* and \**k* > *cˇ* in the vicinity of front vowels (Section 3.1.1);
	- Use of the *gahawa*-syndrome (Section 3.1.3);
	- Resyllabification of CaCaC-v(C) > CCvC-a(C) (Section 3.1.4);
	- Retention of gender distinction in the plural of pronouns and verbs (Section 3.1.5).

(ii) Further rural features suggested by the author of this paper:


#### **3. Results**

This section discusses possible rural *g*@*l*@*t* features and their distribution based on the available sources on *g*@*l*@*t* dialects (illustrated in Table 1; features I-III are phonological, while IV-VII are morphological features).


**Table 1.** Distribution of (possible) rural *g*@*l*@*t* features7.

#### *3.1. Rural g*@*l*@*t Features*

3.1.1. Affrication of OA (Old Arabic) \**q* > *g* > *gˇ* and \**k* > *cˇ* in the Vicinity of Front Vowels

Generally, the phenomenon of a phonetically conditioned affrication of OA \**k* and \**g* is considered typical of Eastern Bedouin-type dialects of the Syro–Mesopotamian area (Palva 2006, p. 606). The phonetically conditioned affrication of \**k* is basically a feature shared by all *g*@*l*@*t* dialects but is somewhat more limited in urban varieties (Blanc 1964, p. 166): compare, e.g., MBA and Basra Arabic *ak*@*l* (Blanc 1964, p. 166; Mahdi 1985, p. 64) and Khuzestani Arabic *aˇc*@*l*.

Similarly, the affrication of \**q* is traditionally more strongly associated with the rural type (Fischer and Jastrow 1980, pp. 142–43; cf. Blanc 1964, pp. 25–28, who calls the affrication of \*q "a hallmark of the countryside"; Palva 2009, p. 37, fn. 19). According to Blanc's informants (Blanc 1964, pp. 27–28), speakers perceive forms with *gˇ* (< \**q*) instead of *g* or *q* as rural or 'provincial', or the use of *g* or retention of *\*q* as urban.

According to the descriptions of Blanc (1964, pp. 26–27), in Muslim Baghdad Arabic the general reflexes of \**q* are *g* and *q*, thus without an affricated realization in front vowel environments. More recently, Palva notes for Muslim Baghdad Arabic that "the contrast between urban and rural *g*@*l*@*t* is diminishing", because such features as the conditioned affrication of *g* (as well as the use of feminine plural forms in the 3rd person, cf. 3.1.5) are gaining ground in that dialect (Palva 2009, p. 37, fn. 19 and the references mentioned there).<sup>9</sup>

For Basra Arabic, Mahdi (1985, pp. 86–87, fn. 102) states that there is some variation between *g* and *gˇ*.

We also find these phenomena in the Šawi-dialects (e.g., ¯ *ciˇ t¯ır* 'much' and *gid ˇ ¯ım* 'old', Younes and Herin n.d., *EALL Online*), and in Khorasan Arabic (e.g., *citab ˇ* 'he wrote'; Seeger 2013, p. 314, and *gir ˇ ¯ıb* 'close', Seeger 2009, p. 310).

Regarding other Iraqi Arabic dialects, affrication of \**q* > *gˇ* is further attested in texts from al-H˙ illa (Denz and Edzard 1966, p. 68: *gidd ˇ am¯* 'in front of', or 70: *rf¯ıgiˇ* 'my friend').10

Thus, at present these phenomena also appear in urban contexts, although apparently to a lesser degree: In MBA, the 'default form' is still unaffricated, in Basra there is variation between affricated and non-affricated forms, and we find the same variation in present-day Ahvazi Arabic, e.g., *m*@*d* ˙ *ayy*@*gˇ* ~ *m*@*d* ˙ *ayy*@*g* 'worried', *s.* @*d*@*g ~ˇ s.* @*d*@*<sup>g</sup>* <sup>&</sup>lt; *s.idqun* 'truth', *<sup>b</sup>*@*cˇan¯* <sup>~</sup> *b*@*kan¯* 'place', and the progressive marker *ga¯*Q *id* ~ *gˇa¯*Q @*d* cf. 3.2.2; Leitner 2020, pp. 30, 32). This might point at a tendency in urbanizing contexts towards de-affrication or replacement of *gˇ* and *cˇ* with the less marked or less 'provincial' *g* and *k*. In other words, its marked rural character (cf. 3.3) makes this phonetic feature prone to be given up in contact with another dialect or other dialects. This, of course, contradicts Palva's statement (as cited above) that the phenomenon of affricating *g* is gaining ground in MBA. This contradiction might result from analyzing data from different speech communities (Shiite vs. Sunnite; different quarters, etc.) of a city. Based on my data, however, I cannot confirm his observation, but must rather argue the contrary.

#### 3.1.2. Raising of OA \**a* in Pre-Tonic Open Syllables

Examples from the dialects analyzed, which feature the raising of \**a* not only in \*CaC¯ın patterns, are: Kwayriš *šibab¯* 'youth' (Denz 1971, p. 66); Khuzestan *s*@*wal¯* @*f* 'stories' (Leitner 2020, p. 43); Khorasan *migli ˇ tin* 'gathering' (Seeger 2002, p. 637); Šawi ¯ *siˇcaˇ¯c¯ın* 'knives' (Younes and Herin n.d., *EALL Online*); al-Shirqat: *d ¯ ibaye ¯ h.* 'slaughter animals' (Salonen 1980, pp. 9, 28, Text 1, sentence 17); MBA: *sˇcaˇ¯c¯ın*<sup>11</sup> ~ *siˇcaˇ¯c¯ın* 'knives'; and Basra: *digˇa¯gaˇ* 'chicken' (Denz and Edzard 1966, p. 80, Text VII) ~ *dyay¯* (Mahdi 1985, p. 247). The overall picture we get from the distribution of this feature is that synchronically this phonological change is found in both urban and rural dialects. Examples such as *mar.aku ¯ b* ˙ *~ mr.aku ¯ <sup>b</sup>* ˙ 'ships' and *manaqil ¯* <sup>~</sup> *mnaqil ¯* 'barbecues' from Basra Arabic (Mahdi 1985, pp. 141–42), which appear both with and without the raising and subsequent elision of *\*a* in the first (pre-tonic and open) syllable, might point to a slight tendency among urban varieties to preserve \**a*. However, this tendency was not really confirmed by the results of the sociolinguistic survey conducted for this study (cf. 3.3), in which most speakers produced forms with a raised \**a*.

Even though this phonological process is often inhibited by consonants of the guttural group (cf. Younes 2018, p. 5), we find various counterexamples, such as *m*Q *abed ¯* 'temples' (Salonen 1980, pp. 10, 29, Text 1, sentence 31) < *ma*Q *abid ¯* probably via raising and subsequent elision of \**a* in the first syllable. Younes argues that in the Middle East this phenomenon probably predates the appearance of the *gahawa*-syndrome (Younes 2018, pp. 7–8) and treats it as a pan-Eastern Bedouin dialect phenomenon.

#### 3.1.3. *gahawa*-Syndrome

The so-called *g(a)hawa-*syndrome is another feature described by Blanc (1964, p. 166) as typical of rural *g*@*l*@*t* dialects and not present in MBA. This morphonological phenomenon denotes the reshuffling of non-final syllables closed by a guttural consonant (i.e., /x, g, ˙ h. , Q, P, h/): CvCG > CvCGv or CCGv, e.g., *gahwa* 'coffee' > *gahawa* or *ghawa*.

As for MBA, the incoming Bedouin tribes and the rural population that has settled in Baghdad apparently have given up this feature in the urban context.

Although there are many traces of this resyllabification rule still found in contemporary Khuzestani Arabic—e.g. P *ahali* 'my family' and *xad* ˙ *ar* ~ P *axad* ˙ *ar* 'green'—it has ceased to be an active phonological process (cf. Section 3.4 and Leitner 2020, p. 50).

Also for Basra Arabic, there are only very few examples of this phonological rule to be found in Mahdi's Ph.D. thesis, but many, which do not show the *gahawa*-syndrome, e.g., *na*Q *ya* 'ewe' (not *n*Q *aya*) and *yi*Q*ruf* 'he knows' (not *y*Q *aruf*) (Mahdi 1985, pp. 51, 99, respectively). The only examples found are some originally P -initial words like *xad* ˙ *ar* < P *axd* ˙ *ar* 'green' or *h. awal* < P *ah. wal* 'cross-eyed', in which the first syllable was dropped after a vowel *a* was inserted after the guttural consonant (Mahdi 1985, p. 62).

The data on Kwayriš Arabic also shows mixed results: while we find, e.g., *naxla* 'palm tree' (cf. Meißner 1903, p. XVIII), we also get *lighawa* 'the coffee' (Denz 1971, p. 55), *(a)heli* 'my family' (Meißner 1903, p. 26), and *ygalub ˙* 'he wins' (Denz 1971, p. 68).

As for al-Shirqat, I found one word that is subject to this phonological rule in Salonen's texts: P *äheli* 'my family' (Salonen 1980, pp. 21 and 42, Text 8, sentence 1) and P *ahalu* 'his family' (Salonen 1980, pp. 22 and 44, Text 9, sentence 4)—but later, in another text, we find the same word without insertion of a vowel after the guttural:a P *ahlu* 'his family' (Salonen 1980, pp. 24 and 46, Text 13, sentence 1).

This phenomenon is attested for both the Šawi dialects (e.g., ¯ *gˇh. aša* 'female ass', instead of *gaˇ h.ša*; Procházka 2003, p. 78), and for Khorasan Arabic (e.g., *yogodi ˙* 'he goes', instead of *yogdi ˙* ).

Today, the productive use of the *gahawa*-syndrome appears to be very limited in most dialects and has thus ceased to be a good criterion for distinguishing rural from urban dialects as well as Bedouin-type from sedentary-type dialects. The reason for the loss of this feature is most likely related to its markedness (cf. Section 3.3).

#### 3.1.4. Resyllabification of OA CaCaC-v(C) > CCvC-a(C)

Blanc described reflexes of the OA PFV verbal forms CaCaC-v(C), e.g., *katabat*, with initial CC- as typical rural *g*@*l*@*t* forms (Blanc 1964, p. 166). Ingham (1982, pp. 48–49, 52) describes such forms as characteristic of the Mesopotamian *badiya ¯* dialects, in contrast to the Mesopotamian *h. ad*˙ *ar* dialects that have an initial syllable structure CiC-.

Basra Arabic and MBA both have forms of the structure CvCC-v(C), e.g., \**katabat* > *k*@*tbat/kitbat* 'she wrote' (Blanc 1964, p. 98; Leitner et al. 2021, p. 69; Mahdi 1985, p. 93). Also in present-day Khuzestani Arabic, the most common reflex is CvCC-v(C) and not CCvC-v(C)—e.g. *k*@*tbat* 'she wrote'. In Khorasan, we find the structure CiCiC-v(C), e.g. *citibat ˇ* 'she wrote' (Seeger 2013, p. 314).

Forms of the type CCvC-v(C), e.g., *ktibet* 'she wrote', are found, e.g., in Kwayriš (Meißner 1903, pp. XLI, LII)12, al-Shirqat (Salonen 1980, p. 80), in all Šawi dialects ( ¯ Younes and Herin n.d., *EALL Online*), and in certain (Q *arab-*type13) dialects of Khuzestan (Leitner 2020, pp. 14–15). Among the urban speakers of Ahvaz, such forms are not used and are rather perceived as clearly rural.

#### 3.1.5. Retention of Gender Distinction in the Plural

Palva states that gender distinction in the plural is a feature retained only in the rural dialects of southern Mesopotamia (Palva 2009, p. 23; cf. also Blanc 1964, p. 166, who states that this phenomenon is "only marginal in M[BA]"). He explains the lack of gender distinction in the plural of MBA as an inherited *q*@*ltu* trait, albeit admitting that sedentarization or urbanization processes would probably have led to the same

development (*ibid*.). Ingham states that, based on his material from Iraq, Khuzistan, Kuwait, and Northern Najd, gender "distinction was maintained almost everywhere, except in the urban centres of Zubair, Kuwait, Basra and Baghdad" (Ingham 1982, p. 38).

My data shows that gender distinction in the plural is still maintained in all Khuzestani Arabic dialects, even in (modern) urban contexts (cf. the case study of Ahvaz below, Section 3.4), and it appears to be used in modern MBA and Basra Arabic as well (see Table 1 and fn. 8). Gender distinction in the plural was also maintained by all urban Iraqi speakers interviewed for the sociolinguistic study described in Section 3.3 and none of the interviewees associated the use of feminine plural forms with rural speech. Therefore, we might need to rethink the strict association of this feature with rural contexts.

#### 3.1.6. Prefix *tv*- in Form V and Form VI Verbs

The retention of the vowel after the prefix *t-* in Pattern V and VI verbs, e.g., al-Shirqat *tah. awwalaw* 'they moved' (Salonen 1980, pp. 11, 31, Text 2, sentence 12) and Kwayriš *yatalagga* 'he meets' (besides *¯ıtlagga*, Meißner 1903, p. XLIV), might be another rural feature of the broader Mesopotamian area.

At least for Khuzestani Arabic, my data suggests that forms with a vowel are typical of rural areas found, e.g., in the dialect of H ˙ am¯ıdiyya, e.g., *tacabbaš ˇ* @*t* 'I have learnt', *n*@*ta*Q *ašša* 'we have dinner'.<sup>14</sup>

Apart from rural Khuzestani Arabic, Kwayriš and al-Shirqat Arabic, we also find this trait in Khorasan Arabic (Volkan Bozkurt, pers. comm.), and in the Šawi dialects ( ¯ Behnstedt 1997, pp. 328–29, map 164; Behnstedt 2000, p. 444; Bettini 2006, p. 33). In contrast, it appears to be completely absent from the dialects of Basra and Baghdad, as well as urban Khuzestani Arabic (at least for the city of Ahvaz).

We might tentatively propose that the retention of the vowel in the prefix of Form V and Form VI verbs is a rural feature. However, this hypothesis definitely needs further elaboration and more data from other rural areas, especially in the form of sociolinguistic surveys.

#### 3.1.7. Imperative SG.M of Final Weak Roots: PvCvC

In MBA and Basra Arabic, the SG.M imperative of final weak roots ends on a vowel, e.g., *imši* 'go (IMP SG.M)' (Blanc 1964, p. 103; Mahdi 1985, p. 125). In Kwayriš, as well as in present-day Ahvazi Arabic, in addition to these forms, we also find forms lacking the final vowel, e.g., P @*m*@*š* 'go (IMP SG.M)' (Leitner 2020, p. 19; Meißner 1903, p. XLVIII). The latter forms (lacking the final vowel) are also found in the Arabic varieties of Khorasan (PvCvC Volkan Bozkurt, pers. comm.) and Kuwait City (PvCC, Yousuf B. AlBader, pers. comm.), as well as in some Šawi dialects (e.g., Urfa Arabic, Stephan Proch ¯ ázka, pers. comm.; cf. Behnstedt 1997, pp. 404–5, map 202; and Bettini 2006, p. 35).

For Khuzestani Arabic, Ingham describes the imperative form lacking the final vowel as Q *arab*-type and, conversely, the form with the final vowel as *h. ad* ˙ *ar*-type (Ingham 2007, p. 577; 1973, p. 544; cf. fn. 13 at the end of this paper on these terms). He further describes the introduction of a prothetic vowel before an imperative of the structure C@CC-v (i.e., IMP SG.F, PL.F, and PL.M; or IMP SG.M with a vowel initial object suffix)—e.g., @*k*@*tbi* 'write (IMP SG.F)'—as a rural feature (Ingham 1973, p. 542). In my corpus, this feature is also attested for speakers from Ahvaz.

The small-scale sociolinguistic study conducted for this paper confirms—at least partly, as some speakers stated that such forms did not exist—that urban speakers associate imperative forms of final weak roots lacking the final vowel with rural speech (cf. Section 3.3).

#### *3.2. Reevaluating the Urban Character of MBA: A Question of Urban Features or Inherited* q@ltu *Features?*

#### 3.2.1. Indefinite Article *fadd* ~ *fard*

The use of the indefinite article *fadd* or *fard* has been described as a shibboleth of Iraqi Arabic. Its use is, however, not limited to the nation of Iraq and we also find it in Arabic dialects in Iran (in the provinces of Khuzestan, Bushehr, and Hormozgan), and in Central Asian Arabic (cf. Leitner and Procházka 2021).

Palva (2009, p. 23) describes the indefinite article as a "sedentary feature found in the Mesopotamian dialect area" that is probably quite old. We can assume that in principle this is an old sedentary feature that has most likely developed in urban contexts that allow for contact with other languages, which make use of an indefinite article, e.g., Persian. This supports the theory that new linguistic categories are more likely to arise in urban contexts and contact situations.

Other than that, looking at the emergence of this feature does not tell us much about the synchronic urban–rural distinction in the *g*@*l*@*t* dialects.

#### 3.2.2. Progressive Markers *da*- and *ga¯*Q*id*

Palva argues that the clitic progressive marker *da*- < *qa¯*Q *id* is a sedentary feature and writes that "the use of verb modifiers to mark different tense and aspect categories is a prominent sedentary feature very well developed in all *q*@*ltu*-dialects [ ... ], whereas in rural *g*@*l*@*t* dialects these categories as a rule are unmarked" (Palva 2009, p. 20). Palva's view on the distribution of this feature is contradicted by data from several rural *g*@*l*@*t* dialects. In fact, Palva himself states several pages later that we do find progressive markers in many rural *g*@*l*@*t* dialects as well (Palva 2009, p. 28: "In addition to the *q*@*ltu*-type verb modifier *da*-, MB[A] also makes use of the unshortened active participle *ga¯* " *ed* in the same function [ ... ]. This is an obvious imported rural *g*@*l*@*t*-type form ... "; cf. also Denz 1971, pp. 82–82, 110, 116 on *gˇa¯* Q*id* in Kwayriš).

This fact is supported by Hanitsch (2019, pp. 266–71), who even states that: "Die vollen Formen [of the verbal modifiers deriving from OA *qa*Q *ad* 'to sit'] sind praktisch über das gesamte arabische Sprachgebiet hinweg anzutreffen. Besonders typisch sind sie für Dialekte nomadischen Typs oder mit nomadischem Adstrat" (Hanitsch 2019, p. 267). She adds that even though this verbal modifier was especially typical of nomadic dialects, it was also found in *q*@*ltu*-type dialects spoken in the Syro–Mesopotamian area, as well as in rural sedentary dialects in Morocco, Tunisia, and Palestine (Hanitsch 2019, p. 267).

As Palva (2009, p. 28) states, the use of this progressive marker is also well documented from the Syrian Desert (*g˘a¯* " *id*) and H ˙ or¯ an ( ¯ *ga¯* " *id*). It is also used in many Šawi dialects ( ¯ *gˇa¯* Q *id ~ gaˇ* Q *d ~ gaˇ* Q *d*; cf. Bettini 2006, p. 44; Procházka 2018b, p. 281; Younes and Herin n.d., *EALL Online*) and in Khorasan Arabic (Volkan Bozkurt; pers. comm).

Some southern Iraqi dialects, e.g., Basra Arabic (Mahdi 1985, p. 212) as well as Najaf and Amarah Arabic (information provided by native speakers) use *gˇay¯* (active participle of the verb 'to come') to mark progressive aspect.

It thus seems that only the shortened form, *da-*, is typical of urban dialects, not the use of a progressive marker per se.

#### 3.2.3. Future Marker *rah¯.*

The future marker *ra¯h.* evolved by a grammaticalization process from *ray¯* @*h.* 'going', which is the AP of the verb *ra¯h.* 'to go'. *ra¯h.* is used as a future marker in Khuzestani<sup>15</sup> (Leitner 2020, p. 157), Basra (Mahdi 1985, pp. 210–11) and al-Shirqat Arabic16, and in all dialects of Baghdad (cf. Blanc 1964, pp. 117–18). In addition to the dialects primarily analyzed in this study, it is found, e.g., though not very frequently, in Bahraini Arabic (Holes 2001, p. 216; Holes 2016, p. 304; Johnstone 1967, p. 152; cf. Taine-Cheikh 2004, pp. 219–220; 231 for a good overview regarding its distribution including examples from North Africa and the Levant). In Kuwaiti Arabic it is used to express proximal intent 'to be about to' (Holes 2016, p. 304).

According to an informant from Kerbala (participant D in Section 3.3), some dialects in central Iraq, e.g., Kerbala Arabic, also use *h. a* to mark future tense.

Kwayriš (cf. Denz 1971, p. 109, fn.11) and Šawi Arabic have no future marker. ¯ Khorasan Arabic uses the particle Q*ud¯* to mark future tense (Volkan Bozkurt, pers. comm.).

Based on the fact that cognate forms of this future marker occur in Baghdadi Arabic (Muslim, Jewish, and Christian), as well as in Egypt, Damascus, and Beirut, Palva describes this future marker as an old urban and sedentary feature (Palva 2006, p. 612; Palva 2009, p. 21).

If the future marker *ra¯h.* is indeed an old sedentary feature, it must have at one point been adopted by Bedouin Arab tribes (e.g., those that then settled in Khuzestan). To determine at what point this adoption happened is made difficult by the lack of historical data of, for example, Khuzestani Arabic. Similarly, the question of *why* this feature was adopted that is, whether it was motivated by, e.g., intensive contact with Iraqi speakers or via the spread of urban features (often connected with a certain prestige)—remains unanswered.

Regarding its current distribution, the use of this future marker is definitely a feature typical of both the sedentary-type *q*@*ltu* and the Bedouin-type *g*@*l*@*t* dialects (like the indefinite article, cf. 3.2.1). It would be interesting to find out, for example, whether this future marker is still not used in Kwayriš or in other Iraqi rural areas nowadays. In case it is used in present-day Kwayriš Arabic, this would support the theory of features found in urban centers spreading to rural areas. From a synchronic point of view, and based on our limited available data, we cannot, however, solidly claim that the use of the future marker *rah¯.* is an exclusively urban feature of present-day *g*@*l*@*t* dialects.

#### 3.2.4. Emphatic Imperative Prefix *d*-

The prefix *d*@- ~ *d-* is used to express an emphatic, more energetic form of imperative, as in the following example from Khuzestani Arabic (cf. Leitner 2020, p. 166 for more examples):

(1) Ahvazi Arabic (Leitner 2020, p. 166) *d*@*-xall asol¯* @*f xayya!* EMP-HORT tell\IPFV.1SG sister.DIM 'Let me tell (my story), sister (and don't interrupt)!'

In addition to Khuzestan, this prefix is also used in the described function in the Arabic varieties of Baghdad (Blanc 1964, p. 117), Basra (Mahdi 1985, p. 107), Kwayriš (Meißner 1903, p. XXXIV), Mardin (*q*@*ltu*), and Harran-Urfa (Šawi) in eastern Anatolia ¯ (Procházka 2018a, p. 169), in Christian-Maslawi Arabic (Hanitsch 2019, p. 61), and in some sedentary-type Bahraini Arabic village dialects (Holes 2016, p. 202). I was unable to find evidence for the use of this feature in Salonen's work on al-Shirqat (Salonen 1980), nor in Denz and Edzard's text recorded from a speaker from al-Shirqat (Denz and Edzard 1966).<sup>17</sup>

Colloquial Persian (especially the northern varieties) also uses a prefix *d-* for strong or emphasized imperatives, e.g., *de-boro* 'Go (now)!' (pers. comm., Nawal Bahrani and Babak Nikzat, May 2021) and the Mandaic imperative prefix *d-* (see Häberl 2019, pp. 694–95) may also be related.

Due to its co-occurrence in all Arabic varieties of Baghdad, Palva describes it as an old *q*@*ltu* and with that as a sedentary feature (Palva 2006, p. 612; Palva 2009, pp. 21–22).

As for the origin of this prefix, Grigore (2019, p. 114) derives it from Ottoman Turkish (as an abbreviated form of *haydi/hayed/hadi* 'Come on!') and states that *de-* is also found in this function in contemporary varieties of Anatolian Turkish. Procházka questions this derivation arguing that "Turkish possesses a distinct suffix to intensify imperatives (-*sana*/ *sene*) and the use of *haydi* together with such forms is only optional" and instead points out that the particle might as well be of Arabic origin and a reflex of the OA demonstrative *d a/d ¯ ¯ı* (Procházka 2018a, pp. 183–84).

*¯ ¯* Whatever its ultimate source (language), its present-day distribution allows us to consider it an old areal feature of the broader Mesopotamian linguistic area, presently found in both rural and urban *g*@*l*@*t* dialects. One possible scenario for its distribution is that the Bedouin dialects that have arrived in the Mesopotamian area between the fourteenth and eighteenth centuries have adopted this feature from the (rural and urban) sedentary dialects.

In case its origin is actually Persian, the question of an ultimate sedentary or Bedouin character of this feature is in principle redundant, even though, of course, it might again be the sedentary dialects that first adopted this feature from Persian and then passed it on to the Bedouin dialects.

#### 3.2.5. Lack of Features: Feminine Plural Forms, Resyllabification Rules, and Form IV Verbs

To explain the absence of feminine plural forms as "an inherited *q*@*ltu* trait" (Palva 2009, p. 23) seems a bit counterintuitive. Instead, we suggest to explain this phenomenon as a modification (reduction/loss) of dialectal features of the incoming *g*@*l*@*t* speakers rather than an adoption of the absence of a category. Even though the difference in this explanation might appear minor, we deem it important to acknowledge the directions of language change. This alternative interpretation is actually touched upon by Palva himself (ibid.), stating that "the natural drift combined with dialect contact would probably have led to the same development, as it has actually done as part of sedentarization process, e.g., in urban centers such as Basra, Zubair and Kuwait."

Although the loss of feminine plural forms is indeed often connected with urbanizing processes, this is not necessarily the case (cf. Ahvazi Arabic 3.4), nor is the lack of feminine plural forms perceived as an urban feature by urban speakers of Iraqi and Khuzestani Arabic (cf. 3.3).

Similarly, Palva lists the absence of the Bedouin-type resyllabification rules, such as the *gahawa*-syndrome or the rendering of OA CaCaC-v(C) > CCvC-a(C), among the inherited *q*@*ltu*-type features, but explains this as a "phonetic adaptation by immigrant Bedouin speakers" (Palva 2009, p. 24). As was shown in Sections 3.1.3 and 3.1.4, the active use of these phonological rules has reduced greatly, and its absence is not limited to *g*@*l*@*t* dialects that have been in contact with a *q*@*ltu* dialect or been influenced by a *q*@*ltu* substratum. It thus seems that this process (loss of phonological or morphonological features) is triggered by the markedness of these features (cf. Section 3.3).

Finally, Palva also lists the absence of Form IV as a productive morphological category in MBA among the inherited *q*@*ltu* traits, even though he himself admits that this feature was absent in Jewish and Christian Baghdadi as well (Palva 2009, p. 24). This verbal pattern has ceased to be productive in most dialects of this area (and beyond), also in rural dialects, and is therefore much rather a general tendency of spoken Arabic than a specific *q*@*ltu* feature.

#### *3.3. Markedness of Rural Features—A Small-Scale Sociolinguistic Survey among Urban Iraqis*

Inspired by the question proposed in the introduction and the features previously described as rural (cf. Table 1), the author has undertaken a small-scale sociolinguistic survey among five urban Iraqis. Audio-recordings of all the interviews were made—of course, with the participants' consent—and later partly transcribed. Some examples taken from these recordings are cited below with English translations.

The five participants, aged between 26 and 41 (three male, two female), are from different urban backgrounds (Falluja, Baghdad, Kerbala, Amarah), and all fled to Vienna within the past few years.

Participant A is a 33-year-old male graphic designer from Falluja, who fled to Vienna in 2015.

Participant B is a 41-year-old professional painter and was born and lived in Al-Adhamiyah in Eastern Baghdad until he fled to Austria in 2015.

Participant C is 26 years old and is also from Al-Adhamiyah, Eastern Baghdad. He went to study for some time near Tikrit. He then too fled to Vienna in 2015.

Participant D is a 28-year-old (female) doctor, who was born and studied in Kerbala, where she has also worked in a hospital after finishing her studies. She has moved to Vienna in September 2021.

Participant E is a 39-year-old mother of four children, who was born in the city of Amarah (Maysan), but then lived for more than a decade in Kirkuk before she and her family fled to Vienna in 2015.

A short interview was conducted with each interviewee individually (mostly in Iraqi Arabic) while focusing on three aspects:

(i) Do the interviewees themselves reproduce rural features as described in Section 3.1?

(ii) What do the participants think about the features described in Section 3.1 regarding their distribution, status, and use? (Which speakers, or in which regions, would we find them? Are they rural, *r¯ıfi*, features?)

(iii) What other features do the participants associate with rural speech?

To answer question (i), the participants were asked to translate certain words and phrases from English, German, or Standard Arabic into their native dialect to see whether they themselves reproduced rural features.

The other two questions were asked within an 'open questions' part of the interview, in which the participants could tell me what came to their mind when hearing specific words or features—e.g., the use of P *imiš* for 'go' (IMP SG.M) instead of P *imši*—or when thinking about the urban–rural distinction in the Iraqi dialects in general. Whereas for denoting 'rural' the Arabic adjective *r¯ıfi* was used by all speakers and some of the younger speakers even used the English term 'rural' additionally (cf. the quote below), for the concept of *urban* no direct equivalent was used. Instead, the interviewees, for example, explained that *ahl il-mad¯ına* 'the city dwellers' would use a certain form or not.

This survey yielded the following results represented in what we shall call a 'Rurality Scale'. The stronger a feature is associated by speakers with rural contexts, the farther the bar related to these features goes to the right18 and, in consequence, the more likely it is that such a feature is modified or given up in urban contexts.

Of all (possibly) 'rural' features discussed in Section 3.1, only the raising of \**a* and the use of gender distinction in the plural were found in the speech of my interviewees, who all lived in cities when living in Iraq. For the translation of 'knives', for example, all participants used the form *siˇcaˇ¯c¯ın* (< *sakak¯ ¯ın* via raising of \**a*), and the sentence 'The mothers are baking bread' was translated by all as @*l-*P*ummahat da-yixubzan ¯* , i.e., using the third person feminine plural form of the verb.

As illustrated in Figure 1, among the 'overtly rural features' (cf. Abu-Haidar 1988, p. 75) in Iraqi Arabic are the *gahawa*-syndrome, the resyllabification of CaCaC-v(C) > CCvC-a(C), the affrication of \**q*, and, albeit to a lesser degree, SG.M imperatives of IIIy/w verbs of the structure PvCvC, i.e. lacking the final vowel, and the pronunciation of \**g˙* as [q].

The pronunciation of \**q* as *gˇ* in forms like \**qal¯ıl > gil ˇ ¯ıl* and \**qar¯ıb > gir ˇ ¯ıb* was considered a rural feature by four participants (B, C, D, and E). Speaker D even emphasized that this was a *very* rural feature as such forms were used mainly in @*l-mukan¯ at¯* @*l kulliš rural* 'areas that are totally rural' and therefore in general not that commonly heard. Speaker B associated this with the speech of northern tribes, where he has relatives, and said it was possibly also found among rural speakers from the south, but that he was not sure about that because he did not know any southerners. Speaker A considered it a northern feature that was, however, not limited to rural contexts but also found in urban contexts in the north.

As for the resyllabification of CaCaC-v(C) > CCvC-a(C), only the youngest participant (C) explained that forms such as *kt*@*bat* 'she wrote' or *glubat* 'she turned' did not exist. The other four speakers (A, B, D, and E) considered both the resyllabification of CaCaC-v(C) and the *gahawa*-syndrome as typically rural (e.g., A *bi-l-aryaf¯* 'in rural areas') or Bedouin (e.g., B Q *ašayirna yg ¯ ul¯ uha ¯* 'our tribes say it') features. Speaker A and B again associate these features particularly with the speech of northern tribes.

As for the SG.M imperative of final weak roots, all of the interviewees used forms ending in a vowel and two of them (B and C) considered forms lacking the final vowel—e.g. P @*m*@*š* 'go!'—as simply wrong or non-existent. The other three participants (A, D, and E) associated said forms with rural speech (e.g., A: P *imiš yistaxdimunha b-ir-r ¯ ¯ıf* P*aktar* 'P *imiš* is mainly used in rural areas').

Interestingly, none of my interviewees associated the use of feminine plural forms with rural speech. In fact, most of them used verbs in the feminine plural form when translating sentences like 'the girls washed ... ' or 'the mothers are baking bread', but only the speaker from Amarah (E) explicitly stated that the sentence would be wrong if I used a masculine plural verb instead (which would have been acceptable for the others).

Similarly, the retention of the vowel after the prefix *t-* in Pattern V and VI verbs or the raising of *\*a* were not mentioned as rural features by any of the interviewees. All of the participants produced forms of Pattern V and VI verbs without vowel retention only, e.g., *n*@*t*Q *ašša* 'we eat dinner', but did produce forms in which \**a* was raised to *i*, e.g., *digˇa¯gaˇ* 'a chicken'. The retention of the vowel in the prefix of Pattern V and Pattern VI verbs thus might be limited to rural areas but does not seem to be marked as a feature of rural speech. The raising of \**a* is neither limited to rural areas nor a marked rural feature.

In addition to these features, three participants (A, C, E) mentioned the realization of \**g˙* as [q] and two (A and C) the use of *tafx¯ım* 'emphasis' (also described as 'heavy speech') as rural features. About the realization of \**g˙* as [q], participant A stated that *b-ir-r¯ıf ma¯ nistaxdim il-gayn b-il- ˙* Q*umum, nistaxdim il-q ¯ af, ma ¯ talan id-Deckel* [German for 'lid (of a pot)'] *ih. na b-il-*Q*ammiyya ing ¯ ul¯* Q *ale qaba ¯ g humma ig ˙ ul¯ un¯* Q *ale qabaq ¯ e kulla q ¯ af¯* 'on the countryside, they [lit. 'we'] don't use the [letter] *gayn ˙* generally, they use the [letter] *qaf¯* [instead]; for example, for lid [of a pot] we say *qabag˙* in our dialect, (while) they call it *qabaq*, yes, it's all *qaf¯* s'. About the stronger use of emphasis in the dialects spoken in the north-western Iraqi province of Anbar, the same speaker says: *lahgathum kulliš ˇ tig¯ıla* 'their dialect is very heavy [i.e., characterized by emphasis]'.

Furthermore, participant A and C also mentioned the use of several lexical features as typically rural, e.g., the use of *xašš* 'enter' (instead of urban *t.abb*, participant A19; cf. Abu-Haidar 1988), *dah. h. ig~da ˇ h. h. ig* 'to see' (vs. urban *bawa ¯* Q , participant C), and *g˙adi ¯* 'there' (participant C).

As for the city of Baghdad, participant A (from Falluja) described its people and dialect as the most educated: *fa-tkun lah ¯ gathum hiyya l-lah ˇ ga l-Karxiyya l-b ˇ e¯d* ˙ *a lli hiyya qar¯ıba* Q *a-l-fus.h. a ya ¯* Q *ni b¯ı-ha kalimat¯* Q *ammiyya u b ¯ ¯ı-ha fus.h. a¯* 'and their dialect is the "white" dialect of Karkh [Western part of Baghdad], which is close to the literary language, that is, it has dialectal words as well as words derived from the literary language'.

Participant E described the dialect of Baghdad as effeminate and unmanly, while she described the rural dialects as masculine.

In general, it appears that the purely geographical division into *gan ˇ ubi ¯* 'southern', *b-ilwas. at.* 'in the center, central', and *garbi ˙* 'western' often plays a bigger role in the participants' descriptions of the dialectal landscape that we find in Iraq.

Finally, it has to be mentioned that two pejorative terms used by city-dwellers to describe people from rural areas were mentioned in the interviews. These terms are *šrugi ¯* (PL *šrug¯* ) <sup>20</sup> and *m*Q *edi ¯* (PL *m*@Q *dan¯* ). While the former mainly denotes people from the south, the latter essentially refers to the marsh-dwellers, many of whom live in the Eastern province of Maysan, but is now often used to derogatorily describe an uneducated, uncultivated person. People of both groups have moved to (the suburbs of) Baghdad during the past decades (cf. Miller 2007, p. 14) and thus more contact situations with the city dwellers have arisen. Most of my participants mentioned that they only used those terms for people who lacked education, good taste in clothing, and had a more conservative lifestyle, but not to people who came from rural areas but were educated and have adapted to the city lifestyle. Only participant E, who was born in Maysan herself, said she was proud of the *m*@Q*dan¯* heritage of her people and considered it an important part of Iraqi culture.

#### *3.4. Case Study Ahvaz, Khuzestan: Urbanization of a Rural* g@l@t *Dialect*

The city of Ahvaz, capital of the south-western Iranian province of Khuzestan, witnessed rapid urbanization and population growth in the twentieth century. In the nineteenth century, Ahvaz was no more than a village and in the early twentieth century it still had less than 50,000 inhabitants (see Oppenheim 1967, p. 22, fn. 1). During the Iran–Iraq War (1980–1988) numerous houses were destroyed (especially in the southern Khuzestani cities of Muh. ammara/Khorramshahr and Abadan) and many families were forced to flee their hometowns. Many of the Khuzestani Arab war refugees left the province of Khuzestan altogether or went to comparably safer cities, such as Ahvaz. For this reason, during that time, the city of Ahvaz witnessed an immense population growth. According to Nejatian (2015), the number of inhabitants in Ahvaz grew from 334,399 in 1976 to 724,653 in 1991, and to 1,112,021 in 2011.

To look at the loss of rural features in Ahvaz might give us a hint as to what *g*@*l*@*t* features are highly marked and first to be given up in arising urban contexts. Arising urban contexts are here defined as contexts which permit contact with other *g*@*l*@*t* dialects (urban and rural) but not necessarily have an old sedentary or *q*@*ltu* substratum.

Of the (possibly) rural features discussed in Section 3.1, Feature 1 (affrication of \**q*), Feature 2 (raising of \**a* in pre-tonic open syllables), and Feature 5 (gender distinction in the plural), are commonly found among all speakers of Ahvazi Arabic.

The remaining features discussed in Section 3.1 are not found in Ahvazi Arabic or are in the process of being dropped:

Feature 3: In Ahvazi Arabic as well as most present-day Khuzestani Arabic dialects, the use of forms that show the typical *gahawa*-type resyllabification is limited to certain frozen examples—e.g., P *ahali* 'my family' and *xad* ˙ *ar* ~ P *axad* ˙ *ar* 'green'—and not productive, e.g., Ahvazi Arabic *gahwa* (not *ghawa*) 'coffee', and *na*Q *ya* (not *n*Q *aya*) 'ewe' (cf. Leitner 2020, p. 50 for more examples).

Feature 4: Ahvazi Arabic does not have forms that show the Bedouin-type resyllabification rule of CaCaC-v(C)-structures, e.g., Ahvazi Arabic *k*@*tbat* 'she wrote', and *š*@*bgat* 'she hugged'. The Bedouin-type form is still typical of north-western Khuzestani towns and villages, such as Xafagˇ¯ıya (Pers. Susangerd) and H˙ uwayza.

Feature 6: Most speakers of Ahvazi Arabic do not show retention of the prefix vowel in Form V and Form VI verbs. As stated above (3.1.6), this feature is more typical of rural areas and smaller Khuzestani towns and villages such as H˙ am¯ıdiyya.

Feature 7: In present-day Ahvazi, but not among all speakers, the rural form of the SG.M imperative of IIIw/y verbs is still found, i.e., the one lacking the final vowel—e.g., P @*m*@*š* 'go! (IMP SG.M)', P @*h.* @*cˇ* 'speak! (IMP SG.M)'. However, the form ending on a vowel that is associated with urban contexts (cf. Ingham 2007, p. 577; 1973, p. 544)—e.g., P @*mši* 'go (IMP MSG)'—is also used in present-day Ahvazi Arabic. Thus, this rural feature is apparently still in the process of being dropped and substituted by the less marked urban forms.

Finally, present-day Ahvazi Arabic appears to use both typically Q *arab*- and *h. ad* ˙ *ar*-type words (cf. fn. 13 on these terms; Ingham 1973, p. 538). Q *arab* words (associated with a rural lifestyle) include, e.g., *(le-)g˙ad¯* 'there' (however, its *h. ad* ˙ *ar* equivalent *hnak¯* 'there' is equally attested in Ahvaz). *h. ad* ˙ *ar*-type lexical items in Ahvazi Arabic are, for instance, *ta*Q *adda* 'to pass (i.e., go past something)'. Some lexemes that Ingham mentioned appear to be given up completely, e.g., the word for 'meal' today is neither *marag* (which Ingham lists as *h. ad* ˙ *ar*) nor *ydam¯* (which Ingham lists as Q *arab*), but P *ak*@*l*; and the most commonly heard word for 'mirror' in present-day Ahvazi Arabic is neither *mn*@*dra* (*h. ad* ˙ *ar*) nor *mraya ¯* (Q *arab*), but *m*@*šufa ¯* (PL *m*@*šaw¯* @*f*) and *m*@*šaffa* (PL *m*@*šaffat¯* ). In turn, sometimes items from both types are used, for example, 'to look (at)' may be expressed by @*s.t.*@*ba* (*h. ad* ˙ *ar*), *bawa ¯* Q (Q *arab*), or Q *ayan ¯* (Q *arab*) in present-day Ahvazi Arabic. Even though this distinction cannot be equated with the urban–rural distinction but is rather connected to occupational differences (cf. fn. 13), these processes found in the lexical domain support the assumption that Ahvazi Arabic has been subject to linguistic levelling tendencies since the times of Ingham's fieldwork in 1969 and 1971. The result of this development is a dialect which does not clearly correspond to one of these sociolinguistic categories anymore and may rather be considered as of mixed typology. The reasons for the linguistic leveling and mixture of dialectal features observed for Ahvazi Arabic lie mainly in the rapid demographic changes that this city has witnessed during recent years. Its fast growth during and after the Iran–Iraq War, especially, has allowed for much (linguistic) contact among people of different geographic origins within Khuzestan and southern Iraq, calling for linguistic accommodation and triggering leveling processes (cf. Ech-charfi 2020, pp. 70–71, 75 on leveling tendencies in other new cities, such as Amman and Casablanca).

The fact that of all possible rural features discussed in Section 3.1, Feature 2 (raising of \**a* in pre-tonic open syllables) and Feature 5 (gender distinction in the plural) are not modified or dropped in Ahvazi Arabic is partly paralleled by the results of the sociolinguistic interview (cf. Section 3.3), according to which these features are not marked as rural features. The third feature that is not dropped in Ahvazi Arabic, Feature 1 (affrication of \**q*), shows that affrication of \**q* is apparently less marked in Khuzestan. This might be related to the fact that the urban category is newer in the Khuzestani society and the dichotomy of urban vs. rural features not as strong or long established as in Iraq, where urban centers have already existed for hundreds of years.

All features that are not found (anymore) in Ahvazi Arabic as described above except for Feature 6—are the same features that were perceived as highly marked by the participants of the sociolinguistic interview.

#### **4. Discussion**

In the light of the scarcity of linguistic descriptions of *g*@*l*@*t* dialects in general (of both urban and rural contexts) and the fact that some of the descriptions date back more than 100 years (e.g., Meißner 1903 on Kwayriš), we must be careful when drawing general conclusions about the present-day classification of this dialect group. The following interpretation of our results is therefore to be understood as tentative and as re-opening the floor for debating this issue.

#### *4.1. Historical and Modern MBA and the Quest for Urban* g@l@t

The analysis in Section 3.2 lets us safely conclude that many features of MBA must not necessarily be explained as a consequence of the *q*@*ltu* substrate but can also be interpreted as consequences of the urbanization of *g*@*l*@*t* dialects.

The fact that some features which have been explained as old urban and sedentary features are nowadays also found in rural Bedouin dialects that have not (or not likely) been directly in contact with *q*@*ltu* supports the theory that features often spread from prestigious urban centers to rural areas.

For example, the emphatic imperative prefix *d-* (cf. Palva 2009, pp. 22, 35), the use of the future marker *ra¯h.* and the indefinite article *fard* are nowadays attested for several urban and rural *g*@*l*@*t* dialects alike and (partly) also for the Šawi dialects and dialects ¯ of Khorasan. In these dialects, the existence of these features cannot be explained via a *q*@*ltu* substrate, which they do not have. More likely, they have spread—probably at a very early stage—from the longer established sedentary *q*@*ltu* to the later incoming Bedouin *g*@*l*@*t* dialects. Nowadays, we may consider them areal features general to the southern Mesopotamian area and beyond.

Instead of explaining the absence of certain features of MBA—the lack of the resyllabication rules and the feminine plural forms—via the *q*@*ltu* substrate, we suggest, rather, that these developments be seen as consequences of the urbanization of *g*@*l*@*t* speakers and the subsequent loss of marked features (cf. subsequent subsections). Some but not all of these marked features are mostly absent in modern urban contexts, e.g., Baghdad and Ahvaz, and are still strongly associated by speakers themselves with Bedouin and rural-type dialects (at least for Iraq). Notably, this is not the case with the gender distinction in the plural, which is gaining ground in modern MBA and is retained in the modern city of Ahvaz.

As for the question why—at least at the time of Haim Blanc's descriptions of Baghdadi Arabic—the only urban or urbanized *g*@*l*@*t* dialects are found in Lower (and none in Upper) Iraq, we must keep in mind that most towns of Lower Iraq were built (or re-populated) not before the eighteenth and nineteenth centuries after the massive depopulation of this area following the Mongol invasions (cf. Blanc 1964, p. 170, fn. 189 and the references mentioned there). This implies that we are mostly not dealing with longstanding sedentary populations (that were later Bedouinized or marginalized by the Bedouin immigrants, as in the case of MBA21) but much later (nineteenth, early twentieth century) sedentarized and urbanized Bedouin populations. In contrast, towns in Upper Iraq have had a more continuous (*q*@*ltu*-speaking) population. In addition to that, the topography of Upper Iraq has been described as "more conducive to polarization between sedentary and nonsedentary life", as the steppes allowed for grazing only, while the fixed banks of the river courses are well suited for permanent sedentarism (cf. Blanc 1964, pp. 170–71). This situation has definitely changed as more and more Bedouins have given up the nomadic lifestyle. The question of whether or not and to what degree we nowadays find urbanizing tendencies in the *g*@*l*@*t* dialects of Upper Iraq remains to be answered in future studies as we still lack the data needed for such an analysis.

#### *4.2. Who Speaks Urban* g@l@t*?*

Traditionally, Muslim Baghdad Arabic (MBA) has been considered the main representative of an urban *g*@*l*@*t* type besides the Arabic dialect of Basra (Blanc 1964, p. 165).<sup>22</sup> However, considering the unique linguistic history of MBA, a *g*@*l*@*t* dialect that has a *q*@*ltu* substratum (Palva 2009 argues for a mixed *q*@*ltu/g*@*l*@*t* character of MBA; cf. Section 3.2 for a discussion of this description), it is questionable whether it is a good reference point for a general description of an urban *g*@*l*@*t* type. The specific character of MBA has indeed arisen in an urban context (though the sense of 'urbanization' here is not a socio-economic one), however, the contact between (Bedouin) *g*@*l*@*t* and (Sedentary) *q*@*ltu* speakers has also shaped the linguistic profile of this dialect.

On the urban character of MBA and the urban–rural split among the *g*@*l*@*t* dialects, Blanc tentatively stated that "... [MBA] is closest to the urban dialects on which some data are available (Basra, Qal'at S ˙ ale ¯ h.) so that one dimly foresees a possible classification of urban vs. rural *g*@*l*@*t* dialects, as yet not solidly established" (Blanc 1964, p. 165).

Mahdi writes in the introduction to his thesis on Basra Arabic that "in studying BA, I found no justification for dividing BA [Basra Arabic] into two groups, i.e., urban and rural, [ ... ] The linguistic boundaries between the town and the surrounding countrysides are

simply not apparent [ ... ] mainly because the town society is rural in origin and the towns depend basically on the surrounding villages and countryside for filling the needs and manpower. Those who live in the town are most of them originally villagers or cultivators who moved to the town for various reasons" (Mahdi 1985, p. XV). Basra has been almost completely destroyed in the fourteenth century and was subsequently moved to its modern location at al-QUbulla (Pellat and Longrigg 2012; Oppenheim 1952, p. 178) and was subject to massive immigration by speakers from rural areas of Lower Iraq. A major difference between Basra Arabic and MBA is that the former has no *q*@*ltu* substrate.

In a similar vein, Bruce Ingham states in the introduction to his book *Arabian Diversions* (Ingham 1997, pp. ix–x) that geographically and demographically all Khuzestani Arabic dialects are really rural in character, for which reason he prefers to use the terms *h. ad* ˙ *ar* vs. Q*arab* instead of urban vs. rural for the subclassification of these dialects (cf. fn. 13 below).

Of course, since Ingham's descriptions (based on his fieldwork carried out in the 1960s and 1970s), new urban centers have developed (e.g., Ahvaz, cf. Section 3.4 below) and others have considerably grown. In both Iraq and Khuzestan, the past decades have witnessed massive population movements (to a considerable part caused the Iran–Iraq war in the 1980s), from rural to—already existing as well as newly arising—urban areas. The question of possible linguistic effects of these urbanizing tendencies among the *g*@*l*@*t* dialects will be discussed in Section 4.4.

Thus, we can see that even though the term 'urban' is clearly associated with cities, this does not necessarily mean that the speakers of a city speak an urban variety and even less does it mean that their families are of an old, established urban background.

#### *4.3. About Rural* g@l@t *and the Markedness of Rural Features*

The analysis of the possible (phonological and morphological) rural features in Section 3.1 yields no unanimously clear picture regarding their distribution. As we can see in Table 1, only Features 6 and 7 are clearly absent in the dialects of the urban centers Basra and Baghdad.<sup>23</sup> However, even regarding these features the picture is not clear when it comes to the supposedly rural dialects of Khuzestan and Kwayriš. The distribution of all other features (1–5), doesn't show a clear-cut distinction between rural and urban contexts either.

While Feature 1 (affrication of \**q*) is found in all dialects analyzed but MBA, Feature 2 is found in all dialects analyzed. Even though Features 3 and 4 are virtually absent from the dialects of Basra and Baghdad, they have also been dropped or are in the process of being dropped from what have usually been described as rural *g*@*l*@*t* dialects. Feature 5 stands out by being found in all varieties analyzed (urban and rural), albeit in MBA it can often be substituted by masculine plural forms.

This picture is in many points corroborated by the results of the sociolinguistic survey presented in Section 3.3 and the theoretical assumption that "overt stigmatization attached to certain rural features seems to be the main reason why these features are reduced during the accommodation process to urban speech" (Abu-Haidar 1988, p. 76).

How markedness may lead to the loss of features, how non-markedness can foster retention of features, and which features are relevant in our context, will be discussed in the following Section 4.4.

#### *4.4. Linguistic Consequences of Urbanization*

In the following, some general tendencies found in urban(izing) contexts will be discussed. 'General' is to be understood in the geographical context of the Mesopotamian area only, as supra-regional tendencies in urban and rural dialects of Arabic are difficult to find, and the analysis of such supra-regional tendencies was not within the scope of this paper. One possible supra-regional tendency might be the perception of urban dialects or urban features as effeminate vs. rural dialects as masculine (cf. Section 3.3; Miller 2007, p. 13; Ech-charfi 2020, p. 75). In general, urbanizing contexts are characterized by an increase of contact (be it urban–rural, or rural–rural of two different rural backgrounds) and leveling tendencies; they also often go hand in hand with complexity reduction.

#### 4.4.1. Loss of Features

It is often the case that a language becomes structurally simpler in contexts permitting a high degree of language (or variety) contact leading to linguistic leveling (cf. Kerswill 2013, p. 521 and the references mentioned there). Based on this rule, we would expect categories such as gender distinction in the plural for verbs and pronouns to be given up in urban contexts.

Like most other *g*@*l*@*t* dialects, including those spoken in urban contexts such as Basra, all Khuzestani Arabic dialects have retained gender distinction in the plural. This is a feature that in general is often given up in urban-type dialects, even when of Bedouin origin (cf. Procházka 2014, p. 129). The use of feminine plural forms has even seemed to gain ground in the city of Baghdad. This fact was confirmed in the sociolinguistic survey conducted for this paper, which also showed that it is not indexed with rural speech or overtly marked as rural. This explains why it can be easily reintroduced in urban contexts, such as Baghdad, with the immigration of rural speakers, and why it is not rapidly given up in newer urban contexts, such as Ahvaz. Finally, the fact that feminine plural forms exist in the literary language might raise its prestige or at least increases the exposure of speakers to this feature, which in turn lends to the readiness to adopt a feature (Baghdad), as well as its resistance to be dropped (Ahvaz).

Due to increased literacy in cities, features that deviate from the literary language in a way that is clearly perceivable for speakers are often highly marked. Examples of such marked features (be it for their deviation from the literary language or for other reasons) are, e.g., the *gahawa*-syndrome, the Bedouin-type resyllabification rule, and SG.M imperative forms of final weak roots lacking a final vowel. At least for our geographic context, our small-scale sociolinguistic survey shows that these are among the features most readily given up in urban contexts and most strongly marked as rural among urban speakers. Inversely, the sociolinguistic survey also shows that those features that are apparently not indexed with rurality—especially Features 2 and 5 as described in Section 3.1—tend to be retained in both urban and rural contexts.

The current developments in Ahvazi Arabic (cf. Section 3.4) underline the proposed tendencies found in urban contexts regarding the loss of highly marked rural features.

#### 4.4.2. Innovations

Reevaluating Heikki Palva's diachronic discussion of urban *q*@*ltu* features as found in MBA has shown that urban contexts, especially those where contact with other languages plays an important role, may encourage the development of new linguistic categories. Examples of this are the emergence of an indefinite article (which in general is absent in Arabic), as described in Section 3.2.1, and of the emphatic imperative particle *d-* (provided we consider it a Persian loan; cf. Section 3.2.4).

The use of time (future), aspect (progressive) and indefiniteness markers may have arisen historically among sedentary communities but were at some point later adopted by Bedouin speakers. Most likely, this happened in urban contexts, which strongly facilitate contact situations.

#### **5. Conclusions and Outlook**

The distinction between urban and rural must not be given up completely for the *g*@*l*@*t* dialect group. Trying to write about this distinction with respect to present-day dialects, we should, however, shift the focus onto synchronic sociolinguistic differences that we can observe among speakers who live in cities vs. speakers who live on the countryside. Following this, we should focus on the study of current linguistic trends as observed in arising urban contexts, such as Ahvaz.

Approaching the question of the urban–rural split among the *g*@*l*@*t* dialects, it also appears highly necessary to ask what is subjectively perceived as rural or urban speech by native speakers, rather than imposing what we think to be rural or urban based on the few dialectal descriptions we have from over a hundred years ago. By this critical reevaluation, the author by no means wants to lower the value of these seminal contributions made by Arabic dialectologists, but merely proposes a new way of approaching the classification of the *g*@*l*@*t* dialects.

One major factor that should be considered in any new attempt at classifying the *g*@*l*@*t* dialects by using the urban–rural dichotomy is the different nature of older or longer established urban communities, as found in Baghdad, and the communities of new cities, such as Ahvaz. While in the former, longer established urban communities witnessed rural immigration, in the new urban contexts most inhabitants—or at least their (grand)parents still are of rural origin themselves. This means that there is no established urban community which would define the linguistically urban character of this city in the first place. Rather, it is the "cohabitation of different ethnic groups" (Ech-charfi 2020, p. 72) and groups of different geographical origin that is shaping the new urban profiles. This should not mean, however, that the linguistic profile of such urban spaces that *do* have established urban communities and that are facing rural immigration is only defined by the linguistic traits of the old urban community. In such scenarios we often observe that new urban sociolinguistic identities are coined by combining both old urban features and part of the rural linguistic heritage (cf. Ech-charfi 2020, pp. 72, 75–76 on a similar observation in Rabat, Amman and Casablanca Arabic). At least for modern MBA, the following citation from Ech-charfi applies "New urban identities are constructed linguistically by combining traditional urban and rural variants while rural stereotypes serve as the background against which urban identities are defined" (Ech-charfi 2020, p. 76).

Thus, in no scenario can we speak of a homogenous urban group that is clearly distinct from the rural population, as such groups are partly (Baghdad) or completely (Ahvaz) descended from rural populations themselves (cf. 4.1. and Mahdi 1985, p. XV, who states that all inhabitants of Basra are rural in origin). Rather, we can only try and capture linguistic trends found in urban contexts by observing which features are most readily modified or dropped and which adopted, and by asking speakers what features are perceived as rural. By this we can get to define the (socio-)linguistic profiles of the (modern-day!) urban and rural societies in Iraq and Khuzestan.

Importantly, the definition of the 'default' urban *g*@*l*@*t* type should not be limited to the scenario of MBA, a Bedouin type *g*@*l*@*t* dialect with a sedentary type *q*@*ltu* substratum, but should be extended to include *g*@*l*@*t* dialects spoken in the context of newly arising urban spaces. Of course, the different nature of these two urban scenarios must be borne in mind.

In addition to the urban–rural distinction as treated in this paper, the *g*@*l*@*t* dialects can be divided by geographic aspects, e.g., into a southern and a northern group. For this question, the reader is referred to Hassan (2020, 2021), who discerns two geographic subgroups of Iraqi Arabic: *šrugi ¯* (south of Baghdad) and *non-šrugi ¯* (north of Baghdad) dialects, corresponding roughly to the Shiite and the Sunni groups of *g*@*l*@*t* speakers in Iraq, respectively (cf. fn. 20 on the derogatory nature of these terms).

The classification of the *g*@*l*@*t* dialects is still far from being solidly established, a situation which primarily results from the scarcity of data available on this dialect group.

We do hope, however, that this modest contribution has brought forth aspects of this classification hitherto not considered and presented some already considered aspects in a new light. Hopefully, this study will motivate other researchers to continue research on the classification, the historical development, and the modern urbanizing tendencies of this still under-researched dialect group.

One major desideratum in the investigation of the urban–rural split in the *g*@*l*@*t* dialects is a large-scale sociolinguistic survey showing how linguistic variables are perceived in terms of prestige, markedness, and other sociological factors, such as masculinity vs. femininity, in various varieties of the *g*@*l*@*t*-speaking area. Ideally, this large-scale sociolinguistic survey would include additional variables: on the one hand, new variables from the domains of syntax and lexicon (e.g., *le-¯ g˙ad¯* 'there' described as a typical rural *g*@*l*@*t* feature by Fischer and Jastrow 1980, p. 151), which could not be treated within the scope of the present study; and on the other hand, phonological features like the realization of \**g˙* as [q] and the use of *tafx¯ım* 'emphasis' that were mentioned as rural features by the participants of the small-scale sociolinguistic survey conducted for this study.

**Funding:** Open Access Funding by the University of Vienna.

**Institutional Review Board Statement:** The only part of this study that included ethically relevant human interaction were the sociolinguistic interviews. Since it was not possible due to the research agenda to anonymize gender, age, and locations, this was only but thoroughly done for names. All subjects were informed that their anonymity is assured, why the research was being conducted, and how their data was going to be used. All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki. Approval by an Ethics Committee was not deemed necessary as this is a non-interventional study, all conventions in terms of data usage were upheld, and there are no perceivable risks for the participants due to the conduction of the study or the publication of the material.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Both publicly available and private datasets were analyzed in this study. The publicly available datasets can be found in the references cited in this article. The private dataset stems from the author's own fieldwork conducted in Khuzestan in 2016 and is available on request from the corresponding author.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Abbreviations**


#### **Notes**


#### **References**


Behnstedt, Peter. 1997. *Sprachatlas von Syrien. I: Kartenband*. Semitica Viva 17. Wiesbaden: Harrassowitz.

Behnstedt, Peter. 2000. *Sprachatlas von Syrien. II: Volkskundliche Texte*. Wiesbaden: Harrassowitz.


Younes, Igor, and Bruno Herin. n.d. 'Šawi Arabic'. In ¯ *Encyclopedia of Arabic Language and Linguistics*. Online Edition. Leiden: Brill.

Younes, Igor. 2018. Raising and the Gahawa-Syndrome, between Inheritance and Innovation. *Zeitschrift Für Arabische Linguistik* 67: 5–11. [CrossRef]

## *Article* **The Southern Moroccan Dialects and the Hilali Category ¯**

**Felipe Benjamin Francisco**

Seminar für Semitistik und Arabistik, Freie Universität Berlin, 14195 Berlin, Germany; fbenjamin@zedat.fu-berlin.de

**Abstract:** The aim of this paper is to review the classification of the southern Moroccan dialects, advancing on the general description of these varieties. Recent descriptive studies provided us with new sources on the linguistic reality of southern Morocco, shedding light on the status of dialects commonly classified as Bedouin or 'Hilali' within the Maghrebi context. To do so, the ¯ paper highlights conservative and innovative features which characterize the dialects of the area, focusing mainly—but not exclusively—on the updated data for two distant localities in southern Morocco: Essaouira and its rural outskirts—the Chiadma territory (Aquermoud and S¯ıdi ¯ Ish. aq)—and ¯ Tafilalt, in south-eastern Morocco. The southern dialects have been situated in an intermediary zone between pre-Hilali and Hil ¯ ali categories for a long time. Discussing their situation may contribute ¯ to understanding what distinguishes them as a dialectal group and also the validity of the 'Hilali' ¯ category in the Moroccan context.

**Keywords:** Arabic dialectology; Moroccan Arabic; Essaouira; Tafilalt; southern Morocco; Bedouin dialects

#### **1. Introduction**

Traditionally, dialectologists have divided the Maghrebi dialects into two categories pre-Hilali and Hil ¯ ali—within a diachronic perspective which associates linguistic features ¯ to the waves of Arabization in North Africa, from the works of W. Marçais, such as the seminal text *Comment l'Afrique du Nord a été arabisée* (Marçais [1938] 1961), to more recent scholarship (cf. Aguadé 2018). Based on these two types, Colin ([1937] 1945, 1986) proposed a sub-classification to Western Maghrebi dialects, or Moroccan dialects precisely, grouping them into: *parlers citadins*, *parlers montagnards*, *parlers bédouins* and *parlers juifs*.

Regarding the Hilali-Bedouin type in Morocco, authors have attempted to tackle the ¯ problem of grouping different linguistic varieties under this category. Colin (1986, p. 1196) proposed that the Moroccan Bedouin dialects could be divided according to their levels of conservatism. That is the case of some dialects of the Sahara area—but not exclusively (e.g., Casablanca, Kampffmeyer 1912)—which retain features such as the realization [g] of \*qaf and the maintenance of interdentals (e.g., / ¯ d ¯ / and /t ¯ /). The same aspect was observed by Lévy (1998, p. 19) who points out that Hilali and Ma ¯ <sup>Q</sup>qili dialects found in the Atlantic plains are quite different from the MaQqili type in the Sahara (e.g., H. assaniyya). In ¯ agreement with this view, Heath (2002, p. 8) drew a distinction between Hilali central type ¯ dialects and the Saharan ones, which—according to him—are restricted to southern oases and parts of the Atlantic plains in Morocco.

Regarding the Bedouin category in Morocco, Taine-Cheikh (2017) points out: "*la situation reste complexe à décrire pour les parlers qui ne sont ni pré-hilaliens ni du type 'saharien'*" (p. 25). That is the case of the southern Moroccan dialects, for which the application of the Hilali category remains doubtful, despite of the confirmation of the [g] realization and the ¯ loss of interdentals, both commonly associated to it. In this manner, we pose the question of whether the findings on the southern dialects, and the revision of their classification, might contribute to shedding the light on the Hilali category within the global linguistic ¯ reality of Morocco?

**Citation:** Francisco, Felipe Benjamin. 2021. The Southern Moroccan Dialects and the Hilali Category. ¯ *Languages* 6: 192. https://doi.org/ 10.3390/languages6040192

Academic Editors: Simone Bettega and Roberta Morano

Received: 3 August 2021 Accepted: 10 November 2021 Published: 23 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

More recently, the endeavor of dialectologists for classifying the so-called Hilali- ¯ Bedouin dialects has come out again after 'Hilali' and 'Ma ¯ <sup>Q</sup>qili' terms were called into question. Benkato (2019) criticized that dialectologists erroneously linked Medieval historical facts—originally incorporated from Ibn Khaldun by French orientalists—with the modern linguistic reality of the Maghreb. He argued that there is a lack of evidence on the direct connection between Medieval tribes, taken as a "reliable unit of sociolinguistic analysis" (*sic*)—such as the MaQqil—and the Arabic dialects spoken nowadays in the region (p. 21). In this way, understanding the condition of southern Moroccan dialects might contribute to understand the validity of categories, such as 'Hilali'. Nevertheless, it is ¯ important to say that the link between historical factors and the current linguistic reality should never be totally discarded.

To explain the distribution of the "southern" linguistic features over this part of Morocco, I argue that they are associated not only to the process of Arabization of this area, but also to modern historical factors, including the trans-Saharan trade route—connecting the Sahel to the Atlantic—and its effects on the populational movements on this area until the nineteenth century. In this way, I also try to explain the reason why distant localities in the south share common linguistic features and how their nature impacts on the validity of Hilali-Bedouin category for classifying southern Moroccan dialects. This may be a ¯ complementary explanation to defining the origin of common features found in southern dialects.

In this context, the aim of this paper is to review the classification of the southern Moroccan Arabic dialects, highlighting some of the features which might single out these varieties. To do so, the study relied on the recent collection of dialectological data for the Atlantic strip, in Essaouira region (Francisco 2019a, 2019b, 2022), and south-eastern Morocco, represented by Tafilalt (Heath 2002; Behnstedt 2004)—without ignoring previous descriptive studies on other varieties of the region.

#### **2. Materials and Methods**

#### *2.1. Descriptive Studies on the Southern Varieties*

According to the traditional classification of Moroccan dialects, the southern dialects belong to the Hilali or Bedouin type, given that they spread from the Atlantic plains—in the ¯ area of Mogador—to the eastern part of the country, the Mouloya basin and the Moroccan Sahara (Colin [1937] 1945, p. 230). Part of these dialects is commonly distinguished from the "truly" Bedouin dialects—Saharan or MaQqili type—, given that their classification was thought based on maintenance and loss of conservative features (Colin 1986, pp. 1195–96). Most of the dialects from the Atlantic strip to Tafilalt lost traditional conservative features (e.g., the interdentals), due to different degrees of Arabization of the Berber tribes (Heath 2002, pp. 8–9).

Taine-Cheikh (2017, pp. 25–26) proposes the category '*parlers "hilaliens" du Sud marocain*' to set together the dialects spoken from the Atlantic coast as far as the Algerian border. Her description relied on studies for the dialects of Skoura, Sous and Essaouira, which exhibit the realization of \*qaf as [g] and the loss of interdentals. In fact, the realization of ¯ \*qaf in southern Moroccan dialects remains a complex issue to the classification of these ¯ varieties under the label 'Hilali', given that both ¯ *g* (Bedouin) and *q* (sedentary) alternate—as phonemes and allophones—, varying lexically. Moreover, the voiced *g* continues to be very usual in these varieties (Heath 2002, p. 9).

Taine-Cheikh (2017, p. 26) considers as general common features for the southern varieties the following:


Given the new data for the dialects of the region, this paper gives special attention to the dialects encountered in two geographical extremities of the south not considered in the study mentioned above. For Tafilalt1: Rissani and Erfoud (Heath 2002); ¯ Igli, Erfoud, MaQdid, z-Zr¯ıgat and z-Z ¯ awya ž-žd ¯ ¯ıda (Behnstedt 2004) and <sup>Q</sup>r.ab S@bba¯h. (Behnstedt n.d.2); and Judeo-Arabic of Ksar es-Souk and Rich (Heath and Bar-Asher 1982). For Essaouira (Mogador), I considered the Muslim and Jewish<sup>3</sup> dialects of the city, and also the variety of Chiadma territory (Aquermoud and S¯ıdi ¯ Ish. aq) ¯ <sup>4</sup> in the rural surroundings of the city (Francisco 2019a, 2019b, 2022). Essaouira data, specifically, may prove to be a valuable source to understanding the linguistic reality of southern Morocco, due to the nature of the settlement in the city, which attracted speakers from different parts of the south since its foundation in the second half of the eighteenth century, as we may see in the next session.

Apparently, the dialects of both southern regions share most of the common features indicated by Taine-Cheikh previously, with a few exceptions, as we may see in Section 3, which consequently bring implications to the classification of these varieties. That may be explained due to historical facts related to the Arab settlement in these localities and also the lasting linguistic contact between the southern dialects.

#### *2.2. Historical and Linguistic Connections in Southern Morocco*

The History of population settling in southern Morocco may explain the linguistic proximity between south-eastern Morocco—the Tafilalt area—and the Essaouira region on the Atlantic strip, which includes the Chiadma territory. Concerning the Arabization process of both areas, well known historical sources indicate that these territories were occupied by MaQqil tribe members at some point, after the beginning of the second wave of Arabization in the Maghreb with the Banu Hil ¯ al invasions in the XI century. Modern ¯ sources continued to narrate the movements of these groups in southern Morocco, region which became gradually more connected by centuries-long trade routes.

It is well known that the MaQqil tribes entered the Maghreb accompanying the Banu¯ Hilal (XI–XIII) and settled mainly on the outskirts of the latter's territory, specially the Sous ¯ and the region corresponding to current Mauritania. In eastern Morocco, the D ¯ w¯ı Mans.ur¯ <sup>5</sup> settled along the Moulouya River and the deserts of Tafilalt, from Taourirt—in northern Morocco—to the Draa Valley, as far as Sijilmassa (Ibn Khaldun 2011, p. 2361). La Chapelle (1930, p. 89) claims that they remained in Tafilalt until the nineteenth century living among other tribes under Berber rule.

In southwestern Morocco, the settlement of groups of MaQqil origin happened more lately, during the Saadian rule (XVI–XVII), groups such š-Š@bban¯ at¯ <sup>6</sup> and l@-Mnabha em- ¯ igrated from the Sous and established themselves on the territories of <sup>Q</sup>Abda and ZQir, on the Atlantic plains in central Morocco, but also at the surroundings of Marrakesh7 (Colin [1937] 1945, p. 224). Moreover, in 1765, the foundation of the port town of Essaouira (Mogador), on the limits between Chiadma (Arabic speaking) and Haha (Tachelhit speaking) territories, attracted peoples, not only from these two neighboring territories, but also from distinct parts of the Sous, and among them š-Š@bban¯ at and l ¯ @-Mnabha once again ¯ were attracted to the Atlantic plains taking part in the formation of Mogador's population (al-Kan¯ un¯ ¯ı 1932; ar-Ragrag¯ ¯ı 1935; as-Sus¯ ¯ı [1966] 2005; as.-S.idd¯ıq¯ı 1969).

Later in the nineteenth century, the flow of the trade of the Trans-Saharan route shifted westward to the Atlantic coast, due to the important role of the port of Essaouira8 for the international trade. The city became connected with southern Moroccan cities by routes with Akka and Guelmim9. In this way, Essaouira was connected indirectly to Tafilalt and West African regions10. In the second half of the nineteenth century (1860–70s), the greater portion of the West Africa trade comes into Morocco via Tindouf and Sous to Mogador (Dunn 1971, pp. 278–80). Caravans were moving between Essaouira, Tafilalt and sub-Saharan regions, connecting their populations who probably used Arabic as a lingua franca in commercial relations11. In the nineteenth, Essaouira used to receive annually one or two caravans composed of thousand camels and smaller caravans as well, trading export commodities—such as gum and ostrich feathers—but also gold and slaves for the local market (Dunn 1971, p. 271). As Lévy (1998, p. 13) points out, certainly linguistic exchanges took place due to contact of the caravans with local populations while passing by rural markets on their routes across the south.

The trans-Saharan slave trade was also very active by that time. El-Hamel (2013) shows that there was a continuous import of thousands of slaves into Morocco by wellestablished trade routes (Tindouf, Ijil and Twat). He estimates that, by the end of the nineteenth century, the total black population was of half a million people (pp. 245–46). According to him, many were sold in the markets of Fez, Mogador and Marrakesh (p. 251), and besides that, a part of the enslaved people from sub-Saharan Africa could be found in the sugar refineries near Essaouira, in Haha and Shishawa territories (p. 152).

It cannot be ignored that Essaouira and Tafilalt were connected, despite of the distance, by the caravans moving between the two regions due to the trans-Saharan trade. And this fact may be important to explain certain singular linguistic features in both localities.

#### **3. Southern Moroccan Features (Results)**

The following selected features may help understanding more deeply what set the southern dialects together or apart, according to the innovative or conservative nature of these traits.

#### *3.1. Retention of Diphthongs: /ăw/ and /ăy/*

In general, the southern varieties present the contraction of diphthongs /ăw/ > /u/¯ and /ăy/ > /¯ı/, as expected for Hilali or central-type varieties, described by ¯ Heath (2002, p. 9), such as for: **Essaouira** *l¯ıl* "night", *suq¯* "market", *s.¯ıf* "summer" (Francisco 2019b, p. 143), **Chiadma** *l-yuma ¯* "today", *z¯ıt* "oil" (Francisco 2019a, p. 5); **Skoura** *b¯ıd.* "eggs", *luz¯* "almonds", *ž¯ıb* "pocket" (Aguadé and Elyaacoubi 1995, p. 25); **Sous** Q*¯ın* "water spring", *fuq¯* "over, on" (Destaing 1937, p. 27). Sometimes diphthongs are accepted as variants in pharyngeal contexts, e.g., **Essaouira** Q*¯ıb* ~ Q*ăyb* "shame", *s.uf ¯* ~ *s.ăwf* "wool" (Francisco 2019b, p. 77).

In other southern varieties, the predominant feature above occurs along with the retention of diphthongs which can be realized phonetically as the vowels [o:] and [e:]—also represented by *o¯* and *e¯*—as found in: **Tafilalt** *lon¯* "color", *lel ~ l ¯ ¯ıl* "night", *rmăytu* "I threw it" (Behnstedt n.d., Notes sur le parler "bedouin" des Qr.ab S@bba¯h. , p. 3), *fok¯* "above, on" (Heath and Bar-Asher 1982, p. 46). For the rural area of Essaouira: **Chiadma** *zoz¯* <sup>12</sup> "plough drawn by oxen" (Francisco 2019b, p. 79), Q@*ndu koma ¯* "he has got a stack (of money)" (Francisco 2019a, p. 6), *nsăyt* "I forgot", *bg˙ăyt* "I want", *bnăyna* "we built", *dzad¯ ăyna* "we were born", *žăyna* "we came" (Francisco 2019b, p. 108). The diphthong in defective verbs is also preserved in H. assan¯ ¯ıya: *žăyna* "we came" (Cohen 1963, p. 110), *šrăyna* "we bought" (*ibid*, p. 102). The same feature is found in Saharan type dialects in neighbouring areas, as in southwest Algeria: **Saoura** *wen ~ weyn ¯* "where", *s. ot ~ ¯ s. owt* "voice" (Grand'Henry 1979, p. 215); **Mzab¯** *nsêit* "I forgot" (Grand'Henry 1976, p. 24), '*šrîna*~*šrêina'* "we bought" (ibid, p. 26).

Nevertheless, the retention is not attested in most of the southern dialects analyzed here. Even in the few dialects that it is attested, Chiadma and Tafilalt, the feature still occurs along with the reduction of diphthongs in /¯ı/ and /u/. The maintenance of diphthongs and ¯ allophones *o¯* and *e¯* in both localities does not appear to be a result of Trans-Saharan trade connections, given that the feature is absent from the urban Essaouira dialect. Therefore, the feature seems to be of Saharan origin, as attested by its occurrence in Saharan dialects, and might be an evidence of the nature of the Arab settlement in both localities.

#### *3.2. The Verbal Suffix -at (3f. perf.) ¯*

In Morocco, the suffix *-at¯* is a variant of *-*@*t* in the 3f. perf. conjugation of triliteral strong verbs. The suffix appears to be predominant in the majority of southern dialects: **Essaouira** *š*@*r.bat¯* "she drank" (Francisco 2019b, p. 96), **Essaouira (J)** *ok*Q*¯ıt* "happened" (3.f.) (Lévy 2009, p. 367), *s*@*r.b¯ıt,* "she drank" (Francisco 2022, in press)13; **Tafilalt** *<sup>š</sup>*@*r.bat¯* "she drank" (Behnstedt 2004, p. 55); **Sous** *x*@*ržat¯* "she went out", *h*@*rbat¯* "she ran away" (Destaing 1937, p. 7).

The absence of *-*@*t* in strong triliteral verbs seem to be characteristic of the southern dialects. Exceptions are found in **Skoura** *kt*@*bt* "she wrote" (Aguadé and Elyaacoubi 1995, p. 151) and **Sous (Houwara)**: *š*@*rb*@*t* "she drank" (Socin and Stumme 1894, p. 22). The same is found in the old data for Essaouira: *ž*@*br*@*t* "she found" (Socin 1893, p. 164), *d.* @*rb*@*t* "she hit" (p. 180)14. More recently, the suffix is seldomly attested, except for a unique occurrence in the Jewish dialect of **Essaouira (J)** *x*@*rz*@*t,* "she went out" (Lévy 2009, p. 368). The suffix is also found in **Tafilalt (J)** (Heath and Bar-Asher 1982, p. 64).

The prevailing opinion is that the occurrence of *-at¯* (3f. perf.), in Moroccan dialects, is due to the analogy with weak verbs (e.g., *mša¯t,* "she went") (Heath 2002, p. 223; Aguadé 2008, p. 291). Regarding the diffusion of the feature, urban centers—such as Casablanca, Meknes and Marrakesh—may play an important role in it. For instance, Aguadé interprets the occurrence of *-at¯* in **Settat** *xădma¯t,* "she worked" (Aguadé 2013, p. 4) as a convergence towards the Casablanca variety. In my opinion, regarding southern Morocco, Marrakesh may also have diffused the suffix in the region, given that it is well attested in the city, e.g., **Marrakesh** *s*@*m*Q*at¯* "she listened" (Sánchez 2014, p. 121).

It is not clear whether *-*@*t* spread earlier than *-at¯* . On one hand, the neighboring Saharan type varieties do not exhibit the ending *-at¯* , as one can attest in H. assan¯ ¯ıya: *kt*@*bt* "she wrote" (Cohen 1963, p. 91); or in Algerian Sahara in the Mzab region: ¯ *k*@*tb*@*t* "she wrote" (Grand'Henry 1976, p. 43). On the other hand, the ending *-at¯* is found in other parts of the Maghreb: **Eastern Libya** *ik'tib-at* "she wrote" (Owens 1984, p. 105). It is also found in the Maghreb neighbouring areas, like in West Sudanic15 *katabat* "she wrote" (Owens and Hassan 2009, p. 713).

The fact is that –*at¯* —in the Moroccan case—must consist of a conservative feature just like in other parts of the Arabic speaking world. The ending *-at*, with a short vowel, in strong verbs, is found in many eastern dialects, not only inside the Arabian Peninsula, but also outside of it in Bedouin-type dialects (Gaash 2013, p. 49).

#### *3.3. The Clitic -ki (2s.f.)*

The occurrence of the clitic *-ki* (2s.f.) is very common in semiverbs all over Morocco. Heath (2002, p. 242) confirms it, but he did not analyze the use of *-ki* in the possessive function.

In southern Morocco, apparently, we find it with possessive and object functions in two regions exclusively: **Tafilalt**: *š*@*ftki* "I saw you", *gannki ¯* "he told you", *r*@*žn¯ıki* "your feet" (Behnstedt 2004, p. 56), *b. b. aki ¯* "your father", *m*Q*aki ¯* "with you", *wuldki* "your son", *da¯r. ki* "your house", Q*andki* "you have", *šuftki* "I saw you", *gallki* "he told you" (Behnstedt n.d., Notes sur le parler "bedouin" des Qr.ab S@bba¯h. , p. 6); and **Essaouira** *xuki ¯* "your brother", *b. b. aki ¯* "your father", Q*ăndki ul¯ ˘ıyydat? ¯* "do you have children?", *ana k ¯* @*nt hna* Q*ăndki* "I was here at your place", *dyalki ¯* "yours", *nsuww ˘* @*lki* "I will ask you", *y*@*qd.* @*r. i*Q*aw¯ unki ˘* "he will be able to help you", *haki kt ¯ abki ¯* "here is your book" (Francisco 2019b, p. 164); and **Chiadma** *ib. b. aki ¯* "your father", *xuki ¯* "your brother", *m. m. wki* "your mother". The suffix *-ki* with possessive and object functions seems not to be attested in other southern localities though, like in the vernaculars of Marrakesh and Sous, for instance.

Regarding the origin of this feature, I claimed previously (cf. Francisco 2019b, p. 164) that the occurrence of *-ki* in the possessive function—in Essaouira and Tafilalt—resulted

probably from a morphological analogy with semiverbs, given that it is absent in Saharan type varieties such as H. assan¯ ¯ıya *nsak¯* @*nti* "he forgot you" (2f.) (Cohen 1963, p. 151) or the Mzab region variety in Algeria, which do not distinguish the gender, using only ¯ *-*@*k/-k* (Grand'Henry 1976, p. 67). However, the historical links between southern Morocco and the West African region could provide us with a new hypothesis. It is not unrealistic to think that *-ki* entered Morocco, and remained restricted to the south, due to the slave trade connecting Tafilalt to the sub-Saharan region, given that enslaved boys and girls were brought to Morocco from parts of Western Africa, such as Nigeria and Chad (El-Hamel 2013, pp. 130–31). Moreover, the clitic is found in West Sudanic *buyut-ki ¯* "your houses (f.)" (Owens and Hassan 2009, p. 712), being clearly a retention. Such hypothesis would deserve a more in-depth discussion though.

Regarding the exclusive occurrence of *-ki* in Essaouira and Tafilalt, it could be explained by the linguistic link resulted from the caravans of Arabic speakers which connected both localities.

#### *3.4. The Suffix -u (pl. imperf.) for Defective Verbs in -i*

This is a retention (Cl. Ar. \**yamšu-na ¯* > *y*@*mš-u*) in defective verbs ending in *-i* well attested in Saharan type dialects, e.g., in H. assan¯ ¯ıya *n*@*šru* "we buy" (Cohen 1963, p. 103), **Saoura** *imšu* "they go" (Grand'Henry 1979, p. 220), **Mzab¯** *y*@*mšu* "they go" (Grand'Henry 1976, p. 49). In southern Morocco, we find it in: **Sous (Houwara)** *ka-y˘ıbku* "they cry" (Socin and Stumme 1894, p. 16), along with **Sous** *ibn¯ıw* "they build" (Destaing 1937, p. 39); **Essaouira** *ta-y˘ıšru* "they buy", **Chiadma**: *ka-y˘ıžru* "they run" (Francisco 2019b, p. 103) along with the variant *-¯ıw* as well. Apparently, the suffix *-u* is not attested in other southern localities, such as for **Skoura** *t*@*mš¯ıw* "you go" (Aguadé and Elyaacoubi 1995, p. 48) and **Tafilalt** *g˙ady ¯ ¯ın n*@*mš¯ıw* "we will go" (Behnstedt 2004, p. 56), *tanu iz ¯ ¯ıw* "they were coming" (Heath and Bar-Asher 1982, p. 74).

In the south, this feature appears to be evidently of Saharan origin, but restricted to Sous and Essaouira, probably due to the settlement of Saharan dialects speaking tribes as mentioned in the Section 2.

#### *3.5. Future Preverb ba~bga˙*

The use of the perf. verb *ba~bga˙* "to want" with imperf. verbs to express the future consists of a structure predominant all over the south of Morocco, from the Atlantic strip to Tafilalt (Heath 2002, p. 217). It is found in **Marrakesh** *ba-y˘ıžri* "he will run" (Sánchez 2014, p. 182) and **Skoura** *b¯ıt nšufh ¯ um˘* "I will see them" (Aguadé and Elyaacoubi 1995, p. 86). In Essaouira, the verb *ba~bga˙* developed into an invariant particle *b(*@*)-* to express future: *f¯ın b*@*-tkun¯ g˙*@*dda?* "where will you be tomorrow?", *b-năhd.r.u daba ¯* Q*ăl-l*@*-bh. ăr.* "now we are going to talk about the sea", *g˙*@*dda b-y˘ıšru l-h. way¯* @*ž* "tomorrow they will buy clothes" (Francisco 2019b, p. 140). A similar particle occurs in **Sous (Houwara)** '*bunn˘ımši'* "I will go" (Socin and Stumme 1894, p. 54).

The particle is a Hilali feature attested in other parts of the Maghreb as well. The ¯ preverb *ba-* is found in the Sahara, in Algerian southwest: **Saoura** *ba-i*Q@*rr*@*s* "he will get married" (Grand'Henry 1979, p. 224); and the particle *b-* is also used to express the future in Bedouin Libyan dialects, e.g., **Al-Khums** *b-y*Q*aw¯* @*d* "he will repeat" (Benmoftah and Pereira 2017, p. 317).

Sánchez (2014, p. 183) mentions the occurrence of the future particle *ba-¯* in some dialects of Yemen, for which he suggests a common etymology with the verb *ba* "to want" in Marrakesh. Despite the fact that MaQqil tribes are assumed to have come from Yemen, the verb *bga˙* is employed in H. assan¯ ¯ıya to express intention only, not expressing the future (Taine-Cheikh 2004, p. 225). However, H. assan¯ ¯ıya does apply the same structure above with the verb *ido¯r.* "to want" as an auxiliary to express the future: *ido¯r. it.¯ıh.* "he is going to fall" (*ibid*, p. 224). This structure to express future is an innovation common to Hilali-Bedouin dialects in southern Morocco, but also in the Sahara and other parts of the ¯ Maghreb.

#### **4. Discussion**

Taine-Cheikh (2017, p. 26) proposed that the southern Moroccan dialects were similar to the Casablanca dialect according to a list of common features presented above. On one hand, the southern dialects exhibit traditional Hilali features, such as the realization [g] for ¯ \*q, but, on the other hand, they also exhibit the loss of interdentals, a distinctive trait of Bedouin dialects. Considering the difficulty to classify these dialects as Hilali, the selection ¯ of features by Taine-Cheikh (cf. Section 2.1) attempted to draw a group of southern dialects, but it was not able to single these varieties out, distinguishing them from other varieties in Morocco. To give a few examples, the following features cited above (cf. Section 2.1) are quite spread all over the country: the future particle *g˙adi ¯* ; the no gender differentiation for the 2s. clitic *-k;* and the ending *-¯ıw* for defective verbs (3pl.imperf).

More recent data, especially from Essaouira and Tafilalt, shed a new light on the reality of southern dialects. Comparing them with the well-known dialects of the region–Sous and Skoura–and also with the dialects of the Saharan neighbouring areas–H. assan¯ ¯ıya and the dialects of Algerian regions of Saoura and Mzab–demonstrated that the varieties of the ¯ southern region are not so homogenous as we thought previously. The findings revealed a Bedouin color for the southern area due to the occurrence of retentions and innovations, some of them comprehending Saharan variants.

The Saharan traits attested in the southern varieties co-occur with the variants presented by Taine-Cheikh (cf. Section 2.1). They are the following retentions: maintenance of diphthongs /ăw/, /ăy/ in pharyngealized and plain contexts, sometimes realized as [o] and [ ¯ e]; and the suffix ¯ *-u* (pl. imperf.) for defective verbs. Both features are spread in Bedouin dialects beyond the Moroccan borders. Nevertheless, in the south, the dialect of Skoura is an exception, not exhibiting these features. Curiously, regarding the suffix *-u*, there is no register of it for Tafilalt.

Concerning the conservative trait *-at¯* (3.f. perf.), the variant is well spread all over Morocco along with *-*@*t*, however, it proved to be dominant in the south, where the latter is seldomly registered nowadays, except for Skoura. Despite of the absence of *-at¯* , in H. assan¯ ¯ıya and other Saharan varieties of the region, it does seem to have a Bedouin origin, as the suffix is attested in other parts of the Maghreb (e.g., Eastern Libya) and also in neighbouring varieties, like West Sudanic. And even though the feature occurs in other parts of Morocco, it can be considered characteristic of southern dialects.

Another representative case is the verb *ba~bga˙* "to want" and the particles derived from it (*b-*, *ba-*) attached to imperfective verbs to express future. This feature consists of the single variant connecting all the southern dialects apparently, distinguishing them from the dialects of northern Morocco. Since the feature has reflexes on H. assan¯ ¯ıya and is attested in other varieties across the Maghreb, being registered even in Yemen, this innovation may be an evidence of a common Bedouin or Hilali origin for the southern dialects, along with ¯ traditional traits such as [g] for \*qaf and the occurrence of interdentals in a previous stage. ¯

Within southern dialects, the clitic *-ki* (2f.), in the possessive and object functions, builds a bridge between Essaouira and Tafilalt, not being attested in any other dialects across the whole country until now. The retention of the clitic restricted to these southern extremities, West and East, could be explained by the contact performed by caravans in the Trans-Saharan trade routes linking distant parts of the south with the Sahara and the Sub-Saharan Africa. This shared past in the south must have played an important role to the diffusion of the other features mentioned here as well.

#### **5. Conclusions**

Southern Moroccan varieties proved to exhibit more Bedouin features than previously thought. Colin (1986) tackled the Bedouin-Hilali issue, pointing out that some dialects ¯ could not be classified fully as Bedouin-type for not maintaining conservative features. That was the case of the southern Moroccan dialects, which were thought not to exhibit Saharan-type features. Here, linguistic findings revealed the opposite though.

In agreement with the traditional dialectology scholarship, the occurrence of Saharantype retentions in southern sites could be explained by the settlement of the MaQqil in the area as confirmed in historical sources. Nevertheless, it is important to note that the MaQqil tribe settled in Tafilalt and in the Atlantic strip with a gap of centuries. Furthermore, there is no linguistic evidence that the MaQqil tribe presented a dialectal unity–as the tribe was subdivided into distinct groups (*but.un¯* )—and, if so, that it preserved this unity through centuries until modern times. Therefore, populational movements caused by multiple factors, such as the Trans Saharan trade, famine crisis and epidemics which swept southern Morocco, should not be ignored when trying to explain the spread of certain variants.

The only innovation feature which seems to gather the bunch of southern dialects, linking them with other Bedouin varieties in Algerian Sahara and Libya, is the future construction *ba~bga˙* "to want" > *b-*/ *ba-* + imperf. Apparently, the origin of this feature goes back to the Yemeni particle *ba-¯* , nevertheless, it is not attested in H. assan¯ ¯ıya, though the dialect is associated to the MaQqil tribe, supposedly from Yemen. Despite of that, H. assan¯ ¯ıya attests a similar structure with the same function (e.g., *ido¯r. it.¯ıh.* , cf. Section 3.5), which may have developed after the former with *ba~bga > b-/ba- ˙* . Moreover, the occurrence of the structure (*b-*/ *ba- + imperf.*) in other parts of the Maghreb may warn us that not all the features in current Saharan varieties, especially H. assan¯ ¯ıya, may be representative of the "purest" Bedouin-type, or Hilali, neither in Morocco nor in the Maghreb. ¯

The occurrence of certain conservative features in certain parts of Morocco and the Maghreb also corroborates the previous argument. That is the case of the retention of the ending *-at¯* (3f. perf.) and the conservative clitic *-ki* (2f.) both absent from Saharan-type varieties.

The current linguistic situation of southern Moroccan dialects highlights the limitation of the 'Hilali' category, including the 'Ma ¯ <sup>Q</sup>qili' label, to deal with dialectal layers within Moroccan varieties. The difficulty in applying this category is due to the co-occurrence of features, or variants, of distinct origins in the current dialects. Trying to determine dialectal groups based on the Arabization waves does not prove to be sufficient anymore, given that speakers of distinct Arabic dialects have been in movement and contact for centuries in the area.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


#### **References**


as-Sus¯ ¯ı, Muh. ammad al-Muxtar. 2005. ¯ P *¯Il¯ıg, qad ˙ ¯ıman wa-h. ad¯ıt an*. Ar-Ribat¯.: Al-Mat.ba<sup>Q</sup>a al-Malik¯ıya. First published 1966.

*¯* az-Zayyan¯ ¯ı, <sup>P</sup>Abu l- Q ¯ asim bin ¯ <sup>P</sup>Ah. mad. 1886. *at-Turgum <sup>ˇ</sup> an al-mu ¯* Q *arib* Q *an duwal al-Mašriq wa-l-Magrib ˙* . Paris: Imprimerie Nationale. Behnstedt, Peter. 2004. Von an-' A¯ s.@r (Al-Qas.r) nach ¯ Igni (¯ Igli): Ein Vorbericht zu einigen arabischen Dialekten der Provinz @r-Raš¯ıd¯ıya

(Marokko). In *Approaches to Arabic Dialects: A Collection of Articles Presented to Manfred Woidich on the Occasion of His Sixtieth Birthday*. Edited by Martine Haak, Rudolf De Jong and Kees Versteegh. Leiden and Boston: Brill, pp. 47–65.

Behnstedt, Peter. n.d. Notes sur le parler "bédouin" des <sup>Q</sup>r.ab S@bbah¯ . (Tafilalt/Maroc).


Bouwman, Dinie. 2008. *Mali. EALL*. Leiden and Boston: Brill, vol. 3, pp. 135–41.


El-Hamel, Chouki. 2013. *Black Morocco: A History of Slavery, Race and Islam*. New York: Cambridge University Press.


Francisco, Felipe Benjamin. 2022. The Judeo-Arabic of Essaouira revisited. In *Semitic Dialects and Dialectology: Fieldwork—Community— Change*. Edited by Maciej Klimiuk. Heidelberg: Heidelberg University Publishing, in press.

Gaash, Amir. 2013. The Verbal and Nominal Feminine Endings -at and -it in Neo-Arabic. *Zeitschrift für arabische Linguistik* 57: 48–69. Grand'Henry, Jacques. 1976. *Les parlers arabes de la région du Mzab (Sahara alg ¯ érien)*. Leiden: Brill.

Grand'Henry, Jacques. 1979. Le parler arabe de la Saoura (Sud-ouest algérien). *Arabica* 26: 213–28. [CrossRef]

Heath, Jeffrey. 2002. *Jewish and Muslim Dialects of Moroccan Arabic*. London: RoutledgeCurzon.

Heath, Jeffrey, and Moshe Bar-Asher. 1982. A Judeo-Arabic Dialect of Tafilalt (Southeastern Morocco). *ZAL* 9: 32–78.

Ibn Khaldun, <sup>Q</sup>Abd ar-Rah. man bin Mu ¯ h. ammad. 2011. *Tar¯ ¯ıx Ibn Xaldun al-musamm ¯ a bi-kit ¯ ab al- ¯* <sup>Q</sup>*ibar wa-d¯ıwan al-mubtada ¯* <sup>P</sup> *wa-l-xabar f¯ı* <sup>P</sup>*ayyam al- ¯* <sup>Q</sup>*arab wa-l-*Q*agam wa-l-barbar wa man <sup>ˇ</sup>* <sup>Q</sup>*as¯. arahum min d¯ aw¯ı as.-s.ult.an al- ¯* <sup>P</sup>*akbar*. Beirut: Dar Ibn Hazim, vol. III.

Kampffmeyer, Georg. 1912. *Marokkanisch-Arabische Gespräche, im Dialekt von Casablanca*. Berlin: Druck und Verlag von Georg Reimer. La Chapelle, Frederic (de). 1930. Esquisse d'une histoire du Sahara occidental. *Hespéris* XI: 35–95.

Levtzion, Nehemia. 2000. Islam in the Bilad al-Sudan to 1800. In *The History of Islam in Africa*. Edited by Nehemia Levtzion and Randall L. Pouwels. Oxford: Ohio University Press, pp. 63–91.

Lévy, Simon. 1998. Problématique Historique du Processus d'arabisation au Maroc: Pour une histoire linguistique du Maroc. In *Peuplement et arabisation au Maghreb Occidental: Dialectologie et Histoire*. Edited by Jordi Aguadé, Patrice Cressier and Ángeles Vicente. Madrid and Zaragoza: Casa de Velázquez, Universidad de Zaragoza, pp. 11–26.

Lévy, Simon. 2009. *Parlers Arabes des Juifs du Maroc: Histoire, Sociolinguistique et Géographie Dialectale*. 'Estudios de Dialectología Árabe' 3. Zaragoza: Instituto de Estudios Islámicos y del Oriente Próximo.

Marçais, William. 1961. 'Comment l'Afrique du Nord a été arabisée'. In *Articles et Conférences*. Paris: Adrien-Maisonneuve, pp. 171–92. First published 1938.

Miège, Jean-Louis. 1981. Le commerce transsaharien au XIX<sup>e</sup> siècle. *Revue de l'Occident musulman et de la Méditerranée* 32: 93–119. [CrossRef]

Owens, Jonathan. 1984. *A Short Reference Grammar of Eastern Libyan Arabic*. Wiesbaden: Harrassowitz.

Owens, Jonathan, and Jidda Hassan. 2009. West Sudanic. *EALL* 4: 708–18.


Socin, Albert. 1893. 'Zum arabischen Dialekt von Marokko'. *Abhandlungen der philologisch-historischen Classe der Königlich-Sächsischen Gesellschaft der Wissenschaften* 14: 151–203.

Socin, Albert, and Hans Stumme. 1894. Der arabische Dialekt der Houwara des W ¯ ad S ¯ us in Marokko. In ¯ *Abhandlungen der philologischhistorischen Classe der Königlich-Sächsischen Gesellschaft der Wissenschaften 15*. Leipzig: Bei S. Hirzel.

Taine-Cheikh, Catherine. 2004. Le(s) futur(s) en arabe: Reflexions pour une typologie. *EDNA* 8: 215–38.

Taine-Cheikh, Catherine. 2017. La classification des parlers bédouins du Maghreb: Revisiter le classement traditionnel. In *Tunisian and Libyan Arabic Dialects: Common Trends, Recent Developments, Diachronic Aspects*. Edited by Veronika Ritt-Benmimoun. Zaragoza: Instituto de Estudios Islámicos y del Oriente Próximo, pp. 15–42.

## *Article* **Auditory and Acoustic Evidence for Palatalization of the Nasal Consonant in Cairene Arabic**

**Navdeep Sokhey**

Department of Modern and Classical Languages and Literatures, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA; navsokhey@vt.edu

**Abstract:** This paper introduces the palatalized nasal [n<sup>j</sup> ] as an allophonic realization of coronal /n/ in Cairene Arabic. The palatalized variants of the phonemes previously described in acoustic and sociolinguistic terms include the alveolar stops [t, d] and their pharyngealized counterparts [tޫ, dޫ] , which can be palatalized preceding the high, front vowel [i:]. While previous studies have anecdotally noted that the coronal nasal /n/ can undergo palatalization in the same environment, this variant has not been systematically investigated. Focusing on syllable-final /-ni:/ segments, I first use auditory measures to show that the palatalized variant occurs with some regularity (~50%) in the read speech of seven speakers of Cairene Arabic. Then, I provide acoustic evidence that this perceived difference significantly correlates with the difference in F2 values taken from the onset and midpoint of the vowel following the nasal consonant. There is also evidence of a lexical effect, such that borrowings exhibit less palatalization than non-borrowings. This study contributes data for the unexamined Cairene nasal and supports the likelihood of palatalization of coronals at the typological level.

**Keywords:** palatalization; nasal; Cairene Arabic; sociophonetics; acoustic phonetics

#### **1. Introduction**

This paper presents an acoustic and auditory study of palatalization in the nasal consonant /n/ in Cairene Arabic (CA). While the mechanics of palatalization have not been widely studied across Arabic dialects, the few studies that have explored this phenomenon have centered on the sociolinguistics, phonology and phonetics of the palatalized stops, /t, d/ and the palatalization of their pharyngealized counterparts (Haeri 1996a; Youssef 2013). There exist some informal observations that the Cairene coronal nasal can undergo palatalization, but no work has systematically examined this sound.

Palatalization of coronals generally involves fronting and heightening articulatory gestures. Thus, a raised F2 and lowered F1 is expected due to the fronting and raising gestures associated with a shortened front cavity. Previous acoustic literature on palatalization have used F2-F1 as a possible cue in distinguishing between palatalized and non-palatalized consonants (Kochetov 2017; Iskarous and Kavitskaya 2010; Purcell 1979). However, this measure can be problematic when studying nasals, as antiformants are known to have an obscuring effect, rendering F1 measures unusable. Studies concerned with nasals have alternatively examined duration, intensity, or nasal murmurs, and have found F2 transition alone to be a highly effective cue for establishing palatalization (Recasens 1983; Harding and Meyer 2003; Kerdpol 2012). Upon examining palatalization in nasals, a linear relationship was established between the formant frequencies at the vowel onset and midpoint, in which F2 at vowel onset varies in relation to the coarticulatorily produced vowel (Sussman et al. 1991; following Lindblom 1963b). Due to F2's reliability in determining plain vs. palatalized nasals in particular, this paper utilizes two points along the F2 transition—one at vowel onset and the other at vowel midpoint/steady-state—and provides a method to establish palatalization in the CA nasal. Auditorily coded tokens are further analyzed in comparison to triangulate the acoustic findings, therefore providing two types of results:

**Citation:** Sokhey, Navdeep. 2021. Auditory and Acoustic Evidence for Palatalization of the Nasal Consonant in Cairene Arabic. *Languages* 6: 190. https://doi.org/10.3390/ languages6040190

Academic Editors: Simone Bettega and Roberta Morano

Received: 9 July 2021 Accepted: 5 November 2021 Published: 18 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

one that indicates the strength/degree of palatalization measured in continuous terms, and the other in terms of frequency or proportions palatalized. This paper provides the first detailed phonetic descriptions of the Cairene [n<sup>j</sup> ].

The participants in this study are seven native speakers from Cairo in their early 20 s, and the data include word-final /ni:/ segments elicited from word-list readings. Results show that using formant transition (F2Onset-F2Midvowel) as a measure effectively distinguishes between plain [n] and palatalized [n<sup>j</sup> ]. By contributing new acoustic data for an unexamined variant in Arabic, my results provide a foundation for those interested in conducting further sociolinguistic and comparative work on palatalization across Arabic dialects.

#### *1.1. Articulatory Gestures and the Phenomenon of Palatalization*

The linguistic phenomenon of palatalization is not uncommon among the languages of the world, let alone within the dialects of Arabic. As a speech process involving the production of a secondary articulation, palatalization entails shifting the primary place of articulation towards the palatal region (Kochetov 2011), or the "superimposition of a raising of the front of the tongue toward a position similar to that for **i** on a primary gesture" (Ladefoged and Maddieson 1996, pp. 363–65). In coronal primary articulations, this involves a displacement of the tongue surface, which would have been realized to support movement of the tongue-tip in the non-palatalized production, to a slightly different primary constriction location (Ladefoged and Maddieson 1996, p. 365).

Two general types of palatalization are often discussed: secondary palatalization (Bateman 2007; Hall 2000; Kochetov 2011) and full palatalization (Bateman 2007). Secondary palatalization, also referred to as "tongue-raising" (Bhat 1978), refers to the addition of a secondary, palatal articulation without changing the initial place of articulation, such as [t → t j ] (Bateman 2011, p. 589). This type is extremely common in the labial, coronal and dorsal places in many languages of the world (Kochetov 2011). Full palatalization (Bateman 2007), can include palatalization to a posterior coronal and to an anterior coronal (Kochetov 2011). A shift to the posterior coronal may result in a non-sibilant sound, e.g., [t, <sup>k</sup> <sup>→</sup> c], or a sibilant sound, e.g., [t, k ĺ t<sup>ݕ</sup> [. A shift to the anterior coronal can result in a non-sibilant sound, e.g., [p, k → t], or a sibilant, which is rare, e.g., [p, k → ts], and [t → ts], which is relatively more common (Kochetov 2011, p. 1671).

The most likely phonetic triggers of palatalization are the high front vowel /i/ and palatal glide /j/ (Chen 1973; Bhat 1978; Hall 2000; Hall and Hamann 2006; Hall et al. 2006; Bateman 2007, 2011, p. 596), followed 'at a considerable distance by mid front vowels' (Kochetov 2011, p. 1672). The acoustic similarities between the high front vowel and the palatal glide play a role, but it is also worth noting that different consonants in various languages have been found to have different triggers. At the typological level, however, Kochetov noted the dependencies between triggers and targets, in which coronals are commonly targeted by high vocoids and dorsals by /i/ and other front vowels (Kochetov 2011, p. 7). Bateman noted that if there were only one vowel trigger of palatalization in a language, that vowel should be /i/ due both to its high and front qualities (Kochetov 2011). These findings aid in better understanding palatalization in Cairene Arabic, as the triggers in this language variety follow the aforementioned pattern (discussed in the next section).

In describing the articulatory gesture of palatalization, it is recognized that co-articulation has to do with the occurrences of two different articulations at the same time. (Catford 1988, p. 106), and that palatalization thus occurs as a type of coarticulation. In CA, palatalization occurs as phonetic coarticulation, as opposed to a phonemic shift (Youssef 2013). We will thus henceforth treat palatalization of /n/ as a gradient, phonetic feature.

As the data analyzed in this study all contain /n/ in the syllable-final /ni/ position, the articulatory gestures involved are necessarily unpacked. First, to produce the nasal /n/, the soft palate is lowered, and there is a complete closure in the mouth: between tongue-tip and teeth or teeth-ridge for [n], so that all airflow is shunted through the nose (Catford 1988, p. 74). The realization can be apico- and lamino- articulations against the dental zone

and against the front and back of the alveolar zone (Catford 1988, p. 82). Secondly, on producing the high front [i, i:] vowel, the Jones' system of a "vowel limit" posits that since the tongue is tense and the dorsal surface is pushed close enough to the hard palate, there is a point at which the approximant [i] turns into a palatal fricative [ݯ) [Catford 1988, p. 125). In other words, if the vowel is 'high enough,' it will ultimately result in a palatal-like production. Articulatorily speaking, [i] and [j] have an identical starting point, and "the highest point of the tongue in front vocoids lies on the front of the tongue, underneath the palatal zone" (Laver and John 1994, pp. 276–77). As the CA vowel [i:] is generally high and tense, it is feasible and likely for palatalization to occur as an assimilatory process (as documented in Youssef 2013, 2015).

During consonant production of the /ni/ segment, the tongue dorsum height and dorsopalatal contact size in the nasal change as a function of the adjacent vowel in the progression [i] > [u] > [a] (Recasens 1999, p. 89; Recasens 1984). As such, on contact with /i/, which is a vowel that requires the raising and fronting of the tongue, /n/ shifts from a coronal to a lamino-alveolar consonant (Recasens 1999, pp. 88–89). This results in a larger contact surface between the tongue and the palate.

Similarly, during palatalization, dentoalveolars such as /n/ undergo tongue dorsum raising and fronting, causing the palatalized coronal to become lamino-alveolar. Thus, there appears to be some overlap in the coarticulatory gestures for producing both the plain and palatalized /ni/ segments. Given the explanation above on Jones' vowel limit, we recognize that the transition from a 'regular' coarticulatory effect of the following vowel to one that is palatalized is gradient, and this is reflected in the auditory coding process, which recognizes that what is counted as palatalized /n/ can be wide ranging (see Section 2.2).

While the displacement of the tongue can have many acoustic and auditory consequences, the observation of the data presented in this study is one of a /j/-like quality in the release of /n/ into /i/, suggesting palatalization, which is further demonstrated through F2-raising in the acoustic analyses.

On coarticulation in VCV contexts, there is evidence that vocalic anticipation is blocked when [i] contributes to the raising of the tongue dorsum in [n] (Recasens 1999, p. 99). Due to the increase in tongue-dorsum constraint when producing /n/, C-to-V carryover (leftto-right) effects are said to be more prominent for [ini] than for [ana] (Recasens 1999, p. 99). However, what is observed to be occurring in CA contradicts this, as it indicates anticipatory (right-to-left) coarticulation, with the influence of /i/ on the preceding consonants. Bladon and Al-Bamerni (1976, p. 148) note that anticipatory coarticulation occurs whenever an articulator is free to anticipate later segments (following Daniloff and Hammarberg 1973), implying a high-level encoding process of scanning ahead, or due to postulating the unit of speech encoding to be an articulatory syllable (consisting of a CV sequence). The triggering effect of /i/ in CA can thus be viewed as anticipatory coarticulation.

Additionally, relevant to our discussion of nasals is their observation by early Arab and Muslim phoneticians. According to Al-Khalil (d.175/791 in Darwish 1967)1 and S¯ıbawayhi (d.177/793)2, the nasals /n, m/ were described as containing nasality (Bakalla 1981, p. 286) and as prone to assimilation, a phenomenon that is widely observed to be intrinsic to the nasal class (Bakalla 1981, p. 286). According to S¯ıbawayhi, the Arabic nasal sounds are produced in a similar way to the modern phonetic description of the nasals, with the complete closure of the air in the oral cavity, and are further described as *munfatih.* , or non-velarized, while Ibn Jinni (Ibn Jinni 1954, d.392/1002 in Al Halabi) notes they are *munkhafid.* , or with lowering of the tongue body. S¯ıbawayhi discusses *'ikhfa'¯* or hidden, *m* and *n*, referring to homorganic assimilation in place of articulation of n to the following consonant, e.g., *man ja'a ¯* <sup>→</sup> *maѪja'a ¯* (Bakalla 1981, p. 290). This type of assimilation is also present in the observation on palatalization in cases where *n* is followed by the approximant *y* [j], observed by Al-Saqqaf (1999) and Haeri (1996a); below, see Table 1.


**Table 1.** Triggers of palatalization in Cairene (/t, d/ from Haeri 1996a, p. 51; Youssef 2015, pp. 25–27; /n/ from my own informal observations).

<sup>a</sup> While Haeri (1996a) found palatalization in these environments, Youssef (2015) found them to block or lack WP.

While nasality is an accompanying feature that, similar to palatalization, involves lowering of the soft palate so that the air stream passes through the nasal cavity as well as through the oral cavity (Ladefoged and Maddieson 1996, p. 131), the acoustic perception in this study is of palatalization: Coders, who were trained phoneticians, searched specifically for /j/-like qualities in the syllable-final /ni/ segment. Acoustically, while an F1 bandwidth increase is indicative of nasalization, the acoustic results are difficult to interpret (Pruthi and Espy-Wilson 2007) and aerodynamic measures (not employed in this study) are better at capturing these effects. While nasalization is worthy of further examination in a more detailed study of nasals in Arabic, it is not discussed further here, as our scope is limited to the process of palatalization on one of multiple CA coronals undergoing this phenomenon. This paper establishes a premise for further study of the potential spread of palatalization onto other consonants not previously discussed.

#### *1.2. Palatalization in Cairene Arabic*

In CA, the palatalization of /t, d/ stops and their pharyngealized counterparts have been examined by Bhat (1978), Haeri (1996a) and Youssef (2013). Haeri noted two types of palatalization in Cairene, one of which is termed weak palatalization (WP), which refers to the secondary palatalization described above. The other is termed strong palatalization (SP), and refers to full palatalization. In representing the auditory effects, WP in Cairene Arabic can be represented in IPA as [t, tޫ → t j ], and [d, <sup>d</sup>ޫ → <sup>d</sup><sup>j</sup> ], while SP can be described as affricates: [t, tޫ ĺ tݕ [ , and [d, dޫ <sup>ĺ</sup> dݤ[.

Weak and strong palatalization in CA are triggered when stops are followed by the palatal glide /j/, long /i:/, word-final /i/, as well as by the phonetically lower wordinternal or epenthetic short /i/or by the long mid vowel /ee/ (Haeri 1996a; Youssef 2013). The environments listed in Table 1 were described by Haeri and Youssef as the observed conditions for palatalization in Cairene.

In addition to the aforementioned coronal stops, Geenberg briefly observed in her study on palatalized stops that the Cairene coronal nasal may also undergo palatalization (Geenberg 2012, p. 21). Al-Saqqaf's descriptive work on Hadramawti Arabic (Al-Saqqaf 1999) again briefly mentioned the palatalized Cairene /n/ as a comparative point to Yemeni, and noted that the palatalized Hadramawti /n/ is not limited to the high-front vowel environment, unlike in Cairene. As he noted, "n, which escaped Haeri's attention (Haeri 1992, p. 171), is also among the consonants that become palatalized in the environment of i

or ¯ı in, e.g., Ar. women's speech, e.g., inti [in݄tݕi] 'you' f.s., ya'ni [jæݧ݄i] 'it means; I mean'" (Al-Saqqaf 1999, p. 95). However, neither Al-Saqqaf nor Geenberg undertook acoustic analyses of the palatalized nasal, despite its presence in multiple speech communities.

The triggers described for WP are relevant to that of the palatalized /n/, as it was observed that palatalized nasals occur in all of the environments for WP, but not necessarily for SP (Geenberg 2012). This is congruent with my own informal observations, although similar to Youssef, I did not observe palatalized [n<sup>j</sup> ] in all environments listed by Haeri (see Table 1 for [n<sup>j</sup> ] triggers). Grammatically, final [i] in Cairene has several roles: a nounderived adjective (e.g., ݦamrikæ:n-i:, 'American'), and the first-person possessive or object pronoun (e.g., ݦibn-i, 'my son'; istannu:-ni, 'wait (2nd, pl.) for me!'). Word-final /ni:/ may also occur in names (e.g., hæ:ni, *h*¯ osni), in other common words such as tæ:ni, 'again,' and extends to borrowed English words such as 'any,' and 'funny,' which commonly occur in the speech of educated Cairenes. Though a gradient, phonetic feature, the palatalization observed in CA is assimilatory and contains an anticipatory coarticulatory effect, since the consonants become more similar in place of articulation to the following vocoids (see Kochetov and Alderete 2011).

#### *1.3. Acoustics of Palatalization*

Acoustically speaking, F2 formant transitions have been widely used as cues to determining nasal places of articulation. Palatalization in particular is often more apparent at the consonantal release than at the formation of the primary constriction, with a higher F2 value at the release (Ladefoged and Maddieson 1996, pp. 363–64). In a study comparing the interactive effects of nasal murmurs, transitions, and release as possible cues for place of articulation, Recasens (1983) noted that examining formant transitions proved useful in determining the palatal nasal's place of articulation, while the nasal murmur was not a useful cue for identifying this sound. It should be noted that the F2 transition may not provide a sufficient place cue for other types of nasal. Bilabial and velar nasals, for example, may rely on other features such as the nasal murmur or quality of the nasal release instead of, or in addition to, using F2 as a place cue (Recasens 1983). In examining formant transitions in the event of coarticulation, formant shapes will vary according to the surrounding vowels (Öhman 1966), but will generally be directed towards the same 'locus' or 'juncture' between the vowel and the sonorant, although F2 loci are not invariant in natural speech (Fant 1973; Kewley-Port 1982; Lehiste and Peterson 1961; Öhman 1966).

In describing the palatalized stops [t, d, tޫ, dޫ] in Cairene, the typical raising/fronting of F2 in these palatalized stops were seen in Youssef's acoustic data (Youssef 2015; from a 38-year-old female speaker), with F2 raising by approx. 390–460 Hz in the palatalized stops compared to their plain counterparts. Note that both the pharyngealized and plain stops can undergo palatalization in Cairene, and that the pharyngealized stops de-pharyngealize, or weaken in their pharyngealized quality, via fronting (Haeri 1996a; Youssef 2015).

Assuming that fronting is a necessary accompanying articulatory gesture for palatalization in CA, and for the above stated reasons which prevent the ability to use F1 as a measure for the nasal, this paper makes use of F2 as a cue for tracing palatalization in the presented data generated from word-list readings. The following sections define the auditory and acoustic measures used to triangulate the study of palatalized [n<sup>j</sup> ] in CA, and proceeds with discussing the implications of the main findings and future directions on the study of this phenomenon.

#### **2. Materials and Methods**

#### *2.1. Participants, Data Collection and Selection*

Speech data gathered from 7 participants were used for the acoustic analyses in this section. They include three women and four men aged 20–27 who are native Egyptian colloquial speakers residing in the greater Cairo area at the time of data collection (month of July 2014; Due to the increasingly unstable political situation in Egypt, it became and remains nearly impossible to do social science fieldwork in the country). All speakers were born, raised, and either attended or completed private or public university in Cairo.

For the purpose of eliciting controlled, comparable data that can be efficiently measured and analyzed, all speakers were given a wordlist written in Arabic script in dialectal spelling (e.g., -  - ). Of the 40 words, the 20 target words contained the coronal nasal preceding the high front vowel in the word-final /-ni:/ environment, which reflects the phonetic condition for palatalization in CA. A few tokens were noisy and produced unclear spectrograms, and were thus omitted. This yielded a total of 138 target tokens containing word-final /-ni:/ that were analyzed. The preceding vowel was not controlled for, but the words selected were items that I had informally observed to variably contain palatalization. To minimize overstressing/overemphasis of items containing the word-final /-ni:/ 20, other filler words that did not contain the /-ni:/ segment were included alongside the target items. Among the filler words, 6 tokens containing the nasal in the /-na/ word-final environment were produced, as well as 3 nasals in the word-medial /-ne-/ environment, while the rest contained no /n/ consonant at all. Although a small-scale study, this paper provides a detailed work to support the anecdotal evidence observed by the aforementioned scholars of variation in the variable (n).

Participants were recorded in spaces with as minimal noise as possible, and took place in either my home or the participant's home, or in a rented meeting room at a local cafe. However, some street noises that permeate the bustling city of Cairo could not be entirely avoided. This is a common challenge for fieldworkers in greater Cairo, where quiet, soundproof recording studios belonging to institutions, if available, have restricted access and are not always practical or possible to use. The speakers were recorded in wav format using a Zoom H1 Recorder (48 kHz) with an external cardioid lavalier microphone (SP-CMC-2), which they were instructed to hold 5–10 inches from their mouths.

#### *2.2. Auditory Coding and Acoustic Measurements*

An auditory coding method was first employed by the author, and 75% of the data was coded by two other linguists: a sociophonetician who is not a speaker of Arabic and a linguist who is a speaker of Cairene Arabic with familiarity of phonetic variation in Arabic dialects. The tokens are coded as 'n' for the plain, unpalatalized nasal, or 'nj' for tokens that contained any degree of palatalization (from lighter to stronger degrees of palatalization). Recognizing that palatalization is a gradient, phonetic feature in CA, the coding originally allowed for three categories: 'non-palatalized,' 'somewhat palatalized,' and 'strongly palatalized,' but the last two categories were collapsed into a larger 'palatalized' category for analysis. The two speakers of Arabic coded the data with 61% agreement, a point further discussed later in this paper. When these two coders did not agree on a token, the coding of the third, non-Arabic speaking phonetician was used in a tie-breaker system to code that token. These auditorily coded items are labeled Auditory Code in the statistical model.

The acoustic measurements used to determine the cue to palatalization of the nasal in this study are performed by obtaining the frequency value of F2 by hand, in Praat, measured in hertz (Hz), at two points in the word-final /-ni:/ segment: one point at the release of the nasal murmur into the following vowel (coded: F2Onset), and another at the midpoint of the same vowel (coded: F2Midpoint). For each word, the F2Midpoint was subtracted from F2Onset, and the resulting value is henceforth referred to as "F2Diff". This follows Lindblom (1963a, 1963b), Gibson and Ohde (2007) and Sussman et al. (1991), who found that comparing the formant transition between these two points served as a useful cue in distinguishing between palatal/palatalized and plain alveolars. A larger, negative F2Diff value indicates a steeper, upward CV transition, while a smaller value indicates a flatter, upward transition. To visually illustrate this, the two points measured (F2Onset and F2Midpoint) are marked in each spectrogram in Figure 1, which shows spectrograms from two speakers: one non-palatalizer (left), and one palatalizer (right), saying [hæ:ni:] 'name, *Hanny*'. The left, non-palatalized spectrogram shows F2 coming out of the 'n' closure at a lower point and heightening rapidly and into the steady-state/midpoint of the [i:] vowel. The right, palatalized spectrogram, however, shows F2 starting at a higher point at [i:] onset, and shows little to no upward transition into the vowel midpoint. In the acoustic analysis, taking the difference between the two points measured in each vowel allows for an alternative way of normalizing between vocal tract length, so no additional vowel normalization techniques were employed.

**Figure 1.** Spectrograms from two women: one non-palatalizer (**left**), showing a lower F2 at the 'i' vowel onset, and one palatalizer (**right**), showing higher F2 at onset, saying [hæ:ni] 'male name, Hanny'.

#### *2.3. Statistical Analyses*

Using the package AFEX, a wrapper for lmer in R (Singmann et al. 2015; R Core Team 2013), the data were first fitted with the following linear mixed effects regression model: F2Diff~AuditoryCode\*Word+(1|speaker). The dependent variable F2Diff is the subtracted value of F2Onset-F2Midpoint, and is expressed as a continuous variable (Hz). The fixed effects are the auditorily coded tokens, coded as AuditoryCode (categorically coded: n/nj) and word (categorically coded: 20 levels). The variable speaker was included as a random intercept to account for individual variation.

#### **3. Results**

#### *3.1. Auditory Results*

The total number of auditorily coded tokens include 66 tokens coded as palatalized [nj ], and 72 tokens coded as plain [n], yielding a total of 47.8% palatalized tokens (Table 2). This is evidence in itself that the palatalized variant is a robust realization of /n/ before /i:/ in CA. Table 2 shows the breakdown of (nj) versus (n) codes by individual speaker. While two of the women show categorical realizations (interestingly, in different directions), most participants produce both variants, and averaging across participants, there is not a compelling difference based on speaker gender (41.67% for women compared to 53.63% for men) in this data pool.

**Table 2.** Token count and percent palatalized by speaker and gender.


Some of the intra-speaker variation may be driven by lexical effects. Table 3 contains the proportion of palatalization in five of the seven speakers (two speakers were omitted as they were categorically either a palatalizer or non-palatalizer, revealing no by-word variation). The table and visualized data in Figure 2 show that the words produced with the fewest palatalized tokens—i.e., only 1 instance of (nj) code in each—were *ya3ni* [jaݧni], 'I mean/meaning,' which contains the voiced pharyngeal fricative known to have a formantlowering effect, and the borrowed words (from English) *funny* and *any.* The word with the

highest proportion of palatalized tokens is *kallimiini* 'call me, 2nd, f.sg,' (80%; Table 3). This is unusual, as other words with final /i:ni/ syllables should have had similar coarticulatory patterns, but this is not the case in the current findings. It is otherwise not immediately clear what unites the words that were or were not frequently palatalized, but English borrowings certainly seem to be among the least palatalized.

Grouped by context, words ending in words ending in /i:ni/ (*kallimiini, istanniini, warriini, sallimiini*) are somewhat more likely to be palatalized (60–80%), compared to other contexts, such as the /u:ni/ (*istannuuni, bitHibbuuni, biiHibbuuni*), /ani/, /a:ni/ and /Cni/ environments (40–60%). The geminate /nn/ (*mistanni, inni*) and geminate + epenthetic /bb-I/ groups (*teHibbeni, yeHibbeni*) are somewhat less likely to be palatalized (40%) than all other contexts.

#### *3.2. Acoustic Results*

Based on the raw token counts above and thus following the methodology from the auditory analysis, the two speakers who categorically produced either (n) or (nj) 100% of the time were omitted in the acoustic analysis. This controls for any influential points that would have caused errors when examining lexical variation.

Upon examining F2Diff, it is apparent that these values distinguish tokens we heard as plain (n) vs. palatalized (nj). This is displayed in Figure 3, which demonstrates less of a difference between the two points measured in (nj), since the F2Onset is starting very high and barely moves to reach the F2Midpoint height of the /i:/ vowel. Contrarily, in plain (n), F2Onset starts at a lower point and moves higher into the vowel midpoint, so there is a greater height difference. Thus, a main effect of Auditory Code (n, nj) on F2Diff is observed (F = 4.06, df = 1, 112.67, *p* = 0.01, N = 99). The distinguishing cue appears to be around the 100 Hz mark, indicating that if F2Diff is less than 100 Hz, the auditory quality is likely palatalized, while an F2Diff greater than 100 Hz covaries with an auditorily non-palatalized nasal. The average F2Diff is −62.4 Hz for palatalized codes, and −209.5 Hz for the non-palatalized ones.


**Table 3.** Token count and percent palatalized by word (includes five of seven speakers); see Appendix A for gloss.

**Figure 3.** Plot of F2Diff (in Hz) by Auditory Code (n, nj) in all speakers. N = 138.

#### **4. Discussion**

This study was motivated by the anecdotal evidence of palatalized [n<sup>j</sup> ] as an allophone of [n] and the absence of any acoustic or auditory study of this sound. The acoustic and auditory analyses on the speech productions of the seven speakers presented here indicate that the two types of production exist and are acoustically distinct from one another in CA.

The first main effect found for Auditory Code on the dependent variable F2Diff (the subtracted value of F2Onset-F2Midpoint) contributes to works surrounding the palatalized nasal in a few ways. First, it suggests that using auditory coding as an approach to measuring palatalized /n/ is reliable, as it correlates with the acoustic measure. This finding, along with previous informal commentaries, confirms that the list of palatalized consonants in Cairene Arabic must be expanded beyond the /t, d/ stops to include the palatalized nasal. Additionally, the linear relationship found in the points utilized in the F2Diff measure also confirm the methodology proposed by Sussman et al. (1991) and Lindblom (1963a) for analyzing nasals, and further supports the use of linear regression as a method of analysis.

The findings from the auditory coding revealed a rater agreement of 61%, which indicates that while auditory coding was effective, palatalized [n<sup>j</sup> ] can be difficult for listeners to code. This may be an effect of perceptual compensation—a type of perceptual bias that ordinarily leads listeners to ignore, or correct for, coarticulatory effects (Garrett and Johnson 2013). It is possible that coders ignored palatalization in instances where the coarticulation was relatively milder, leading to some discrepancies in the coding process. Despite the difficulty in auditorily coding this sound, F2Diff remained effective in distinguishing (n) from (nj) codes, which suggests that the triangulated method employed can be used to measure the degree of palatalization in future studies.

The effect on borrowed words was unexpected, since palatalization had been previously observed in my informal observations on both borrowed words included in this study (particularly in *funny*) in casual speech from speakers with similar educational backgrounds to the participants. Yet, it is known that loanwords can be realized with native or non-native sounds and that topic, speaker- and word-specific sociolinguistic factors can determine the selection of one variant over another (Hashimoto 2019). Some factors that can affect this selection include level of bilingualism (Poplack and Sankoff 1984), degree of

linguistic integration (Haugen 1950), and language dominance (Aktürk-Drake 2015, 2017). More recently, Hashimoto's work on tap [ݐ[ -borrowing from Maori into New Zealand ¯ English uses an exemplar-based approach (Pierrehumbert 2001; Docherty and Foulkes 2014) which posits that exemplars with native sounds and those with non-native sounds are represented in the cognitive system of a borrower and updated based on linguistic experience (Hashimoto 2019). Relevant to our discussion is the idea that exemplars with non-native sounds are stored in relation to a social category associated with the source language and its culture (Hashimoto 2019). In this study, it is possible that the context of the word-list reading task had an effect and, intersecting with the participants' mental representations of the borrowed words, in turn elicited plain [n] as the appropriate selection for the task. Future studies may be interested in further examining the effects of palatalization across speaking contexts, or specifically on loanwords, at greater length.

As for the word *kallimiini* containing the highest proportion of palatalization in the auditory coding, without similar coarticulatory effects in other words from the same environment, the motivation is unclear. There may be socially motivated reasons related to the affective quality of this particular word, potentially combined with sound symbolism. As exemplified in Japanese, a type of palatalization exists that is not phonologically conditioned, but rather contains an iconic function, and is linked with "smallness", "childishness" or "affection" (Nichols 1971; Ferguson; Ohala 1994). Such types of "expressive palatalization" occur cross-linguistically in sound symbolism, diminutive morphology, hypocoristics, and in "babytalk" (Kochetov and Alderete 2011, p. 346). A sociolinguistic perception study on palatalization of /n/ in CA would be enlightening.

Another factor is the prosodic position of the final morpheme /ni/. The list of words analyzed include words with stress systematically placed on the penultimate, which has an effect of articulatory reduction: The morpheme /ni/ is phonologically /ni:/ with a long vowel, which undergoes vowel temporal reduction, resulting in an undershoot (Lindblom 1963b, pp. 1776–79). In languages with heavy stress, vowel reduction is a characteristic feature, especially in weakly stressed syllables (Lindblom 1963b, p. 1773). In CA, wordfinal vowels in open syllables are never stressed, except in monosyllables such as *di* 'this, f.' (Youssef 2013, p. 242). Youssef further recognizes the syllable-final reduction in his observation that "Short /i/ triggers palatalization only when it is word final", e.g., *hҼidj i* 'he became satisfied,' and *nabҼaӃtj i* 'vegetarian' (Youssef 2013). The results observed in this paper follow this observation, and support the finding that syllable-final undershooting does not prevent the occurrence of palatalization in CA.

Additionally, as stated above, the high-front vowel /i/ is reported to be the main trigger of palatalization in CA (see Table 1). As Youssef's findings on weak palatalization of the stops /t, d/ revealed, WP is observed to occur as a phonetic co-articulatory effect of following /i/, since the articulation of the target consonant is affected by the high and front position of the tongue in the production of the following vowel (Youssef 2015). It is further reported that the vowel height of /i/ is a distinguishing articulatory feature of CA, and that there exists a vowel hierarchy in which long [i:] is higher than short [i], and word-final [i] is higher than non-final [i] (Haeri 1996a, p. 57; Youssef 2015).

As well, the auditory results show that words ending in /i:ni/ are somewhat more likely to be palatalized than words ending in /u:ni/, /ani/, /a:ni/ and/ Cni/. This supports the idea of palatalization as an assimilatory process in which surrounding vocoids act as triggers (Youssef 2015; Kochetov and Alderete 2011): in this case, having an i:Ci syllable somewhat increased the likelihood of palatalization.

My data on word-final /-ni:/ show that palatalization occurs nearly 50% of the time in this environment, which is fairly similar to the probability of palatalization Haeri found in the /t, d/ stops word-finally (63%, Haeri 1996b, p. 58). This probability is second to palatalization in the *stop + j* environment (i.e., a stop followed explicitly by the [j] glide and assimilating to its feature: *nadya* [nad.ja → nadj .ja] 'name, Nadia') in Haeri's data, with a proportion of 68% (Haeri 1996b, p. 58). If we assume that palatalization of/t, d/occurs more frequently than /n/ (Sokhey 2015), and that the two have similar phonological conditions (see Table 1), then the fact that my data show a proportion of palatalized /n/ that is just below Haeri's data on /t, d/ support the hypothesis that the word-final environment is one of the most likely triggers of palatalization of /n/. At the typological level, my findings further support the notion that coronals are the most common sounds to undergo at least secondary palatalization in the coronal range (Bateman 2011).

The lower proportion of palatalization observed in /n/ compared to /t, d/ (based on Haeri's and my data) further begs the question of whether a change in progress is taking place, whereby palatalization is spreading from /t, d/ to /n/. Sokhey (2015) hypothesized this based on synchronic, auditorily coded /t, d/ and /n/ tokens, but a historical study tracing the emergence of the palatalized nasal, and/or an acoustic study on the relationship between the palatalized nasal and stops would be further revealing.

Sociolinguistic categories must also be taken into consideration. While the data presented here do not reveal a gendered pattern, preliminary work using free speech data reported avoidance of [n<sup>j</sup> ] by men and more frequent use by women (Sokhey 2015). Given that there is strong evidence that palatalization of /t, d/ are features of a sociolect in CA (i.e., palatalization covaries with the larger sociolinguistic categories such as socioeconomic class and gender; Haeri 1996a; Youssef 2015), a sociolinguistic examination of the palatalized nasal is warranted. Haeri found in her work that weak palatalization of the stops /t, d/ is an innovation of upper-middle class women, and Youssef found that strong palatalization (i.e., affrication) of the same stops have been phonologized into the sociolect of a group of speakers who use it to index covert prestige in opposition to the upper classes (Youssef 2015). Decades after the first sociolinguistic work was conducted on /t, d/ in CA, it is not unlikely that weak palatalization has advanced to neighboring consonants. The Cairene nasal, as an available palatalized sound, is an ideal candidate for such ideological extensions.

#### **5. Conclusions**

This paper concludes that the palatalized Cairene [n<sup>j</sup> ] is acoustically distinguishable from the coronal /n/, and examining the CV transition proves to be a useful distinguishing cue. Borrowed English words observed in the word-list data produced fewer palatalized nasals while intervocalic /n/ with both a preceding and following high front vowel appears to render palatalization stronger. Given that the palatalized nasal is not uncommon in Cairene speech today, future studies may examine the sociolinguistic status of palatalized /n/ in relation to the palatalized /t, d/ stops, as well as determine whether palatalization has spread to other consonants. This advancement warrants further examination of the status of the proposed sociolects involving WP and SP—i.e., whether WP continues to covary with upper-classness, and if/how this affects the social status of SP and those who use it to display opposition to upper-classness.

Furthermore, a comparative study of palatalization across other dialects and/or a historical study that traces the appearance of this phenomenon would be a worthwhile and informative study on the progression of social salience that can be used not only to study Arabic dialects, but also to study other speech communities. As a phenomenon that is linked to socioeconomic class in CA, palatalization is a sociolectal feature that requires a culturally appropriate index for measuring social class in order to be studied at a larger scale. While this has not been widely done in Arabic sociolinguistics, some scholars began work on this: Haeri created an index for her study of palatalization using a group of socioeconomic indicators with varying degrees of importance (Haeri 1996a). Rania Habib examined socioeconomic indicators in Christian rural migrant speakers in Hims, Syria and found that income, followed by residential area, were the strongest indicators of social class, which differs from the situation in the western world (Habib 2010). In the Gulf (e.g., Bahrain and the UAE), family name and communal background have weighty impacts on the internal evaluations of social status and economic opportunities, but no Arabic sociolinguistic work has considered these variables. It is apparent that socioeconomic indicators differ between Arabic speaking communities, and the aforementioned communities are

promising venues for further (and updated) investigation on the intersection of linguistic variation and social class. It is hoped that the work outlined here provides a basis for future studies involving not only palatalization, but further sociolinguistic work within other Arabic speech communities.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of The University of Texas at Austin (protocol code: 2014-04-0121; date of approval: 4 May 2014).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

**Acknowledgments:** This work was made possible by the friends and acquaintances who kindly volunteered their time to participate in this study in Cairo back in 2014. Thank you to Kristen Brustad, Abby Walker, Corinne Stokes, Aarnes Gudmestad, and Katie Carmichael for their support, encouragement and guidance throughout this project and while preparing this manuscript. Thanks to the audiences at Jil Jadid and ASAL Conferences for thoughts and reflections on the earlier iterations of this project. Any shortcomings that remain are my own.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**


#### **Notes**

<sup>1</sup> 'Abu cAbd al-Rahman Al-Khalil Ibn 'Ahmad. *Kitab al-cayn*. 1967. Ed. by cAbdallah Darwish. Baghdad: Matbacat al-cAni, vol. I. <sup>2</sup> S¯ıbawayhi, 'Abu Bishr cAmr b. d.177/793. In cUthman S¯ıbawayhi, *Al-Kitab.* 1889–1900 Repr., Baghdad: Al-Muthanna, n.d. Cairo: Bulaq.

#### **References**


Al-Saqqaf, Abu Bakr. 1999. *The Yemeni Unity: Crisis in Integration. Le Yemen Contemporain*. Paris: Karthala, pp. 141–49.

Bakalla, Muhammad Hasan. 1981. The Treatment of Nasal Elements by Early Arab and Muslim Phoneticians. *Historiographia Linguistica* 8: 285–305.


Docherty, Gerard J., and Paul Foulkes. 2014. An evaluation of usage-based approaches to the modelling of sociophonetic variability. *Lingua* 142: 42–56. [CrossRef]

Fant, Gunnar. 1973. Stops in CV-syllables. *Speech Sounds and Features* 10: 110–39.


Ladefoged, Peter, and Ian Maddieson. 1996. *The Sounds of the World's Languages*. Oxford: Blackwell, vol. 1012.

Laver, John, and Laver John. 1994. *Principles of Phonetics*. Cambridge: Cambridge University Press.


Öhman, Sven E. 1966. Coarticulation in VCV utterances: Spectrographic measurements. *The Journal of the Acoustical Society of America* 39: 151–68. [CrossRef] [PubMed]

Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency. *Frequency and the Emergence of Linguistic Structure* 45: 137–57.

Poplack, Shana, and David Sankoff. 1984. Borrowing: The Synchrony of Integration. *Linguistics* 22: 99–135. [CrossRef]

Pruthi, Tarun, and Carol Y. Espy-Wilson. 2007. Acoustic parameters for the automatic detection of vowel nasalization. Paper present at Eighth Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27–31.


Youssef, Islam. 2015. Palatalization in educated Cairene Arabic. *Nordlyd* 42: 21–31. [CrossRef]

## *Article* **A Typological Analysis of Cognate Infinitives in Lebanese Arabic Based on Comparative Semitic Evidence**

**Ana Iriarte Díez**

Department of Near Eastern Studies, University of Vienna, 1090 Vienna, Austria; ana.iriarte.diez@univie.ac.at

**Abstract:** Despite the relatively scarce literature on the topic and the lack of terminological consensus among scholars, Cognate Infinitives (CI) have been identified to share formal and functional characteristics across Semitic. The present study provides a description of the formal features of Cognate Infinitives in Lebanese Arabic (LA) based on the analysis of linguistic data gathered through a participant observation method. The novelty of this description lies in its comparative approach, which has been developed in the light of the Semitic evidence available, gathered through a review of the main literature available on the topic. The results of this comparative analysis reveal that the grammatical features of Cognate Infinitives in Lebanese Arabic seem to be in line with general Semitic trends that do not, however, always find their parallel in prescriptive descriptions of Cognate Infinitives in Classical or Standard Arabic.

**Keywords:** cognate infinitive; Lebanese Arabic; typology; Semitic languages

#### **1. Introduction**

The existence of Cognate Infinitives within the Semitic continuum has been noted for centuries by various scholars, on occasion resulting in the creation of seminal studies on the topic (Goldenberg 1971; Kim 2009).

However, these scholars' attempts to describe the formal and functional nature of this linguistic feature did not always lead to a consensus as far as terminology is concerned. Different grammatical approaches brought about many distinct nomenclatures for one single linguistic form: *maf*Q*ul mu ¯ t.laq mubham* in Classical Arabic (Al-Zamaxšar¯ı 1870, p. 111); Paronymous Complement in Syrian Arabic (Cowell 1964); Unmodified Cognate Complement in Rural Palestinian Arabic (Shachmon and Marmorstein 2018); Tautological Infinitive in Biblical Hebrew (Goldenberg 1971); Infinitive Absolute in Syriac (Nöldeke 2003); Paronomastic Infinitive in Akkadian (Cohen 2004), etc.

Nevertheless—and notwithstanding the lack of agreement in terminology surrounding Cognate Infinitive constructions— if we are able today to group together this myriad of grammatical labels, it is only because both formal and functional characteristics of Cognate Infinitives seem to be clear enough to be described, even across Semitic languages.

At the formal level, a Cognate Infinitive construction (CI) is formed by two essential elements: (1) a finite verbal form that functions as the lexical head of a predicate (or 'cognate head') and (2) an infinitive that depends syntactically on and is cognate with the verbal head and stands indefinite and unqualified ('cognate infinitive'). This makes CIs different from Cognate Object constructions (CO), which, albeit similar, present an "infinitive" that is specified, modified, or qualified in a variety of ways.

The distinction between CIs and COs throughout most of the Semitic literature also extends to functional grounds. While COs are known to modify the verb adverbially, the function of CIs has been often described with vague notions such as 'emphasis', 'asseveration', 'contrast', or 'intensification'.

Within the Arabic grammatical tradition, these two concepts have been traditionally studied as two facets of one grammatical category: *al-maf*Q*ul al-mu ¯ t.laq*. The combined

**Citation:** Iriarte Díez, Ana. 2021. A Typological Analysis of Cognate Infinitives in Lebanese Arabic Based on Comparative Semitic Evidence. *Languages* 6: 183. https://doi.org/ 10.3390/languages6040183

Academic Editors: Simone Bettega and Roberta Morano

Received: 24 June 2021 Accepted: 29 October 2021 Published: 4 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

analysis of CIs and COs in Classical Arabic, which strongly influenced subsequent analyses in Modern Standard Arabic (MSA), seems to neglect the abundant Semitic evidence of analogous constructions that draw a clear grammatical line between these two structures.

The present study argues for the separation of these two types of cognate structures following Semitic treatments, and takes a first step in this direction by describing the formal features of Cognate Infinitives in Lebanese Arabic (LA), based on the analysis of abundant linguistic data gathered though a participant observation method during a period of four years. The novelty of this description lies in its comparative approach, for it was developed in light of the Semitic evidence available, gathered through a review of the literature available on the subject.

Given the scarcity of descriptions of CIs in non-Standard Arabic varieties, adopting a traditional approach for the analysis of the data could misleadingly urge us to read the grammatical features of CIs in LA as 'exceptions' to or 'simplifications' of the Standard norm, since this is the only one that has been thoroughly documented. However, when analyzed along comparative Semitic evidence, the data results reveal that the grammatical features of CIs in LA seem to be in line with general Semitic trends that, interestingly enough, do not always find their parallel in prescriptive descriptions of CIs in Classical or Standard Arabic.

By using the CIs in LA as a case study, this study attempts to shed light on the general benefits of a cross-Semitic analysis in the study of Arabic linguistics. As it was the case with CIs, the comparative Semitic evidence available may very well help researchers elucidate a broader, more inclusive vision of the grammatical nature of specific linguistic features, and assist them in the challenging task of revisiting traditional classifications to ensure accuracy in typological descriptions of Arabic varieties.

#### **2. Materials and Methods**

The present paper is based on an extensive study of CIs in LA based on a corpus of 133 recorded instances collected over 4 years of participant observation in Lebanon for the author's doctoral dissertation (Iriarte Díez Forthcoming) <sup>1</sup> as well as on a thorough review of the literature on CIs in Semitic languages.

Participant observation is the main method used by anthropologists. Fieldwork through this method requires "active looking, improving memory, informal interviewing, writing detailed field notes, and perhaps most importantly, patience" (DeWalt and DeWalt 2002, p. vii). The author's long stay in Lebanon (almost 10 years) facilitated the 'prolonged engagement' (Lincoln and Guba 1985) necessary to the creation of the CI corpus. This method was chosen for being the only one able to face the methodological challenges that the study of CIs presented, namely, (1) their marked interactional and emotional nature, (2) the variety of linguistic, social, and communicative contexts in which they occur, and (3) the extreme difficulty of eliciting them2. Furthermore, as in previous studies, participant observation considerably diminished the effects of the so-called observer's paradox (Labov et al. 1968; Milroy 1977).

Comparison is essential for any researcher to acquire a real and profound knowledge of the true nature of a linguistic feature. In fact, comparison is not only useful "to relativize a phenomenon that we tend to consider as outstanding" but also necessary "to understand the role of the specific grammar of a dialect in leading to a type of evolution" (Ibrahim 2011, p. 128). In this spirit, I carried out a thorough review of the relatively scarce literature on CIs. The purpose of this review was to gain awareness of the formal and functional variation that CIs show along the Arabic and Semitic continuums in order to be able to better evaluate (1) the morphological and syntactical factors that effectively differentiated CIs from COs, and (2) the pragmatic and discursive factors that could hold the key to a deeper understanding of the CI's function.

As for the data presentation, for the purposes of this paper, the LA examples gathered in the corpus will be numbered and preceded by the letters LA (Lebanese Arabic) (i.e., [LA.n]). This system will help the reader differentiate CI and CO examples in LA from

instances in other Semitic languages—which have been kept as they appear in the source reference and will be numbered and preceded by their corresponding abbreviation (e.g., [BH.3]). The abbreviations used for the different Semitic languages and Arabic varieties are displayed in the following Table 1.



#### **3. Results**

The present section provides a preliminary description of the formal features of CIs in LA in comparison to those of analogous forms in different Semitic languages. Its aim is to illustrate the formal variation that this construction shows across varieties.

As previously mentioned, in Lebanese Arabic, Cognate Infinitive constructions are formed by a finite verbal form that functions as the lexical head of a predicate ('cognate head' or CH) <sup>4</sup> and a less finite verbal form (usually an infinitive) that depends syntactically on and is cognate with the cognate head and stands indefinite and unqualified (**'cognate infinitive' or CI**).

[LA.1] Lebanese Arabic


In contrast, Cognate Object constructions, present an "infinitive" that is specified, modified or qualified in a variety of ways (cognate object or CO).

[LA.2] Lebanese Arabic


'The car took a long detour (lit. The car circled the circle of the bride)'

In [LA.2], the suffix *–a(t)/-e(t)*, which in Arabic may be used to form a noun of single instance—also called *nomen vicis,* or *ism al-marra*—modifies the CH to indicate that the action has taken place once. Cognate nouns of single instance in Lebanese Arabic are often qualified, as in [LA.2b], where the noun with the adjective 'fast' modifies the verb adverbially, explaining how the action took place. The noun of single instance may also be made definite by a genitive construction or *id. afa ¯* , as example [LA.2c] illustrates.

Both CIs and COs in LA are nominal(ized) elements cognate with a verbal form and canonically stand after the verb. However, CI constructions are formed with a plain infinitive (indefinite and unqualified), expressing some sort of 'emphasis' while CO constructions make use of nouns of a single instance—generally marked by the suffix *–a(t)/-e(t)*—that appear often qualified and modify the verb adverbially. This indicates that CIs and COs are distinct grammatical forms, both on formal and functional grounds.

The following subsections will address, in more detail, some of the formal features of CIs in LA in light of Semitic evidence. The morphological features include form and pattern of CIs and pattern correspondence between CIs and CHs. The syntactic features include CIs' syntactic case, CIs' position in the sentence, and the presence of enclitics.

#### *3.1. Morphological Features of CIs in LA in the Light of Semitic Evidence* 3.1.1. Infinitival Form of CIs

I have previously illustrated the formal differentiation between CIs and COs in LA as far as the choice of the 'infinitival form' is concerned. When looking at CI instances in other Semitic languages, the data consistently show a choice of morphological infinitives,<sup>5</sup> which are formally and functionally distinct from the cognate nouns used in CO constructions (the latter being generally identifiable by an *-a/-e(t)* ending)). Table 2 illustrates this formal distinction. In the table, both CIs and COs are marked in bold and cognate heads are underlined.


**Table 2.** CI and CO instances in different Semitic varieties.

In Biblical Hebrew (BH), not only is a morphological distinction made between CIs and COs, but the infinitive used for CI constructions is also a special form of infinitive called 'Infinitive Absolute', which contrasts with the Infinitive Construct6. The distinction between these two forms is both morphological and syntactic7.

In contrast to BH, Classical and Modern Standard Arabic seem to be the most outstanding exception to the Semitic constant that morphologically distinguishes between infinitival forms for CIs and COs. In these varieties, both COs [CA.1] and CIs [CA.2] may be formed with a *mas.dar* (infinitive), although the use of a noun of single instance (NSI) in CO constructions is also accepted [CA.3]:

[CA.1] Classical Arabic (CI instance) qumtu **qiyam-an ¯** (PFV-stand.1S stand.INF-ACC) [lit. I stood standing]

(Al-Zamaxšar¯ı 1870, p. 111; my glossing)


(Al-Zamaxšar¯ı 1870, p. 111; my glossing)


#### (S¯ıbawayhi: 112; my glossing)

In fact, the shared use of *mas.dar* between CIs and COs was in all likelihood one of the main factors (along with syntactic case8) that led traditional Arabic grammarians to group these two phenomena under one single grammatical category, at variance with what we normally find in the descriptions of Semitic languages.9

#### 3.1.2. Pattern Correspondence between CIs and CHs

In Semitic, there is in general a high degree of correspondence between the verbal pattern of the CH and the corresponding CI. The majority of the studies on the subject (Goldenberg 1971; Cohen 2004, 2006; Kouwenberg 2010) highlight the existence of an exact pattern correspondence between CHs and CIs in Akkadian. Kim (2006, p. 197) corroborates this fact by compiling 228 examples of CIs throughout the different stages of the Akkadian language, and Finet affirms that this is also true as well for the Mari dialect (Finet 1952, pp. 21–22). On the other hand, Syriac and BH follow different pattern correspondences, similar to those of Lebanese Arabic.

As for Classical Arabic, Talmon's study on *maf*Q*ul mu ¯ t.laq* occurrences in the Qur'an¯ shows that 61/64 of the *mas.adir ¯* appearing in CI constructions10 share both the root and the pattern with their governing verb, reaching an 'almost perfect' pattern correspondence (Talmon 1999).

CIs in Lebanese Arabic generally share their pattern with their CHs when said heads are in pattern I (*fa*Q*al*), II (*fa*QQ*al)*, III (*fa¯*Q*al),* and X (*istaf*Q*al).* In the cases of those patterns that carry passive, reflexive, or reciprocal values, such as V (*tfa*QQ*al*) VI (*tfa¯*Q*al*), VII (*nfa*Q*al*), or VIII (*fta*Q*al*), CHs take the cognate infinitive of their corresponding active pattern. The following examples illustrate pattern correspondence between CHs in patterns VII and V with their CIs in their active counterpart: Patterns I and II, respectively.



Detailed pattern correspondence between CIs and CHs for the verbs in my LA data is illustrated in Table 3 below.

**Table 3.** Pattern correspondence between CIs and CHs in LA and percentage of occurrence in the corpus.


These apparent 'asymmetries' seem to contrast with the 'perfect' or 'almost perfect' correspondence between CIs and CHs that is claimed to exist in certain Semitic languages.

Nevertheless, exceptions to this apparent 'perfect correspondence' in patterns between CH and CI have also been documented in Classical Arabic (CA) by grammarians, who do note that some finite verbs in specific patterns might govern a *mas.dar* of the same root but a different pattern [CA.4]:

[CA.4] Classical Arabic


#### (Al-Zamaxšar¯ı 1870, p. 111; my glossing)

Ibn Ya " ¯ıš argues that in these cases, the two forms of the verb carry the same meaning (Al-Zamaxšar¯ı 1870, p. 111). However, S¯ıbawayh and Al-Mubarrad, among other grammarians, explain the lack of pattern correspondence as a consequence of the elision of the verb11.

Interestingly, however, the logic for pattern correspondence followed by LA seems to be in line with that of Biblical Hebrew and Syriac.<sup>12</sup> In these two varieties, like in LA, the CH and the CI generally share the same pattern. However, Syriac passive verbal forms (i.e., *ethp* " *el; ethpa* " *al; ettaph* " *al*) can take the infinitive of their corresponding active pattern. In the following example [SYR.3], the CH is in *ethp* " *el*—passive counterpart of *p* " *al*—and the CI is in *p* " *al* pattern.

[SYR.3] Syriac

**meh. za¯** [CI]"eth. az¯ a¯ [CH] hwat leh s ¯ .úr mtú<sup>m</sup>

"il n'avait jamais vu Tyr" [he had never seen Tyre]

(Ined. Syr. 2, 14 from Duval 1881, p. 332; English translation mine)

As for BH, Gesenius' grammar specifies that with a verb of the derived conjugations, not only the infinitive absolute of the pattern can be used, but also the *Qal* pattern as

"the simplest and most general representative of the verbal idea" (Cowley and Kautzsch 1910, p. 345). This is specifically common with verbs in the *Niph* " *al* pattern, the passive or reflexive form of the *Qal* pattern. In the following example [BH.3], the main verbs are in the *Niph* " *al* form while the CIs appear in the *Qal* form.

#### [BH.3] Biblical Hebrew

lo " tigga "bô yad¯ ¯ , kî **saq¯ ôl** [CI] yissaqel ¯ [CH]" ô **yaroh ¯** [CI] yiyyar˛ ¯ eh [CH]

"They are to be stoned or shot with arrows; not a hand is to be laid on them.

(NIV 2011, Exod. 19:13)

In light of this Semitic evidence, the logic of pattern correspondence between CI and CH in LA can be said to align with systems existing in other geographically adjacent Semitic languages, rather than simply 'deviating' from the idealized CA standard of 'perfect' correspondence.

#### *3.2. Syntactic Features of CIs in LA in Light of Semitic Evidence*

#### 3.2.1. Case Marking on CI

Syntactic case is not marked in Lebanese Arabic; therefore, CIs are not marked with any syntactic case in this variety. This is also the case in all the spoken varieties of Arabic as well as in the great majority of Semitic languages, except for Akkadian, Ugaritic, and Classical Arabic—the only Semitic languages that are known to mark syntactic case with enough attested examples of CIs to render a syntactic analysis possible.13

Although syntactic case has no bearing on LA, looking at the case of CIs in casebearing Semitic languages may provide further evidence on the distinct natures of CIs and COs, and it may shed light on the function of the CI across Semitic.

CIs in Classical Arabic (CA), such as [CA.1] and [CA.4], appear in the accusative case, generally marked with the indefinite accusative ending *–an.* In fact, the relevance of the syntactic case for the CA description of CIs is such that S¯ıbawayhi and Al-Mubarrad decided to name this construction after its syntactic case (i.e., *mas.dar mans.ub, ¯* lit. 'accusative infinitive').

Ibn As-Sarraj observes that the ¯ *mas.dar* used to 'strengthen' the meaning of the action has to be in the accusative: " ! " # " [if its only [function] is emphasis, then it [appears] in the accusative, for the nominative is too improbable] (Ibn As-Sarraj 1985 ¯ , p. 168; translation mine).14 Although S¯ıbawayhi had also stated that CIs are accusative, a more detailed reading of his description reveals that he conceived both accusative and nominative as acceptable options and placed the decision in the speakers' hands: "\$% & ' & )( " #\* + \* -,- ' .#
/0 1 2 " [in the same way, any *mas.dar* may be in nominative of their verb if (the verb is) not (syntactically) occupied with another (subject)] (S¯ıbawayhi 1996, p. 229; translation mine).

This flexibility in the case marking of CIs falls in line with the general situation in the Semitic languages. Both *–u(m)/-u(n)*—generally associated with nominative in Semitic and *–a(m)/-a(n)*—generally associated with accusative in the Semitic languages—appear to mark CIs in case-bearing Semitic languages. However, while accusative seems to be the norm in CA, Akkadian<sup>15</sup> and Ugaritic present quite the opposite situation, as the following examples illustrate:

[AKK. 3] Akkadian

[ša i]štu s.eh <sup>&</sup>lt;reku l ¯ a¯ amuru/ ¯ **[am]arum ¯ -ma** [CI] atamar ¯ [CH]

"[That wh]ich I have not seen [si]nce I was young I have seen now"

(AbB 11, 34:5-6 from Cohen 2004, p. 108)

[UG.1] Ugaritic<sup>16</sup>

**l** " **akm** [CI] " il " ak [CH] [**la** " **aku ¯** -ma" il " aku]

"I will surely send"

#### (2.30, 19-20 in Sivan 2001, p. 123)

The predominance of the *-u(m)* ending for CIs' analogous forms in Akkadian and Ugaritic has triggered discussion about the ending's function. While some interpret it as a locative-adverbial case (Rosenthal 1942; Pope 1951; Huehnergard 2012), some preferred to think of it as a nominative (Driver 1956; Finet 1952; Kouwenberg 2017, p. 659)17, and others stayed neutral on the grounds of a lack of sufficient evidence (Goldenberg 1971; Bordreuil and Pardee 2009).

Be that as it may, a wider review of the literature on CI case marking that observes Arabic varieties as part of the Semitic continuum suggests that CIs are syntactically marked as salient entities and bearers of adverbial meaning rather than objects. This supposes additional evidence of the formal and functional differentiation of CIs and COs in Semitic varieties and presents the seemingly consistent accusative marking of CIs in Classical and Standard Arabic as an 'exception' within the Semitic continuum.

#### 3.2.2. Presence of Enclitics

CIs in Lebanese Arabic do not present any kind of enclitics, as is the case in the majority of the other Semitic languages. The review of the literature shows, nonetheless, that in Akkadian and Ugaritic, CIs systematically present the enclitic *-ma/-m,* respectively.

In CI constructions in Akkadian (i.e., what scholars referred to as the *parasum (-ma) ¯ iprus* type), the enclitic particle *–ma* often appears attached to the infinitives [AKK.1] [AKK.3].<sup>18</sup> However, while the enclitic -*ma* seems to appear very frequently in Akkadian CIs, it is almost non-existent in COs.<sup>19</sup> This 'emphasizing' particle, according to Huehnergard (1997, p. 325), marks the "logical predicate of a sentence" while for Buccellati (1996, p. 387) *–ma* "is more often than not associated with emphasis of limitation"20—both functional features associated with CIs and not to COs.

The situation is similar in Ugaritic, where the so-called paronomastic infinitive (i.e., CI) appears with an enclitic *–m.* Pope (1951, p. 124) suggests that the enclitic *–m* attached to CIs in Ugaritic indicates "merely additional emphasis" and its omission or addition does not affect the meaning perceptibly. Scholars agree that the Ugaritic *–m* is related to the aforementioned Akkadian enclitic *–ma*.

[UG.3] Ugaritic

**mtm** [CI] " amt [CH] [matu-ma/m ¯ utu-ma ¯ " amutu] ¯ "verily I will die"

#### (1.17 VI, 38 in Sivan 2001, p. 124)

The presence of enclitics in CI constructions in Akkadian and Ugaritic—two of the oldest documented Semitic languages—points directly to the correlation between the use of CIs and the marking of logical predicates, limitation, and/or focus. Once again, a cross-Semitic analysis of formal features elucidates the functional nature of the CI that appears to be linked to information structure.

#### 3.2.3. CIs' Position in the Sentence

The syntactic position of CIs has been the formal feature probably more widely studied in Semitic languages. Throughout the Semitic continuum, CIs are found both in post-verbal and pre-verbal positions.

The literature suggests that CIs consistently show a post-verbal position in most Arabic varieties, including Lebanese Arabic ([LA.4] [JA.1] [EA.1] [RPA.1] [CA.1] [CA.4]), with the exception of Sason Arabic [SSA.1, SSA.2] (Akku¸s and Öztürk 2017). Functionally speaking, it is worth mentioning that this position in many varieties of spoken Arabic is oftentimes reserved for focus of contrast (Brustad 2000).


CI in a pre-verbal position. Kim (2006) argues that the pre-verbal position is the most common order in all the extinct Semitic languages while Solà-Solé (1961, p. 191) regards the post-verbal position as the original one. In Akkadian, the CI regularly precedes the main verbal form21, following the [CI+CH] order in affirmatives and [CI+neg+CH] in negative utterances. This is also the case in the few documented instances of CI in Eblaite [EB.1], Phoenician [PH.1], and Ugaritic [UG.1] [UG.2]:

[EB.1] Eblaite **pá-kà-ru** [CI] a-pá-kà-ru [CH] "They should join firmly"

(Lipi ´nski 2001, p. 520)

[PH.1] Phoenician "m **nh. l** [CI] tnh. l [CH] mgštk " lk wmgšt " ly "If you shall come into possession of it (the money), your share is yours and my share is mine"

(Krahmalkov 2000, p. 210)

[UG.2] Ugaritic **yd m**" [CI] l yd " t [CH] [yada¯ " u-ma la yada ¯ " ta] "verily you (m.s.) knew not"

(2.39, 14 in Sivan 2001, p. 123)

In the realm of currently spoken languages, an interesting case is that of Sason Arabic, where, in contrast to the post-verbal position of CIs in all other Arabic spoken varieties, both CIs [SSA.1] and COs [SSA.2] are canonically placed before the CH22:

[SSA.1] Sason Arabic su¸ ¸ sa **qarf** [CI] ınqaraf [CH] 'The glass broke a breaking' [SSA.2] Sason Arabic babe **fadu-ma** hedi [CO] ınfada [CH] 'The door opened a slow opening'

#### (Akku¸s and Öztürk 2017, p. 3)

However, despite researchers' interest in classifying CIs according to their position, not all languages have clear preferences for pre-verbal or post-verbal positions—what is more, most of those where CIs are fairly well documented often exhibit both. Some examples of the latter are Biblical Hebrew, Syriac, Mandaic, and Northeastern Neo-Aramaic dialects (NENA)<sup>23</sup> where, according to the literature, CIs may be found both pre- and post-verbally. Examples from some of these languages are shown in Table 4:

**Table 4.** CIs in both pre-verbal and post-verbal positions in specific Semitic varieties.


Given that in BH the pre-verbal (and often clause-initial) position of the CI seems to be the most frequent<sup>24</sup> and often regarded as the "basic structure" (Goldenberg 1971, p. 65), there is a vivid scholarly debate as to whether the post-verbal position is syntactically conditioned or not.<sup>25</sup> The situation is similar in Syriac (Hoffmann 1827, p. 341; Duval 1881, p. 333; Nöldeke 2003, p. 235)—while some believe that the post-verbal CI expresses a higher degree of emphasis (Nöldeke 2003, p. 236), others find that there is no difference in meaning between both variables (Duval 1881, p. 332 based on Barhebraeus).

Immersed in this process, both groups of scholars seem to have overlooked the possibility that pre-verbal and post-verbal CIs could be, in fact, two separate (although closely related) grammatical forms—with their corresponding functions—rather than two different manifestations of one single grammatical form. In the following section, I discuss how a comparative analysis of CIs' position across Semitic languages such as the one I propose in this study could, in fact, shed light on the 'evolution' of both the formal and functional variation of cognate infinitives across Semitic languages.

#### **4. Discussion**

#### *4.1. CI Position across Semitic Varieties: A Practical Discussion*

In their attempt to understand CIs' syntactic variation, many scholars traditionally identified the most frequent position in the varieties they studied, then proceeded to find the formal explanations of what factors may condition the occurrence of the 'exceptions' to the rule they had themselves drafted.

It is worth noticing that different approaches in the literature have inevitably resulted from the nature of the available data in each variety, but also from the feeding influences that diverse grammatical schools may have received and, especially, from the authors' attitudes towards other related Semitic varieties.

Studies on the CI in most Semitic varieties have explored both positions as two different manifestations of the same form (Goldenberg 1971; Cohen 2004, 2006; Mengozzi and Miola 2018). As for Arabic, the few descriptions of CIs in Arabic varieties—where the grammatical tradition establishes the post-verbal position of the *maf*Q*ul mu ¯ t.laq—*hardly ever include examples of the so-called "extraposed" CIs, for they are considered to be, simply, a separate grammatical entity.

One of the few studies that actually includes such examples—Blau's *Grammar of Christian Arabic—*rules out the possibility of extraposed CIs existing in Christian Arabic, and ascribes their occurrence to a Greek-Aramaic interference resulting, presumably, from poor translations (Blau 1967, p. 605)26:

[CRA. 1] Christian Arabic

$$
\downarrow \downarrow \downarrow \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \downarrow \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \downarrow}{\downarrow} \stackrel{\scriptstyle \$$

"hear indeed and understand not, and see indeed and perceive not!"

(Blau 1967, p. 604)

Contrary to Blau's opinion, although extraposed CIs are rarely documented, they seem to be indeed used in Spoken Arabic. The followings are some examples from Lebanese and Najdi Arabic:

[LA.ext1] Lebanese Arabic

**weqraye** [ext.CI] kenna neqra qimet sa " a u-ness ben-nh ¯ ar bel-qe ¯ s.as. wer-rway¯ at¯ el-gr˙ amiye ¯

"Notre travail durait environ une heure et demie par jour et consistait dans la lecture d'histoires amusantes et romans d'amour" [our daily work would last around an hour and a half and it would entail reading entertaining stories and romantic novels]

(Feghali 1935, p. 10; translation mine)

[NA.1] Najdi Arabic **hawaš¯** hawaš-t-ih ¯ rebuking.INF rebuked-I-him 'As far as rebuking is concerned, I have rebuked him'

(Ingham 1994, p. 43)27

Oftentimes, these extraposed CIs may appear followed by a *–w* that introduces the CH and contributes to the expression of topicalized enumerations in Lebanese and Egyptian Arabic:

[EA.2] Egyptian Arabic

**bos¯** wi-bosti, **hizar¯** wi-hazzarti, **li**Q**b** wi-liQbti

As for kissing, you kissed. As for flirting, you flirted. As for playing, you played.28

> (Movie: *El nom f ¯ ¯ı-l-* " *asal* ('Sleeping in Honey').


As the previous examples show, at least in Spoken Arabic, extraposed CIs seem to function as regular topics. The infinitive in this case is the chosen form for the topicalization of the finite CH.29 Extraposed CIs could thus be simply considered Infinitival Cognate Topics30, as Ingham (1994, p. 43) suggested, which, as infinitives in topic positions "can be used to encode states of affairs as topics" (Maslova and Bernini 2006, p. 83).

In this sense, it is my impression that, at least in Spoken Arabic, these extraposed CIs, for which we adopt Ingham's term "Infinitival Cognate Topics", have a more accentuated nominal character than that of (post-verbal) CIs. For this reason, perhaps, it is common to find in extraposition those infinitives that have been almost completely nominalized (e.g., *ak*@*l, ra*P@*s., qraye, ¯* etc.).31 This, along with the function of topic, would differentiate them (but not necessarily isolate them) from the 'canonical' CIs described in this study.

Were the case of Spoken Arabic applicable to other Semitic languages, there would be a possibility that the existence of both pre-verbal and post-verbal positions of the CI are simply a manifestation of two closely related grammatical forms. One would be a reduplication of the verb that has been fronted, therefore topicalized, while the other represents the reduplication of the verb that focuses on the event expressed by the CH.

In this case, although the joint analysis of Infinitival Cognate Topics and CIs that has often been adopted in the literature—originally based on an excessive concern for the 'form' to the detriment of function—might raise some doubts from a functional perspective, it would be also understandable given the lack of extensive and comprehensive data available in most of the studied varieties.

Another difficulty that might have added to the typological confusion is that the line between the pragmatic notions of topicalization and focus is not only thinner than what it seems, but also, practically imperceptible for scholars working with written texts and consequently deprived from any information regarding the communicative contexts of the utterances in question. With such thin a line, it is not surprising that topics and focus sometimes overlap.

In fact, the function of topicalization (normally assigned to pre-verbal CIs) could have overlapped with that of focus (normally assigned to post-verbal CIs) under the umbrella of CIs in languages such as Old Babylonian (Cohen 2004)—which exhibited two functions but only the pre-verbal extraposed position.

#### *4.2. Data, Ideologies and Their Role in the Creation of Typologies: A Theoretical Discussion*

A review of analogous phenomena along the Semitic continuum provides us with invaluable comparative evidence that reveals that the grammatical features of CIs in LA seem to be in line with general Semitic trends that, interestingly enough, do not always find their parallel in prescriptive descriptions of CIs in Classical or Standard Arabic.

Consequently, by taking the analysis of CIs in LA in light of Semitic evidence as a case study, this paper's results invite the reader to ponder on several theoretical and methodological questions of relevance for the field of Arabic linguistics in general and for that of typology in Arabic dialects in specific.

#### 4.2.1. Data

Scholars working with written texts, particularly those working with extinct varieties, are often deprived of essential information regarding the communicative contexts of the analyzed utterances (e.g., discursive context, communicative setting, communicative intentions and priorities, speakers' stance, etc.), for the data generally tend to have a performative character, rather than an interactive one.

This leads to the concern that both the amount, but especially the quantity, variety, and contextualization, of linguistic data available in certain varieties may be insufficient for a functional approach. Given that most grammatical descriptions of standard norms rely on this kind of data, typologies may also be of a predominantly form-based nature and therefore neglect important functional aspects.

The joint classification of the functionally distinct categories of CIs and COs in Classical and Standard Arabic is just one example of the potential consequences of an excessive reliance on form over function for grammatical description. As the study of CIs across Semitic languages illustrates, the subsequent grammatical labels that arise from such descriptions and that are systematically attributed to analogous linguistic phenomena across varieties may hamper the typologists' task of identifying functional cross-dialectal and cross-linguistic patterns, compromising the linguistic accuracy of typologies.

Consequently, the aforementioned considerations would compel researchers to reflect on the following methodological question: In reconsidering old typologies, what kind of data should be considered for the creation of new typologies?

#### 4.2.2. Cross-Dialectal and Cross-Semitic Approaches for Descriptive Purposes

More often than not, the amount of available data at our disposal might be neither enough nor have the desired quality. While questioning the representativeness of the data behind the literature remains an academically healthy and necessary exercise, the reality is that oftentimes the available tools are relatively scarce and limited.

This, however, does not mean that researchers cannot optimize the worth of these resources. Adopting a comparative approach, even when the purpose of the research is not comparative *per se*, may be an excellent strategy to broaden the researcher's scope particularly when facing the scarcity of available data—thus improving the quality of the description.

The study of CIs in light of a comparative analysis of the data available in Semitic varieties has:


When considering the Semitic evidence for the study of a linguistic feature in a non-standard Arabic variety, both formal and functional patterns of use become more discernible. Catching sight of these patterns provides researchers with a more holistic understanding of the feature in question and enables them to create linguistic models and descriptions with potential cross-dialectal applications.

#### 4.2.3. Ideologies

According to one characterization of the concept of 'language ideologies', these include "speakers' consciousness of their language and discourse as well as their positionality (in political economic systems) in shaping beliefs, proclamations, and evaluations of linguistic forms and discursive practices" (Kroskrity 2004, p. 498; Kroskrity 2000).

Previous studies have revealed that the ideological biases in linguistic scholarship have tangible effects on practices such as linguistic mapping and/or on the interpretation of historical linguistics (Irvine and Gal 2000).

The thorough reading of the Semitic literature carried out for this study also uncovered two different types of ideologies that may have very possibly affected the accuracy of some of the descriptions of CIs, namely (1) ideologies toward the linguistic feature and (2) underlying ideologies toward linguistic varieties and their grammatical traditions:

(1) *Ideologies toward the linguistic feature*: Given its reduplicative character, some scholars have often treated CIs as redundant, literally as mere "ornaments" (Guismondi 1913, p. 65) or as a "purely rhetorical" complementation (Krahmalkov 2000, p. 210). These ideological biases have been enough for some to consider CI features not worthy of systematic analysis. This functional stance is also quite present in the underlying implications of other qualifiers that have been traditionally used in the literature to name CIs—such as 'paronomastic', which implies some kind of pun or play on words, or *tautological*, which directly implies that this infinitive is not necessary and thus "syntactically and pragmatically insignificant" (Callaham 2006, p. 4).<sup>32</sup> These ideologies have had a rather tangible effect on grammatical descriptions on CA and MSA, where, in the name of eloquence, the use of CIs is often said to be appropriate only in cases where the meaning of the action is doubtful or vague. Consequently, expressions such as :;\* &;\* P*akala* P*aklan* (lit. 'He ate an eating') or < # *qa*Q*ada qu*Q*udan ¯* (lit. 'he sat a sitting'), although grammatically correct, are considered by some grammarians as 'rhetorically weak', since the meaning of the verbs &;\* P*akala* (to eat) or *qa*Q*ada* (to sit) are not in a situation of uncertainty or doubt (Hasan 2009, pp. 326–27).

However, in sentences such as < % = > ?@ - A!7) B.
= *t.arati s-samka fi-l- ¯* Z*aww t.ayaranan ¯* (lit. 'the fish flew a flight in the air'), however, the use of the CI is justified by the bizarreness of the meaning (Hasan 2009, p. 327; translation mine). A fairly quick look at the available data, however, shows that this description of CIs is not usage-based, but rather ideology-based.

(2) *Ideologies toward linguistic varieties and their grammatical traditions:* In spite of the formal and functional similar nature of CIs all across Semitic languages, certain analyses of CIs across Semitics show traces of ideological biases that can lead to typological inaccuracies.

Goldenberg (1971), for instance, wrote a seminal paper on CIs—which he referred to as Tautological Infinitives—in Biblical Hebrew and presented a classification of this feature with the help of comparative Semitic data. His rather detailed cross-linguistic classification clearly distinguishes "Tautological Infinitives (TI)" (Cognate Infinitives) from "Inner object constructions" (Cognate Object constructions).

Despite the productivity of CIs in different Arabic varieties (illustrated throughout the present study), Goldenberg seems to show some reticence at including Arabic as one of the languages where Tautological Infinitive constructions occur, for he considers the grammatical concept of '*maf*Q*ul mu ¯ t.laq'* a mere synonym of his notion of 'Inner Object' (CO).33 Although, in absolute terms, his description does account (albeit briefly and only in the final pages) for "exceptional" examples of what could be type A and type B Tautological Infinitive constructions in different Arabic varieties, one cannot help but notice that the scanty Arabic data are treated and analyzed with certain skepticism.34

The different treatment of Arabic varieties in this classification—despite the abundant examples of CI available in CA and MSA—could be explained by either (a) an excessive reliance on the joint grammatical traditional label of *maf*Q*ul mu ¯ t.laq*<sup>35</sup> or by (b) insufficient research on CIs in the different Arabic varieties (including the spoken varieties) and/or an overgeneralization of the scarce available data.

Be what it may, both of these factors, probably fueled by ideological biases, may have led to an excessive reliance on formal features (in this case, on syntactic order and case) that in turn, resulted in typological inaccuracies.36

The results of this study suggest that the use and functioning of CIs is fairly similar across Semitic languages with clearly identifiable patterns of formal variation. The divergence between CIs in different Semitic varieties lies, to a great extent, in the different ideological approaches used for its analysis rather than in its linguistic nature.

Once we, as Arabic dialectologists, acknowledge the profound ways in which language ideologies can shape presumably "objective" linguistic analyses, it thus becomes imperative to pose the question: In revising the typologies of the grammatical tradition, how can we identify and set aside language ideologies to ensure accuracy in future typological descriptions of Arabic varieties and its shared features?

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of the American University of Beirut (protocol ID: KS1.06).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** I owe a huge debt of gratitude to Kristen Brustad for her always valuable comments and corrections. I thank also Stephan Procházka and the two anonymous peer reviewers for their close readings and helpful remarks. All remaining errors are my own responsibility.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


Kim (2006, p. 223), however, argues the opposite:"As far as the evidence goes, in these languages [Semitic] the tautological and non-tautological infinitives share the same form, supporting the view that the Hebrew infinitive absolute and construct developed from a single form" (Kim 2006, p. 23).


K F.F (indefinite infinitive) and implies hardly anything more accurately definable than the vague \* ("strengthening" or "emphasis") it corresponds apparently to some uses of the inf.-constr. of type C" (Goldenberg 1971, p. 77). Affirming that the claims that regard certain type A TIs in Arabic should "not be regarded as as simple intrusion of a structure completely alien to the nature of the Arabic language" (Goldenberg 1971, p. 78) is as far as Goldenberg goes when discussing the posibility of original type A and B TI examples existing in Arabic varieties.


#### **References**


Bordreuil, Pierra, and Dennis Pardee. 2009. *A Manual of Ugaritic*. Winona Lake: Eisenbrauns.


Driver, Godfrey Rolles. 1956. *Canaanite Myths and Legends*. Edinburgh: Old Testament Studies, No.3.


Guismondi, Enrico. 1913. *Linguae Syriacae: Grammatica et Chrestomathia Cum Glossario: Editio Quarta*. Rome: Excudebat carolus de Luigi. Harbour, Daniel. 1999. Two Types of Predicate Clefts: Classical Hebrew and Beyond. In *Papers on Morphology and Syntax: Cycle Two*. Edited by Vivian Lin, Cornellia Krause, Benjamin Bruening and Karlos Arregi. Cambridge: MIT Press, pp. 159–75.

Hasan, Abbas. 2009. *Al-Nah. w Al-Waf¯ ¯ı Ma* " *a Rabt.ihi Bi-L-* "*asal¯ ¯ıb Al-Raf¯ı* " *a Wa-L-H. ayat Al-Lughawiyya Al-Muta˘ ¯ gaddida*. Cairo: Dar al-Ma "arif.



Versteegh, Kees. 2014. *The Arabic Language*. Edinburgh: Edinburgh University Press.

Waltke, Bruce K., and Michael Patrick O'Connor. 1990. *An Introduction to Biblical Hebrew Syntax*. Winona Lake: Eisenbrauns. Watson, Janet. 2012. *The Structure of Mehri*. Wiesbaden: O. Harrassowitz.

Woidich, Manfred. 2006. *Das Kairenisch-Arabische: Eine Grammatik*. Wiesbaden: O. Harrassowitz.

## *Article* **An Innovative Copula in Maghrebi Arabic and Its Dialectological Repercussions: The Case of Copular** *yabda*

**Adam Benkato 1,\* and Christophe Pereira <sup>2</sup>**


**Abstract:** Research on copulas in Arabic dialects has hitherto largely focused on the pronominal copula, and has also mostly ignored Maghrebi dialects. Drawing on published literature as well as fieldwork-based corpora, this article identifies and analyzes a hitherto undescribed verbal copula in dialects of Tunisian and northwestern Libya deriving from the verb *yabda* ("to begin"). We show that copular *yabda* occurs mostly in predicational copular sentences, with time reference including the habitual present and generic future. It takes nominal, adjectival, and locational predicate types. We also argue for broader inclusion of syntactic isoglosses in Arabic dialectology, and show how copular *yabda* crosses the traditional isogloss lines established on the basis of phonology, morphology, or lexicon, and therefore contradicts established dialect classifications such as Bedouin/sedentary or Tunisian/Libyan.

**Keywords:** Tunisian Arabic; Libyan Arabic; copulas; syntactic isoglosses; dialect classification

#### **1. Introduction**

Arabic dialectology has largely focused until now on understanding the geographic distribution of varieties through socio-historical parameters. The traditional dialectological approach to the Arabic varieties of northern Africa ("Maghrebi" varieties) foregrounds a classification scheme which is organized not only along geographical lines, but one which also depends on ecological categories ("Sedentary" vs. "Bedouin") as well as sociohistorical ones ("pre-Hilali" vs. "Hilali") (Caubet 2001; Palva 2006; Pereira 2011, 2018). While certain categories used for classifying Maghrebi Arabic varieties have recently been subject to critique from historical perspectives (Kosansky 2016 on "Judeo-Arabic"; Benkato 2019 on "Bedouin"), it has also been shown that the existing linguistic evidence does not necessarily support the utility of other categories.1 Similarly, it can be pointed out that the existing classifications rely almost exclusively on phonological and morphological isoglosses, and to a lesser extent on lexical ones. Though neglect of morphosyntax for drawing isoglosses is typical of dialectology in general, the problem is particularly astute in Arabic dialectology in northern Africa. This is not only because morphosyntax is almost entirely ignored, but because regional variation in phonology and morphology can often be rather limited, meaning that dialect boundaries drawn on the basis of a handful of such isoglosses are not strong.

The dialectology of Maghrebi Arabic, therefore, could benefit not only from the continued interrogation of the traditional classification system but also from drawing on a broader set of data that includes previously unexamined linguistic features, particularly morphosyntactic ones. This study, by describing a syntactic feature and examining its consequences for dialectology, aims to show how such work has the potential to change the traditional map of Arabic in northern Africa. It opens by giving a brief overview of the copula in Arabic dialects (Section 2), before proceeding to the description of a hitherto unidentified copula in varieties of Tunisia and Libya (Section 3). The study then discusses

**Citation:** Benkato, Adam, and Christophe Pereira. 2021. An Innovative Copula in Maghrebi Arabic and Its Dialectological Repercussions: The Case of Copular *yabda*. *Languages* 6: 178. https:// doi.org/10.3390/languages6040178

Academic Editors: Simone Bettega and Roberta Morano

Received: 17 August 2021 Accepted: 21 October 2021 Published: 26 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the neglect of syntax in Arabic dialectology and shows that syntactic isoglosses may conflict with isoglosses based on other linguistic features (Section 4).

#### **2. Copulas in Arabic**

Copula constructions are to be understood as constructions used to encode the identity of two participants and to express group membership, classification, location and the ascription of a range of properties to a participant and the element linking these is a copula. It is common to assume that the copula is lexically-semantically empty (Pustet 2003, p. 5) and that its main role is in semantic composition and in carrying tense/aspect (Roy 2013). The generally-accepted major types of copula construction are predicational, equative, specificational, and identificational (Higgins 1979, pp. 204–93; Mikkelson 2011). For our purposes, an equative copula construction is one which equates the referents of the two elements besides the copula (Mikkelson 2011, pp. 1807–8), while a predicational copula construction is one whose subject is referential and whose predicate is some non-verbal element, whether nominal, adjectival, or prepositional (Mikkelson 2011, pp. 1808–9).

While languages vary greatly in terms of what elements provide copulas, and which constructions require overt copulas, we can state the following regarding how copula constructions in Arabic are *typically* viewed.2 Predicational constructions with present reference usually use a zero copula (1–2). In Tripoli Arabic, copula constructions with zero copula describe facts and express general truths in thetic utterances, serving to present an entity, a proposal or a state of affairs that is new information.3 In such utterances, a state of being (an inherent or permanent characteristic of a being, as in the first example) or a current activity (including location, as in the second example), considered true by the speaker, is expressed.


In examples (1–2) above, the zero copula is employed in phrases in which the speaker validates the predicative relation. The zero copula thus expresses realis/indicative. Copula constructions of all types which have temporal reference to the non-present, however, require an overt copula, usually provided by a form of the verb *kan¯* /*ykun¯* "be" (3a). Moreover, if epistemic modality is to be expressed, the overt copula *ykun¯* is required (Pereira 2010, pp. 453–67): in copula constructions with *ykun¯* , the predicative relation is to some extent uncertain and the construction thus expresses irrealis/potential (3b).


"Adnan may be in the desert"

To explicitly situate the copula construction in the future, the preverb *h. a-¯* precedes the verb *ykun¯* (Benmoftah and Pereira 2019). In the following example (4), we can compare the use of the zero copula with a generic present reference and the verb *ykun¯* with a future reference.


Moreover, many dialects make use of copula forms in addition to the zero copula and *kan¯* /*ykun¯* copula. For example, in some dialects, such as those in Egypt or Lebanon, presenttense equative constructions in which the complement is a definite noun phrase optionally use a copula based on the 3rd-person independent pronoun (Choueiri 2016) (example 5). Peripheral Arabic dialects go farther and employ the full range of the independent personal pronouns in these constructions (Akku¸s 2018, pp. 459–62)


It is worth pointing out that essentially all literature on the copula in Arabic, theoretical or descriptive, has been devoted to either the "typical" copula situation or to the pronominal copula.5 Other types of copulas in Arabic dialects, especially ones which derive from verbs, have hardly been described. Only very recently have scholars begun to address the existence of other copulas, in particular the use of *ga¯*Q*id*, formally the active participle of "to sit/to stay", as a present-tense predicational copula in varieties such as Maltese and others (Camilleri and Sadler 2019, 2020) (example 6).


Here, we describe for the first time the existence of an additional copula occuring in Arabic varieties of Tunisia and northwest Libya. This copula, supplied by the verb *yabda* (lexically "to begin") occurs in certain types of predicational constructions. In Section 3, we will analyze copular *yabda* on the basis of representative examples from the welldocumented varieties of Tunis (northern Tunisia), Douz (southern Tunisia), and Tripoli (northwest Libya). Since the goal of our study is dialectological in nature, deeper discussion of the grammaticalization path undergone by *yabda* to become a copula will be left aside, and we will concentrate on describing and comparing its function in these three dialects.6 The geographical range of copular *yabda* and its importance for dialectology will then be discussed in Section 4.

#### **3. Copular** *yabda* **in Maghrebi Dialects**

A copular element consisting of the verb *yabda* in the imperfective conjugation occurs in predicational constructions, mainly those where an overt copula is required. Copular *yabda* mainly occurs in narrative or descriptive contexts to refer to a habitual action, event, or description. It can also be used to refer to a present state and to describe an event which is happening at the moment of speech. Moreover, *yabda* can have a future value. Finally, it is used in addition to *ykun¯* as the auxiliary of the future perfect. So far as we can tell, *yabda* never has a past reference, that is, in the perfective conjugation it is used only as a lexical verb and not as a copula. From a modal point of view, copular *yabda* seems to be used, as opposed to *ykun¯* , when the speaker considers the states and the situations to be true or when the speaker believes that the content of the interrogative sentence can be validated by the interlocutor.

While the dialects under discussion all have parallels in the syntax of *yabda*, it should be noted that the phonological or morphological particularities of each dialect do apply to copular *yabda*, without affecting its meaning. For example, the variety of Douz marks gender in the plural verb while that of Tripoli does not; hence Douz has both a 3PL.M *yabdu* and a 3PL.F *yabdan* while Tripoli has only 3PL *yabdu*. Or, the morphophonology of the 3PL may differ: *yabdaw¯* in Tunis but *yabdu* in Tripoli.

#### *3.1. Habitual Present*

In the majority of our examples, copular *yabda* indicates the usual occurrence of a state or situation. In all the following utterances, copular *yabda* has a habitual present value. It can also be used to express a general truth. It appears in main clauses as well as temporal clauses and can occur with adjectival, nominal, or locational predicates.<sup>7</sup>

#### 3.1.1. Adjectival Predicate

The predicate can be adjectival (including passive participles). In the following examples, *yabda* refers to actions or events that take place habitually: indeed, in the first example, it describes an event that takes place every year because of the change of seasons; in the second one, every time a meat dish is cooked according to a particular method; finally, in the third example, every time the family gets together.



"And (each time you cook it) that meat is present in the aluminum foil and is thrown into the middle of the hole"


to talk about general topics and topics concerning the family and so forth"

It is also the case in Tunis Arabic where *yabda* allows the expression of habitual facts. Without *yabda*, examples (10–11) would have an actual present meaning.9


"Nature (i.e., the weather) is cold"

Copular *yabda* also appears in temporal clauses with a habitual present value. The conjunctions (*lamma*, *k¯ıf* and *k¯ı*) refer not to a single, but rather to the habitual, occurrence of an event. The three representative dialects agree in this usage.


The predicate can also be a nominal phrase and copular *yabda* allows to provide a comment on an event or a fact as they habitually occur.


"Sometimes she (viz. a divorced woman) is back with children and this is the other big problem"

problem-F DEF-big.F DEF-second-F

In the following utterance, contrary to the previous examples, copular *yabda* is used in Tunis Arabic in an equative construction.


This example shows that *yabda* is required because there is a semantic constraint, in this case the circumstantial *fi š-šmal¯* "in the north", which limits the applicability of the claim about what *masfuf ¯* is to a particular region. Otherwise, the equative construction with no overt copula would be used: @*l-m*@*sfuf Ø kisiksi ¯* "*masfuf¯* (is) couscous".

#### 3.1.3. Locational Predication

Locational predication can also be expressed with copular *yabda*. In this case, the copula complement consists of a prepositional phrase or a locational adverb. The locational predication can have a habitual value (22–23).


"(Each time you cook it) the heat doesn't come (to the meat) from one direction, it is from all directions".


In the following example from Tunis, copular *yabda* expresses locational predication in a temporal clause.


#### *3.2. Future*

Depending on the context, copular *yabda* situates an event or a state in the future, whether it is a question about location or state or a wish or hope about a situation. The three representative dialects agree in this usage.


"God-willing you will be near us and you will watch, you will see"

#### *3.3. Future Perfect*

Finally, followed by a verb in the perfective, *yabda* is also used as the auxiliary verb of the future perfect, indicating a state or situation that is expected or planned to occur in the future. Here, however, *yabda* and *ykun¯* can both be used with a variation in meaning that requires further study. This usage only exists in two of the representative dialects: Tripoli and Tunis.


"We will have gone to my grandfather's house, for example, so gathered together we eat bazin together"


*<sup>k</sup>*@*mm*@*l-na*

finish\PFV-1PL

"Normally around 2 pm we will have returned home from prayer, so around 3:15 pm we will have finished (eating lunch)".


"On the 21st I can't I will have gone back home".


"All of your old friends will have gotten married but not you".

In Douz neither *yabda* nor *ykun¯* can be used for the future perfect, but instead *ywalli* (lexically "to become") is used.


In Tripoli, the use of *yabda* or *ykun¯* for the future perfect seems to break down along the following lines: *yabda* is used when the speaker considers the future state or situation as certain to occur, while *ykun¯* in contrast allows for the addition of modality, expressing a supposition or a fictional or probable hypothesis. This aligns, in fact, with the use of *ykun¯* for expressing epistemic modality in the present (Pereira 2010, pp. 453–67).


The distinction between *yabda* or *ykun¯* seems to be similar in Tunis as well, though this requires further study.

#### **4. Copular** *yabda* **as Isogloss and the Problem of Syntactic Isoglosses**

As shown in the preceding section, copular *yabda* exists in both "northern" and "southern" Tunisian varieties, as typified for this study by the areas of Tunis and Douz, respectively. More generally, according to Tunisian colleagues and colleagues working on other Tunisian varieties, it can be considered a pan-Tunisian feature.11 In Libya, the

only location where copular *yabda* has been documented is Tripoli, though it would be unsurprising if other varieties of northwestern Libya, about which there is little published, also had the feature. The total geographic extent of copular *yabda* is not yet known; but it does not exist in Benghazi or eastern Libyan varieties generally, and there is essentially no documentation of eastern Algerian varieties available for comparison. It is unknown in areas of central coastal Algeria, such as Algiers or Dellys, however.12 According to the existing information, therefore, it is a shared feature of the varieties of Tunisia and Tripoli (see Table 1).13


**Table 1.** Domains of copular *yabda.*

That these dialects share a linguistic feature, in particular an innovation, is unexpected given the categories and isoglosses typically used in Arabic dialectology. Copular *yabda* crosses not only national boundaries (Tunisia/Libya) but also the pseudo-typological ones most prominent in Arabic dialectology, in particular the categories of "pre-Hilali/Hilali" or "sedentary/Bedouin". Besides the fact that these categories are outdated and problematic from a socio-historical point of view, it must also be pointed out that the collection of features on which they are based almost never includes syntactic features. In Arabic dialectology, syntax plays very little role in discussion of dialect classification. For example in a recent handbook, the authors note that "syntax will, and we do not constitute an exception in so doing, only be taken into account in a restricted manner, although in this area too significant differences between dialects are present"(Behnstedt and Woidich 2005, p. 68). More generally, recent large projects of regional dialectology, such as the *Wortatlas der arabischen Dialekte* (Behnstedt and Woidich 2011–2021), include phonology, morphology, and lexicon, but not syntax. Even the most recent overviews of Maghrebi dialects (e.g., Aguadé 2018) do not treat syntax. Syntax has received slightly more attention from sociolinguists and contact linguists, but is typically not used as the basis for regionallyorganized dialect groupings nor has it been studied as part of intra-dialect variation in ways comparable to phonology or morphology.<sup>14</sup>

Syntax seems to be neglected in dialectology in general regardless of language. Even recently, scholars have gone as far as stating that "there is no doubt that syntax has been the most neglected linguistic subsystem in classical dialectology" (Berger et al. 2012, p. 93). On one hand, this goes back to the fact that traditional dialectological methods, such as the word list and questionnaires, can be unsuitable for describing syntax; on the other, syntax does not necessarily fit the diachronic documentation goal of traditional dialectology, which concentrated on phonological and lexical criteria (Glaser 1996; Werlen 1994). However, this state of affairs has changed quite significantly in certain fields, such as Germanic and Romance dialectology (Kortmann 2010; Berger et al. 2012; Glaser 2013).

Arabic dialectology has largely shared the traditional dialectological emphasis on uncovering archaisms, partially due to its goal of answering questions about the historical origin of Arabic dialects. As with other languages, Arabic dialect groupings have been made primarily on the basis of phonological, morphological, and lexical isoglosses.15 For example, of the 73 isoglosses used by De Jong (2000, pp. 39–48) to group the Arabic dialects of the Sinai peninsula, only 4 can potentially be characterized as (morpho-)syntactic. Meanwhile, some of these traditional non-syntactic isoglosses may not withstand scrutiny: Embarki (2008) argues, for example, that some of the isoglosses traditionally considered to be strong markers of dialect type, such as the interdental consonants, exhibit too much variation within a single dialect to really be useful discriminants (and see again Guerrero, forthcoming).

This being the case, attention to syntax as part of dialectology has the potential to complexify and even complicate the typical dialect groupings. Indeed, it has been noted that syntactic isoglosses often cross and contradict the established isoglosses based on phonology or lexicon (Poletto 2013). As Glaser (2013, p. 204) puts it, "that geographically conditioned syntactic variation indisputably exists does not entail, however, that the distribution of syntactic variants is identical to the distribution of phonological or lexical variants". For the Arabic varieties under discussion here, this crossing and contradiction can easily be illustrated with a quick look at only a few isoglosses (Table 2).


**Table 2.** Selected Isoglosses in Tunis, Douz, Tripoli Arabic varieties.

The above table considers three phonological, two morphological, three lexical, and one syntactic variable. Each of these categories yields different isogloss lines: in some cases Douz and Tripoli agree (nos. 1, 3, 5, 6), in other cases Tunis and Douz agree (no. 2). An isogloss grouping Tunis and Tripoli can even be found, namely the lack of gender marking on plural verbs (no. 4). Of course, many of these features are shared with dialects beyond these three and so only serve to connect two of the three with each other, but not to separate them out from surrounding dialects. Copular *yabda* not only is an isogloss connecting Tunis, Douz, and Tripoli, but also one which separates them out from other Maghrebi dialects.

This raises the question of how much weight a syntactic isogloss should have as part of a group of multiple different isoglosses. While phonological and lexical isoglosses are typically more valued by dialectologists, and more frequently available in the published literature, Chambers and Trudgill (2004, pp. 96–100) note that there is evidence that "grammatical variables stratify speech communities much more sharply than do phonological and lexical variables", suggesting that regions delimited by grammatical isoglosses will be more strongly regarded as different dialect areas than regions separated by mostly phonological and lexical ones.<sup>16</sup> Moreover, there seems to be agreement that grammatical isoglosses delimit larger areas than phonological or lexical ones. In this regard, one would think that copular *yabda* and other syntactic isoglosses should actually have a fair amount of weight when it comes to drawing up-to-date subgroups of Maghrebi Arabic.

Proponents of the traditional dialectological view might note that *yabda* is relatively new in the history of the Arabic varieties in question and that, as an innovation, only represents the spread of a particular feature in very recent history and therefore does not affect the traditional classification. But we would counter that copular *yabda* is not necessarily all that new, as it is already attested in Tripoli at the end of the 19th century.17 Moreover, an innovative feature that is well-attested in a fairly significant region should be the concern of dialectologists and future research should attempt to account for its history and present distribution. For example, did copular *yabda* jump between urban areas, slowly spreading into the rural areas between them? Or did it radiate out from a particular area where it was first innovated? Why has it, seemingly, not been accompanied by the spread of non-syntactic features?

If we are dealing with the spread of a syntactic innovation in the Arabic varieties of a particular region, then we indeed have to think less about the traditional classifications, which attempt to explain how the distribution of Arabic may have looked centuries ago, and more about processes of inter-dialectal contact and diffusion. And it is here that copular *yabda* may also make a contribution, since studies of inter-dialectal contact in Arabic have typically focused on what happens when different dialects come into contact in urban environments, rather than looking at the diffusion of a feature over a large region. These studies also typically focus on phonological and morphological variables, rather than syntactic ones. Meanwhile, general studies of convergence do typically focus on morphosyntax, though in most cases they deal with totally different *languages* rather than different *varieties* of a language. Copular *yabda* may represent a case of a syntactic innovation being spread through dialect contact over a large region, giving rise to a dialectal version of a "linguistic area", that is, the "outcome of diffusion of structural 'patterns' across language boundaries" (Matras 2011, p. 146). In that case, it may be one example of area formation in Arabic dialects, and indeed one that does not follow national boundaries but instead crosses them. And again here, syntax is important, since, as is clear from Table 2 above, the diffusion of copular *yabda* seems, so far as can be seen, not to have been accompanied by the diffusion of other linguistic features. It thus speaks to interaction *between* Maghrebi dialects that can not be seen simply by looking at areas like phonology or lexicon. Future research should therefore look to morphosyntax in search of other features which (unexpectedly) link Tunisia and northwest Libya, or characterize other dialect areas in general.

#### **5. Conclusions**

In this study, we have attempted to describe the occurrence of a verbal copula in certain types of predicational, and less frequently equative, constructions in dialects of Tunisia and northwest Libya. This copula is provided by the verb *yabda*, lexically meaning "to begin", and occurs in predicational constructions which require an overt copula, both in the present and future, including constructions with temporal or modal implication. This can be illustrated succinctly with a final example, taken from social media, where the generic predicational construction with zero copula (35a) contrasts with the temporal construction requiring an overt copula (35b) which is supplied with a form of *yabda*.


Importantly, however, the *yabda* copula is attested in a number of dialects, including three dialects—Tunis, Douz, and Tripoli—which are not closely linked in the traditional dialectological classifications. As a syntactic isogloss, *yabda* crosses the isoglosses drawn from other linguistic levels, ignoring national and typological boundaries, exhibiting behavior seen in syntactic isoglosses more generally. While our study has only been able to use currently existing material to suggest what the rough area contained by the *yabda* isogloss may be, additional data from locales in between these three representative locations may be able to help us define that area more precisely, and, in addition, potentially show if there are transitional areas as well. More importantly, copular *yabda* requires explanations that do not draw on the traditional historical classifications for Arabic dialects, but look to diffusion, area formation, and above all contact. We suggest that syntactic features should play a larger role in Arabic dialectology, and including more of them in the lists of isoglosses drawn on for classification has the potential to complexify and even reshape our understanding of the distribution of Arabic dialects and the processes which continue to shape them.

**Author Contributions:** Both authors wrote and revised the article together and have joint responsibility for all sections. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank Zeineb Sellami for sharing and discussing unpublished data with us, as well as Marwa Benshenshin and the anonymous reviewers for their valuable suggestions and corrections.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Notes**


attribution). We have therefore left these attestations out of our analysis. This is not to suggest, however, that copular *yabda* has not now spread to some regions of southern Libya; but there are unfortunately no studies which can confirm this as of yet.


#### **References**


Heine, Bernd, and Tania Kuteva. 2002. *World Lexicon of Grammaticalization*. Cambridge: Cambridge University Press.

Hengenveld, Kees. 1992. *Non-Verbal Predication: Theory, Typology, Diachrony*. Berlin: Mouton de Gruyter.


Ritt-Benmimoun, Veronika. 2014. *Grammatik des Arabischen Beduinendialekts der Region Douz (Südtunesien)*. Wiesbaden: Harrassowitz.


## *Article* **Connecting the Lines between Old (Epigraphic) Arabic and the Modern Vernaculars**

**Ahmad Al-Jallad**

Faculty of Theology, University of Groningen, 9712 CP Groningen, The Netherlands; a.m.al-jallad@rug.nl

**Abstract:** This paper investigates three linguistic features—wawation, the 1CS genitive clitic pronoun, and the relative pronoun—that are shared between the ancient epigraphic forms of Arabic and modern dialects, to the exclusion of Classical Arabic. I suggest that these features represent the earliest linguistic layer of the modern dialects.

**Keywords:** historical linguistics; Arabic dialectology; Arabic epigraphy

#### **1. Introduction**

It has been widely recognized that the diverse forms of spoken Arabic today do not descend in a linear manner from the literary Arabic of medieval prose and poetry conventionally termed Classical Arabic—or the language of the Quranic Consonantal Text (QCT), Old H. igaz¯ ¯ı (for the most recent appraisal, see Holes 2018a, pp. 1–28; Al-Jallad 2020b, chps. 4 and 5). Indeed, when viewed through the lens of the comparative method, many modern Arabic vernaculars exhibit features that are more archaic than their Classical Arabic counterparts. Na'ama Pat-El (2017) has skillfully identified a number of such features in her 2017 article "Neo-Arabic and Comparative Semitics". Clive Holes has also done pioneering work on pre-Islamic relics in the modern vernaculars of the Gulf, especially in the realm of the lexicon (Holes 2018b, pp. 112–32). Van Putten and Benkato (2017) isolated relics of an earlier stratum of Arabic in loans in Awjila Berber that is distinct from the present-day dialects of Libya. And I have suggested that the phonology of the emphatics of pre-Hilalian Maghrebian Arabic may be connected to the pre-Islamic dialects of the Levant (Al-Jallad 2015). The existence of these features implies that an unidentified stratum of Arabic that failed to achieve written form in the early Islamic period contributed to the formation of modern vernaculars.

This essay explores the possibility that such ancestors may be attested in the pre-Islamic epigraphic record. Before approaching this question, however, it is important to recognize two things. The modern vernaculars never existed in a vacuum; they have experienced considerable contact with the literary register, which has contributed significantly to their lexicons and to their grammatical structure. In addition to this, interdialectal contact has led to an amalgamation of grammatical features in living speech, ones that originate in different times and places. An obvious example of this is the verb *šaf¯* "he saw", which is nearly pan-Arabic today. *šaf¯* , although presently widespread in the Maghreb, was likely a late introduction through inter-dialectal contact (Aguadé 2018, p. 57). It is absent in Maltese, which became isolated from the Arabic sprachraum by the 13th century, and is not used in several pre-Hilalian dialects. These only know *r. a*. The same applies to the Levant. There, *šaf¯* is the primary verb used to express "to see" in Lebanon, yet Cypriot Arabic, which originates on the Levantine coast and became isolated from the Arabic-speaking world by the 13th c. CE, does not use this etymon. Instead, it employs two verbs for "to see"—*ra* (Proto-Arabic \*ra" aya; Classical Arabic ra" a; ¯ Borg 2004, p. 214) and *kiš* " *e* (Q@ltu qaša " ; Borg 2004, p. 388). The latter is fossilized as a presentative in Damascene Arabic, *ša* "(Souag 2016). While it is clear that Cypriot Arabic shares a common ancestor with the

**Citation:** Al-Jallad, Ahmad. 2021. Connecting the Lines between Old (Epigraphic) Arabic and the Modern Vernaculars. *Languages* 6: 173. https://doi.org/10.3390/ languages6040173

Academic Editors: Simone Bettega and Roberta Morano

Received: 15 September 2021 Accepted: 15 October 2021 Published: 20 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

dialects of the Levant, in the intervening centuries since its isolation, a new verb for "to see" spread as a result of contact with other dialects, in this case perhaps northern Arabian ones.

Likewise, Cypriot Arabic does not know the pseudo-verb *bVdd-* "to want" and instead makes use of a verb derived from the root *rwd*, *piri* (< \*bir¯ıd "he wants"; Borg 2004, p. 256; cf. Classical Arabic yur¯ıdu). This it shares in common with the Q@ltu dialects, while the modern dialects of the Levantine coast employ *bVdd*. The latter may also find its source in the North Arabian dialects, where "to want" can be expressed with the prepositional phrase, (*i)b-widd-*PN, or simply with *widd-*PN, literally meaning "in PN's wish" and "PN'S wish", respectively. If we employ an archaeological metaphor, a dialect area, such as the Levant, can be regarded as an archaeological section. The layers would reflect different chronological strata of contact-based features and local innovations. While *šaf¯* and *bVdd* may reflect relatively late layers, this paper is interested in identifying the very earliest linguistic strata in the modern vernaculars.

Almost all who have discussed Arabic's past begin its historical period with the Quran and the nearly contemporary oral poems, passed on traditionally from *raw¯ ¯ı* to *raw¯ ¯ı* until achieving written form in the 8th–9th centuries at the earliest. The Quran itself is far from a linguistic unity. It minimally comprises a consonantal text, *rasm*, which reflects the local dialect of the H. igaz, while the reading traditions imposed upon it draw on various 7th and ¯ 8th c. varieties. The combination of these two linguistic types sometimes produces features that may never have been used in spoken language (Van Putten 2021, §3.4; Al-Jallad 2020b, pp. 57–72). Likewise, the oral poems can provide us with a glimpse of the performance language of that particular tradition, but we cannot know how much the odes changed over time as they were passed from generation to generation. Finally, their linguistic unity is little more than an assumption rather than a demonstrable fact. No one has yet, as far as I know, engaged in a truly comparative examination of the poetic tradition's language on its own terms.

Another corpus suitable for comparison exists: pre-Islamic epigraphy.<sup>1</sup> These texts, which are carved in nearly half a dozen scripts, offer both advantages and disadvantages. To begin with the latter, the inscriptions do not belong to a living tradition. While the researcher has the work of early Islamic philologists to rely upon when approaching the Qas.¯ıdah odes and the Quran, the meaning of the pre-Islamic inscriptions must be reconstructed. However, with a proper comparative approach, and with due attention to archaeological and historical contexts, one can be confident about the meaning and grammar of a large part of the corpus. Nevertheless, the consonantal Semitic scripts that encode these ancient Arabic vernaculars provide us with a very limited view of their phonologies and morphology.

These materials come with advantages as well. We can be sure that their language was not filtered through later, Classicizing traditions. They reflect a register of Arabic used at the time they were produced, and since many are simple graffiti, they likely reflect something close the vernacular of their writers. The pre-Islamic inscriptions, moreover, stretch much further into the past than the pre-Islamic odes, as far back as the middle of the first millennium BCE if not earlier, and cover a wider geographic area, spanning from the Syrian desert to the Yemeni frontier.

As such, how can this corpus aid in the understanding of the linguistic history of the Arabic vernaculars? The answer is not straightforward. In some cases, we may posit a direct developmental trajectory between a phenomenon attested in the ancient sources, but in others, similarities may point towards parallel developments in the history of the language. The following pages will identify three features that the modern dialects share with the ancient epigraphy to the exclusion of normative Classical Arabic. I would suggest that these are reflective of the earliest linguistic layer of present-day vernacular Arabic.

#### **2. Wawation**

Proto-Arabic inherited the Proto-Semitic case system with only a few changes, including the emergence of a new declension (Huehnergard 2017; Al-Jallad and van Putten 2017;

Al-Jallad forthcoming), but the case system began to disappear in several ancient dialects of Arabic at approximately the turn of our era, mainly concentrated in the Nabataean realm (Corriente 1976; Blau 2006). The first stage of this process appears to have been the loss of final short vowels and then the loss of *nunation* (tanw¯ın), which resulted in a new set of final vowels in triptotic nouns. While a couple of inscriptions attest a functional declensional system in this state, the majority situation generalizes the nominative ending in all syntactic positions.<sup>2</sup> This feature—conventionally termed *wawation*—is encountered not only in the Nabataean inscriptions, but wherever one finds triptotic Arabic names in the Aramaic inscriptions of the first millennium BCE and the first half of the first millennium CE. Perhaps the earliest attestation of this feature in the Aramaic script is found in the 5th c. BCE votive inscription of Qaynu son of Gu´sam king of Qaydar at Tell Maskhu¯t.ah, Egypt (Rabinowitz 1956). Wawation is attested continuously throughout the centuries in northern Arabic dialects, appearing on the anthroponyms and tribal names in the Namarah ¯ inscription and even in 6th c. CE Arabic inscriptions from Syria and North Arabia (Al-Jallad forthcoming).

Tell Maskhut¯ .ah (5th c. BCE)

C *zy qynw br gšm mlk qdr qrb l-hn*"*lt*

"That which **Qaynu** son of Gu´sam has offered to han-"Ilat (the goddess)"

Namarah inscription (S. Syria) (328 CE) ¯

*w-mlk*"*l-* "*šryn w-nzrw w-mlwk-hm w-h.rb mdh. gw*

"He ruled the two Syrias and **Nizaru ¯** and their kings and waged war upon **Mad¯ h. igu**"

H. arran inscription (S. Syria) (568 CE) ¯

"*n* "*šrh. yl br t.lmw*

"I am Šarah.¯ıl son of **Z. alimu ¯** "

The distribution of ancient *wawation* is as follows: with a few exceptions, it appears on triptotic anthroponyms and on Arabic proper nouns. It does not attach to names terminating with the feminine ending *-at*, nor does it attach to diptotic names belonging to patterns such as *fu* " *al,*" *af* " *al,* and *fV* " *lan¯* or names defined by the article. It is reasonable to assume that this distribution applied to nouns as well, although it is impossible to prove as there are so few examples of Arabic prose written in the Classical Nabataean script. JSNab 17, an Arabic inscription carved in the Nabataean script from Mada¯ " in S. ali ¯ h. (dated 267 CE; Fiema et al. 2015), marks all triptotic nouns with wawation, including definite forms:" *lh. grw* = " al-H. igr, the ancient name of Mad ˇ a¯ " in S. ali ¯ h. , " *lqbrw* = " al-qabru 'the grave' (Fiema et al. 2015). While wawation does not apply to anthroponyms with the definite article—for example, the name *mar*" *alqays* (=imru" ulqays) is always written *mr* " *lqys* and never *mr* " *lqysw*—its application appears to have been extended in the realm of nominal morphology, at least in some varieties.

The *u* termination is also encountered in the modern Arabic vernaculars of southwest Arabia, concentrated in the Yemeni Tihamah, extending as far north as the dialect of ¯ Balqarn (Behnstedt 2016, p. 81; Greenman 1979; Alqahtani 2015). Nouns terminating in a non-etymological *u* have a distribution virtually identical to anthroponyms terminating in *waw* in the ancient inscriptions: it is restricted to triptotic nouns and does not occur on nouns with the feminine ending -*at*. The striking congruence of both of these systems motivated Blau (2006) to compare them directly. While he stops short of suggesting a genealogical relationship between the dialects of Southwest Arabia and the ancient North Arabian dialects, the particular sequences of changes required to produce a nearly identical distribution at both ends of the ancient Arabic sprachraum does suggest that the feature may share a common ancestor.

The Southwest Arabian dialects, however, attest an important difference. There are some dialects where wawation is in complementary distribution with tanw¯ın. The former appears in pause and the latter in context. Nöldeke was the first to hypothesize that the Nabataean *w* had developed from *-un*, but in these Tiham¯ ¯ı dialects we see the process in action. The asymmetric situation is rare, isolated to a few dialects of the " As¯ır (Behnstedt 2016, p. 81). Rather, most dialects of the area have generalized one form. Those on the Tiham¯ ¯ı coast have generalized *u* while most in the " As¯ır have only the nunated ending, either *un* or *in*. Thus, as Blau (2006) suggested, the following relative chronology appears secure (Figure 1):


**Figure 1.** Stages in the development of wawation.

Those dialects exhibiting the *baytu/baytVn* opposition appear to be more archaic than the Nabataean situation at first glance, but this may simply be an accident of attestation. Since most of the nouns attested in Nabataean occur in an Aramaic linguistic setting, it may be the case that their attested forms are pausal. While there is no direct evidence for the preservation of nunation in Nabataean inscriptions, a clue might be found in the Nahal Hever papyri, which are first c. CE legal documents from the Dead Sea area. The Arabic noun for "contract" is attested with an otiose final *nun¯* , " *qdn*. Although Yardeni (2014) suggested that this could possibly be a first person pronominal suffix, it would make little sense in this context. Rather, one could carefully hypothesize that it be interpreted as the adhoc writing of context form, with nunation. An even earlier example of functional nunation is attested in a widely known yet unpublished inscription from the Tayma¯ " area. The text carved in an oasis North Arabian alphabet—was authored by the king of Dumat (mod. ¯ Dawmat al-Jandal) and can be dated to the middle of the 6th c. BCE based on its reference to the Babylonian king Nabonidus. All non-pausal, non-construct, and non-diptotic nouns terminate in a *nun*. 3

#### **The Bsrn inscription**

"*n : bsrn :* " *bd : nbwn*"*d : mlk : bbl : nz.rt : h-gnm : b-m ˙* "*tn : frsn : w-m*"*tn : rkb :*"*bl*

'I am Bsrn servant of Nabonidus king of Babylon; I have guarded the spoils with a cavalry unit and a unit of cameleers'

The phrase *m* "*t frs* "cavalry unit" is widely attested in the Safaitic inscriptions, which are about half a century later (Macdonald 2014). The appearance of *nun¯* s in this inscription suggest that the two words do not form a genitive construction but rather a noun and adverb, *bi-mi*" *atin farasan*. The final word of the inscription," *bl*, lacks a *nun¯* , perhaps suggesting that it is a pausal form.

This distribution could indicate that both the ancient northern Arabic dialects and those of southwest Arabia share a common ancestor that had undergone the changes described above. Over the passage of time, each group altered the asymmetric *pausal* vs. *context* distribution by generalizing one form. The *u* termination was eventually favored in Nabataean and the Tihamah while the nunated form was favored elsewhere. Some ¯ varieties of Nabataean further generalized wawated forms to the definite declension as well, producing the situation we find in JSNab 17.

If the genealogical connection between these two dialect groups is correct, then it may suggest that an ancient dialect of Arabic similar to what is attested in the Bsrn inscription moved south sometime in the first millennium CE and replaced the pre-Arabic languages of the " As¯ır and Tihamah. ¯ <sup>4</sup> We should further note that Nabataean Arabic and the dialects of southwest Arabia differ in the form of the definite article, *al* and *am* respectively. Thus, it is possible that the definite article of the ancestral dialect to both was *han-*, as attested in the Tell al-Maskhu¯t.ah inscription. This morpheme split into" *al*- in the north and" *am* in the south (on the chronology of the Arabic article, see Al-Jallad 2021) (Figure 2).


**Figure 2.** Evolution of wawation in Nabataean, "As¯ır¯ı, and Tiham¯ ¯ı Arabic.

Wawation is today not only attested in southwest Arabia. It is also found in the Levant and Mesopotamia, where it is realized as *u* or *o*, depending on the dialect. It has a much more restricted distribution: the feature is found on high frequency kinship terms, such as Levantine Arabic " *ammu* "paternal uncle", *ٰalu¯* "maternal uncle", *s¯ıdu* "grandfather", *gaddu ˇ* "idem.", and on feminine nouns, *ٰaltu ¯* "maternal aunt", etc. In northern Mesopotamia the *u/o*-termination applies only to masculine kinship terms, while feminine nouns terminate in *-a*; in Mardin, feminine vocative nouns terminate in -*e*. This distribution speaks against viewing the suffix as a third person masculine singular clitic; there would be no reason that it should be restricted to masculine nouns. Grigore (2007, p. 203) suggested that, at least for the dialect of Mardin, the termination could have a Kurdish source, but Procházka favors a Semitic origin as its distribution extends far beyond the areas in which Persian or Kurdish influence would seem possible (Procházka 2020, pp. 95–96). If I may go further, I would suggest, given the broader Arabic context, that the *u/o*-termination is a reflex of wawation as attested in Nabataean and in the southwestern Arabic dialects. The distribution in the Mesopotamian dialects matches the situation in Nabataean—it does not apply to nouns terminating in the feminine ending. The etymology of the feminine *-a* remains unclear. Perhaps Grigore (2007, p. 203) is correct to see a connection with Kurdish. While the masculine wawated form would have had an Arabic origin, speakers could have understood it as the same morpheme as the Kurdish vocative ending in a bilingual setting. The absence of any marking on feminine kinship terms perhaps motivated the borrowing of the Kurdish feminine ending to produce an etymologically mixed paradigm nearly identical with the Kurdish vocative paradigm.

The Levantine dialects appear to have extended the domain of wawation through analogy, appending the suffix to the female counterparts of male kinship terms; a similar extension of nunation occurred in Classical Arabic as Van Putten (2017) convincingly reconstructs the feminine ending as diptotic in Proto-Arabic.

The Levantine situation may, therefore, reflect a continuation of ancient Nabataeantype wawation, which survived marginally while the rest of the nominal system shifted either through contact or through internal development—to favor the non-wawated paradigm. The early 6th century CE Arabic inscription from Jebel Usays5 already demonstrates that the local Levantine dialects of Arabic had dispensed with wawation on personal names and nouns; thus, it is already possible at this point that the feature was restricted to kinship terms. It is not surprising that kinship terms would preserve older layers of morphology, and so this solution, if correct, would provide a unified analysis of wawation across Arabic.

To conclude, the linguistic stratum of wawation in the Levantine and northern Mesopotamian dialects, the ancient dialects of the southern Levant, and the modern Tiham¯ ¯ı and " As¯ır¯ı dialects would appear to share a non-Classical Arabic common ancestor with this distinct declensional profile.

#### **3. 1CS Genitive Clitic Pronoun**

The next feature I would like to consider is the 1CS genitive clitic pronoun. In all forms of Arabic, the shape of this pronoun is dependent upon the termination of the noun to which it attaches, as in other Semitic languages, but its distribution can vary from dialect to dialect. The pronoun has two allomorphs: *-¯ı* and *-ya*.

\*ya

\*¯ı

Classical Arabic: conditioned—following long vowels and diphthongs: " *alay-ya* G@ "@z: unconditioned—*hagaré-ya* Ugaritic: conditioned—*y* = /ya/, gen + acc singular, and other nouns; on prepositions Phoenician: conditioned—*y* = \*/ya/, genitive nouns

Classical Arabic: conditioned—following short vowels or consonants: *kitab- ¯ ¯ı* Ugaritic: conditioned—ø = /¯ı/on nominative singular + fem. pl. nouns

Phoenician: conditioned—ø = /¯ı/, nominative + accusative

Some contemporary Arabic dialects, most notably those spoken in North Africa, employ the *\*ya* allomorph following certain prepositions: Maghrebian *liya* "to, for me"; *biya* "in/by me", in contrast to normative Classical Arabic *l¯ı* and *b¯ı,* respectively. This distribution may in fact not be innovative. Various Quranic reading traditions produce such forms, but perhaps more importantly, the *rasm* itself demonstrates that this allomorph was in existence and had a much wider distribution. **Quran**

69:19

L A M N M M O PM L Q O G O \* R M \$ O JO ' M M AM M 7 M M M O A N M M M - M O \* L M F L M F M \* M ""

> L A M M M )SM HNM I O F L - M M \* O

*fa* "*amma man ¯* "*utiya kit ¯ aba-h ¯ u bi-yam ¯ ¯ıni-h¯ı fa-yaqulu h ¯ a¯ umu qra u¯ kitab-iyah ¯* "and whosoever has received his record in his right hand will exclaim—Behold! Read aloud **my record**"

69:20

"M *inn¯ı <sup>١</sup><sup>щ</sup> anantu* " *ann¯ı mulaqin ¯ h. isab-iyah ¯* "I had thought that I would surely face **my doom**" 69:28

L A M M M F L - T M M - U - M T L - M \* R M F

M M = L - M \*

*ma¯* "*agn˙ a¯* " *ann¯ı mal-iyah ¯*

"**My wealth** has not availed me"

In Surat al- ¯ H. aqqah, the termination ¯ *iyah*, where the final *h* should be understood as *ha¯* " *u s-sakt*, i.e., a pausal *h* following a short vowel, is used on nouns that are syntactically nominative (*maliyah ¯* ) and accusative (*kitabiyah ¯* and *h. isabiyah ¯* ). The employment of the *ya* allomorph in these contexts is certainly motivated by rhyme, but there are other places in the Quran that demonstrate that its conditioning environment was slightly different from normative Classical Arabic. The vocative expression in Quran 12:84, V'W , is read by H. afs. as *ya¯* " *asaf a¯* and by al-Kisa¯ " ¯ı as *ya¯* " *asafe¯*, translated as "woe to me" (lit. O my woe). Q 5:31 attests a similar construction, -TI, H. afs. *ya waylat ¯ a¯*, al-Kisa¯ " ¯ı *ya waylat ¯ e¯*. The *alif maqs.urah ¯* , read by H. afs. as *a¯* and al-Kisa¯ " ¯ı as *e¯*, reflects the outcome of an original triphthong, \**ya¯* " *asafa-ya* > *ya¯* " *asafe¯* (Old H. igaz¯ ¯ı; al-Kisa¯ " ¯ı) and *ya¯* " *asaf a¯* (H. afs.) (Al-Jallad and van Putten 2017, pp. 113–14). Thus, these expressions preserve a situation where Arabic deployed the *ya* suffix following a short /a/, the accusative. Finally, in agreement with the modern North African varieties, the first person clitic following the preposition *li-* is sometimes realized as *ya*, depending on the reading tradition. H. afs. reads -Xas *liya*, for example, in Q 36:22.

The pre-Islamic Arabic inscriptions also attest a different distribution of the -*ya* allomorph. The Safaitic inscription BES15 799 attests a construction that is identical to the Quranic use of the -*ya* allomorph in the vocative.6

#### **BES15 799**

*wgd sfr bny f t¯ ql* " *l-bny w ql ٰbly*

"he found the inscription of Bonayy and was weighed down (by grief) on account of Bonayy and said: woe to me (*ٰabla-ya* lit. O my woe)"

The use of the *-ya* allomorph following the short high vowel /i/ is also attested in the pre-Islamic corpus. A Thamudic D inscription from the northern H. igaz attests this ¯ allomorph following the preposition *bi*. 7

#### **UdhThamD 1 = JSTham 213**

#### *rbt ´sq by* "*{l} kn*"*mt ´skrn*

'There is much longing **in me** (*biya*) for Kn the maidservant of ´skrn.'

Finally, the Dumaitic inscription WDum 3 = WTI 23 attests the *-ya* allomorph on a noun which is syntactically in the genitive case. Its presence implies that the genitive ending was still productive in this stage of the language.8

#### **WDum 3; WTI 23**

*h rd. w w nhy w* " *trsm s* " *d-n* " *l-wdd-y*

'O Ruৢғ aw and Nuhay and " Attarsame, help me in the matter of my wish ( ¯ *widadiya ¯* )'

The combination of these facts indicates that the Proto-Arabic distribution of the *¯ı* and *ya* allomorphs of the 1CS genitive pronoun was different from normative Classical Arabic. Rather, its appearance following the accusative in vocatives /a/, and short /i/, following prepositions like *li* and *bi*, and the genitive in Dumaitic, indicates a distribution similar to Ugaritic. Thus, we can reconstruct the Proto-Arabic situation as such:

**Nouns** Nom: *\*gamal-¯ı* Gen: \**gamali-ya* (Attested: Dumaitic; relics: QCT) Acc: \**gamala-ya* (Relics: vocative in QCT and Safaitic) **i-vowel prepositions:** *\*li-ya \*bi-ya* **Long vowels + diphthongs** *\** " *alay-ya \*yada-ya ¯*

In this light, modern vernaculars that exhibit forms such as *biya* and *liya* continue the ancient situation, while Classical Arabic is innovative in its generalizing of the -*¯ı* ending to these propositions. As one reviewer of this paper pointed out to me, the quality of the vowel of the preposition in the Maghrebian varieties suggests that its immediate ancestor was long, *liya* < \**l¯ıya.* Maghrebian Arabic generally loses etymologically short vowels, except in unstressed function words, where they are reanalyzed as long, e.g., the third masculine plural pronoun *huma ¯* < *hum*. Thus, an original *\*liya* would have plausibly yielded *l¯ıya*; the same applies to the form *biya*.

The vocative form may also be attested in some modern dialects. In some Levantine dialects, the expression *yab¯ aye ¯* is used in situations of distress. It translates literally as "O my father." If the expression goes back to \*ya¯ " aba-yah, with ¯ *ha¯* " *u s-sakt*, then it would parallel similar constructions in the Quran and Safaitic.

*Ha¯* "*u s-sakt* must be reconstructed for the ancestor of the forms *liya* and *biya* as well. The presence of a final *a* in these cases is anomalous, as final-short vowels, including *a*, have generally been lost in the modern vernaculars (Figure 3).


**Figure 3.** Loss of final *a* in Modern Egyptian.

Thus, the survival of the vowel suggests the presence of a final *h*, protecting it from apocope. In other words, the antecedent of dialectal *biya* was not \*biya but rather \*biyah, as attested in Surat al-H ¯ . aqqah. ¯

To conclude, both the distribution and form of the 1CS genitive clitic pronoun in the modern dialects speaks against a Classical Arabic origin, but should rather be connected with phenomena attested marginally in the QCT and in the ancient inscriptions.

#### **4. Relative Pronoun**

The relative pronoun"*allad ¯ ¯ı* is restricted to southwest Arabia today (Behnstedt 2016, p. 74), but in former times it was much more widely distributed (Holes 2018a, p. 13). It is the primary form attested in Middle Arabic texts, even those that are quite close to the vernacular. It is attested in the Damascus Psalm Fragment as ελλεδι (8th–early 9th c.; Al-Jallad 2020b, p. 26). If this form was common in medieval vernaculars, it has today given way to the virtually pan-Arabic relative pronoun \*" *alli* (Stokes 2018). Yet *allad ¯ ¯ı* seems to have spread at the expense of an earlier relative pronoun *d ¯ V:*. To the Arabic Grammarians, *d ¯ V:* was characteristic of the dialects of southwest Arabia, where it can still be heard today, and the Najdi dialect of T. ayyi" (Rabin 1951, chps. 3 and 14). In the modern dialects, *d ¯* -base relatives are common in Southwest Arabia (Behnstedt 2016, p. 74) and in the Maghreb (Aguadé 2018, p. 54). The genitive particles *d ¯ ¯ıl* and *d ¯ el¯* (lit. "that which is for") in the Q@ltu dialects and marginally in the Levant also suggest that at one point the relative pronoun of those dialects was a simple *d* -base form (Procházka 2018, p. 280; Lentin 2018, p 195).

*¯* The relative *d ¯ V:* is attested across the pre-Islamic Arabic Sprachraum (Figure 4) indeed, the form" *allad ¯ ¯ı* has not yet appeared in the pre-Islamic epigraphic record, although its feminine counterpart" *allat¯ı* has been attested once in the H. igaz. ¯


**Figure 4.** Distribution of relative pronouns in the epigraphic record; data from Al-Jallad (2018).

In at least Safaitic and Hismaic it seems to inflect for case, gender, and number, with the plural form appearing as *d ¯ w /*d ¯ awu/ or / ¯ d ¯ aw¯ı/). Even as far south as Qaryat al-Faw, in the linguistically mixed inscription from the site, the Rbbl bin Hf m grave inscription, " the plural form is attested as *d ¯ w* (Beeston 1979; Al-Jallad 2014). In Safaitic the relative may rarely agree in definiteness with its antecedent, producing *hd ¯* /had¯ d ¯ı/.

¯ The presence of the *d ¯* -base relative pronoun in all other branches of Semitic permits its secure reconstruction to Proto-Arabic, although there is not enough information to determine the details of its inflectional paradigm (Huehnergard 2017, pp. 16–17). This in turn indicates that the \*" *allad ¯ ¯ı* and later \*" *alli* forms are innovative, and spread at a later period, similar to *šaf¯* and *bVdd* discussed in the introduction.

Since *d ¯ V:* is an archaism it cannot be used to argue for a shared genealogical relationship between the dialects that preserve traces of it. It does, however, demonstrate that these dialects do not descend linearly from Classical Arabic, which had replaced this form with the *allad ¯ ¯ı*-type relative. Moreover, its presence throughout pre-Islamic Arabic prevents us from assuming that the *d ¯* - base relative pronoun in the modern vernaculars is a result of "South Arabian" influence, as has been previously suggested (Corriente 2007). The relative was not bound to a single geographic area in pre-Islamic times, but was in use from Yemen to Syria. Rather, it was the *allad ¯ ¯ı*-type relative that appears to have had a specific geographic distribution, restricted to the H. igaz. Today's dialect geography reflects ¯ a reversal of the pre-Islamic situation. The *allad ¯ ¯ı*-type relative, including \**alli*, has spread at the expense of the older *d ¯* -type, which is today restricted to the periphery of the Arabic *sprachraum*.

#### **5. Concluding Remarks**

The features discussed here are but a small sample of possible Old Arabic relics strewn throughout modern Arabic vernaculars. They nevertheless motivate one to think in terms of a three-dimensional dialect continuum, extending not only geographically but also chronologically. Interdialectal contact, substrate contributions from the pre-Arabic languages of all regions to which Arabic spread, and the heavy superstrate influence of Classical Arabic prevent us from regarding any dialect as a monogenetic descendent of a pre-Islamic variety. Yet there can be no doubt that pre-Islamic phonological and morphological features absent in Classical Arabic contributed to the formation of the modern vernaculars.

**Funding:** This research received no external funding.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Appendix A**

**Figure A1.** Safaitic Inscription BES15 799 (courtesy OCIANA).

BS 821: *l mgyr bn msk bn ˙ md bn mlk w wgd sfr bny f* " *t ¯ ql* " *l bny w ql ٰbly w d ¯ kr rgl f*"*dn {* " *}l rgl*

"By Mogayyer son of M ˙ asek son of ¯ " am¯ıd son of Malek and he found the writing ¯ of Bonayy and was weighed down by grief for Bonayy and said "O my woe" and he remembered Ragel and was debased (by grief) for R ¯ agel" ¯

Commentary:

This text was discovered in the Jordanian H. arrah at 32.43341; 37.270460, during the 2015 campaign of the Badia Epigraphic Survey project. The author produced three other Safaitic inscriptions KRS 38, 1885, and 1886, in the same general region.

*wgd sfr*: "he found the writing", a common inscriptional genre produced upon the finding of the inscription's of distant or deceased loved ones.

*t ¯ ql*: "he was weighed down", cf. Classical Arabic t ¯ aqula. The verb is only attested in grieving contexts and so should be construed as a metaphor for worry and grief.

*ql ٰbly*: "he said: woe to me!" The meaning of this line was discussed in section three of this paper. A similar expression is attested in KRS 941: *w ql ٰbl-h trh.* "sorrow afflicted him". *w d ¯ kr rgl w*" *dn*: "he remembered Rgl (likely vocalized as Ragel) and was debased. ¯ " *dn*, the causative of *danna* "to make lowly" should be construed as a passive here with an unexpressed agent, namely, grief.

#### **Notes**

<sup>1</sup> For a summary and linguistic classification of these texts, see Al-Jallad (2018) and Macdonald (2004).



#### **References**


Behnstedt, Peter. 2016. *Dialect Atlas of North Yemen and Adjacent Areas*. Leiden: Brill.

Blau, Joshua. 2006. Problems of Noun Inflection in Arabic: Reflections on the Diptote Declension. In *Biblical Hebrew in Its Northwest Semitic Setting: Typological and Historical Perspectives*. Edited by Steven E. Fassberg and Avi M. Hurvitz. Jerusalem: Magnes Press, Winona Lake: Eisenbrauns, pp. 27–32.

Borg, Alexander. 2004. *A Comparative Glossary of Cypriot Maronite Arabic (Arabic-English): With an Introductory Essay*. Leiden: Brill.

Corriente, Frederico. 1976. From Old Arabic to Classical Arabic through the Pre-Islamic Koiné: Some Notes on the Native Grammarians' Sources, Attitudes and Goals. *Journal of Semitic Studies* 21: 62–98. [CrossRef]

Corriente, Frederico. 2007. On the prehistory of the Arabic language. *Aula Orientalis* 25: 141–53.

Fiema, Zbigniew T., Ahmad Al-Jallad, Michael C. A. Macdonald, and Laïla Nehmé. 2015. Provincia Arabia: Nabataea, the Emergence of Arabic as a Written Language, and Graeco-Arabica. In *Arabs and Empires before Islam*. Edited by Greg Fisher. Oxford: Oxford University Press, pp. 373–433.

Greenman, Joseph. 1979. A Sketch of the Arabic Dialect of the Central Yamani Tihamah. *Zeitschrift für Arabische Linguistik* 3: 289–304. Grigore, George. 2007. *L'arabe parlé à Mardin: Monographie d'un parler arabe "périphérique"*. Bucharest: Editura Universitã¸tii din Bucure¸sti. Holes, Clive. 2018a. Introduction. In *Arabic Historical Dialectology*. Edited by Clive Holes. Oxford: Oxford University Press, pp. 1–28. Holes, Clive. 2018b. The Arabic Dialects of the Gulf: Aspects of their historical and socitic development. In *Arabic Historical Dialectology*. Edited by Clive Holes. Oxford: Oxford University Press, pp. 112–48.

Huehnergard, John. 2017. Arabic in its Semitic Context. In *Arabic in Context*. Edited by Ahmad Al-Jallad. Leiden: Brill, pp. 3–34.



## *Article* **The Old and the New: Considerations in Arabic Historical Dialectology**

**Alexander Magidow**

Modern and Classical Languages and Literatures, University of Rhode Island, Kingston, RI 02881, USA; amagidow@uri.edu

**Abstract:** Arabic historical dialectology has long been based on a historical methodology, one which seeks to link historical population movements with modern linguistic behavior. This article argues that a nexus of interrelated issues, centered around a general theme of "oldness," has impaired this work, and proposes basic principles to avoid the misinterpretation of linguistic data. This article argues that there is a strong tendency to essentialize the idea of linguistic conservatism and attribute it to the groups that have archaic features. Against this view, it proposes that linguistic conservatism should be seen as a failure to participate in otherwise widespread innovations. It critiques the assumption that the modern dialect distribution is directly derived from the earliest settlements established during the Islamic conquests in the seventh century, arguing instead that long-term linguistic durability is unlikely. The article further challenges the assumption that highly conservative dialects such as those of Yemen are ancestral to modern dialects in a meaningful way, arguing instead that either more proximate ancestors or wave-like diffusion had a greater impact on the development of modern dialects. Finally, the paper suggests that a heuristic approach based on typical processes of language diffusion and human migration offers a more productive approach to understanding the history of Arabic dialects than a model based on historical events; many of the existing linguistic classifications may be directly derived from this heuristic.

**Keywords:** Arabic dialects; dialectology; historical dialectology; nomadism; methodology; geography; dialect geography

#### **1. Introduction**

In his introduction to the recent volume *Arabic Historical Dialectology: Linguistic and Sociolinguistic Approaches*, Clive Holes, an expert on Bahraini and Gulf Arabic dialects, relates the following anecdote to illustrate the similarity of so-called "Bedouin" type dialects:

[W]hen, in the mid-1970s, my employer transferred me from Kuwait to Algeria, a distance of several thousand miles, I had no difficulty, if I spoke in Gulf Arabic, in making myself understood to (and in understanding) ordinary Algerians in southern oasis towns such as Ourgla and Touggourt, even though most of them had never left Algeria in their lives: we were all speaking 'bedouin' dialects. But the Arabic of the city of Algiers, only a few hundred miles to the north, and where I was based, is of North African 'sedentary' type and was so incomprehensible to me (as was my Gulf Arabic to the Algérois) that throughout my two-year residence there I found it easier to speak French. (Holes 2018a, p. 22)

This example comes in a section on Bedouin dialects, and is intended to illustrate how, despite issues with the use of "Bedouin" as a classification, it still has value as a category of analysis.1 However, the exact mechanism by which these two far-flung dialects remain mutually intelligible—but unintelligible compared to the nearby coastal sedentary dialects—remains uninterrogated. How exactly does their status as "Bedouin" dialects render them similar? The explanation becomes simply that they are Bedouin dialects, and by existing in the same classification, they are expected to be similar, without invoking

**Citation:** Magidow, Alexander. 2021. The Old and the New: Considerations in Arabic Historical Dialectology. *Languages* 6: 163. https://doi.org/ 10.3390/languages6040163

Academic Editors: Simone Bettega and Roberta Morano

Received: 8 July 2021 Accepted: 28 September 2021 Published: 9 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

either history or linguistics to delve deeper into that similarity. If a linguist from outside the field was presented with this example, they would almost certainly explain it simply as a matter of time-depth—clearly the Gulf dialects and the southern Algerian dialects are simply the result of a relatively recent divergence, rather than due to a vaguely defined typological similarity.<sup>2</sup>

This article will argue that it is precisely due to an accretion of traditional approaches to Arabic linguistic and linguistic history that the situation Holes describes is seen as anomalous or explainable only through categorization. Instead, if we reassess some of those existing views, this situation falls out naturally from basic linguistic and historical processes, without the need to invoke specific social categories like "Bedouin" as explanatory. The issue here is not that the category of "Bedouin dialect" itself is invalid, but rather that both the category, and the linguistic evidence, are not sufficiently interrogated. This article aims to suggest ways in which our traditional approaches to link Arabic dialectology and the social history of Arabic-speaking peoples can be profitably reconsidered, investigating a nexus of interrelated issues that center around a general theme of "oldness": how dialectologists interpret conservative linguistic features, how they conceive of earlier versus later layers of movement and population, and what conservative dialects mean for the genealogy of modern dialects.

This paper is, at its heart, a historiographical exploration, looking at the narratives that surround the history of Arabic as much as the linguistic data itself, and how these narratives shape our conception of that history. Since it is difficult to survey the entire field in a meaningful or systematic manner in a paper of this length, much of the attention is focused on the recent survey that the quote above comes from, Holes (2018a). This seminal work is a well-researched, elegantly conceived volume which makes the historical dialectological work performed until now both easily accessible, and easily comparable, and it is doubtful this paper could have been easily written prior to the publication of that volume. Indeed, the high quality of many of the essays therein make the larger critique in this paper difficult at times, as many of the authors have indeed begun to move beyond the assumptions critiqued here. However, as argued here, those assumptions continue to influence the research on a less conscious level.

This paper is not intended to be polemical, nor is it intended as a broader criticism of the work in the field of Arabic dialectology and linguistics. The incredible work performed by scholars for the past several centuries has and continues to be of immense value, and none of the critiques laid out here could even be articulated without that work. The goal, rather, is to offer constructive suggestions to improve the depth and accuracy of the work in the field of Arabic historical dialectology. When a scholar is quoted in the process of identifying a common theme in the literature, these quotes are intended to represent a larger narrative within the field, and certainly not to criticize the author quoted or their work more generally. Indeed, much of the criticism is focused on the author's own earlier work.

The paper also seeks to suggest concrete, actionable ways to avoid common issues in historical dialectological work. Section 5 presents a novel heuristic approach to considering how linguistic and population movements are intertwined in the South-West Asia and North Africa region based on general linguistic and social principles. Similarly, the conclusion presents several specific recommendations for future historical dialectology.

Prior to beginning, it is important to step back and consider the goals of what we seek to determine from Arabic historical dialectology research. The closest statement that Holes (2018a) makes on this topic is in a footnote: "Our main purpose in writing this book is to show how a more historically and socially grounded linguistic approach, despite the gaps in the record, can help trace the long-term dynamics and some detail of what happened [in the development of Arabic dialects today] (note 7)." In a sense, what historical Arabic dialectology seeks to do is to take the various snapshots we have of the Arabic dialects—their modern distribution and the glimpses of the dialects we find in historical records—and interpolate them to develop a narrative of how the dialects developed. We seek to be able to somehow "rewind" the historical development of the

language to see how it came to be. It is, of course, the "gaps in the historical record" that are the primary difficulty in this endeavor, but this paper will also argue that it is *how* we view the available historical and linguistic record that can, at times, hamper our progress toward that larger goal of understanding the history of Arabic dialects.

#### **2. Conservative Features and Conservative Dialects**

One key issue in Arabic dialectology is how we interpret the data that is available to us. As a preliminary, it is important here to be clear in differentiating several levels of linguistic analysis that operate at difference scales.3 The lowest level of analysis is a linguistic feature, a particular way of using language, such as the use of a certain reflex of a proto-phoneme, a certain word used to mean "to go," an intonation pattern in declarative sentences, etc.<sup>4</sup> At a higher scale is the location of that linguistic feature in space, i.e., its dialect geography, which also implies a certain point in time as well, analogous to what we find on the pages of a dialect atlas. Finally, for the purposes here, we have a dialect (or more awkwardly, "feature bundle"), which is the total collection of linguistic features spoken at a bounded area in time and space.5

It is important to differentiate these levels of analysis as there is a significant difference between a given feature being old and long attested, and its presence in a particular space and time being long-term. This too is different from presence alongside a cluster of other features being long-term in a particular area or among a particular speech community. The first of these can be quite easy to prove, even in Arabic where diglossia muddies the waters considerably—if we can find an early attestation, we can prove that a feature is quite old. Of course, the latter is rarely true—if we cannot prove the antiquity of a feature, it does not necessarily mean it is new or old. More difficult is to prove that a feature has been in the same location over a long period of time. Certainly Occam's razor suggests that if we find a feature in, e.g., early Levantine Middle Arabic documents, and also today, it must have been resident in that place the entire time. However, it is easy to imagine a scenario in which two waves of movement and replacement occurred, such that the feature ceased to exist in the area, and then was replaced by a dialect that again had the feature. Finally, it is most difficult to establish the long-term durability of a dialect or cluster of features. Among other challenges in establishing dialect durability is at which threshold one considers that bundle of features to be fundamentally altered, such that a declaration of continuity, or of change, can be declared a kind of "Ship of Theseus" problem.6

A significant issue in the dialectology literature is in how we interpret dialects which have a preponderance of archaic features, as opposed to dialects with many innovative features. There is a strong tendency to associate conservative features with a kind of "originalism," a primordial state that is often taken to imply long-term residency in an area, or some kind of genetic priority to more developed dialects.<sup>7</sup> This is a form of essentialism in which the linguistic conservatism becomes linked to a larger conservatism that is seen as an inseparable characteristic of the dialects which have those linguistic features.

We see many examples of this conflation in the literature. Behnstedt and Woidich (2018, p. 81) list a variety of migrations into the Fayyum, up to and including the eleventh century, but consider the Fayyum dialect to represent "the earliest linguistic stratum" in Egypt based on conservative linguistic features. For a dialect area only a short distance from Cairo (certainly half the distance from Cairo to Alexandria), and certainly in a close relationship of trade with that city, what is remarkable is precisely that the Fayyum somehow resisted those assimilatory pressures to which Alexandria was subjected, as detailed in Section 3.1. Similarly they argue that a conservative syllable structure reflects earlier migrations to upper Egypt, while less conservative syllable structures represent later or continuing migration, independent of historical data (p. 84). Procházka (2018) formally distinguishes between inherited and innovative traits in his discussion of the Northern Fertile Crescent dialects, a welcome division given how often these are conflated as distinguishing features of dialects. However, he considers the inherited traits to be "'archaic' or 'pre-diasporic', i.e., going back to dialects spoken in Arabia before Islam (p. 262)," when

by definition these traits should be found in any dialect that has not innovated a new form, regardless of when it migrated into or out of an area. Similar arguments regarding linguistic conservatism as evidence of longer settlement or a vague sense of "oldness" are found throughout the volume (see pp. 1, 57, 71, 81, 136, 162–63, 264, 298, 304).

Indeed, this idea of 'old features' as 'conservative' is fundamental to the differentiation between sedentary and Bedouin dialects, and the related tendency to consider Bedouin dialects as themselves 'conservative' by extension. Quoting Rosenhouse (2006, p. 259), Holes (2018a, p. 20) notes that Bedouin dialects are seen as "more conservative" since they "retain many 'Classical' features lost elsewhere," though even without considering Classical Arabic, many characteristically Bedouin features such as the retention of interdentals are certainly retentive with respect to most nearby sedentary dialects. Lists of purportedly Bedouin features are rarely more than lists of retentions, rather than innovations, with the only innovation that commonly can be said to unite all Bedouin dialects being the use of a voiced reflex of the (Q) variable (Palva 2006).

Though Holes and many other modern authors have developed more detailed understandings of the distinctions between Bedouin and sedentary dialects, they are still viewed as fundamentally distinct from one another and form a key category in the linguistic analyses in the field. In the Holes volume, under the larger category of "major areal and typological" distinctions, there are 35 distinct entries in the index under "Bedouin vs. sedentary," totaling over 60 pages, while related patterns such as the pre-Hilali vs. Hilali and *qultu* vs. *gilit* distinctions have a further 18 and 17 entries each. The only other categories listed under this heading are "Maghrebi vs. Mashreqi", "peripheral vs. heartland" (a total of 50 entries) and "urban vs. rural" which often tends to functionally be a "Bedouin" vs. "sedentary" distinction, especially in the chapter on the Maghreb. The Bedouin vs. sedentary distinction is by far the most ubiquitous distinction in this volume, and one predicated primarily on the apparent conservatism of Bedouin dialects (but more accurately, the features of those dialects).

Indeed, it is notable that Bedouin dialects do not appear to be more or less innovative in general than sedentary dialects. Rather, they have participated, by and large, in different innovations than sedentary dialects. Magidow (forthcoming), in a sample of 52 dialects across the Arab world, divided them into Bedouin dialects if they had a voiced realization of (Q), and sedentary dialects if they did not. Out of a pool of 59 total possible innovations, Bedouin dialects showed an average of 13.1 innovations, versus sedentary dialects with an average of 13.9 innovations. As expected, there was no significant statistical difference between these groups—the two groups are effectively equally innovative.

This focus on conservatism tends to miss a key point, which is that conservatism vis-a-vis dominant linguistic features in the area is simply the result of a group failing to participate in an innovation, not necessarily a deeper statement about the history of that dialect. The fundamental observation of historical linguistics is that only successful sharing of a feature is indicative of shared history or participation in a common speech community (Hetzron 1976; Magidow 2017). Sharing of linguistic features implies connection between the dialects sharing the features, while linguistic conservatism implies a lack of sufficient connection. There are only a few ways that connection can occur, and in that sense all happy linguistic families are alike in that they share many innovations, reflecting a shared past of contact.<sup>8</sup>

However, unhappy linguistic families, those without a connection, are often uniquely different. A failure to participate in an innovation can be caused by a wide range of factors. Dialects may simply be too far apart, unable to be exposed to a particular feature, or the speakers of two nearby dialects by virtue of their lifestyles never come into contact. Social rather than geographical barriers may play a role—a group of speakers can resist aligning themselves linguistically with their neighbors, even if they live one neighborhood over—this is precisely what happens with communal and sectarian dialects (Blanc 1953, 1964; Walters 2006). There may indeed be some kind of influence from the social structure of the speakers of a dialect—the work by Lesley and James Milroy has long shown that

dense social networks can inhibit the diffusion of innovations in comparison to looser social networks (Milroy 2008).<sup>9</sup> The role of network density needs to be investigated in greater depth for Arabic, but in most cases it is likely the lack of contact—a lack of frequent linguistic interactions between two populations—that explains most of the disparity between dialects.

Indeed, when there is a connection and interaction between groups over sufficiently long periods of time and little reason to resist change, we would expect to find that linguistic features would diffuse across the entire population. From this perspective, what is remarkable about Bedouin dialects is their lack of participation in innovations. If we discard the essentializing notion that they are in some fundamental, unchangeable way conservative, the most logical explanation for the deviance of Bedouin dialects from sedentary dialects is not that Bedouin dialects are somehow "old," but rather that they are relatively new arrivals to an area. This is where the distinction between features which may indeed be old, archaic, or non-innovative, relative to other features found in a language—and feature position in space is key. The individual features of a Bedouin dialect may be archaic, but their deviance from the features in the surrounding dialect geography is almost certainly indicative of a relatively recent movement into that area.

Another issue arises here, which is that even as a nomadic dialect might contain a variety of conservative features, the "feature-bundle" of that dialect may or may not be continuous, any more than in a sedentary dialect. The way that tribes are imagined in the dialectology literature are as relatively unchanging familial groups, but the reality is that tribal groupings are in actuality political rather than genetic entities (Hoyland 2009, p. 390). They can divide and recombine, even as traditional tribal names might be retained (Magidow 2013, pp. 119–22). Today's Ma'qil tribe is not necessarily yesterday's Ma'qil (or Ibn Khaldoun's), either in terms of the genetic makeup of its members or its linguistic behavior. Here again, the distinction between conservative features and conservative dialects is key. The presence of any number of conservative features in a dialect should not necessarily be understood to mean that a specific combination of features has been used together for a long period of time. Nor does it imply that the group which uses those conservative features has had long-term cohesion and durability. It is possible that this was the case, that there has been continuity, but it should not be assumed.

Moreover, Bedouin dialects are imagined as being traditionally spoken nomadic groups subsisting in resource-poor areas with quite low population density. From the perspective of a dialect map, it takes relatively little human movement for a given space in a low-population area to change its linguistic behavior. Even a dialect map drawn at different seasons might show significant changes as groups move to summer and winter camps within these marginal areas. Contrast this with the dialect maps of high population sedentary areas where a massive catastrophe would be required to cause a migration sufficient to change the overall linguistic landscape—in these areas, one would instead expect the linguistic features to diffuse across the landscape, while the speakers themselves remained stationary.

Magidow (2013, pp. 133–34) refers to this contrast between linguistic conservatism, but recent migration, as the "*Bedouin paradox*:"

Nomadic speakers generally do not always participate in the spread of innovations among settled groups, and therefore they appear to retain archaic linguistic features in comparison with their settled neighbors. However, their extreme mobility and the ease of replacing indigenous nomadic groups means that these 'archaic' speakers may be newcomers to an area in comparison with settled groups.

This idea helps explain Holes' observation about the Bedouin dialects in Algeria. The dialects that Holes reports being able to understand so well are not magically "Bedouin" in nature. Rather, they likely have a much shallower historical branching from the dialects that he was already familiar with, and had moved into southern Algeria relatively recently in comparison with the sedentary dialects of the coasts. The conservatism of these dialects (which nonetheless have acquired "Maghrebi" features from nearby settled areas) reflects their relatively recent arrival on the dialectological scene.

#### **3. Early Layers and Later Layers**

Another key idea linked to the idea of "oldness" as deterministic in the history of Arabic dialect is the "big-bang" model of the expansion of Arabic. This model holds that it is the initial expansion of Arabic in the early Islamic period that is at essence responsible for much, if not all, of the modern geographical distribution of Arabic dialect features.<sup>10</sup> This model has a genetic component—it is these old dialects, first distributed across what is now the Arabic-speaking world, that are the direct ancestors of the modern Arabic dialects, with changes within those dialects due to contact, urbanization or similar processes.

This concept is a foundational in Arabic dialectology. The strongest modern proponent of the idea, Jonathan Owens, explicitly designed his monography, *A Linguistic History of Arabic*, around the goal of reconstructing the Arabic of the period from 630 to 790 (2006, pp. 2–5), and continues to use a similar methodology in more recent papers (Owens 2018). My own earlier work, Magidow (2013) followed this basic assumption quite closely as well, and it is a common underlying assumption throughout the Holes volume, where the introduction focuses on what "language ... the conquerors spoke"(Holes 2018a, p. 7). The *Encyclopedia of Arabic Language and Linguistics* article on "Dialects: Genesis" quite explicitly states that "by the 10th century, dialect areas were already shaped" in essentially their present distribution (Abboud-Haggar 2006, p. 620). Jastrow (2002) divides between Zone I dialects, those in the Arabian Peninsula, against Zone II dialects, those "colonial" dialects that are a results of the early Islamic expansions. The idea was also key in earlier work. Ferguson's famous idea of an Arabic *koine*, the ancestor of modern sedentary dialects, assumes that "its spread coincided roughly with the spread of urban Arabo-Islamic culture (Ferguson 1959, p. 618), and the same is essential true of Versteegh's pidginization hypothesis (Versteegh 1984, 2004).11 Even earlier approaches which assume linear descent of the Arabic dialects directly from Classical Arabic are, at their heart, assuming a diffusion of relatively similar speakers at the time of the conquests, with later developments occurring in-situ, with many of these ideas going back to even the very early grammarian traditions that spoke of Bedouin informants and dialects becoming corrupted by sedentarization (Blau 1977; Fück 1950; Garbell 1958; Versteegh 2014, p. 138).

The big-bang phenomenon also has a related phenomenon in the study of North Africa, what could be called the "little bang." The first big-bang is shared with the rest of the Arab world, as Arab armies lead the conquest of North Africa and Andalusia in the 7th and 8th centuries. This is believed to have laid down an initial layer of Arabic, known as the "pre-Hilalian" variety of Arabic (Marçais 1938). Following this era, another major linguistic expansion occurred in the movement of tribes from the lineage of the Banu¯ Hilal, supposedly from the Arabian Peninsula (by way of a brief stopover in southern ¯ Egypt) beginning in the early eleventh century and ending by the fourteenth in the typical accounts. This group is said to be responsible for the "Hilalian" dialects of North Africa, a group of dialects primarily spoken by Bedouin, rural or recently urbanized populations.

If true, the big-bang idea would be extremely convenient for the historical dialectology of Arabic. Researchers would be able to ignore the complex histories that follow the time periods in which these "bangs" occurred, and instead focus on the vast, early historical tradition which reports many of the early population movements in and out of the Arabian peninsula. This would allow us to reduce the enormity of the task of Arabic historical dialectology, and to focus on linking those historical reports to the modern distribution of dialects (Aguadé 2018; Behnstedt and Woidich 2018; Magidow 2013; Procházka 2018). Unfortunately, the big-bang model appears untenable for three primary reasons. The first is that it is not clear how durable dialect geography is over time. Second, given the lack of durability of features-in-space over time, it is important to pay attention to the significant evidence that major population movements occurred well after the Islamic conquests. Finally, the model (in either the big-bang or little bang versions) simply does not

make effective linguistic predictions, with the actual linguistic features of modern dialects contradicting the predictions that these models would make.

#### *3.1. Durability of Linguistic Material over Time*

In general, there is an unarticulated assumption that linguistic features, once in-place, will generally persist over time. On a basic level, this is often true, but the Arabic speaking world is a crossroads of civilizations, with both long-term, continuously inhabited cities and vast areas of quite low population density. Indeed, the disappearance of the many languages other than Arabic following the Arab conquests gives lie to this theory, for clearly this linguistic inertia can be interrupted and once dominant languages driven extinct, like Coptic, or into a very marginal status, as with Aramaic.

There is plentiful evidence from sociolinguistics that language change can proceed extremely rapidly. Miller (2005) found that within a generation of arrival, many Upper Egyptian migrants to Cairo had assimilated to a wide variety of different Cairene linguistic features. The koineization of the Amman dialect appears to have happened within three generations, and has significantly changed the linguistic repertoire of the newly created city (Al-Wer 2003, 2007). The migration of 'Arab dialects to Bahrain, though hailing perhaps from the 18th century, accelerated after the 1930s and so, in spite of sectarian differences, endogamous marriage within groups and other barriers, by 1995 there was already a developing areal koine (Holes 1995), and by the late 2000s, even in rural areas the old village Baharna dialects "have now all but disappeared (Holes 2015, p. 475)." One notes also that for several key variables, including the shift of (Q)/q/>/*P*/, Behnstedt (1997, map 9) differentiates between the oldest and youngest generations, showing change in within three generations. These kinds of changes, well attested in the sociolinguistics literature more generally, typically occur on timescales of 3–4 generations, equivalent to approximately one century (Trudgill 1986). To expect any significant linguistic durability of features-in-space, or even of dialect bundles, across longer timespans seems wildly optimistic.

Some accounts for the "big-bang" approach have attempted to formalize the idea of linguistic inertia. For example, Owens (2018, p. 209) suggests that in the framework of Dixon (1997), the Islamic conquests represent a "punctuated phase" in a larger linguistic equilibrium. Even leaving aside the many criticisms of Dixon's model (Bowern 2006), and that it is clearly meant for longer time-periods than treated here, it is unclear how this is the only punctuated phase that is meaningful in the history of Arabic, or how long the phase lasted exactly. Arabicization took centuries in most places, and is still incomplete in many others, such as North Africa. Going back to Dixon (1997, esp. Chapter 6), virtually every form of punctuation he discusses—natural causes (e.g., plague), material innovations (especially of weapons), "development of aggressive tendencies"—happened repeatedly since the early Islamic conquests.

Magidow (2013) adopted a different approach, attempting to formalize this model of persistence using a concept from geography, adopted by Labov for sociolinguistics, the "principle of first effective settlement." This principle states:

Whenever an empty territory undergoes settlement or an earlier population is dislodged by invaders, the specific characteristics of the first group able to effect a viable, self-perpetuating society are of crucial significance for the later social and cultural geography of the area, no matter how tiny the initial band of settlers may have been. (Zelinsky 1992, p. 13)

To which Labov (2001, p. 504) adds:

In any one generation, if the numbers of immigrants rise to an order of magnitude greater than the extant population, the doctrine may be overthrown, with quantitative changes in the general speech pattern.

Though this principle does indeed seem to match with the role of population density in acting as a barrier to linguistic change (Magidow 2013, 99ff; Ostler 2005), it is frustratingly vague, and again we simply do not have sufficient access to the complete history of the places in question. Magidow (2013), drawing heavily on Conrad (1981), makes much of the plagues occurring immediately around the time of the Islamic conquests. However, there were clearly many subsequent plagues, including the Black Death that devastated the entire western hemisphere in the 14th century (Dols 1974). Between plague, conquest, migrations, deurbanization and urbanization due to changes in trade, climate, and other facts, Labov's "order of magnitude" criterion must have been regularly fulfilled in the millenium after the Islamic conquests.<sup>12</sup>

#### *3.2. Later Population Movements*

Indeed, we find that when we do look at the history of the Arab world, there are often many examples of later movements and changes that clearly post-date the early Islamic conquests, and which have significant implications for the linguistic history of a region. Even if we restrict ourselves to the chapters in Holes (2018a), we find numerous examples where the current distribution of linguistic features in space clearly are a result of post-conquest population movements.

In Egypt, Behnstedt and Woidich (2018) find many examples where the dialectological situation owes its distribution to much later phenomena. Against Owens (2003), they argue that "the constant return of Maghrebi tribes to Egypt" reinforced the use of the *niktub-niktubu* verb paradigm, and that for certain regions of the Delta these forms are "at least partly due to later Maghrebization from the fourteenth century onwards (p. 76)." For the city of Alexandria, older linguistic layers have apparently been erased. Alexandria has a long and storied history, previously having many Maghrebi features. By the time the French arrived it had only 7000 inhabitants, growing again only in the nineteenth century under Muhammad Ali Pasha. By the end of the 19th century, it continued to have non-Cairene dialect features, many of which were also Maghrebi, such as/dč/for the (J) variable versus Cairene/ܳ/, and common use of the *niktub*-*niktubu* verbal paradigm. By the 1970's it was "a 'one foot in the grave dialect' (p. 79)" effectively replaced by the Cairene dialect in middle and upper-class speech in younger people, and with older people preserving only some of the original features. These are changes which largely took place only in the last three centuries, such that the pre-modern dialect has little relationship to the current one, and that original dialect may be difficult if not impossible to reconstruct.

Indeed, while we generally would expect that large cities would be the most stable across time given high population density and durability of location, the situation of Alexandria is surprisingly common. Many cities witnessed periods of intense depopulation and repopulation, particularly in the 20th century when urban areas underwent spectacular growth (Miller 2007). Many modern cities in the Arab world are virtually ex nihilo creations, such as Casablanca (25,000 in 1900 to millions today), Amman (effectively founded in 1923) and Nouakchott (founded 1957). However, even older cities had surprisingly low populations until recent times. Table 1.1 in Miller (2007) shows the vast growth in many of the major cities of the Arab world, most of which have grown at least 4-fold in 20th century—and given the rates of urbanization that have also grown in that time, from 14.5% in 1900 to 59.7% in 2005, this is almost certainly a massive movement of rural inhabitants into urban spaces.

All of this occurred only in the last century. Going further back, the internal histories of many of these cities are replete with cycles of growth and decline, and so it is quite difficult to be sure that an urban space is going to continue earlier linguistic behavior. Baghdad, once one of the largest cities in the world, is said to have had as few as 15,000 inhabitant in the 1650s (Palva 2009, p. 31). Even rural populations may have had recent depopulating and repopulating events—Behnstedt and Woidich (2018) suggest that upper Egypt was depopulated multiple times, and that in some Upper Egyptian dialects "one has to suppose that the immigration from the H. ijaz lasted right up until the present era, and ¯ that some of the village dialects evolved only in recent times (p. 84)".

In the Levant, an area for which we have some of the oldest clear examples of Arabic language in the form of late Nabatean and Safaitic, many modern dialects and their features certainly seem to be more recent, certainly more recent than the Islamic conquests. Lentin (2018, p. 175) quotes Ayalon as saying that near the end of the 15th century, the area between Latakia and modern Biredjik was Turkish, rather than Arabic-speaking (these form a line that passes northeast from Latakia, to approximately 50km north of modern Manbij). While certainly not entirely surprising, given that much of this area is Turkish speaking today, it pushes the development (or perhaps, "deployment") of the Cilician dialects later than might be imagined.

In the fertile crescent and the Syrian-Iraqi desert, Procházka (2018) begins the history of the region in the pre-Islamic era, with the desert hinterlands said to already have contained Arabic-speaking tribes. His primary claim is that the late tenth century is the *terminus ante quem* for the features he describes as characteristic of relatively more sedentary dialects in this region, but ascribes the Bedouin features in others only to the era following the Mongol conquests in the 13th century. The Shawi Bedouin dialects he considers an early stratum, ¯ but one he links only to the 11th century, while he also notes frequent movement even into the 20th century. The camel-breeding Shammar and 'Anaza are said to have come only in the 19th century, the time he gives for similar migrations to the Cilician Plain, building on Lentin's account above about the Arabicization of north-west Syria. It is also notable that folk accounts put the major migrations into Tillo, near Siirt, Turkey by Arabic speakers to ca. 1300 and 1600 in two waves (Procházka 2018, n. 5). While Procházka does note features which are attested early, and still present in the region, such as the shift of/\*r/> [G], attested in Al-Jahiz (d. 869 CE), this is only a report of a feature, with some marginal spacial information (p. 270). No data exists to determine whether the current linguistic situation reflects continuous inhabitation by the same dialect group.

In the Gulf, Holes (2018b) states that only in the past century, beginning in the 1930s to the 1970s, "the Gulf dialects *as a whole* (with the partial exception of Oman) underwent a number of 'reductional' changes in their morphology (p. 134, emphasis original)." These changes include a loss of gender distinction in plural verbs, loss of the internal passive, loss of the dialectal tanwin, and the innovation or increased use of analytical genitive markers. All of these features are typical of the features used in dialectology for classification and historical reconstruction. If they can spread in a few decades, then we must be cautious about reconstructing even further back. Holes' "tribal Arab" dialects are said to be primarily due to 18th century migrations, with an earlier stratum of unknown chronology (pp. 134–35). Though his evidence for the antiquity of his B strand of dialects as an early layer is quite compelling, he rests some of his argument on the isolated nature of Oman, noting that significant changes have occurred since the ascension of Sultan Qaboos in the 1970s, but that many speakers he worked with in Oman showed extremely limited mobility. However, one wonders whether earlier periods in Oman's history, such as the Ya'rubid dynasty (1624–1742) and Omani Empire periods (1710–1783), when Oman was major regional power, might not have had a similar impact on the language to the present growth of the Omani state.13

While all of the authors in this volume, and in Arabic dialectology more generally, are clearly aware of these later changes, it is still difficult for them to pull entirely away from the "big-bang" model. Behnstedt and Woidich (2018) search hard for the "first layer" in Egypt, drawing on early accounts of the tribal affiliations of the migrants, even as they acknowledge later strands of migration. Holes (2018b, p. 133) attributes the -*inn*- infix's distribution in the Arabian Peninsula to the era of the early Islamic conquests, though he notes that its movement into Egypt, Sudan and the West Sudanic area was probably later. Owens (2018), as noted previously, simply assumes a big-bang model which leads quite directly to suspect historical reconstruction. He argues that the *b-* prefix in modern dialects come from a single source simply as a result of his historical model, "the spread of b- described did occur in some regions very early, and indeed has existed in the forms which will be reconstructed since at least the earliest Islamic period, if not in the pre-Islamic era (p. 212)," which leads him to treat the Yemeni *b-* prefixes, which are quite transparently derived from \**bayna(ma)* (Behnstedt 2016, p. 213) as equivalent to other *b-* prefixes which

are likely from other sources (Owens himself reconstructs them as being from *yabg˙a > yaba ¯* ). Procházka also tends to use the construct of "pre-diasporic Arabic" (p. 267) and focuses on the early tribal conquests and settlement, even as he acknowledges the later population movements.

The other side of the big bang equation also seems lacking at times—there simply is not always a compelling record of Arabicization in the earliest periods. The Arabicization of Egypt probably did not begin in earnest until the Fatimid era, with complaints about the shift from Coptic to Arabic reported into the eleventh century CE (Magidow 2013, pp. 220–22; Papaconstantinou 2012). Aramaic took time to be supplanted even in the Levant and Iraq, with neo-Aramaic dialects surviving to this day in Syria and Iraq.

The area where early Arabicization is most unlikely is North Africa, where vast portions of the area remain either un-Arabicized or show significant bilingualism between Berber languages and Arabic, even after 13 centuries of Arabic presence in the region. The chronology outlined in Aguadé (2018) does not seem likely to have produced significant Arabic penetration prior to the tenth century (the "pre-Hilalian" period). He repeats reports that Arabs formed only a part of the population in many Tunisian towns into the twelfth century CE, while Qayrawan in his telling only developed into a major regional center by the nineth century. Even if it was, as claimed with very little evidence "the origin of the spread of all pre-Hilali Maghrebi dialects" the process that would have resulted in significant Arabicization would have taken centuries. Even in the traditional French chronologies, Tunisia itself, a mostly flat and accessible area immediately surrounding Qayrawan, only is completely Arabicized by the 15th century, suggesting that the spread of Arabic must have been quite slow (Aguadé 2018, p. 42). Fes is said to have been "surrounded by Arab tribes" in the twelfth century, while the supposedly Arab settlements of Bas.ra and Nakur disappeared by the 11th century, though it seems likely Berber was still ¯ the dominant language. There is little in this history to suggest a strong early Arabicizing trend in North Africa that would allow us to clearly attribute the supposedly "Pre-Hilalian" layer to the earliest era of settlement. We do have evidence that a dialect similar to modern Moroccan Arabic existed by the twelfth century, but it still shows some differences, such as the use of *mta:*ݧ' of' instead of *dyal¯* now more common in Morocco (Vicente 2012).

Indeed, there certainly seem to have existed an even earlier layer of dialects than the Pre-Hilalian ones. For example, in Ajwila Berber in Libya, the words for 'Friday' and 'heaven', contain a reflex of the (J) variable that must be originally a palatal-velar [[ܱ] or [dč], which goes against the "pre-Hilalian" [č] and Egyptian (and likely Proto-Arabic) [ܳ] (van Putten and Benkato 2017). Borrowed words with the feminine ending have *-at*, not the current *-a* used outside of construct state, and this is the case in Berber dialects across North Africa (Kossmann 2013, pp. 209–14). All of this is suggestive of a very minor early level of Arabicization, that was almost certainly swept away by later layers, many of which probably came into the area well after the early Islamic conquest period.

The big bang view would perhaps presume that these later migrations were simply additive, building on an already-established dialect geography. However, the poor level of initial penetration of Arabic into the conquered areas, the high likelihood of extreme depopulating events occurring between those initial settlements and the present, and the relatively rapid pace of linguistic change all suggest that the big bang model of linguistic history is far too simplistic, and that the convenience of the model is not something that we as a field are lucky enough to enjoy.

#### *3.3. Effectiveness of the Big Bang Approach as a Linguistic Model*

The big bang approach is also problematic simply because it does not appear to be a strongly predictive linguistic model. As Behnstedt and Woidich (2013, sec. 15.5.2.2) succinctly note, "Neither of these two approaches [Jastrow 2002's zones, Owens 2006 'prediasporic Arabic'] is convincing for linguistic subgrouping, because they cannot be related to linguistic variables which would justify them." This is in part because of the difficulty, noted in Section 2, of linking not just individual features, but features clustered with one another within a dialect across nearly two millennia of history. Attempts to do so starting

in the mid-twentieth century were largely unsuccessful, failing to identify clear features of the earliest layer of conquest (Blau 1977; Cohen 1962; Ferguson 1959; Miller 1986; Versteegh 1984), while both Magidow (2013) and Owens (2006) should both be re-evaluated in light of the arguments presented here.

There are certainly features shared by all modern dialects that contrast with earlier layers of Arabic—the development of vowels in hollow verbs, for example, suffuses all modern Arabic dialects, even though there was a clear historical memory of the earlier situation, where the glides in hollow verbs were retained, also attested in the pre-Islamic Safaitic inscriptions (van Putten 2017a). Similarly, virtually all modern dialects now have \*-aya >/a:/for the *alif maqsura ¯* , though the/e:/reflex is still attested in Classical and Quranic Arabic (van Putten 2017a). However, the fact that these features date to around the time of the Islamic conquests does not tell us that these features directly hail from that era, since any later migrations almost certainly would also have brought these features to those areas.

The little bang theory makes more specific predictions than the big-bang theory: there should be a clear division between the dialects that settled North Africa during the "pre-Hilalian" era and those that came later. Of course, even if this prediction was proven true, it does not prove the specific historical claims of the model—it simply proves that there have been multiple waves of migration, a situation that exists in basically all Arabic-speaking regions. However, the linguistic evidence still does not support a simple binary bifurcation of the dialects in this region.

Aguadé's (2018) article in the Holes volume has done a monumental job of listing all the claimed isoglosses between pre-Hilalian and Hilalian dialects, which previously were largely scattered across dozens of publications. This allows us, however, to note how contradictory many of these isoglosses are, often crossing the supposed boundary between the two layers. For example, the phonemes/b/and/m/replace each other in both types of dialects; interdentals are generally merged with dentals in pre-Hilalian dialects, but many exceptions exist, even in dialects like Cherchell that are traditionally considered key examples of pre-Hilalian dialects (p. 44). Short vowels are lost in open syllables in both types of dialects in Morocco (p. 47), though this is an urban feature further east. Monopthongization of historical diphthongs similar cross the boundary (p. 48). Unconditioned *imala ¯* cuts across all Maghrebi dialects (p. 49). The classic *niktub*-*niktubu* isoglosses characterizes all North African dialects, while the loss (or retention) of gender distinctions has occurred in both Bedouin and sedentary dialects (p. 55). While there are indeed some remaining isoglosses which do distinguish these groups, for example the realization of (Q) and greater use of analytic genitives in sedentary dialects, it remains quite possible that these are later than the alleged migrations (see also below in this section).

Research outside of the traditionally highlighted isoglosses again supports a subgrouping of North African dialects which cuts across the divide. Magidow (forthcoming) analyzes the personal pronouns, demonstratives, and interrogatives in over 80 Arabic dialects to find isoglosses which can be used to classify the entire Arabic-speaking region. In contrast to many classification schemes, where the historical model often drives the selection of isoglosses, the isoglosses in this study were derived directly from the linguistic data without reference to a historical model. Isoglosses in this study are only those in which a clear innovation has occurred, so retentive features (e.g., retention of the interdentals) that are so often mentioned in the dialectology literature are except for comparison with traditional classifications.

One striking result from this analysis is that North African dialects, regardless of where they fall on the Hilalian divide, tend to cluster quite strongly. The study identified 9 isoglosses that were much more strongly represented in North African dialects than in the mainstream of Arabic dialects, identified in Table 1, which shows the isoglosses, their prevalence among the dialect sample as a whole, their prevalence in North African dialects, and their prevalence in dialects which have voiceless and voiced realizations of the variable (Q), used to distinguish between putatively pre-Hilalian and Hilalian dialects.14


**Table 1.** Innovations found primarily in North African dialects, compared with other dialects and divided according to "Bedouin" or "sedentary" features.

<sup>1</sup> Note that this sample group here includes Maltese as well as Andalusian Arabic as they are often included with North African dialects in the literature.

It is clear from Table 1 that, not only are North African dialects strongly linked by the isoglosses, but that these innovations cut across the Hilalian divide. Many of the features are common in both sets of dialects, even as they are rare elsewhere. There is a great deal more homogeneity, and a larger number of isoglosses which distinguish this group in contrast to the other dialect groupings found in the few other dialect groupings found in (Magidow forthcoming). The "Penisular Bedouin" group, found in the western Arabian peninsula, is only distinguished by 3–5 isoglosses, while the "Sedentary Levantine" group has 8 isoglosses.

This is not to say that North African dialects do not show some differentiation along the lines of the supposed Hilalian divide, but this tends to be the exception in the sample and data here rather than the rule. The *Ҍ inti:na* innovation appears to characterize only dialects which fit the pre-Hilalian mold—none have voiced Q, all have merged interdentals. These 5 dialects (Anjra, Morocco; Larache, Morocco; Fez, Morocco; Tlemcen, Algeria; Djidjelli, Algeria)15 share almost all of the major North African isoglosses, though some lack -*ayya* suffixes, *Ҍ a* - *ku:n* or *waqta:*- . They also share another isogloss, a 'when' form based on \**fi: Ҍ ay waqt*.

The 'where' interrogative *fayn* is never found in dialects with voiced Q, and almost all dialects (75%) with that form have merged interdentals. From the other side, only voiced Q dialects have *-a(h)* in the 3ms suffix pronouns. Interrogatives for 'when' derived from *Ҍ ayy mata:* are primarily found in the voiced Q group, but this is also found in two putatively pre-Hilalian dialects, that is Marrakesh and the Jewish dialect of Tripoli, Libya.

Figure 1 is a heatmap of the number of these features, showing all dialects with two or more features from Table 1. The size of the circles corresponds to the number of isoglosses from the table found in each dialect. Overlaid on this are symbols for whether the dialects have merged the interdentals with the dental consonants (taken typically as an indication of 'sedentary' dialects) or if they have a voiced realization of (Q). What is evident from this figure is that these features are found in both sedentary and Bedouin dialects. The number of features appears to form an east–west cline, rather than a division between Bedouin and sedentary dialects. As one moves west, one tends to find more of these features, while further east there are fewer—to the point where Benghazi's dialect shows none of these isoglosses at all. There are of course some dialects outside of North Africa which have some of these isoglosses as well, though not in the areas of Arabia often held to be the source of the North African dialects (Magidow 2013, p. 236 shows the presumed homeland of the Banu H¯ . ilal and Sulaym as western Arabia). ¯

**Figure 1.** Map of dialects which show "North African" features, with dialects having at least 2 such features with a yellow circle sized proportionate to the number of innovations found in each dialect. Overlaid on this are markers of either merged interdentals or voiced (Q) reflexes.

Evidence of an east–west cline, rather than a sedentary-Bedouin distinction is found elsewhere in Aguadé (2018). The affrication of/\*t/>/t*s*/is only found in Morocco and Algeria, not further east (p. 44) The indefinite articles *wa:hid* and - *i:* are used only in Morocco and Western Algeria, while *fard* is common further east in Tunisia and Libya (p. 50). The supposedly ancient genitive particles derived from *\*mata:ҍ* are most widespread from North-Eastern Morocco into Libya, while the genitive particles similar to *diya:l* are concentrated further west.

The Hilalian "little bang" narrative does not provide a strong model here for understanding the data. Instead the data shows a remarkable unity between the North African dialects from both pre- and post-Hilalian dialects, while the many contradictions in the traditional isoglosses found for North African dialects strongly undermine the narrative. If the Hilalian narrative was correct, we would expect a clear bifurcation on an essentially north–south axis, between sedentary Hilalian dialects on the coasts and rural/Bedouin dialects further south. The fact that many of the features work on an east–west cline strongly suggests that North Africa is little different from any other region of the Arabic-speaking world, with a gradual process of migration into the region. Dialects coming from further east, which presumably lacked many of the key North African features (but not all, given the eastern dialects shown with yellow in Figure 1) would acquire those features. Distance to the west should also act as a reasonable proxy for time spent in the region, as it takes time to migrate and settle, which would explain the cline going from fewer features to greater the further west the dialects are found.16

This section provides another piece of the explanation for Holes' observation about the similarity of Gulf and Algerian Bedouin dialects. The existing big-bang and little-bang models cloud our view of these dialects, forcing us to assume that they are separated by nearly a thousand years of history, with the Algerian Bedouin dialects being the direct ancestors of the Hilalian dialects of the 11th to 14th centuries. Instead, it seems quite likely that the movement of those dialects into that area were much later, which would explain the easy mutual intelligibility he experienced. Moving away from that historical model allows us to instead apply Occam's razor and propose that many migrations proceeded into (and out of) North Africa even after the supposed Hilalian period.

#### **4. Conservative Dialects and Early Origins**

There is another common idea based on apparent linguistic conservatism that combines both the "conservativeness as archaic" and the "big-bang" model. The idea is that

dialects which are more conservative are seen as likely candidates for genealogical ancestors of less conservative dialects. This was, of course, the basic premise behind the long-held view that Classical Arabic is the direct ancestor of the modern dialects, which has largely fallen out of fashion as evidence has begun to more clearly show that Old Arabic deviated in significant ways from Classical Arabic (Al-Jallad 2017).

However, there still tends to be a belief that the modern Yemeni dialects reflect, due to their archaism, a kind of ancestor to most modern dialects. This cannot be entirely separated from the big-bang hypothesis, nor can it be separated from the idea that conservative dialects have genetic posteriority. The Yemeni dialects are highly archaic in a number of ways, from phonology to vocabulary, and many of the features in Yemeni dialects (as a whole perhaps more than individually) appear quite similar to canonical Classical Arabic. For example, the interrogatives are primarily of the *ma¯* variety, rather than the forms derived from *\*ayy šay* which are common in both later Classical/Middle Arabic and most dialects.

This apparent linguistic conservatism is coupled with a strong historical tradition that holds that Yemen, and Yemeni tribes, were the origins of many of the Islamic armies. In combination with the big-bang approach, this means that these Yemeni dialects are often held to be the immediate proximate ancestors of modern Arabic dialects. For example, Map 3.2 in Behnstedt and Woidich (2018) depicts "spring pastures of Yemeni tribes in immediate post-conquest Egypt," while references to that Yemeni origin and influence abound in their article. Elsewhere we see for example Yemeni origins posited as the ultimate ancestors of the Ma'qil tribes that are seen as ancestors of the Hassaniya dialect (Taine-Cheikh 2006, p. 301), a narrative repeated by Watson (Watson 2018, n. 9). Outside of this volume, one can witness the attempt to link Yemeni to Andalusi Arabic (Corriente 2014), though that argument has been criticized (van Putten 2017b).

Beyond the historical reasons for seeing this link, there is also a deeper misinterpretation, again about the significance of "old" features. This idea revolves around the idea that archaic features, or dialects with many retentions versus innovations, are older, and that older dialects must necessarily be ancestors to newer dialects. However, the reality is that the Yemeni dialects (and conservative dialects more generally) are striking precisely in how much they *differ* from other dialects, however conservative that might be. Though this means they could be an ultimate ancestor (or related to an ultimate ancestor) of other modern dialects, there almost certainly exist more proximate ancestors.

A wide-scale comparison of existing Arabic dialects largely confirms this. Magidow, (in preparation) identified 55 isoglosses that represent innovations with regard to Proto-Arabic, and then identified isoglosses present in at least 60% of all sampled dialects. These are shown in Table 2. The geographic distribution of these groups is shown in Figure 2.


**Table 2.** Core innovations found in the majority of modern Arabic dialects (*n* = 80).

What is notable about the distribution of these features is that, as would be expected in the literature, the Yemeni dialects are the only dialects to fall outside of the 90% core. That is to say, the vast majority of Arabic dialects have participated in those two major innovations, while it is primarily the Northern Tihama dialects (Behnstedt 2016, p. 191) that have failed to partake in these innovations. Though not necessarily in the same group of dialects per Behnstedt's classification, there are further archaisms found only in Yemen that

are absent from most modern dialects. Virtually every modern Arabic dialect has, through analogy with the 1cp suffix *-na:* (both pronominal and verbal) innovated a form of the 1cp independent pronoun from \**nih*¯ *nu* to *nih*¯ *na* or similar. The primary exceptions are in Yemen (points 153 and 154, just outside of Ta'izz, not far from the point 145 included in Figure 2), while Reinhardt (1894, pp. 21–22) transcribes *honu:* and *nah. nu* for some Omani dialects.

**Figure 2.** Map of "core" dialects, with dialects not sharing the 90% core features shown as squares.

The implication of this data, when freed from the "archaic as ancestor" narrative is that it is quite unlikely that these Yemeni dialects are the most recent node on a genealogical tree from which all other dialects developed. It is much more likely that most modern dialects derive from a dialect which innovated these core features, while the Yemeni dialects simply did not participate in those changes. The conservatism of these Yemeni dialects and their linguistic features provide a convenient window into the linguistic past of Arabic, however, they do not imply a linguistic ancestry in a historical sense.

Given how widely distributed across space (and time, given their presence in Andalusian Arabic) the dialects with 90% core features are, it would seem reasonable to assume that a single ancestor developed those features before becoming more widely distributed. This dialect, of course, would be unlikely to have been in Yemen, given that its innovations have not fully suffused the area, in contrast to the rest of the Arabic-speaking world. In the "big-bang" viewpoint, this would be the "pre-diasporic" Arabic, the variety that was spoken by the conquering Arabic armies in the early Islamic conquests.

However, as detailed previously, we simply cannot be entirely sure of the how ancient or new any dialectal features are with respect to their distribution in space. There has clearly been a huge amount of dialect movement and change, and many waves of diffusion which have brought specific linguistic features across the Arabic-speaking world. The example of the verb - *a:f* 'he saw' are illustrative. The verb - *a:f* has incredibly wide distribution across the Arab world, found in virtually every Arabic-speaking region today. While this diffusion is often believed to be quite early (Ferguson 1959), many dialects still have other verbs or biforms which appear to be diminishing only in recent times (Behnstedt and Woidich 2011, pp. 330–37). It is hard to determine the earliest this verb came into use—we find an example of the causative *ywšwfwk* 'they will show you' in a papyrus datable as as early as ca. 1000–1100 CE.<sup>17</sup> Though attested relatively early, the modern wide-spread distribution of this form often appears to be quite recent. Cowan (1966) argues that the diffusion of - *a:f* must have taken place between the twelfth and sixteenth centuries, based on the places where it is not found (Malta, 15th century Andalusia, and Cyprus). However, in much of North Africa this lexeme's uptake is clearly quite recent. It is generally not found in Judeo-Arabic varieties in North Africa, while Aguadé (2018, p. 57) notes that

in Djidjelli as of the 1950s, - *a:f* was a recent loanword, while in Anjra in the 2000s there was a generational divide between users of *ra* and - *a:f*. Blau (1977, p. 200) notes a similar situation in the Tunisian dialect of Marazig, where - *a:f* was only widespread among men. Behnstedt and Woidich (2011, p. 333) note many places, in both eastern and western dialects where there are bi-forms, often with - *a:f* clearly encroaching on older local forms (mostly *\*ra:* reflexes, but see their maps for many other previously common verbs with this meaning).

The implication here is that the diffusion of the "core" isoglosses could be either genetic or areal—that is, the diffusion of these features could be an early phenomenon, in a common ancestor of modern dialects, or like - *a:f*, they could owe their current distribution to later waves of diffusion. Particularly in the latter case, it would not surprising that this diffusion would not have made it to Yemen, a peripheral region with difficult terrain, often not included in a meaningful way in the major Arabic-speaking empires. However, in either perspective, while Yemen may have preserved earlier states of the language, Yemeni dialects are at best a great-grandparent or great-aunt to the bulk of modern dialects, but hardly a direct parent.

#### **5. From History to Heuristic: An Alternative Approach**

There are several dangers when it comes to a using a historical approach for analyzing the macro-history of Arabic dialects. The first is simply that any attempts to reconstruct the history of the Arabic-speaking world are wildly ambitious. The Arabic-speaking world is vast, both in its present reach and historical extent. With Arabic spoken from Mauritania to Afghanistan to Zanzibar, and Arabic inscriptions already attested early in the first millennium, centuries before Islam, any attempt to even scratch the surface of this history will necessarily be extremely superficial. Most historians would be hesitant to even attempt a complete history of the population movements in a region, let alone the entire Arabic-speaking world, especially in the current environment where *longue* durée and macrohistory has become somewhat rare, or is treated at a more popular than academic level. By way of example, Hugh Kennedy's *The Great Arab Conquests* runs to nearly 500 pages, but covers less than two centuries. Expecting a dialectologist not trained in history to be able to make an original contribution and synthesis of the historical literature is demanding a great deal.

The second is that established histories can become quite difficult to rethink and question once they have become integrated into the dialectological literature. We have seen this already in the big-bang approaches to Arabic, but the little-bang approach of North African history bears additional scrutiny. The historical model established by Marçais (1938) has dominated the dialectology of North Africa, and researchers remain strongly committed to it. Benkato (2019) has documented this extensively (pp. 16–18 especially), but one is still amazed to find statements in Aguadé (2018) such as "it is this role as a junction that, according to William Marcais, made Qayrawan the origin of the spread of all pre-Hilalian Maghrebi dialects, and *there is no reason not to accept this assumption* [emphasis added] (p. 39)" or "the question of whether this description [of Banu Hilal causing destruction] (which was widely embraced by the French colonial ideology in the nineteenth and twentieth centuries) actually reflects historical facts *need not be discussed here* [emphasis added] (p. 42)." The dominant narrative is unquestioned, even as it makes it difficult to account for the many contradictions in the data covered in that chapter, as discussed previously.

Finally, it is not always clear that linguists are always able to do high quality historical work, again because we are largely not trained to do so. A common concept in Arabic dialectology is that of *sprachinseln*, dialect areas which, by virtue of being cut-off from the mainstream of the Arabic language, can allow us to date subsequent changes. However, our historical models for these sprachinseln are highly simplistic. One of the most commonly discussed situations is that of Malta, for which Holes (2018a, p. 18) gives a typical interpretation: "a good example [of a *sprachinsel*] is Malta, where a variant of Siculo-Arabic was spoken until the end of the eleventh century, after which all contact with Arabic-speaking communities ceased." This is echoed by Aguadé (2018, p. 34) where he

states that "Maltese is an important source as *terminus comparationis* since the island of Malta was conquered by the Normans [ ... ] in the year 1090," the value of which is that "Maltese represents an archaic pre-Hilali dialect which evolved uncontaminated by later Hilali interferences."

However, even a slightly deeper look into the history of Malta suggests that reports of this "cutting off" are greatly exaggerated. The conquest of Malta did indeed bring it under the rule of the Kingdom of Siciliy in 1090—but that Kingdom also included Northern Tunisia, and further conquest included Tripoli from 1146 to 1158, and coastal Tunisia from 1134–1148. The Normans did not appear to meaningful occupy Malta at that time. Rather, Malta was a tributary until 1127 when it was conquered "to use as a transit point for trade," clearly including trade with Tunisia (Joffé 1990, p. 68). Luttrell (1975, pp. 31–32) reports a Pisan captain, apparently to combat piracy, seizing a Tunisian ship at Malta with the goods aboard it, and throwing the crew into the sea. The island of Pantelleria, 200 km NW of Malta, was even included in an arrangement by which half their tribute went to Tunis (Luttrell 1975). This trade with North Africa clearly continued, where under the Aragonese crown starting from 1283, "Malta's real usefulness was not as a market or a source of raw materials but as an entrepot and safe harbor on the routes to Beirut and Alexandria, and above all to Tunis and other African ports (Luttrell 1965, p. 6)" where slaves, wood, and cotton all appear to have been traded through Malta. Trade contacts almost certainly would have also resulted in continuing linguistic contact.

Cyprus is held to similarly have been cut off, with Borg (2006, pp. 536–37) stating "the sociocultural parallels between Cypriot Arabic and Maltese is particularly close ... since in both cases, we are dealing with an Arabic vernacular surviving in complete isolation" referring presumably to Crusader conquest of the island beginning in the twelfth century.18 This vision of a "cut-off" in Cypriot Arabic leads Tsiapera (1969, p. 11) to speak of a dialect "isolated for some six centuries from other Arabic speakers." However, here again, there is no clear cut-off, with both migration and trade clearly continuing through both the crusader and Ottoman eras. Movements of Christian refugees from the Levant continued at least through the 13th centuries (Borg 2004, p. 8), with some authors suggesting migrations even in into the Ottoman era in the 16th century (Hourani 1998). As in Malta, trade with the Levantine coast must have continued in both the crusader and Ottoman eras (Borg 2004, p. 10).

The problem here is that linguists, not trained as historians, are making two basic mistakes in their historical research. The first is simply not diving deep enough in their research on these dialects—there is notably almost never a reference cited for the "cutting off" of Malta (none is cited in Holes or Aguadé's assertions above), and little further research appears to have been carried out in most such accounts. This is somewhat reasonable, as these authors are dialectologists, not historians, and cannot be expected to be deeply immersed in the historical literature. The second is the overly facile equation of a change in political rule or religious affiliation with a wholesale change in a group's personal and trade relationships. This again seems to be an issue of not having been trained in the surprising mobility and importance of trade that was common in the pre-modern world. Thus, while dialectologists are not 'to blame' per se for perpetuating historically inaccurate models, these models can significantly impact how we interpret the data from our dialectological research.

One solution to this problem is to work on more focused, microhistorical work of particular regions. Palva (2009) is a paradigmatic example of this, weaving highly detailed dialectology work with a more sophisticated view of history than we sometimes encounter. He rightly questions the tradition that 1258 marks the decline of Baghdad (see for an example of that idea Fischer (2006) which uses 1258 as the dividing line for "post-Classical Arabic"), and instead notes that the decline of the city has been documented as beginning much earlier. He also correctly identifies that the city may have had cycles of depopulation and repopulation. This kind of highly focused work—in this case, on a single city—allows for a much deeper level of research into the history of the area. Another suggestion would

be for linguists to reach across departmental and disciplinary lines and work with historians specializing in the area to bring a new perspective to their work, and certainly linguistics and dialectologists have their own contributions to make to historical research.

The microhistorical approach, however, does not work as well when dealing with large areas or finding general trends. Here, the danger of missing large chunks of history will always rear its head, and it can be challenging to dismantle the existing historical narratives dominant in the field. For this reason, I suggest instead a general heuristic that draws on general principles, rather than specific historical narratives, to suggest the kinds of linguistic movements that we should expect to see over time and which can be applied as a basic test of whether a historical narrative is plausible.

This heuristic is based on the observation that the movement of linguistic features across the landscape divides into primarily two types. The first is the diffusion of features without a major change in the distribution of populations within a space—that is, the speakers remain in situ, while the linguistic feature itself moves across the landscape. I refer to this simply as "diffusion". The second is the movement of populations, such that a group moves into an area, bringing the linguistic features of that group with it. I refer to this as "migration." The speakers are not changing their linguistic behavior, but the linguistic geography of the area changes by virtue of their movement.

Diffusion is by far the most common way a linguistic form moves across the landscape in areas with high population density. This diffusion is generally not simply geographical, diffusing outward to the nearest geographical point, but instead is often hierarchical, with features moving between areas with similar population densities even if they are physically distant, and only later diffusing to areas that are geographically adjacent but with lower population density (Britain 2008). Major movements by sedentary groups are rare, the result of major economic or political changes (e.g., industrialization, urbanization, warfare) and as discussed earlier require quite large changes in the total population for a group to supplant the linguistic behavior of a high population group via migration (Miller 2004; Palva 2009). Less populated areas certainly experience diffusion as well, though being at the "bottom" of hierarchical diffusion they tend to diffuse features from the "bottom up," that is features tend to diffuse between areas of similar population density (Wikle and Bailey 1996).

In contrast, for areas less conducive to intensive settlement, migration is much more common as a driver of change in dialect geography. The lower carrying capacity of the land means that population densities are lower to begin with, often requiring nomadism to maximize resources, ensuring frequent movement. Therefore, on a given point of land in such areas, the inhabitants are both more likely to move of their own accord, and if another group moves into that area, are more likely to be overwhelmed by the power of numbers. The basic strategy of nomadism means that it is easy for a group speaking a particular linguistic variety to move out of a given territory or into new territory, perhaps due to a single or several seasons of bad weather. Nomadic groups are well prepared for mobility, physically (in terms of their possessions) and culturally (in terms of having the necessary knowledge to survive). Sedentary populations are less likely to move. Even when they do, they have larger populations and thus are more likely to leave behind enough members of the group to maintain the presence of their linguistic variety in their original location. Depopulating a city requires a catastrophe, while for a nomadic group, leaving a given area is a routine seasonal event. For a nomadic group to seize the territory of another nomadic group requires a relatively small migration of people and could radically change the linguistic behavior of an area from a dialectologist's perspective. On the other hand, nomadic groups cannot generally move into densely populated sedentary areas and impose their linguistic behavior on that area. The nomadic group would need to first win a military campaign against a numerically superior group, and then to maintain and impose their language on a still numerically superior group.19

Therefore, in this model, densely populated areas are treated as a barrier to migration as a means of language change. For a migrating group to have a linguistic impact, absent a

depopulating catastrophe, it must migrate within marginal lands that support similarly small groups. Of course, there is a limit to the kind of terrain that can support meaningful numbers of people. Life in the Sahara and Empty Quarter require such specialized skills for survival that it might be hard to supplant the small number of speakers in those regions easily. This means that there are narrow corridors between the areas where population density is too high for a migrating group to have an impact linguistically (and where there would be potential resource competition with settled people who were actively using more fertile areas), and where it simply is too difficult to inhabit for a newly arrived group. This model is especially important for the MENA region since those corridors are a more common component of the landscape as vast areas of the Arab world are poorly suited for intensive settlement. The amount of agricultural land in 1961 in the Arab world was approximately 25%, compared to 56% in the Euro area.20 Nomadic groups have played a much more significant role in the development of Arabic than in the modern development of Europe languages, for example, and so our heuristics for the development of Arabic must take this into account.

This model therefore suggests that there are two main corridors of movement of linguistic features over space and time in the Arabic-speaking world. For sedentary populations, movement of linguistic material will likely happen between densely populated areas by way of diffusion of linguistic features without permanent population movement. For nomadic populations, movement would occur through marginally inhabited spaces, and it would be the physical movement of populations that result in a particular distribution of linguistic features in space. Densely populated areas would act as a barrier to the diffusion of linguistic features associated with nomadic speakers of the language. These corridors allow us to anticipate particular kinds of linguistic change over space and time without a need for reference to historical events.

Population density is fundamental to this model, but since we have no access to historical population data, we need a proxy variable. Rainfall provides a good proxy, since agricultural production is roughly proportional to total rainfall, while nomadic pastoralism in the Middle East exploits low-rainfall areas through seasonal migration and use of hearty livestock such as camels. To establish that this proxy variable works well, Magidow ( forthcoming), correlated data about the realization of the Q variable in a sample of 88 geo-located dialects against average (modern) rainfall totals. Comparing whether dialects have the voiced realization of (Q) to the average amount of rainfall produces a correlation of 0.50 (*p* < 0.001), explaining nearly 25% of the variation between dialects, a reasonable result for a very rough and simplistic comparison.21

Another variable to consider is elevation. Different types of nomadism require different subsistence strategies. One major division is between low-land subsistence (based on movement across large distances) versus elevation subsistence (based on movement up and down in elevation) (Barfield 1993; Barth 1961; Donner 1989). It is notable that in many areas where the Arabs conquered, the 1000 m elevation line represents the limits of Arabicization, from Andalusia to North Africa to the Iraq-Iran border, though Yemen constitutes something of an exception (Donner 1989; Kennedy 2007, pp. 435, 438; Magidow 2013). Figure 3 therefore combines both rainfall and elevation in an attempt to illustrate the migration corridors for linguistic features. The black areas represent either those areas with too much rain (or a river systems likely to support agriculture) or which are at too high an elevation (1000+ m).<sup>22</sup> The light grey areas on the map have extremely low rainfall of 50mm/year or less, which are very difficult to inhabit even for nomadic settlement. The white areas are therefore the "Bedouin corridors." These are the areas where we would expect movement of Bedouin groups, while we expect less movement, and slower movement, elsewhere.<sup>23</sup>

Overlaid on the map in Figure 3 are realizations of the proxy variable most strongly associated with Bedouin dialects, use of a voiced reflex of the Q variable. As expected, voiced Q reflexes are primarily found within the Bedouin corridors. The most common exception is in Yemen, which is a geographical and linguistic area noted for being dialec-

tologically unusual. This suggests that in general, the notion of the Bedouin corridor is a good proxy for how nomads help diffuse linguistic features, without a need to reference a particular model of the history of Arabic.

**Figure 3.** Map of Bedouin corridors (white) contrasted with barriers to Bedouin movement due to rainfall or elevation (black), or extremely low rainfall (grey), overlayed with proxy variables for sedentary vs. Bedouin dialects.

This model also makes meaningful predictions. Among other things, it removes the need to treat the Bedouin-sedentary difference as an essentialized difference with little explanation. Instead, this distinction falls out naturally from the basic idea that there are two different manners in which linguistic features move. Should a dialect spoken by nomads settle into a populated area, it would stop being subject to migration as a driver of linguistic change, and become more influenced by diffusion from more densely populated areas. The similarities between sedentary dialects are therefore due primarily to linguistic diffusion between them, rather than due to a single process of koineization (or even perhaps sharing an early ancestor.) The similarities between Bedouin dialects are similarly a result of their contact with one another, and likely due to rapid movements which erase diversity rather quickly and present an illusion of homogeneity across broad spaces.

This model also informs our historical understanding. In North Africa, we expect to find significant nomadic movement along the coast as far as southern Tunisia. After that, we would expect most of the areas of the coast to be relatively more difficult for nomadic groups to penetrate, and that this would be an area of sedentary in situ linguistic diffusion.<sup>24</sup> This replicates relatively well the "pre-Hilalian" vs. "Hilalian" distinction, but without a need to rely on the flawed historical model. We expect the Sinai to be a major land-bridge, with frequent population movements, and that the dialects in that area probably do not represent extremely long-term settlement.25 We expect the movement along the western coast of the Arabian Peninsula to be relatively easy, while the Hijaz mountains would likely divide the dialects to the west and east of that mountain range. We expect to find the desert between Syria and Iraq to act as an extension of the Arabian Peninsula in terms of dialect, but the crescent of sedentary areas from the Syrian coast through Anatolia and into Mesopotamia to form a barrier for further nomadic movement, and for linguistic features to diffuse through that sedentary corridor (Behnstedt 1990; Ingham 1982; Jastrow 1978, p. 78).

Another key prediction of the model is that the linguistic behavior of sparsely populated areas is unlikely to have significant time-depth in terms of features in that space over long periods of time. The chances that the linguistic behavior in the Bedouin corridors has remained the same over millennia is quite low. Indeed, as discussed in Section 3, even the sedentary areas may show less time-depth for their linguistic features than is

typically believed. This has serious implications for how dialectologists analyze the dialects of the Arabian Peninsula. In keeping with the 'conservative features imply antiquity in place' fallacy, we often find linguists assuming that Najdi Arabic represents an unbroken linguistic tradition in the region. Owens (2018, p. 211) claims that "Gulf and Najdi Arabic are spoken in the Arabian Peninsula and have been spoken in these areas since pre-Islamic times." Al-Jallad (2009) similarly states that "the Bedouin dialects of the southern Najd of course never left the Arabian Peninsula." 26This of course seems highly unlikely. This is a region of very low population density, with well-document migration out of the Arabian Peninsula into other areas. The vacuum caused by these migrations would almost certainly have changed the dialect landscape of the Peninsula itself. Ingham (1982, map 5) illustrates a very significant reshaping of the linguistic isoglosses in the Peninsula since the 17th century. It is simple to imagine that the thousand years between the Islamic conquests and the start of his map witnessed many similar disruptions to that dialectal map. Indeed, the main argument of Holes (2018b) is that the dialects of the Gulf reflect at least three major layers, one of which may date only as recently as the late eighteenth century. One wonders too about the Baharna layer that he uses as a key piece of evidence. It is in relatively rapid decline, and if traces of that dialect can virtually disappear within two or three centuries, how many layers of Arabic dialects may have been lost in the past millennia? This is not to strictly argue that there cannot be continuity here, but it is highly unlikely and we would need the kind of microhistory mentioned previously to prove continuity or a lack thereof.

This idea that the dialects of the Arabian Peninsula are somehow original to the area based on their archaism is not entirely dissimilar from the argument about Yemeni dialects discussed previously. However, the heuristic here would predict very different histories for the two regions. The highlands of Yemen are relatively inaccessible, and have historically had higher populations than central Arabia, receiving greater rainfall including some from the monsoon. Our model would expect that linguistic change in Yemen would be relatively more difficult than in the Najd. This is not the say that we expect Yemen dialects to be highly durable—it is notable for example that the least "non-core" Yemeni dialects discussed above are located on the Tihama, an area that should be a Bedouin corridor and should not have long-term durability. The geography of the region means that we should consider the possibility that these archaic dialects originated somewhere else, and only arrived relatively recently in Yemen. The model proposed here allows us to still derive observed differences between dialects while being able explain them in a parsimonious manner based on general principles and using a more accurate analysis of conservative and innovative features.

#### **6. Conclusions**

This article began with an observation about dialect similarity that the dialects of the Arabian Gulf and those of southern Algeria, two areas thousands of kilometers apart, were virtually mutually intelligible, while the dialects only a few hundred kilometers to the north were not. This paper has argued that this fact is surprising primarily because of the long-standing narratives in the field of Arabic dialectology, and if one steps away from those narratives, it becomes evident that this situation is both explicable and expected. This article argued that there is a strong tendency to essentialize the idea of linguistic conservatism and apply it more broadly to the linguistic groups which have conservative features. In a similar vein, the field of Arabic linguistics tends to assume the primacy of the earliest layer of settlement during the Islamic conquest, and that modern dialect geography is directly derived from these early settlements. In contrast, this article argued that it is highly unlikely that the earliest settlements successfully Arabicized the areas settled and even if they had, it is unlikely that they were able to resist 1400 years of historical changes. The article similarly argued that the long-held idea that Yemeni dialects are ultimate ancestors of most modern Arabic dialects ignores the evidence for the existence of more proximate ancestors from which the dialects descend. Finally, the paper argued that given the limitations of historical models, it may be helpful to apply a heuristic approach

based on sociolinguistics and geography to derive the existing observations found in Arabic dialectology.

This paper is meant to provide some suggestions to strengthen the work being carried out in this field and the following attempts to summarize these into concrete suggestions:


Finally, there are some larger implications for the field in this paper which will need to be explored elsewhere. If the Arabic dialects are, by and large, new to the areas they currently inhabit, and the linguistic features characteristic of each dialect susceptible to adoption by different dialects as people, or linguistic features, move across the landscape, can Arabic dialects actually be analyzed from a traditional genealogical perspective? Is it meaningful to talk about a long-term persistence of a cluster of dialect features within the same social group? Or are the Arabic dialects reflective of *Wellentheorie* at its most fundamental—a set of human movements bringing features across a vast space, sharing them at times, and them bringing them elsewhere, or disappearing entirely? If a wave on a shoreline leaves behind a seashell, then the following wave moves that shell before depositing a new one, does it make sense to speak of any of those waves as the "origin" of the shore? Will our field be able to reconstruct the dialects that moved into North Africa, or Central Asia, or Andalusia, or the earliest dialects immediately following the Islamic conquests, or do we simply lack the fidelity and density of information needed to do so?

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


#### **References**


Behnstedt, Peter. 1997. *Sprachatlas von Syrien*. Weisbaden: Harrassowitz.


Blanc, Haim. 1964. *Communal Dialects in Baghdad*. Cambridge: Harvard University Press.


Guerrero, Jairo. 2016. Notes on the Arabic Dialect of Larache (North-Western Morocco). In *Lisan Al-Arab: Studies in Contemporary Arabic Dialects. Proceedings of the 10th International Conference of AIDA. Qatar University 2013*. Edited by Muntasir F. Al-Hamad, Rizwan

Ahmed and Hafid I. Aloui. Vienna: LIT Verlag, pp. 195–208.


Holes, Clive, ed. 2018a. *Arabic Historical Dialectology: Linguistic and Sociolinguistic Approaches*. Oxford: Oxford University Press. Holes, Clive. 2018b. The Arabic Dialects of the Gulf: Aspects of Their Historical and Sociolinguistic Development. In *Arabic Historical*

*Dialectology: Linguistic and Sociolinguistic Approaches*. Edited by Clive Holes. New York: Oxford University Press, pp. 112–47. Hourani, Guita G. 1998. A Reading in the History of the Maronites of Cyprus From the Eighth Century to the Beginning of British Rule.

*Journal of Maronite Studies* 2: 1–16.

Hoyland, Robert G. 2009. Arab Kings, Arab Tribes and the Beginnings of Arab Historical Memory in Late Roman Epigraphy. In *From Hellenism to Islam: Cultural and Linguistic Change in the Roman Near East*. Edited by Hannah M. Cotton, Robert G. Hoyland, Jonathan J. Price and David J. Wasserstein. Cambridge: Cambridge University Press, pp. 374–400.

Ingham, Bruce. 1982. *North East Arabian Dialects*. Boston: Kegan Paul International.

Jastrow, Otto. 1978. *Die Mesopotamisch-Arabischen* Qˬltu*-Dialekte*. Wiesbaden: Komissionsverlag Franz Steiner GMBH.

Jastrow, Otto. 2002. Arabic Dialectology: The State of the Art. *Israel Oriental Studies* 20: 347–64.

Joffé, E. G. H. 1990. Relations between Libya, Tunisia and Malta up to the British Occupation of Malta. *Libyan Studies* 21: 65–73. [CrossRef]

Kennedy, Hugh. 2007. *The Great Arab Conquests: How the Spread of Islam Changed the World We Live in*. London: Weidenfeld and Nicolson. Kossmann, Maarten G. 2013. *The Arabic Influence on Northern Berber*. Leiden: Brill.

Labov, William. 2001. *Principles of Linguistic Change, Volume 2: Social Factors*. Malden: Blackwell.

Lane, Edward. 1968. *An Arabic-English Lexicon*. Beirut: Librairie Du Liban Publishers.

Lentin, Jérôme. 2018. The Levant. In *Arabic Historical Dialectology: Linguistic and Sociolinguistic Approaches*. Edited by Clive Holes. New York: Oxford University Press, pp. 170–205.

Luttrell, Anthony T. 1975. Approaches to Medieval Malta. In *Approaches to Medieval Malta: Studies on Malta before the Knights*. Edited by Anthony T. Luttrell. London: British School at Rome, pp. 1–70.

Luttrell, Anthony T. 1965. Malta and the Aragonese Crown: 1282–1530. *Journal of the Faculty of Arts, University of Malta* 3: 1–9.

Magidow, Alexander. 2013. Towards a Sociohistorical Reconstruction of Pre-Islamic Arabic Dialect Diversity. Ph.D. dissertation, The University of Texas at Austin, Austin, TX, USA.

Magidow, Alexander. 2016. Diachronic Dialect Classification with Demonstratives. *Al-Arabiyya* 49: 91–115.


Marçais, Philippe. 1956. *Le Parler Arabe de Djidjelli, Nord Constantinois, Algérie*. Paris: Librairie d'Amérique et d'Orient.

Marçais, William. 1902. *Le Dialecte Arabe Parlé a Tlemcen: Grammaire, Textes et Glossaire*. Paris: Publications de L'École des Lettres DÁlger. Marçais, William. 1938. Comment L'afrique Du Nord A Été Arabisée. *Annales de l'Institut d'études Orientales d'Alger* 4: 171–92.

Miller, Ann M. 1986. The Origin of the Modern Arabic Sedentary Dialects: An Evaluation of Several Theories. *Al-Arabiyya* 19: 47–74.

Miller, Catherine. 2004. Variation and Change in Arabic Urban Vernaculars. In *Approaches to Arabic Dialects: Collection of Articles Presented to Manfred Woidich on the Occasion of His Sixtieth Birthday*. Edited by Martine Haak, Rudolf de Jong and Kees Versteegh. Leiden: Brill, pp. 117–205.

Miller, Catherine. 2005. Between Accomodation and Resistance: Upper Egyptian Migrants in Cairo. *Linguistics* 43: 903–56. [CrossRef]


Ostler, Nicholas. 2005. *Empires of the Word: A Language History of the World*. London: HarperCollins.

Owens, Jonathan. 2003. Arabic Dialect History and Historical Linguistic Mythology. *Journal of the American Oriental Society* 123: 715–40. [CrossRef]

Owens, Jonathan. 2006. *A Linguistic History of Arabic*. Oxford: Oxford University Press.


Reinhardt, Carl. 1894. *Ein Arabische Dialekt Gesprochen in 'Oman Und Zanzibar ¯* . Stuttgart and Berlin: W. Spemann.

Rosenhouse, Judith. 2006. Bedouin Dialects. In *Encyclopedia of Arabic Language and Linguistics*. Leiden: Brill, vol. 1, pp. 259–69.

Stanford, James N., Thomas A. Leddy-Cecere, and Kenneth P. Baclawski Jr. 2012. Farewell to the Founders: Major Dialect Changes Along the East-West New England Border. *American Speech* 87: 126–69. [CrossRef]

Stewart, Frank H. 2017. Bedouin Dialects of the Sinai: A Review Article. *Jerusalem Studies in Arabic and Islam* 44: 169–220.

Taine-Cheikh, Catherine. 2006. H. assaniyya Arabic. In ¯ *Encyclopedia of Arabic Language and Linguistics*, online ed. Edited by Kees Versteegh. Leiden: Brill, vol. 4, pp. 687–99.

Trudgill, Peter. 1986. *Dialects in Contact*. Oxford: Basil Blackwell.

Tsiapera, M. 1969. *A Descriptive Analysis of Cypriot Maronite Arabic.Pdf*. Den Haag: Mouton.

van Putten, Marijn, and Adam Benkato. 2017. The Arabic Strata in Awjila Berber. In *Arabic in Context: Celebrating 400 Years of Arabic at Leiden University*. Edited by Ahmad Al-Jallad. Leiden: Brill, pp. 476–502.

van Putten, Marijn. 2017a. The Development of the Triphthongs in Quranic and Classical Arabic. *Arabian Epigraphic Notes* 3: 47–74. van Putten, Marijn. 2017b. The Illusory Yemenite Connection of Andalusi Arabic. *Zeitschrift Für Arabische Linguistik* 66: 5. [CrossRef] Versteegh, Kees. 1984. *Pidginization and Creolization: The Case of Arabic*. Amsterdam: Benjamins.

Versteegh, Kees. 2004. Pidginization and Creolization Revisited: The Case of Arabic. In *Approaches to Arabic Dialects: A Collection of Articles Presented to Manfred Woidich on the Occasion of His Sixtieth Birthday*. Edited by Martine Haak, Rudolf Erik De Jong and Kees Versteegh. Leiden: Brill, pp. 343–57.

Versteegh, Kees. 2014. *The Arabic Language*, 2nd ed. Edinburgh: Edinburgh University Press.

Vicente, Ángeles. 2012. Sur la piste de l'arabe marocain dans quelques sources écrites anciennes (du XIIe au XVIe siècle ). In *De Los Manuscritos Medievales a Internet: La Presencia del Árabe Vernáculo en las Fuentes Escritas*. Edited by Mohamed Meouak, Pablo Sánchez and Ángeles Vincente. Zaragoza: Universidad de Zaragoza, Área de Estudios Arabes e Islámicos, pp. 103–20.

Vincente, Ángeles. 2000. *El Dialecto Árabe de Anjra (Norte de Marruecos)*. Zaragoza: Universidad de Zaragoza.

Walters, Keith. 2006. Communal Dialects. In *Encyclopedia of Arabic Language and Linguistics*. Leiden: Brill, vol. 1, pp. 442–48.

Watson, Janet. 2018. South Arabian and Arabic Dialects. In *Arabic Historical Dialectology: Linguistic and Sociolinguistic Approaches*. Edited by Clive Holes. New York: Oxford University Press, pp. 316–34.

Wikle, Thomas, and Guy Bailey. 1996. The Spatial Diffusion of Linguistic Features in Oklahoma. *Proceedings of the Oklahoma Academy of Science* 77: 1–15.

Zelinsky, Wilbur. 1992. *The Cultural Geography of the United States: A Revised Edition*. Englewood Cliffs: Prentice Hall.

## *Article* **A Historical Reconstruction of Some Pronominal Suffixes in Modern Dialectal Arabic**

**Phillip W. Stokes**

Department of Modern Foreign Languages and Literatures, The University of Tennessee, Knoxville, TN 37996, USA; pstokes2@utk.edu

**Abstract:** The morphology of the pronominal suffixes in dialectal Arabic are of particular interest for scholars of the history of Arabic for two main reasons. First, multiple dialects attest suffixes that, from a comparative perspective, apparently retain final short vowels. The second and more complicated issue concerns the vowels which precede the suffixes in the dialects, which are thought to either have been case inflecting or epenthetic. In this paper, I take up Jean Cantineau's "embarrassing question" of how to account for the development of the vowels of the pronominal suffixes. Based on data from dialectal *tanw¯ın* in modern dialects, and attestations from pre-modern texts as well, I will argue that the pre-suffix vowels did originate in case inflecting vowels, but that no historical model heretofore proposed can satisfactorily account for how the various dialectal forms might have arisen. I identify two major historical developments and propose models for each. First, I suggest that dialects in which the pre-suffixal vowels harmonized with the suffix vowels developed via a process of harmonization across morpheme boundaries before the loss of final short vowels. For dialects in which one vowel is generalized, I argue that a post-stress neutralization took place, which led to a single vowel both before suffixes and *tanw¯ın* as well. Finally, I rely on evidence from the behavior of the suffixes to argue that the final vowel of the 3fs suffix was originally long, but that those of the 3ms, 2ms, and 2fs were most likely short.

**Keywords:** Arabic dialects; historical dialectology; historical linguistics

#### **1. Introduction**

The morphology of the pronominal suffixes in dialectal Arabic are of particular interest for scholars of the history of Arabic for two main reasons. First, multiple dialects attest suffixes that, from a comparative perspective, apparently retain final short vowels: dialectal *daras* "he studied", < \**darasa* but *Ҵabu-ki ¯* < \**ҴabVV-ki*, "your (fsg) father." This is despite the fact that most assume a complete loss of final short vowels in the pre-modern ancestors of all Arabic varieties. The second, and more complicated issue concerns the vowels which precede the suffixes in the dialects. In Classical Arabic (henceforth ClAr), this position was occupied by a short case vowel: *bab+u ¯* /*i*/*a* +*ka* "your door (nom/gen/acc)." In the modern dialects, which lack morphosyntactic case marking, these vowels vary: *bab-Vk ¯* "idem." Several attempts have been made to account for the development of these vowels from a functional case system (Diem 1991; Jastrow 1991; Blau 2006), though none has yet proved entirely satisfactory. The thorniest issue from a historical perspective is how to account for the functional shift of the pre-suffixal vowels, from morpho-syntactic inflection to their current forms, while also assuming the loss of final short vowels, which should have eliminated several of the vowels of the pronominal suffixes. Alternatively, several scholars have recently argued that the ancestors of the dialects came from varieties of Arabic that lacked case completely (Retsö 1994; Owens 2006, 2018). In this alternative scenario, the pre-suffixal vowels were epenthetic vowels inserted to resolve increasingly un-tolerated consonant clusters.

In this paper I argue in favor of the case vowel origin and propose reconstructions that address both issues: final vowel length and the nature and development of the pre-

**Citation:** Stokes, Phillip W. 2021. A Historical Reconstruction of Some Pronominal Suffixes in Modern Dialectal Arabic. *Languages* 6: 147. https://doi.org/10.3390/ languages6030147

Academic Editors: Simone Bettega and Roberta Morano

Received: 24 June 2021 Accepted: 23 August 2021 Published: 31 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

suffixal vowel. Following a discussion of the previous proposals and their deficiencies, I discuss both of these questions, with special focus on the problems posed by the third and second person singular forms. I will argue that data from remnants of case vowels before *tanw¯ın* in the dialects (dialectal *tanw¯ın*), as well as historical data in which case was retained longer before suffixes than elsewhere, can help understand the development of these vowels. The implications of these observations for the question of the historical length of the vowels of the suffixes will be discussed at length in Section 3. The ultimate goal of these reconstructions is to gain deeper historical insight into the most common surface forms attested across the spectrum of Arabic dialects, as well connect them together in the broader picture of Arabic linguistic history.

#### **2. Pre-Suffix Vowel: Case Vowel or Epenthesis?**

The pronominal suffix paradigms for the modern Arabic dialects do not inflect for case but nevertheless attest vowels before the pronominal suffixes. The following data (Table 1) serve as examples of the regular realizations of these suffixes, regardless of syntax (Fischer and Jastrow 1980, p. 81; Jordanian from Al-Wer (2007, p. 510)):


**Table 1.** Levantine and N. African Singular Pronominal Suffixes.

The origin and nature of the vowels preceding the pronominal suffixes in the modern dialects has become rather controversial. They have traditionally been interpreted as frozen reflexes of case vowels,1 which eventually harmonized with the short vowels of the suffixes, e.g., nominative *u* was frozen based on harmony with 3ms \*hu, etc. (Birkeland 1952, pp. 12, 19; Cantineau 1936/1937, p. 180, vol. 2; Fischer and Jastrow 1980, p. 42; Diem 1991). To account for the vowel harmony, which would ostensibly be possible only after the breakdown of case, and thus presumably the loss of final short vowels, Blau (2006, pp. 87–88) appeals to the intentionality of speakers, who decided to avoid a homophony between masculine and feminine forms, which after the breakdown of case and loss of final short vowels, should have both become \*\**bab-k ¯* "your (msg and fsg) door." Instead, they inserted an *a* vowel before the suffix for the masculine and an *i* to indicate the feminine. Blau cites a phenomenon in ClAr called *naql*, "transfer", wherein speakers would insert a vowel homophonous with the suffixed case vowel on noun of the pattern of CVCC: *al-bakru* (nom sing) > *al-bakuru*, etc. (ibid., p. 88). However, Blau does not address the biggest issue with his proposal. Given that he believes that a *naql* would have occurred only once a final cluster had been created, it is unclear how speakers would have retrieved the etymological *a* vowel to mark masculine or the *i* vowel for the feminine if they had been lost word-finally.

Diem (1991) offers a different explanation. He recognizes that if loss of case was caused, or at least accompanied by loss of final short vowels, then the vowels of at least some of the pronominal suffixes, like 3ms \*-*hu* and 2ms \*-*ka*, should have also been lost. If such was the case, however, then of course the final vowel, with which the case vowel would harmonize, would have been eliminated. Diem, still believing the initial vowels to be remnants of case vowels, argues that the breakdown of the case system was caused by syntactic factors, especially the redundant nature of the Semitic case system, and not the result of a phonetic loss of short final vowels. In Diem's scenario, case breakdown resulted initially in a state where the short case vowels were in free variation (ibid., pp. 301–3). Eventually, before pronominal suffixes they were harmonized with the suffix vowel and frozen; in word-final position (including, for Diem, before *tanw¯ın*), one case was levelled.<sup>2</sup>

This is the state of affairs in what Diem calls "nomadic dialects" (i.e., Bedouin dialects) (ibid., pp. 303–4). Most non-nomadic dialects eventually witnessed the total loss of these remnants due to their lack of any meaningful syntactic function.

This is an attractive explanation<sup>3</sup> because it solves the main objection to the traditional assumptions (see above regarding Blau's 2006 proposal), namely that case was lost due to loss of final short vowels. If case marking was reanalyzed as marked solely by word order etc., then these variants might have eventually been in free variation.<sup>4</sup> It should be noted here that Diem recognizes the loss of final short vowels at some point, but argues that the breakdown of case was not related to this loss (or, probably, multiple losses), and presumably preceded it.

Alternatively, Owens (2006, chp. 8) has argued that these initial vowels originated as epenthetic vowels, completely unrelated to case vowels, the quality of which varied depending on the following consonant/vowel: 3ms \**hu* > *u-hu* (addition of epenthetic vowel) > *uh* (loss of final short vowel).5 Owens's arguments against reconstructing case in the ancestor of all Arabic varieties is unconvincing. There is thus no reason a priori to suggest these vowels cannot be remnants of case vowels. Still, the complete loss of case vowels could have led to their loss before pronominal suffixes as well, presumably in this case, per Blau (2006), via the levelling of pausal forms to non-pausal position, which in turn could have led speakers to insert epenthetic vowels to resolve these newly-created consonant clusters. In other words, since Ø-marked forms were levelled to other contexts where they would be expected to be preserved if the loss of case were the result of a regular phonetic change (\*v > Ø/C\_#), such as the construct, then it is also possible that these Ø-marked forms were levelled to the position before pronominal suffixes.

Deciding between these options is difficult due to the uneven knowledge of the contemporary Arabic dialects, and the virtual absence of historical data for the ancestors to the modern dialects. Nevertheless, a careful examination of the implications of these proposals yields meaningful insights. If we imagine that the vowels in question originated as case vowels, then we must account for speakers presumably sacrificing the morphosyntactic role of case vowels before pronominal suffixes in favor of a non-phonemic epenthetic function. Scholars have thus mostly assumed that this happened following the breakdown of case (cf. Diem 1991). Most likely, then, the loss of case marking on the un-suffixed nouns motivated the levelling of one form, as in the external masculine plural oblique form *¯ın*. The presence of very similar suffix forms in Aramaic, and probably for Hebrew as well, provide attractive parallels (Table 2; Aramaic and Hebrew data taken from Hasselbach 2014, pp. 204–5):

**Table 2.** Reconstructed Aramaic (Syriac) and Hebrew Pronominal Suffixes.


<sup>1</sup> As indicated by the mater lectionis <Y> of the Aramaic form, the *i* vowel here was apparently long. Benjamin Suchard has suggested to me (p.c.) that this long ¯ı was the result of \**ki* > *k¯ı* based on contamination with the imperfect ending. <sup>2</sup> The Tiberian forms are complex. In non-pausal position, Tiberian *malk*@*-ka¯*, but pausally, *malkéka¯*. The latter form suggests < \**malki-kah* (Suchard, p.c.), or possibly analogy with the feminine singular form (Suchard 2020, pp. 194–96).

There are several reasons to favor positing etymological case vowels as the origin of the pre-suffixal vowels instead of originally vowelless consonant clusters that were only later broken up via epenthesis. First and foremost is a methodological argument. We can confidently reconstruct case for both Proto-Semitic and Proto-Arabic based on comparative and internal data. A case system in which singular (and broken plural) nominal forms are inflected for three cases (nominative, genitive, and accusative), while duals and sound masculine plurals are inflected for two (nominative and oblique) is attested in Akkadian, Ugaritic, and Canaanite, as well as ClAr (Huehnergard 2019, pp. 60–61). Classical Ethiopic

(G@*ҵ* @z) also attests a remnant of the same system, with an accusative/non-accusative system in which accusative is marked with short-*a*, and non-accusative is unmarked, likely as the result of the merger and loss of final \**u* and *\*i* (>@ > *ø*/C\_#) (Butts 2019, pp. 128–29).

Within Arabic, it is not only ClAr that attests this nominal case marking; it is attested, to varying degrees, in pre-Islamic Arabic corpora in the Safaitic (Ancient North Arabian), Nabataean, and Greek scripts as well (Al-Jallad 2018). As argued in Van Putten and Stokes (2018), the Quranic consonantal text (*rasm*) attests a partial case system, distinct from that attested in ClAr. Finally, from a methodological perspective, we should greatly prefer an explanation for the lack of a feature that is most parsimonious and consistent with standard historical linguistic methodology. In attempting to answer these questions we are faced with the scenario either that the Semitic languages in general, and Arabic specifically, should be sub-grouped historically based on the presence or absence of a single feature, or that nominal case marking, which is known to have been lost in a number of languages, was retained completely in some, only partially in others, and lost completely in the rest. The latter is more parsimonious than the former.

Further, Owens' argument that these vowels originated as epenthetic vowels is not as strong as his claims make it seem. Owens himself reconstructs these pronominal suffixes with final short vowels, which were only lost after vowel harmony with the newly inserted epenthetic vowels (Owens 2006, p. 248 ff.). He further argues that when a pronominal suffix is added to a word like *qalb* "heart", the combination would result in a chain of three consonants, CCC, which is often (but not always; see Watson (2007, pp. 340–41)) disallowed in Arabic dialects (ibid., pp. 107–11). This resulted in the insertion of an epenthetic vowel between the stem and the suffix: *qalb* + *ha¯* > *qalbha¯* > *qalb-a-ha¯* "her heart." The same process was responsible for creating 2ms *ak*: *qalb* + *ka* > *qalbka* > *qalb-a-ka* "your (ms) heart", and 2fs *qalb* + *ki* > *qalbki* > *qalb-i-ki*, and so on. After this stage of epenthetic vowel insertion, final short vowels of these suffixes were lost: *qalb-a-ka* > *qalb-ak*.

While on the face of it this explanation might seem to account for the dialect data without resorting to the notion of case vowels, it is mitigated by some further considerations. First, for Owens's explanation of these forms to obtain, we would need to assume that all dialects resolved these CCC clusters in the same way, namely by inserting the epenthetic vowel after the second of the three consonants, i.e., CCCv(:) > CCvCv(:). In fact, however, as Owens implicitly acknowledges (ibid., pp. 108–9), many dialects resolve these clusters by inserting the epenthetic vowel after the initial consonant, so CCCv(:) > CvCCv(:) (on which, see Watson (2007, pp. 340–48)). In such cases, we would expect two different allomorphs of the suffixes to exist, distributed according to the phonotactic rules of the dialect, but this is not the case. That is, if CVCC nouns were the origin of the epenthetic vowels, as Owens maintains, then we would expect a distribution which matches the two patterns of epenthesis insertion: *qalb-a-k* for CCC > CvCC, and \*\**qalabk* for CCC > CCvC. Instead, we only find forms that would need to have originated in CCvC-inserting dialects.

Another question that Owens does not meaningfully answer is why epenthesis would have been necessary in most nominal patterns. Owens's examples, cited above, are typically CVCC (including CVW/YC), i.e., *qalb-ak* "you (msg) heart" and *bet-ha ¯* "her house", etc. Elsewhere epenthesis is not typical. For example, most dialects do not resolve CCV clusters: Jordanian (a CVCC dialect) *katabti* "you (fsg) wrote", and not \*\**katabiti* "idem" (Al-Wer 2007); Cairene (a CCVC dialect) *gasalti ˙* "you (fsg) washed", and not \*\**gasaliti ˙* (Woidich 2006, p. 330).<sup>6</sup> As Owens admits, the pronominal suffixes, with the exception of 3fs (\**ha¯*) and 1cp (\**na¯*), are to be reconstructed with final short vowels (reviewed above) (Owens 2006, pp. 239–58). Based on the dialectal data on which Owens relies, epenthesis at this initial stage, where the final short vowels of the suffixes were still present, would only be required for nouns of the pattern CVCC. It seems rather implausible that allomorphs necessary only for nouns of one (admittedly frequent) noun pattern were leveled everywhere. However, it is not impossible. If there are any dialects where Owens' reconstruction should be expected to accurately match the distribution, it is thus CCVC-patterning dialects, like Cairene (Watson 2007, p. 341).

The decisive blow to Owens's reconstruction, however, comes upon further examination of the Cairene distribution. The Egyptian realization of \**qalb-a-ha¯* "her heart" is *Ҵalb-a-ha*, \**qalb-a-ka* "your (msg) hearth" is *Ҵalb-ak*, and \**qalb-i-ki* "your (fsg.) heart" is *Ҵalb-ik*, each as Owens's reconstruction would predict. However, if the forms with epenthesis were generalized, as could be suggested by the 2ms and 2fs forms, we would expect 3fsg -aha to occur when suffixed to any noun, regardless of morphological pattern. Instead, however, we find that on nouns ending in a single consonant, the form is -ha, i.e., *baҴarit-ha* "her cow", and not \*\**baҴaritaha*. The same noun, however, when suffixed with a 2ms or 2fs suffix, would have the supposedly epenthetic allomorph, i.e., *baҴaritak* (msg.)/*baҴaratik* (fsg.). In other words, in a dialect that fits Owens' pattern for the creation of epenthetic allomorphs, we would have to suppose that the epenthetic allomorphs of the 2ms and 2fs suffixes were generalized everywhere, but the 3fs suffix retained its original distribution (i.e., occurring only after CC clusters). Thus, Owens's theory does not successfully explain the particulars of the distribution of any dialect type.

The question remains how to model this development from etymological case vowel to the various surface forms. As noted above, both Blau and Diem believe, as I do, that nouns originally ended in a short vowel that marked nominal case on singular, feminine plural, and most broken plural patterns. The question of their development is tied up with that of the simplification and eventual loss of Proto-Arabic morphosyntactic case inflection. For Blau, the breakdown and loss of case is related to the loss of final short vowels, and the subsequent analogical extension of these caseless forms to non-final position. For Diem, on the other hand, case breakdown was only partially, if at all, related to short final vowel loss; rather, for him it was primarily due to the low functional yield of case in Semitic and Arabic.

While both Blau and Diem are undoubtedly correct that both analogy and sound change played roles, neither provided explanations which, in the end, were able to provide satisfactory reconstructions of the process. Blau is ultimately unable to compellingly answer how short final case vowels were lost, thus leading to case loss, without the simultaneous loss of short pronominal suffix vowels. Diem, on the other hand, does not manage to explain what led to the free variation of case vowels, and ultimately the loss of case. As Blau argued successfully elsewhere (Blau 1972), the low functional yield of case in Semitic is not, in and of itself, an explanation for its loss.

Another problem that plagues both accounts detailed above is that neither fully integrates the only other feature which provides direct evidence of case vowels that remain in the dialects, namely dialectal *tanw¯ın*. Both Blau (2002, p. 44–46; 2006) and Diem (1991, 303 ff.) assume that the attested surface forms of dialectal *tanw¯ın* reflect unaltered the case vowel which was ultimately generalized. That is, in their reconstructions, any examples of dialectal tanw¯ın -*in* are automatically assumed to reflect a frozen genitive \*-*in*, any dialectal -*an* reflects accusative \*-*an*, etc. As argued in Stokes (2020), however, there are reasons based on the dialectal data to doubt this straightforward identification. The clearest example is found in the southern Saudi dialects of Bal-Qarn and Ban¯ı*ҵ* Abadil, where an ¯ interesting pausal/non-pausal distribution of dialectal *tanw¯ın* is attested: non-pausal *bet-in ¯* "a house"/pausal *bet-u ¯* "idem." The most likely interpretation of this is a vowel merger to a high vowel in final position, wherein before *tanw¯ın* the surface vowel is short front *-i*, whereas when word-final the vowel is back high -*u* (itself apparently lengthened rather than lost; see Stokes (2020, pp. 655–56) for details).

Other evidence suggests a similar merger occurred in most, if not all, ancestors of the modern dialects. For example, in both Bahraini and rural (*fella¯h.¯ı*) Iraqi dialects, there is an adverbial tanw¯ın -*an* that contrasts with the standard dialectal *tanw¯ın* form -*in*; crucially, these occur in contexts which could not have been ClAr borrowings (Stokes 2020, p. 650). Rather, the reflexes of dialectal *tanw¯ın* attest a stage in which phonemic distinction was lost between short case vowels before tanw¯ın: \**u*,\**i*,\**a* > V/C\_#(n) (Stokes 2020, pp. 655–60). It is likely that such a merger is responsible for the surface forms of dialectal *tanw¯ın* in those dialects that retain it.

Finally, the traditional accounts offer only one set of reconstructions for dialects that attest a great deal of variation. In what follows, I will bring together data from dialects across the spectrum in order to provide historical explanations for each of the major patterns attested therein. Where possible and relevant, I incorporate data from dialectal *tanw¯ın* to inform the reconstruction. I will attempt to show that while the patterns attested in the dialects can effectively be derived from the same Proto-Arabic case-bearing situation, they require different scenarios—each of which differs from those proposed heretofore—to explain the surface manifestations. It is to the data, then, that we now turn.

#### **3. Historical Development of Pronominal Suffixes**

Obviously, there is not sufficient space here to treat every attested form. Further, the reconstructions offered here are not final; rather, it is hoped that they will constitute just one more step in the process of understanding the history of the dialects and their interrelationships. There are undoubtedly dialect-specific factors necessary to understand the particular histories of these suffixes.

Table 3 (below) illustrates most of the major strands of morphological variation in the dialectal suffix forms and will constitute the bulk of the data upon which the subsequent discussions will focus. The following should be considered a representative sampling, and are not exhaustive (Najdi from Ingham (1982, pp. 96–98; 2008, p. 328); S. an*ҵ* ani from ¯ Rossi (1939, p. 4), Watson (2009, p. 110), Isaksson (1991, p. 127)); Mardin Arabic from Grigore (2007, p. 228); Levantine from Brustad and Zuniga (2019, p. 411).


**Table 3.** Pronominal Suffixes from Sample Arabic Dialects.

#### *3.1. Third Person Masculine and Feminine Forms*

Numerous and geographically widespread modern dialects attest 3ms C-*u*/C-*o*, and -V:. Regarding forms with -*o(h)*, Cantineau 1939) argued that it is the result of *dar¯* + frozen accusative *a* + *hu*, with subsequent elision of the intervocalic laryngeal:

*dar-a-hu > d ¯ ara-u > d ¯ ar-o. ¯*

Jastrow (1991) assumes that *dar-u(h) ¯* /*dar-o ¯* , as well as *dar-a(h) ¯* and *dar-ih ¯* "idem", were the result of the "freezing" of different case vowels in different dialects. In his scenario, *dar-u(h) ¯* /*dar-o ¯* attested the retention of nominative \*-*u*, whereas *dara(h) ¯* attested a frozen accusative \*-*a* and *dari(h) ¯* a frozen genitive \*-*i*. However, like Blau and Diem, Jastrow fails

to provide a scenario which accounts for the development of each. We are not told why the nominative case would be frozen only when followed by the 3ms suffix.

I agree that the 3ms suffix forms -*u(h)* ultimately derives from \**u-hu*. As we saw above, explaining these forms involves a model that can explain the freezing of the case vowel before PN suffixes before the loss of word-final short vowels. I would suggest the following steps. From the reconstructed Proto-Arabic distribution, vowel harmony across morpheme boundaries developed, wherein V1-CV2 > V2-CV2. 7


Subsequently, final short vowels were lost.

3. Loss of final short vowels: \**bayt*-*Vcasen*;' \**al-bayt*; \**bayt-uh*

And finally, the neutralization of phonemic contrast before *tanw¯ın* occurred:

4. Neutralization of contrast before tanw¯ın: *bayt-Vn*; *al-bayt*; *bayt-uh*

Steps 3 and 4 are not necessarily linear; they might have been simultaneous, or 4 might have preceded 3. Unlike Jastrow et al., however, I also suggest that 3ms -*o(h)* forms originate in \**u-hu* as well, and not \**a-hu*. The raising of \**u* to *o* is widespread,<sup>8</sup> and is likely behind the realization of the 3ms-*o(h)* forms as well.

Similar types of harmonization are attested in Arabic. For example, Sibawayh (1988, p. 173) mentions a type of assimilation that takes place in pause in which CVCC nouns insert a vowel to break up the final cluster where the second vowel harmonizes with the first:

*Ҵiblun* "a camel", >*ҵ ibil* "idem",

*h. ulmun* "a dream", > *h. ulum* "idem."

Blau's discussion of *naql* "transfer" represents a similar process of harmony, wherein a noun in pause inserts an epenthetic vowel that is the quality of the vowel which, when in non-pausal position, would mark its syntactic case:

*bakrun* > *bakur*

*bakrin* > *bakir*

#### *bakran* > *bakar*

This process<sup>9</sup> is the same kind as is proposed here (CV1CV2 > CV2CV2). It is important, however, to emphasize the difference between the present proposal and Blau's and Diem's. Blau's appeal to *naql* to explain the pronominal suffix is predicated on the loss of final short vowels as the driver of the breakdown of the case system, which would eliminate the source of the transfer—the final vowels of the pronominal suffixes—he proposes. In the present proposal, a cross-morpheme harmony rule develops before the loss of final short vowels. Case on word-final nouns, as well as before *tanw¯ın*, likely continued for some time. A round (or several rounds) of word-final short vowel loss subsequently contributed to the loss of case, although it is likely that this process was gradual. The present proposal also differs with Diem's insofar as it proposes phonological rules to account for each stage, whereas Diem posits a period of randomness in the realization of etymological case vowels that is ad hoc.

Returning to the discussion of *dar-u(h) ¯* and *dar-o(h) ¯* , the -*u* and -o in *dar-u ¯* and *dar- ¯* o need not therefore be frozen nominatives as such; rather, they should be considered etymological case vowels that have harmonized with the vowel of the suffix. The point is, to my view, significant and thus bears repeating. While the argument here is that the vowels that precede the pronominal suffixes were etymologically case vowels, their development in harmonizing dialects (i.e., where the vowel matches the quality of the PN suffix) represents a stage at which the vowel harmonized regularly with the vowel quality of the following morpheme. It is therefore rather meaningless to speak of "frozen nominative" when referring to the *u*. Not only does this argument account for how the

vowels began to harmonize before the final short vowels of the suffixes were lost, but it also avoids the linguistically ad hoc free variation proposed by Diem.

In addition to forms with -*uh* or -*o(h)*, a minority of modern dialects, however, attest a form -*hu*, which quite possibly derives from \*-*hu¯*. In the Arabic of Mardin, for example, post-vocalic 3ms suffix is -*hu* rather than a lengthened vowel -V:, e.g., *abuhu ¯* "his father" (Grigore 2007, p. 228; Talay 2011, p. 913). In nearby S¯ırt, as in most Q@ltu dialects, short high vowels have completely merged to /@/ in every phonetic environment (Talay 2011, p. 913). Thus S¯ırt *m*@*fl*@*s* (<\**muflis*) "broke", and pronominal suffixes *k*@*n*/@*n* (<\**kun*/\**kin* and \**hun*/\**hin*) "your (cpl)/their (cpl)" (Grigore and Bit,una 2012, p. 551). We would thus expect the 3ms suffix, derived from \**u-hu* or \**uh*, to be realized -@*h* as well. In some SW Yemeni dialects, -*hu* is ubiquitous whether the noun ends in a consonant or a vowel.

Comparative Semitic data suggest reconstructing Proto-Semitic 3ms suffix with a short vowel, \*-*su*. Additionally, both Hebrew and Aramaic forms suggest a Proto-Northwest Semitic \*-*hu* (Suchard 2020, p. 43). The Arabic dialectal forms ending in -*u(h)* and -*o(h)*, if the present reconstruction is correct, also suggest a short vowel \*-*hu*. If the Arabic dialectal -*hu* forms do indeed reflect etymological \*-*hu¯*, it could suggest that two by-forms existed at the Proto-Arabic node: \*-*hu* and \*-*hu¯*. There is evidence of long and short vowel variants of pronominal suffixes throughout Semitic. In ClAr, for example, pronominal suffixes following short case vowels are realized long, while those after long vowels were realized short: *bi-h¯ı* "with it", but *f¯ı-hi* "in it" (Fischer 2002, p. 126). The opposite of this system is attested elsewhere, in, e.g., Aramaic and G@*ҵ* @z, where suffixes with short vowels occur after short vowels, and suffixes with long vowels typically occur after long vowels, cf. Biblical Aramaic (Ezra 5:11) *ҵ avdoh¯ ¯ı* "his servants." This distribution led Cantineau (Cantineau 1936/1937, vol. 2) to propose a sort of quantitative harmony, where the length of the suffix vowel harmonizes with the vowel preceding the suffix. A further explanation, offered in Hasselbach (2004), is that the suffixes with short forms are original, with the long forms the result of contamination with the independent form. It is quite possible that various analogies and developments have led to the distribution attested in the various languages (Suchard 2020, pp. 198–214). Whatever the case may be, the presence of long and short forms of the suffixes, originally tied to preceding syllable structure, is very possibly behind the different contemporary realizations of these suffixes across the dialects. This is undoubtedly the explanation for the common distribution of the 2fs suffix, cf. Damascene C-*ik*, but V-*ki*, where the vowel is always historically long (Lentin 2006, p. 548). The *u* forms in Q@ltu dialects such as Mardin could represent a levelling of the long form to all contexts, with a subsequent loss of the laryngeal *h* after consonants, thus C-*u* but V(:)-*hu*. 10

Another possibility is that -*hu* forms in some dialects reflect analogical protection of 3ms suffixes, based on analogy with the independent *hu¯*-based forms. Paradigm pressure can deter an otherwise regular phonetic shift from occurring. For example, the shift in West Semitic from Proto-Semitic \**s*<sup>1</sup> > *h* when word-initial was blocked in some roots where other forms of the paradigm would have remained \**s*1. The verb *samiҵ a*, "he heard", should have become everywhere *hamiҵ a* but, likely due to the imperfect forms, which would have been word-internal and thus not shifted (*yismaҵ* not \*\**yihmaҵ* ), the initial s1 was retained (Al-Jallad 2015).

Finally, in some instances an etymological \*-*hu* was possibly lengthened by analogical pressure from the long vowel in the 3fs suffix, \*-*ha¯*. Elsewhere in the pronominal suffix paradigm feminine forms form the basis for analogical adjustment to the masculine forms. In some cases, for example, the originally feminine -i vowel is levelled to masculine forms as well: e.g., Gabal Fayf ˇ a¯*Ҵ* (Yemen) \*-*hum*/\*-*hin* > -*him*/-*hin*. Elsewhere, the 3fp are extended via clipping of the feminine plural imperfective suffix \*-nah: e.g., Bani Abadil (Yemen) 3mp ¯ -*him* but 3fp -*hinna* (Behnstedt 1987, p. 67). Masculine forms with a doubled nasal and final *a* were created in many dialects based on the feminine forms: e.g., As.-S. alt (Jordan) -*hummu*/-*hinne* ((Palva 1994, p. 461); apud (Procházka 2014, p. 142)). It is very possible then that the long feminine vowel led to the lengthening of the masculine form in some cases as well.

Other than -*u(h)*/-*o(h)*, the 3ms suffix also frequently takes the form -*i(h)*, -*eh*, or -*ah*. In a number of Najdi dialects, the vowel preceding all suffixes, singular and plural, is the high vowel *i*, with the exception of the 3fs suffix, which is *a* (Ingham 2008, p. 328). In, e.g., the southern Hijaz and Tihama, (Procházka 1988, p. 192) reports -*ah*. Elsewhere, a fronted -*eh* is reported, for example in the dialect of Yaš¯ı*ҵ* in Yemen (Isaksson 1991, p. 126).

Dialects that attest a 3ms suffix realized as -*a(h)*, including, e.g., some Najdi dialects (Ingham 1982, p. 98) and the southern H. ijaz ( ¯ Procházka 1988, p. 126), as well as some Sudanese dialects (Owens 1984), have traditionally been interpreted as reflecting the generalization of the accusative case followed by the loss of final short u: \**a-hu > ah* (Cantineau 1939). Owens (2006, pp. 254–55) suggested instead that there were originally two 3ms suffix variants, one with a high vowel (usually *u*, *i* in Najdi), and one with a low vowel, the quality of which was specifically conditioned by the presence of emphatic consonants. As an example, he cites Eastern Libyan Arabic, where the distribution is supposedly attested. Owens then suggests that some dialects levelled the emphatic variant ah, while others the non-emphatic *ih*. As mentioned, Jastrow (1991, p. 170 et passim) argues that -*ah* and -*ih* reflect a frozen accusative and genitive vowel respectively.

There are reasons to doubt each of these explanations, however. To begin with, it is not clear whether dialects with -*eh* reflect either \*-*i* or \*-*a*. Relatedly, it is not always clear that contemporary *a* or *i* should be connected with historical \**a* or \**i*. This is especially true of, e.g., the Najdi dialects that constitute the classic examples of dialects with 3ms suffix -*ih* and -*ah*. In most Najdi dialects, historical high vowels \**i* and \**u* merged, with phonetic context determining whether the surface form was *i* or *u* (Ingham 2008, p. 327). It is also important to consider issues of transliteration and phonetic reality when discussing these forms. For example, as illustrated in Table 3 above, San*ҵ* ani Arabic is sometimes ¯ transliterated with -*a*, as with Rossi, and other times with -*i*, as with Watson. It is likely that presumed etymological correspondence with historical vowels influences which is used.

I believe the most obvious explanation is that the forms -*ah*, -*ih*, and -@*h* are ultimately the result of a process wherein the distinction between *i* and *a* before *h* word-finally is neutralized. In non-rounded contexts, though transliterated as *ah*, the feminine ending is actually realized as basically a schwa; e.g., Najdi \*-*ah*, realized in neutral contexts as @, i.e., *xirz*@*h* "bead", (Ingham 2008, p. 327). Ingham elsewhere (Ingham 1986, p. 283) notes that 3ms suffix transliterated -*ih*, "his", is often realized similarly to the *ta¯Ҵ marbu¯t.a*, namely as -@*h*. Such an explanation would also make sense out of the difference variation in transliteration found in a number of dialects, such as S. an*ҵ* ani, where ¯ Rossi (1939, p. 4 gives -ah, but Watson (2009, p. 110) gives -*ih*. There is also evidence for such a situation in the Negev (Palestine) dialect of the *ҵ* Azazmih, where the reflex of the 3ms suffix is -*ih* when suffixed to most nouns, but -*ah* when suffixed to verbs whose stem vowel is low: *ras-ih ¯* "his head", but *gˇab-ah ¯* "he brought it" (Shawarba 2012, p. 193). This is paralleled in some cases by the realization of the feminine nominal marker \*-*ah*, which is -*it* in construct on most nouns but -*at* on verbs whose stem vowel is low: *nag-it ¯* "she-camel of", but *nam-at ¯* "she slept." The same is true for Eastern Libyan itself, where Owens notes that Mitchell's descriptions (Mitchell 1952, 1960, apud Owens 1984, p. 93) give 3ms suffix-*ih*, whereas Owens transliterates it -*ah*. Owens' examples from Eastern Libyan Arabic are probably similar to other dialects, where the exact quality depends on the phonetic environment. Other dialects have been recorded with a form transliterated as -*eh* (e.g., Yaš¯ı*ҵ* ; (Isaksson 1991, p. 126)).

If correct, the traditional historical explanation for these forms, namely that they represent frozen genitive or accusative cases, becomes much less convincing. These variants cannot be explained through the same set of developments as the harmonizing group. Note that the variation attested in the vowel before *tanw¯ın* (Stokes 2020, pp. 654–58) mirrors that of the 3ms suffix. I suggest that the same neutralization attested before *tanw¯ın* occurred before PN suffixes in some dialects as well:

*\*bayt-Vcase-hu/\*bayt-Vcasen > \*bayt-V-hu/\*bayt-Vn*

Following the loss of final short vowels, the 3ms suffix would follow this nonphonemic vowel:

*bet-ih/b ¯ et-in. ¯*

As we will see, many dialects with 3ms *-ih* or -*ah* attest 2ms -*ik* and 2fs -*i´c*. Thus, it is likely that vowels in this position merged, with a high vowel surface manifestation in every instance.<sup>11</sup> It is unclear what might have caused such a neutralization in this position. One possibility is that the change began in the construct, where the case vowel on the initial noun of the construct chain would be in a was.l position, and thus potentially prone to reduction:

*\*baytu r-raguli ˇ* "the man's house" > *\*bayt Vr-raguli ˇ*

Another possibility is that post-stress but non-word final vowels were reduced and merged:

*\*bayt-u-hu/bayt-un > \*bayt-V-hu/\*bayt-Vn but al-bayt-u* and *baytu r-raguli ˇ*

Subsequently, the non-phonemic vowel manifested as a high front vowel in most contexts, and final short vowels were lost. The construct perhaps continued inflecting for case, but the otherwise-ubiquitous absolute form eventually replaced it.

This can be summarized by the following steps:


In a few dialects, such as Al-Mahabšeh, however, 3ms is - ¯ *eh* but 2ms is -*ak*, suggesting against a generalization to all contexts. In these cases, there might have been a generalization of -*a* before pronominal suffixes, with subsequent raising of -*ah*, as with the *ta¯Ҵ marbu¯t.a* -*ah*, to -*eh* or -*ih*.

The 3fs suffix is in most dialects realized as -*a* or *–(V)ha*. The latter suggests an originally long final vowel \*-*ha¯*, but the former is ambiguous. Ahmad Al-Jallad (p.c.) has pointed out that there is some pre-Islamic Arabic epigraphic evidence for a short \*-*ha* by-form in line one of the Namarah inscription, which reads ¯ *mlk Ҵl-ҵrb kl-h*/malk *Ҵ*al-*ҵ* arab koll-ah/"king of all *ҵ* Arab" (M.C.E Macdonald's translation, from Fiema et al. (2016, pp. 405–6)). For either /kull-ha/ or /kull-ha/ we would expect \*\* ¯ *kl-hҴ* in the Nabataean orthography. Cantineau (1936/1937, pp. 78, 182–83) argued based on data from the Syrian Šammar¯ı and *ҵ* Anaze Bedouin dialects, in which 3fs suffix was realized *C-ah* but postvocalic *V-ha*, that the vowel was originally anceps (that is, both long and short), and that its length matched the length of the preceding case vowel. While this fits the data on which Cantineau was focusing, this does not fit many other dialects, which show only one form. So, while I agree with Cantineau that there were possibly two by-forms, one long and one short, of the 3fs suffix, based on realizations of the suffix in other Arabic dialects (see below), as well as ClAr data, it seems unlikely that we can posit a length harmonization for the ancestors of all dialects, and almost certainly not for Proto-Arabic.12

If we posit short and long by-forms for Proto-Arabic, then examples of 3fs suffix realized as -*ha* seem most likely explained as retentions of \*-*ha¯*. More complicated are forms such as -*ah*. If, as is maintained here, final short \*-*a* vowels were eventually lost in the ancestors of the dialects, then -*ah* forms are most likely explained as reflecting the same harmony across morpheme boundaries described above: \**Vcase-ha > a-ha > -ah*, with harmonization before loss of final short *a*. Alternatively, -*ah* forms could reflect an analogical restructuring of the 3fs suffix based on the 3ms forms: *-uh*/*-aha* > -*uh*/-*ah*. In that case, even these forms could ultimately have descended from an originally long final form \*-*ha.¯* From such a situation, whether due to loss of a final short \**a*, or analogical restructuring, we can explain the distribution in a minority of dialects in which 3fs is -*eh*/-*ih*, as in the dialect of Giblah, Yemen (Table ˇ 4; data from Isaksson 1991, p. 130):

**Table 4.** Giblah Forms. ˇ


The short *i* vowel of the 3fs suffix in this dialect could reflect one of two paths. One possibility is that it reflects a fronting of short a before word-final -*h* (as with *ta¯Ҵ marbu¯t.a* and 3ms suffixes noted above). Such a distribution is also attested in, e.g., Al-H. ugar¯ıyeh: 3ms C-*oh*/V-*uh* and 3fs C-*eh*/V-*ih* (Isaksson 1991, p. 132). Alternatively, the 3fs suffix in Giblah could be analogical with the 2fs suffix. ˇ

Forms like Levantine -*a*, which lack a final laryngeal, could possibly go back to either \*-*ha* or \*-*ha¯*. If they reflect an originally short vowel, then their development would mirror the -*ah* forms discussed above, with elision of the final laryngeal. More likely, in my view, is that they descend from long \*-*ha¯* and reflect the loss of the laryngeal -*h* following a vowelless consonant: *bayt-V-ha > bayt-h ¯ a > bayt- ¯ a > bayt-a ¯* "her house". Several pieces of evidence support such a reconstruction. First, the 3fs suffix regularly patterns in terms of pre-suffixal vowel quality and insertion with suffixes that historically were heavy, namely the plural forms, in multiple dialects. That is, these suffixes are typically preceded by a vowel when the phonotactic patterns of the particular dialect dictate the insertion of an epenthetic vowel.

In Meccan Arabic, for example, epenthetic vowels are regularly *a*, which in the paradigm above is inserted before 3fs, 1cp, and 3rd and 2nd plural forms. In the dialect of Al-Mah. all (among others), the anaptyctic vowel harmonizes with the following vowel (see Table 5). The simplest explanation of this distribution is that the vowels before these suffixes originated as anaptyctic insertions, not case vowels, and are thus distinct from the vowels preceding the 3ms, 2ms/2fs, and 1cs suffixes.


**Table 5.** Patterns of \*-CVV and \*CVC suffixes (Arabian data from Isaksson (1991); Damascene from Brustad and Zuniga (2019, p. 411)).

How can this apparent contradiction be resolved? First, the 3fs and 1cp suffixes, unlike 3ms, 2ms, and 2fs, are re-constructible with long \**a¯* vowels, 3fs \**ha¯* and 1cp \**na¯*, which pattern with the heavy CVC syllables of the 3mp/3fp \*-*hum*/\*-*hin* and 2mp/2fp \*-*kum*/\*-*kin* suffixes. I argue that as case inflection broke down, the original short (formerly case-inflecting) vowel was syncopated before long (CVV) and heavy (CVC) syllables13: *\*bayt-V-ha/hum/kum > \*bayt-h ¯ a/hum/kum. ¯*

At some point later in their development, a disallowance for CCVV and CCVC syllables arose in many dialects, resulting in insertion of an anaptyctic vowel in the same slot. Elsewhere, in, e.g., the ancestors of Levantine-like dialects, no such disallowance developed, with either the retention of the laryngeal of the 3rd person forms (as in Dafar; ¯ see Table 3 above) or loss (as in Damascene).

The current proposal can thus be summed up by distinguishing between suffixes that consisted historically of light syllables (i.e., CV) and those that consisted of heavy ones (either CVV or CVC). Before suffixes consisting of light syllables, the final short vowel, historically a case vowel, in some cases (e.g., Levantine) harmonized with the vowel of the suffixes, and in other cases (e.g., Najdi) lost contrast and manifested as a generalized high vowel.14 Before suffixes consisting of heavy syllables, there was an initial syncope of the short vowel.<sup>15</sup> In many dialects, a subsequent phonetic rule disallowing strings of heavy syllables resulted in the insertion of an anaptyctic vowel, while in others no such insertion occurred. In dialects such as Damascene, the laryngeal was elided, but elsewhere no elision occurred (see Table 6 for reconstructions).

**Table 6.** Reconstruction of Meccan and Damascene.


#### *3.2. Second Person Masculine and Feminine Forms*

The 2ms suffix is in many dialects realized -*ak* after consonants and -*k* after vowels. These are derivable from \**Vcase*-*ka* > \**a-ka* (via the vowel harmony rule mentioned above) > *ak* (after loss of final short *a*), with a post-vocalic variant -*k*. A significant minority of dialects attest another vowel before the final velar stop, usually either high front -*ik*, but in a few instances high back -*ok*/-*uk*. In the case of 2ms -*ik*, the feminine form is almost always an affricated -*itš* or -*i´c*. I have already argued that 3ms -*ih* in many cases reflects a neutralization of vowel contrast in this position. The dialects with 3ms -*ih* in which 2ms and 2fs suffixes are -*ik* and -*i´c* suggst a general loss of contrast, with a high front vowel surface manifestation.

The 2ms suffix form realized as -*ok* or -*uk* has long baffled commentators (cf. Diem 1973, p. 42; Isaksson 1991, p. 127). In Dafar, for example, the paradigm is as follows ¯ (Table 7; data from Isaksson 1991, p. 127):



How can the 2ms -*uk* be explained? One possibility is that, initially, the pre-suffix vowel was universally high: *\*bayt-Vcase-*PN *> \*bayt-Vhigh-*PN. Subsequently, harmonization of the 2fs to the similarly high suffix vowel occurred: \**Vhigh-ki* > \**i-ki* > \**i-ši* > -*iš*. Before the 2ms, however, the back high *u* was generalized, with no subsequent harmonization. It is curious that the 3ms and 2fs suffixes, both of which were followed etymologically by high vowels, apparently triggered a front high manifestation of the pre-suffix vowel, whereas the 2ms, with a low vowel, did not. Perhaps there was a pattern that led to this:

V-CVhigh > Vfront-CVhigh,

but

V-CVlow > Vback-CVlow

The feminine by-forms -*ik*/-*ki* require a bit of comment. Comparative evidence strongly favors a reconstruction of final short i, which we would not expect to remain in these dialects. Diem (1991, p. 301) reconstructs 2fs \*-*k¯ı*. While this is a possible explanation of forms after long vowels with -*ki*, it cannot explain the nearly ubiquitous existence of -*ik* forms. Several scholars have suggested rather that -*ki* forms are due to contamination from the nominative pronoun inti and the imperfect verbal suffix, e.g., Jordanian *tuktubi* "you (fsg) write" (<\**tiktub¯ı*) (attested also apparently in G@*ҵ* @z; (Hasselbach 2004, p. 10, n. 28; Al-Jallad 2014, p. 319)). While it is possible that 2fs patterned originally with 3ms and 3fs in having short and long by-forms, with some length assimilation to explain why -*ki* remains only after long vowels, I believe the lack of 2ms long forms argues in favor of analogy with the independent and imperfective forms.

If indeed the source of the *i* on 2fs-*ki* PN suffix forms is based on analogy with the imperfect, we must still explain its predominant distribution: suffixed to nouns that end in a long vowel but absent following nouns that end in a consonant. In, e.g., Damascene Arabic, the 2fs suffix is -*ik* after a C but -*ki* after a vowel: *kitab-ik ¯* "your (fs) book", but *abu-ki* "your (fs) father." In other words, if the final vowel of -ki was restored from the imperfect, why was it not generalized to all contexts? Its distribution is precisely the same distribution of the 3fs suffix, which is -*a* after a C but -*ha* after a vowel: *kitab-a ¯* "her book", but *abu-ha ¯* "her father." There is, in other words, a gendered difference in the paradigm. I suggest that, in many dialects, this distinction led to the creation (or retention) of a 2fs suffix with final -*i*, which was available still on the independent and prefix verbal forms:

3ms -VV/2ms -VVK:: 3fs -*ha*: 2fs X > *ki*

The symmetry also holds for other dialects. For example, in Najdi, the 2fs suffix patterns completely with the 3ms and 2ms:

3ms C-*eh*, V-*h*/2ms C-*ik*, V-*k*/2fs C-*i´c*, V-*c´*

Alternately, in a small minority of dialects, the 2fs suffix patterns completely with the 3fs, against 2ms and 2fs, e.g., the Saudi dialect of Al-Mah. abšeh (Table ¯ 8).


**Table 8.** The 3rd and 2nd singular pronominal suffixes.

We can therefore explain the 2ms and 2fs suffix forms via developments from originally short final vowels. As we have seen with the 3ms and 3fs (and the plural forms, obliquely), the vowels preceding the suffixes were originally case inflecting short vowels, which lost phonemic value and subsequently harmonized with the retained suffix vowels.

#### **4. Suffix Vowel Length**

The length of the final vowels in Proto-Semitic is notoriously difficult to determine because the reflexes of the forms differ across the attested Semitic languages.16 Various scholars have attempted different solutions to this puzzle. Brockelmann (1908/1913, p. 74 ff.) suggested what has been the majority opinion of the past century, namely that the vowels were anceps, meaning that they could be long or short. Brockelmann's own intuition was that they were originally long, but that some were shortened because they were unstressed (ibid.).17 Blau (1981, p. 63 ff.) held the opposite scenario, namely that the suffix vowels were originally short, but that paradigm pressure, namely the preservation of gender distinctions on suffixes, led speakers to retain them in languages like Hebrew, where short vowels were otherwise lost.

Despite the difficulties in reconstructing their length in Proto-Semitic, I follow Hasselbach (2004); Al-Jallad (2014) in analyzing most of the forms as originally short in Proto-Arabic. The reasons, discussed above, concern the consistency with which the dialects agree with ClAr. For example, virtually all modern dialects lack the final vowel on 2ms \*-*ka*, which seems best interpreted as the loss of final short \**a*. It is necessary, and indeed methodologically preferable, to allow for multiple rounds of final-vowel reduction and loss when modeling the development of the dialects from their ancestors. We should

not constrain ourselves to solving the loss of, say, the word-final \*-*a* of the 2ms suffix in the same stage as the loss of final \*-*a* on the 3ms prefix verb; rather, we should probably factor in multiple rounds of reduction and loss, depending on dialect. The main difference between my reconstructions and those offered by Hasselbach concern the 3fs suffix, which I reconstruct long for Proto-Arabic, based on the form in ClAr, as well as its almost total retention in the modern dialects. Exceptions to these forms are, I believe, explicable by appeal to analogies (on which, see above). However, the Syrian Bedouin data gathered and analyzed by Cantineau suggest the possibility that, in the ancestors of some dialects, and perhaps even Proto-Arabic, a short by-form for 3fs suffix, and a long by-form for the 3ms suffix, based on length assimilation and/or polarization, was created. Table 9 presents proposed reconstructions of Proto-Arabic PN suffixes.


**Table 9.** Reconstructed Proto-Arabic Pronominal Suffixes.

#### **5. Conclusions**

In this paper I have addressed two long-standing issues in the historical study of the modern Arabic dialects. First, I argued in favor of the traditional interpretation of the vowels before the pronominal suffixes of the 3ms, 2ms, and 2fs as originating in etymological case vowels. I provided two different models to account for the modern realizations in the two major attested paradigms. Specifically, I posited two sets of rules. To account for dialects in which pre-suffixal vowels harmonized with the vowels of the pronominal suffixes, I proposed a vowel harmony rule which operated across morpheme boundaries. In order to explain forms in which a single vowel occurs prior to each suffix, I proposed a development by which post-stress vowels were neutralized, leading to a generic vowel (usually front high) surface manifestation. The remaining variants were explained within these models. I contrasted these scenarios with those offered by Blau and Diem, detailing how the arguments here account for variables unaccounted for in these previous proposals.

The second topic examined here is that of the historical length of several pronominal suffixes; specifically, the length of the 3ms, 3fs, 2ms, and 2fs suffixes was examined. I concluded that most 3ms forms can be derived from a historical short \*-*hu*, although a long by-form possibly existed at the Proto-Arabic node. I further argued that the 2ms and 2fs suffixes are derivable from historically short forms, \*-*ka* and \*-*ki*, and that exceptions can be explained via analogies that are readily identifiable. Finally, I argued that the 3fs suffix was in the vast majority of cases derivable from a historically long \*-*ha¯*. Furthermore, I argued that the vowels before the 3fs suffixes patterned with those of the 3rd and 2nd plural forms over against the other singular forms, which I argued is due to the fact that the etymological case vowels were syncopated at some point before suffixes of the syllable shape CVV and CVC, with the subsequent insertion of an epenthetic vowel. These vowels behave differently from those of the singular forms because they represent a separate stage of development in the dialects.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


#### **References**

Al-Farra¯*Ҵ*, *Ҵ*Abu¯ *ҵ* Al¯ı. 2014. *Kitab F ¯ ¯ıh Lug˙at Al-Qur ¯ Ҵan¯* . Edited by Gˇ abir b. ¯ *ҵ* Abd Allah al-Sar ¯ ¯ı*ҵ* . Unpublished.


### *Article* **Towards a Dialect History of the Baggara Belt**

**Stefano Manfredi 1,\* and Caroline Roset <sup>2</sup>**


**Abstract:** The Baggara Belt constitutes the southernmost periphery of the Arabic-speaking world. It stretches over 2500 km from Nigeria to Sudan and it is largely inhabited by Arab semi-nomadic cattle herders. Despite its common sociohistorical background, the ethnography of Baggara nomads is complex, being the result of a long series of longitudinal migrations and contacts with different ethnolinguistic groups. Thanks to a number of comparative works, there is broad agreement on the inclusion of Baggara dialects within West Sudanic Arabic. However, little or nothing is known of the internal classification of Baggara Arabic. This paper seeks to provide a comparative overview of Baggara Arabic and to explain dialect convergences and divergences within the Baggara Belt in light of both internally and externally motivated changes. By providing a qualitative analysis of selected phonological, morphosyntactic, and lexical features, this study demonstrates that there is no overlapping between the ethnic and dialect borders of the Baggara Belt. Furthermore, it is argued that contact phenomena affecting Baggara Arabic cannot be reduced to a single substrate language, as these are rather induced by areal diffusion and language attrition. These elements support the hypothesis of a gradual process of Baggarization rather than a sudden ethnolinguistic hybridization between Arab and Fulani agropastoralist groups. Over and above, the paper aims at contributing to the debate on the internal classification of Sudanic Arabic by refining the isoglosses commonly adopted for the identification of a West Sudanic dialect subtype.

**Keywords:** Arabic; Sudanic Arabic; Baggara; comparative dialectology

#### **1. Introduction**

In his early account of the Shuwa Dialect of Bornu, Nigeria and of the Region of Lake Chad, the British colonial governor G. J. Lethem (1920, p. xi) stated that Nigerian (i.e., Shuwa) Arabic "is part of the Arabic dialects of the Sudan, of which Shuwa is the westernmost". By using these words, Lethem was most likely the first to recognize 'the dialects of the Sudan' as a homogenous dialect group, distinct from other Arabic varieties (i.e., Maghrebi, Levantine, Gulf, etc.). Later on, Blanc (1971) adopts the label 'Sudanic Arabic' for referring to the dialect continuum running across the vast region delimited by Lake Chad (Nigeria) in the west, by the Red Sea coast (Sudan) in the east, by Lake Nasser (Egypt) in the north, and by the Nuba Mountains (Sudan) in the South. According to Blanc (1971, p. 503) Sudanic Arabic "does not fit too neatly into either the East-West or the nomadic-sedentary dichotomy, though on the whole it is more Eastern than Western and more nomadic than sedentary". In spite of the vagueness of the dialect portrait drawn by Blanc, both historical and linguistic data unmistakably point out that Sudanic Arabic mainly (but not exclusively) emerged following the penetration of Arabic-speaking nomadic groups from Upper Egypt into Sudan in the first half of the 14th century (cf. 2, see also Thomas A. Leddy-Cecere this special issue). During the last decades, several studies (Blanc 1971; Kaye 1976; Owens 1985, 1993b; Roth-Laly 1994a; Manfredi 2012) have identified a number of isoglosses for pinpointing the Sudanic area within the wider Arabicspeaking world. These Pan-Sudanic features include:

**Citation:** Manfredi, Stefano, and Caroline Roset. 2021. Towards a Dialect History of the Baggara Belt. *Languages* 6: 146. https://doi.org/ 10.3390/languages6030146

Academic Editors: Simone Bettega and Roberta Morano

Received: 26 June 2021 Accepted: 23 August 2021 Published: 30 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).


Concerning the internal classification of Sudanic dialects, Kaye (1976) examines selected phonological and morphological features of Chadian and Sudanese Arabic in light of Ferguson's monogenetic theory of dialect emergence and stresses the relative homogeneity of these varieties. Owens (1993a), in contrast, proposes a patchwork approach that does not reveal any eco-linguistic difference between 'Bedouin' and 'Sedentary' dialects, while showing clear affinities between Sudanic and Upper Egypt Arabic. Roth-Laly (1994a, 1994b), on her part, stresses the generalization of traditional 'Bedouin' features in Sudanic Arabic and identifies new local isoglosses for opposing 'Bedouin' to 'Sedentary' dialects across the Sudanic area. Regardless of their different approaches, the previously mentioned scholars agree in identifying two main dialect sub-types within Sudanic Arabic: West Sudanic Arabic (henceforth 'WSA') and East Sudanic Arabic (henceforth 'ESA'). WSA encompasses the dialects spoken in Nigeria (Lethem 1920; Owens 1993a, 1993b), Cameroon (Owens 1993a), and Chad (Carbou 1913; Roth-Laly 1972, 1979; Hagège 1973; Decobert 1985; Zeltner and Tourneux 1986; Abu-Absi 1995; Jullien de Pommerol 1999a, 1999b) as well as in the western Sudanese provinces of Darfur and Kordofan (Manfredi 2010, 2012, 2013; Roset 2018). ESA covers the remaining parts of the Sudanic dialect area (i.e., the central and eastern part of Sudan) and it includes the koine of the capital Khartoum (i.e., Khartoum or Sudanese Arabic, Bergman 2002; Dickins 2011) and the rural dialects spoken in the Gezira and Butana regions (Reichmuth 1983). Even if this geographical split between WSA and ESA is supported by strong linguistic evidence, it hardly reflects the high degree of diatopic and eco-linguistic variation affecting Sudanic Arabic. In this regard, it is worth remarking that Hillelson (1925, p. xv) distinguishes at least four distinct dialect subtypes in Sudan, counting:

*"The speech of the Northern Sudan, including Berber Province and the Arabic-speaking parts of Dongola; the speech of the Central Sudan, including Omdurman, the Gezira, and the country to the east of the Blue Nile; the idiom of the Western Sudan, embracing the White Nile, Kordofan and Darfur; and the dialect of the Baggara tribes. It should further be noted that the speech of nomad Arabs everywhere differs from that of the settled population."*

Hillelson's introduction to his Sudanese Arabic dictionary therefore warns about the specificity of the "speech of nomad Arabs" in Sudan of which the semi-nomadic Baggara are part. In light of the above, this paper seeks to answer the question of to what extent Baggara Arabic constitutes a homogenous dialect sub-type within West Sudanic Arabic. For this aim, we provide a comparative overview of five different Baggara dialects in order to assess their degree of structural proximity and to explore the sociohistorical factors underlying the diffusion of linguistic innovations across the Baggara Belt.

A preliminary version of this paper was presented at the 47th North Atlantic Conference on Afroasiatic Linguistics, INALCO, Paris, 24–25 June 2019. The paper is organized as follows. In Section 2, we offer a sociohistorical and linguistic introduction to the Baggara Belt. Section 3 briefly presents the data and the sources used for our comparative analysis. In Section 4, we explore the diatopic variation affecting a number of phonological, morphosyntactic, and lexical features in Baggara Arabic, while trying to reconstruct both internally and externally motivated diachronic changes. Section 5 finally attempts at reconstructing the dialect history of the Baggara Belt and provides some new hints on the internal classification of West Sudanic Arabic.

#### **2. The Baggara Belt: Sociohistorical and Linguistic Background**

MacMichael (1922, p. 271) argues that "Baggara means no more than cattlemen". Accordingly, the term Baggara (from the agentive noun PL *baggara ¯* 'cattlemen'*,* SG.M/F *baggari/bagg ¯ ariye ¯* ) has neither ethnic nor genealogical pertinence, as it rather stresses the specificity of an agro-pastoral system of production based on cattle herding and sorghum cropping (Cunnison 1966, p. 10; Teitelbaum, 1984; Braukämper 1993, p. 14; Manfredi 2010, p. 10). There is a broad agreement on the fact that the center of origin of the Baggara tribes is to be found in present-day Chad. Nevertheless, two contrasting hypotheses have been put forward to trace back the way by which Arab nomads reached Chad. On the one hand, Carbou (1913, p. 4) and Henderson (1939, p. 52) allege that Arab nomadic groups entered Chad via the Fezzan area in Libya. In this perspective, the Baggara should be seen as an offshoot of the Arab groups that pushed southwards from Maghreb to central Africa following the Hilalian invasion in the 11th century. Even if this hypothesis is corroborated by Baggara oral traditions referring to Abu Zayd al-Hilali (Manfredi 2010, p. 12), there are neither historical nor linguistic arguments supporting the suggestion of a Maghrebi origin of Baggara groups. On the other hand, MacMichael (1922, p. 275) affirms that:

*"On the migration of these Arabs from the east there cannot be the least doubt. They advanced gradually through the Negroland. [...] Their dialect is quite different from the Maghrebi, while in many respects it still preserves the purity and the eloquence of the language of Hijaz."*

Accordingly, the Baggara would have split apart from the Juhayna groups that penetrated Sudan from Upper Egypt (Cunnison 1971). This latter hypothesis is also supported by Braukämper (1993, p. 19) who links the beginning of the westwards migration of the Baggara ancestors with the famines that affected the Nile valley during the second half of the 15th century. Further to this, there is an unmistakable linguistic evidence pointing to an influx of Upper Egypt Arabic into Baggara dialects (Owens 1993b, cf. 4.1).

Against this backdrop, the question of when and how Arab nomads abandoned camels in favor of cows remains quite controversial. Braukämper (1993, pp. 17–20) suggests that the Baggarization process started after the overthrow of the Tunjur dynasty in Wadai (eastern Chad) in 1635. This event, which is referred to as *Šaggat al-Naga ¯* 'the division of the she-camel' in Chadian and Sudanese oral traditions (MacMichael 1912, p. 151), would have led to a southward movement of the Arab groups that supported the Tunjur dynasty. Following this population displacement, Arab nomads came into contact with Fulani cattle herders settled in the low rainfall savannas and eventually switched from camel to cattle breeding, while maintaining Arabic as their ancestral language. At variance with this hypothesis, Owens (1993b, p. 166; 2003, p. 723), making use of both historical and linguistic data, claims that the Baggarization process took place as early as the 15th century in the area of Kanem-Bornu and Baguirmi (western Chad), since Arabs and Fulani had already reached this region by that time.

In the present paper, we stress instead that Baggarization should be seen as a progressive process of socio-economic integration rather than a sudden ethnolinguistic hybridization induced by the adaptation to new ecological conditions. We therefore argue that the need for economic differentiation of both sedentary and nomadic groups is the main factor behind the emergence and the diffusion of the Baggara semi-nomadic production system across eastern Sahel. In this regard, Haaland (1969) convincingly shows that the Baggara tribes of Darfur (i.e., Ta'isha, Rizeygat, B ¯ anu ¯ H. alba, cf. Figure 1) are characterized by the incorporation of local sedentary groups, as cattle-owning Fur farmers frequently establish themselves as Baggara nomads. In a similar manner, Manfredi (2010) observes that the H. awazma, the larger Baggara tribe of Sudan (cf. Figure ¯ 1), are mixed with both sedentary non-Arab groups of Kordofan (mainly Nuba) and with camel-herders coming from eastern Sudan (the so-called *Abbala ¯* ). The same is true for the Banu Sul ¯ eym tribe of the White ¯

Nile (cf. Figure 1) who integrated eastern Arab groups (mainly Ah. amda) as well as the ¯ sedentary Shilluk of South Sudan. The fact that Baggarization is a gradual and ongoing process of socio-economic integration is also revealed by recent genetic studies (Cˇ ížková et al. 2017; Priehodová et al. 2020; Nováckov ˇ á et al. 2020), which indicate that, despite a remarkable degree of ethnic admixture between agro-pastoralist groups of the Sahel, biological contacts between Fulani and Arab nomads must have been rather infrequent. These circumstances support the idea that the Baggarization process took place at different times, across a wide geographical front, and involved different Arab and non-Arab groups. As we will see, this underlying ethnolinguistic heterogeneity is the main reason for the absence of interference from a single substrate language (i.e., Fulani) in Baggara Arabic cf. 4.1, 5).

**Figure 1.** The Baggara Belt and its main tribes.

At the present time, Baggara Arabs are involved by different dynamics of language contact, mainly depending on their degree of sedentarization and their relative demographic weight. On the one hand, the Shuwa of north-eastern Nigeria represent a largely sedentarized linguistic minority. Accordingly, speakers of Nigerian Arabic present a high degree of bilingual proficiency in Kanuri and/or Hausa, while maintaining transmission of their ancestral language to younger generations (Owens 2020, p. 177). On the other hand, Baggara Arabs of Sudan represent an ethnolinguistic majority and they still hold on to their semi-nomadic production system. Accordingly, they have hardly developed any bilingual competence in the different languages of sedentary communities of Darfur and Kordofan. Nevertheless, due the dominant position of Arabic in Sudan, western Baggara groups are affected at different degrees by dialect mixing and leveling towards Sudanese Arabic (Manfredi 2013, cf. 5).

Finally, it is worth remarking that Baggara Arabic historically represented the target language of non-Arab sedentary bilingual communities of Chad and western Sudan. Most sedentary communities dispersed across the West Sudanic dialect area speak Arabic as a vehicular language (see Roth-Laly 1979 for the variety of Abbeche, eastern Chad). In such a context, an increasing number of town dwellers in western Sudan (Darfur and Kordofan) are shifting from their ancestral languages to Arabic (Manfredi 2012; Roset 2018). It is thus not surprising that Baggara dialects and the Arabic varieties spoken by non-Arab sedentary groups display a high degree of mutual intelligibility. However, we will also see that, due to the stronger influence from local languages, the Arabic varieties of sedentary communities display a number of divergent morpho-phonological features (e.g., depharyngealization, lack of implosive consonants, weakening of F.PL as morphological category, cf. 4.1, 5) that allow us to draw a distinction between Baggara and 'Sedentary' West Sudanic Arabic.

#### **3. Sample and Sources**

The data used for our comparative overview of Baggara Arabic come from different sources. First, we refer to a heterogeneous literature that provides linguistic information on different Baggara dialects of Chad and Sudan. These bibliographical sources have been completed by new first-hand data gathered during fieldwork in the White Nile region (Sudan) in 2018. The data and the sources can be summarized as follows:


For the aims of this study, we will also make reference to Shukriyya Arabic (ShA) (Reichmuth 1983) which provides room for comparison between Baggara Arabic and an Eastern Sudanic dialect. Moreover, we will largely disregard both ESA and WSA Sudanic sedentary dialects in the quest to provide evidence for dialect convergence or divergence within the Baggara Belt. Nonetheless, the comparison between Baggara Arabic and the sedentary dialects of Chad and Sudan offers a number of interesting hints concerning the internal classification of WSA as a whole (cf. 5). The geographical distribution of the dialects included in our sample is shown in Figure 2.

**Figure 2.** The geographical distribution of the Arabic dialects of the sample.

#### **4. Assessing Diatopic Variation across the Baggara Belt**

In this section, we provide a qualitative overview of selected dialect features of Baggara Arabic. The analysis is primarily intended to show the geographical distribution of Figure 2 these features and to explain the dynamics of dialect convergences and divergences across the Baggara Belt. For this purpose, we will analyze phonological (4.1), morphosyntactic (4.2), and lexical (4.3) isoglosses from the perspective of internally motivated, and externally motivated as well as multi-causal changes.

#### *4.1. Phonological Features*

If we omit a few phonological features attested all across the Baggara Belt (ex. \**-a > -e* in pre-pausal position*,* e.g., *\*kab¯ır-a > kab¯ır-e* "big (F)"; presence of backness vowel harmony, e.g., \**simsim* > \**sumsim* > *sumsum* "sesame"), Baggara Arabic is affected by a high degree of phonological variation. This is actually not surprising as phonological features typically have a low stability gradient and they are therefore more likely to undergo both internally

and externally induced changes. If we take a look at the domain of pharyngealized (i.e., empathic) consonants, Baggara Arabic presents a number of phonological splits producing a rich set of non-etymological pharyngealized consonants.


However, the etymological pharyngeal consonants \**h.* and \*Q are diversely affected by depharyngealization. Owens (1993b, 2020) claims that the phonological developments \**h. > h* and *\**Q *>* P*, Ø* represent a defining feature of WSA and he further argues that "this change could have been due to substratal influence, originally non-native speakers having difficulty in mastering h. /Q." (1993b, p. 163). This hypothesis is indeed plausible for sedentary verities of WSA spoken by non-Arab groups, which are characterized by the complete loss of pharyngeal consonants (Jullien de Pommerol 1999b, p. 11; Roth-Laly 1972, p. 68; Manfredi 2013, p. 24; Roset 2018, p. 18). However, the situation is quite different when it comes to the distribution of \**h.* and \*Q across the Baggara Belt.

2. NA-*\*h. ilim > hilim* 'he dreamt' (dream.3SG.M)*, \*ga*Q*ad > ga*P*ad* 'he sat down' (sit.3SG.M) BbA-*\*h. ille > hille* 'village', *\*na*Q*la > na*P*ala* 'sandal' BaA-Q*ud* 'stick', *h. ille* 'village' KA-*h. ilim* 'he dreamt' (dream.3SG.M)*, ga*Q*ad* 'he sat down' (sit.SG.M) WA-*h. alla* 'he released' (release.3SG.M), *ga*Q*ad* 'he sat down' (sit.SG.M)

Example (2) shows that, with the exception of NA and BbA, Baggara dialects retain pharyngeal consonants. Furthermore, KA gives evidence of pharyngealization of the etymological glottal stop in intervocalic position (e.g*., \*ra*P*a > ri ¯* Q*a* see.3SG.M, Manfredi 2010, p. 232). Further to this, at the beginning of the 20th century, Carbou (1913) and Lethem (1920) reported the presence of pharyngeal consonants in western Chad and Nigeria, respectively. This state of affairs inevitably weakens the hypothesis that depharyngealization in western Baggara dialects is a product of substrate interference due to second language acquisition. Contrariwise, if we consider that Nigerian Arabs have developed a high bilingual proficiency in Kanuri and/or Hausa following their progressive sedentarization (cf. 3), a more plausible hypothesis is that depharyngealization is a relatively recent phenomenon triggered by language attrition. In this perspective, speakers of NA gradually lost their ability to produce the etymological sounds \**h.* and \*Q and they replaced them with their laryngeal and glottal counterparts (cf. Lucas and Manfredi 2020, p. 6). All things considered, depharyngealization is not a defining feature of WSA, but it is rather an important phonological feature distinguishing Nigerian and Baguirmi Arabic from other Baggara varieties.

The innovative nature of the westernmost Baggara phonologies (i.e., NA and BgA, cf. Figure 3) is confirmed by other features differentiating them from eastern Baggara varieties. This is the case of the insertion of an epenthetic vowel after *x, h,* and *q*, whose occurrence is also limited to Nigeria and western Chad.

3. NA, BbA *\*ah. mar > ahamar* "red", *axd. ar > axadar* "green" BaA, KA, WA, ShA *ah. mar* "red", *axd. ar* "green"

Owens (1993b, pp. 96–97, 161) and Owens and Jidda (2006, p. 710) consider guttural epenthesis a generalized feature of WSA. This is because this syllable change is also attested in most sedentary dialects of Chad and western Sudan (Jullien de Pommerol 1999b, pp. 28–29; Roth-Laly 1979, pp. 107–8; Roset 2018, p. 29). In this general context, eastern Baggara dialects (BaA, KA, and WA) as well as ShA are characterized by a higher degree of stability of syllable structures, as they do not display guttural epenthesis.

An important segmental feature subjected to diatopic variation across the Baggara Belt is the reflex of the etymological dental emphatic \**t..* It has been argued that the common reflex of \**t.* in WSA is an implosive emphatic *ѐщ* (Owens and Jidda 2006, p. 709). Nonetheless, sedentary dialects of Chad and western Sudan have *t* as the most common reflex of the

etymological voiceless dental emphatic (Jullien de Pommerol 1999b, pp. 28–29; Roth-Laly 1972, p. 69; Manfredi 2013, p. 24; Roset 2018, p. 41). Looking at Baggara dialects, the implosive emphatic *ѐщ* presents different phonological statuses.

**Figure 3.** Depharyngealization.

	- BgA *t.awwal* 'he was late' (be\_late.3SG.M)
	- KA *t.aršan¯* [*ѐщ* arša:n], *t.¯ın* 'mud'
	- WA *t.awwa* 'he lifted' (lift.3SG.M)

If *ѐщ* is a full-fledged phoneme in NA, in KA the implosive emphatic [*ѐщ*] only occurs as an allophone of *t.* before open vowels (Manfredi 2010, p. 44). The other three Baggara varieties included in our sample align to ESA dialects in presenting an etymological \**t.* (see Reichmuth 1983, p. 44 for ShA). Concerning the origin of the implosive realization of *\*t.,* Owens (2020, p. 179) argues that it represents a possible candidate for substrate interference in WSA, as Fulani also has a dental implosive consonant. This hypothesis, however, neglects the fact that Upper Egypt dialects also present a glottalized realization for the etymological \**t..* Khalafallah (1969, p. 29), for example, states that *t.* and its glottalized reflex are in partial complementary distribution in Sa'idi Arabic. Behnstedt and Woidich (1985), on their part, claim that a glottalized realization of *t.* is attested from Asyut to Aswan. More recently, Schroepfer (2016, p. 152) shows that, similar to what is observed in KA, in Aswan Arabic, *t.* is in variation with [*ѐщ*] in pre-vocalic position. Accordingly, it seems plausible to think that the origin of *ѐщ* is an inherited feature from Upper Nile dialects rather than a phonological innovation due to substrate interference from Fulani. In this perspective, the phonological status of *ѐщ* in NA would have been strengthened only at a later stage due to broader areal diffusion, as most local languages of the Chari-Baguirmi region present implosive consonants (Maddieson 2013). Conversely, the absence of the reflex *ѐщ* in the other Baggara varieties could be explained in light of dialect leveling towards regional standards lacking glottalized realizations (i.e., Sudanese and Chadian Arabic, *t.*). All things considered, the complex geographic distribution of the implosive emphatic *ѐщ* across the Baggara Belts should be interpreted as the result of a multi-causal change involving language inheritance from Upper Nile dialects, areal diffusion from local languages in the Lake Chad region, and dialect leveling.

The phonological reflexes of the etymological voiced velar fricative *\*g˙* are also variably affected by (im)plosivization.

5. NA *\*g˙adi > q ¯ adi ¯* 'there', *\*šugul > suqul ˙* 'thing' BgA, BaA \**gayyar > ˙ ғ љyyar* 'he changed' (change.3.SG.M), \**šugul > šu ˙ ғ ul* 'thing' KA \**ganam > qanam ˙* [ä*љ*n*љ*m] 'goats', \**šugul > šoqol ˙* 'thing' WA *\*ganam > ˙ ganam ˙* 'goats', *\*šugul > šu ˙ gul ˙* 'thing'

Example 5 shows that Chadian Baggara dialects (i.e., BbA, BaA) have a uvular implosive *ғ* as basic reflex of the voiced velar fricative *\*g˙* (Decobert 1985, pp. 45–47; Zeltner and Tourneux 1986, pp. 16, 23), whereas NA (Owens 1993a, p. 20) and KA (Manfredi 2010, p. 231) present a voiceless uvular plosive *q,* which is typical of Levantine Bedouin dialects (Rosenhouse 2006, p. 261). Similar to what is observed with *t.* [*ѐщ*] above, in KA, the uvular implosive [*ғ* ] can occur as an allophonic realization before open vowels. Lastly, in line with ESA dialects (i.e., ShA, Reichmuth 1983, p. 46), the Baggara variety of the White Nile does not present any innovative development of the voiced velar fricative *\*g˙*. In this context, it should be also remembered that WSA sedentary dialects stand apart from Baggara dialects in that they typically present a voiceless reflex *x* for the etymological *\*g˙* (Roth-Laly 1994a, p. 77; Roset 2018, p. 36).

According to Owens (1993b, p. 165) the occurrence of a uvular implosive *ғ* in the Chari-Baguirmi region provides strong evidence for a Fulani substratal input in WSA, as Fulani (Niger-Congo) is among the few languages in the area with implosive *ғ* . In fact, if we look at the geographic distribution of *ғ* across the Baggara Belt (cf. Figure 4), it is plausible that *ғ* represents an innovation emerging in Chad from a former voiced velar plosive *q*, which is still attested at the fringes of the Baggara dialect continuum (i.e., Nigeria in the west, Kordofan in the east). Furthermore, it is also true that *ғ* is rarer than *ѐ* in the local languages spoken across the Baggara Belt. Despite this, there is no particular reason to postulate a Fulani substrate in Baggara Arabic, as the uvular implosive *ғ* is also found in Afro-Asiatic (e.g., the Chadic languages Tera and Bole) and Nilo-Saharan (e.g., Central Sudanic-Sara-Bongo) languages spoken in the wider Lake Chad region (Maddieson 2013). In view of the above, *ғ* can be better analyzed as a phonological innovation that emerged in Chad due to areal diffusion and whose geographical dispersion across the Baggara Belt is affected by both the persistence of conservative phonological features (i.e., *q* in NA and KA) and by the influence of ESA varieties (i.e., *g˙* in WA).

**Figure 4.** Implosive consonants.

Another segmental feature that draws attention in our phonological comparison of Baggara dialects is the occurrence of the voiceless postalveolar affricate *cˇ* [t- ]. Owens (1993b, p. 161) and Roth-Laly (1994b, p. 77) consider *cˇ* as a Pan-Sudanic feature (cf. 1). In fact, if we exclude ESA Bedouin dialects (i.e., ShA, Reichmuth 1983, p. 43), *cˇ* seems to be attested all across the Sudanic dialect area. Despite this, the phonological status of *cˇ* varies a great deal across the Baggara Belt. In most cases, *cˇ* is found either in ideophones (cf. 1) or in loanwords from different local languages.

	- BgA *cat ˇ* IDPH, *cilal ˇ* 'milvus'
	- BaB *cut ˇ* IDPH, *kolˇci* 'groundnuts'
	- KA *call ˇ* IDPH, *cor ˇ oro ¯* 'topping for sorghum'
	- WA *call ˇ* IDPH

These non-etymological occurrences of *cˇ* may suggest a marginal phonemic status of this phoneme in Baggara dialects. Nevertheless, the origin of the phoneme *cˇ* can also be traced back to internal phonological changes.


As we can see in the previous examples, in Baggara Arabic *cˇ* may represent either a minority reflex of the etymological *š* or the output of phonological assimilation between the voiced postalveolar affricate *j* and a following laryngeal *h.* Still, none of these internal changes is attested in WA (e.g., *šakka* 'he pierced'; *wašš* 'face'). The weakening of the phonological status of *cˇ* in the White Nile region suggests a western (i.e., Chadian) origin for this phoneme.

Lastly, an interesting case of suprasegmental change that variably affects Baggara Arabic is represented by the regressive assimilation *nt > tt* in 2nd person independent personal pronouns (cf. Figure 5).

**Figure 5.** \**nt > tt* in 2nd person independent pronouns.

As we can see in Table 1, western Baggara dialects (i.e., NA, BbA and BaA) are characterized by conservative pronominal forms retaining the nasal-alveolar cluster *\*nt*, whereas eastern Baggara dialects (i.e., KA and WA) give evidence of the regressive assimilation \**nt >*

*tt*. Given that Baggara Arabic as a whole is characterized by a remarkable stability of bound pronouns,1 regressive assimilation in independent pronouns suggests that free morphemes are more likely to undergo phonological change than bound morphemes. In comparative terms, the west–east split in the domain of 2nd person independent personal pronouns proves the integration of ESA Bedouin features in eastern Baggara dialects (cf. 4.2, 4.3, 5), as the assimilation *nt > tt* is also attested in ShA (Reichmuth 1983, p. 102).


#### *4.2. Morphosyntactic Features*

Baggara Arabic is bound by a few innovative morphosyntactic changes that distinguish it from other WSA and ESA dialects. For example, all the Baggara dialects included in our sample elide 1st singular and 2nd singular masculine pronominal affixes in the suffixed conjugation of consonant-final verbs lacking of nominal/pronominal objects, as shown by Table 2.

**Table 2.** Elision of pronominal subjects in the suffixed conjugation.


9. KA, elision of 1st SG person in absence of nominal objects *wis.íl* arrive-1SG 'I arrived.'

*wis.il-ta kudugli ¯* arrive-1SG Kadugli 'I arrived in Kadugli.'

In these conditions, stress is grammatically distinctive as it distinguishes between 1st singular/2nd singular masculine and 3rd singular masculine pronominal subjects (Zeltner and Tourneux 1986, p. 72; Owens 1993a, p. 111; Manfredi 2010, p. 240). In a different manner, sedentary dialects of Chad and Sudan present the suffix *-ta* for both 1st singular and 2nd singular masculine persons (Roth-Laly 1979, p. 2; Owens 1993b, p. 131; Dickins 2006, p. 563; Manfredi 2013, p. 15; Roset 2018, p. 177), whereas the form *-t* seems to be limited to ESA Bedouin dialects (i.e., ShA, Reichmuth 1983, p. 281).

In contrast to the above, verbal inflection may also be affected by an important degree of diatopic variation across the Baggara Belt. This is the case of 1st singular/1st plural person marking in the prefixed conjugation.

As is well known, Arabic dialects can be broadly classified into three morphological types depending on the 1st singular/1st plural pronominal affixes of the prefixed conjugation (i.e., type-1 *(b-)a-...* 1SG vs. *n-* ... 1PL; type-2 *a-...* 1SG vs. *n-* ... *-u* 1PL; type-3 *n-...* 1SG vs. *n-* ... *-u* 1PL). If type-1 is mainly found in eastern (i.e., Levantine) Arabic dialects, type-3 is generally supposed to be a western (i.e., Maghrebi) feature spreading up to eastern Egypt. Type-2, on its part, seems to be limited to a few buffering zones in the Nile Delta and in Upper Egypt (Behnstedt 1998).

Despite important differences in their historical reconstruction, Owens and Jidda (2006) and Behnstedt (2016) agree on the fact that the attestation of type-3 in Chad is proof of the migration of speakers out of Upper Egypt into the Sudanic region. Nevertheless, if we look at the paradigms in Table 3, it clearly appears that the diffusion of type-3 in Baggara Arabic is affected by both internal developments and dialect contact. On one side, Chadian Baggara dialects (i.e., BgA, BaA) present type-3 forms *n-...* 1SG vs. *n-* ... *-u* 1PL.

**Table 3.** 1SG/1PL marking in prefixed conjugation.


On the other side, WA shares with ShA a more conservative type-1 paradigm, *a-...* 1SG vs. *n-* ... 1PL. Nigerian Arabic, on its part, presents an innovative type-1-derived paradigm in which the preverbal marker *\*b-* has been integrated into the 1st singular affix *\*a-*, i.e., *ba-* 1SG vs. *n-* ... 1PL. In light of the above, it seems plausible to think that both type-3 and type-1 dialects played a role in the emergence of Baggara Arabic, the former type still covering the core of Baggara Belt (i.e., Chad), with the latter now being limited to the geographical fringes of the dialect continuum (i.e., Nigeria, White Nile). In this context, KA falls again into a contact zone characterized by a mixed type 2-paradigm, *a-...* 1SG vs. *n-* ... *-u* 1PL. It is also worth remembering that most WSA sedentary dialects present a type-3 prefixed conjugation (Roth-Laly 1979, p. 3; Jullien de Pommerol 1999b, p. 131; Roset 2018, p. 178) and that for this reason, the forms *n-...* 1SG vs. *n-* ... *-u* 1PL can well be considered as a WSA feature.

However, non-native varieties of Arabic in western Sudan tend to neutralize number distinction of 1st persons. Accordingly, they generalize the use of the prefix *n-* to both 1st singular and plural persons (Manfredi 2013, p. 42). This instance of paradigm simplification by analogy (i.e., *n-...* 1SG vs. *n-* ... *-u* 1PL > *n* − 1) proves that Baggara dialects represent the main target varieties of non-native speakers of Arabic in Chad and western Sudan. Figure 6 resumes the distribution of the pronominal prefixes of the prefixed conjugation across the Baggara Belt.

**Figure 6.** 1st person marking in prefixed conjugation.

Morphosyntactic variation across the Baggara Belt can also be induced by the emergence of isolated features due to internal change. This is the case of the interrogative pronominals "who" and "which one".

10. NA, BgA *miné* 'who'/*atú* 'which (one)' WA *min = ú I'*who' (who = 3SG.M)/*yat = ú* 'which (one)' (which = 3SG.M) KA *at = ú* 'who, which (one)' (which = 3SG.M)

Generally speaking, like most Arabic dialects, Baggara Arabic marks a distinction between the non-selective pronoun 'who' and the selective pronoun 'which one'. In WA, these interrogative pronominals are inflected for number and gender by means of accented clitic pronouns, whereas western Baggara dialects (i.e., NA, BbA) present two invariable pronominal forms. In this context, KA is the only Baggara dialect to express both nonselective and selective meanings by means of a single morphological form inflected for number and gender (i.e., *at=*, Manfredi 2010, p. 218). Given that Niger-Congo and Nilo-Saharan languages in contact with Baggara Arabic in the Nuba Mountain region (Southern Kordofan) formally distinguish 'who' and 'which one', this isolated feature of KA can only be imputed to an internal change not shared by other Baggara dialects.

Finally, diatopic variation in morphosyntactic structures can also be a product of the diverse impact of areal diffusion on Baggara Arabic. This kind of contact-induced change can be exemplified by two competing comparative constructions across the Baggara Belt.

```
11. NA, BgA, KA, exceed comparative with fat¯ 'pass, surpass'
```

```
h. ajm = í b = u-fut = ak ¯
size = 1SG IND = 3SG.M-surpass = 2SG.M
'I'm bigger than you.'
WA, ShA, elative form with locative marking
ana akbar min = ak
1SG big from = 2SG.M
'I'm bigger than you.'
```
Example 10 shows that, in line with most Sub-Saharan languages (Stassen 2013), Baggara Arabic presents exceed comparative constructions in which the standard is constructed as the object (i.e., *=ak* 2SG.M) of the transitive verb *fat¯* 'surpass'. Still, this instance of grammatical calquing induced by areal diffusion does not reach WA which, similarly to ShA and other ESA dialects, presents a more common locational comparative construction with the standard introduced by the preposition *min* 'from'. The absence of exceed comparative constructions in the eastern fringes of the Baggara Belt is reasonably another output of dialect leveling towards ESA and it is another argument in favor of a west–east migration of Baggara groups.

#### *4.3. Lexical Features*

The Baggara dialects included in our sample share a number of interesting lexical innovations. These include the lexemes *h. /harray¯* 'sun' (Behnstedt and Woidich 2011, p. 402) and *elmi* 'water' (< *\*al = mi < \*al = ma¯*P, Behnstedt and Woidich 2011, p. 420), which differ from the more common *šemis/šemiš* 'sun' and *moya ¯* 'water'. Furthermore, Baggara Arabic gives evidence of a few conservative lexical features, as in the case of the root *\*ra*P*a¯* for the verb 'see' (Behnstedt and Woidich 2014, p. 330; cf. 4.1). Despite these surface affinities, Baggara Arabic displays a high degree of lexical variation. First of all, the westernmost Baggara dialects (i.e., NA, BgA) stands out from the other varieties included in our sample due to a number of loanwords from Kanuri and other languages spoken in the Lake Chad region (Behnstedt and Woidich 2011, p. 23; 2014, p. 730).


Most commonly, lexical variation across the Baggara Belt results from dialect contact. The possessive particles (POSS) in Table 4 are a case in point.

**Table 4.** Possessive particles.


As we can see, western Baggara dialects (i.e., NA and BgA) are bound by the forms *hana* POSS.SG.M and *hine(n)* POSS.PL*,* as opposed to *hul¯* POSS.SG.M and *hilel¯* POSS.PL in WA. Conversely, the form *hil/h¯ıl* POSS.SG.F is attested all across the Baggara Belt, whereas the singular feminine form *hint* is limited to NA and KA. In this overall situation, KA clearly falls into a buffer zone in which western and eastern lexical forms are still in competition (Manfredi 2012). On the one hand, *han(a)/hint(a)/hinen¯* forms originate in Upper Egypt (Owens 1993b, p. 111) and they represent a WSA feature within Sudanic Arabic as they are also attested in the sedentary dialects of Chad. On the other hand, *hul/h ¯ ¯ıl/hilel¯* possessive particles are common to ESA dialects spoken by groups that penetrated Sudan directly from the Arabian Peninsula (i.e., ShA, Reichmuth 1983, pp. 111–12) and they are not attested in the urban dialects of eastern Sudan. The complex geographical distribution of *han(a)/hint(a)/hin ¯ en¯* and *hul/h ¯ ¯ıl/hilel¯* across the Baggara Belt (cf. Figure 7) seems to indicate a longstanding coexistence of these possessive forms and may corroborate the idea that speakers of ESA dialects have also been historically involved in the Baggarization process (cf. 2, 5).

**Figure 7.** Possessive particles.

The intensifier 'very' provides another example of lexical variation due to dialect contact and leveling across the Baggara Belt.

#### 14. NA, BgA *bilhen¯* 'very'; BaA, KA *bilh.en ' ¯* very' WA *šed¯ıd* 'very'

In fact, except WA, all the Baggara dialects included in our sample display the form *bilhen/bil ¯ h.en, ¯* which finds its origin in the prepositional phrase \**balh. ayl* 'very' (<\**bi-l-h. ayl* 'by strength') attested in a number of Middle Eastern Bedouin dialects (Rosenhouse 2006, p. 267). WA, on its part, aligns with ESA in using the adjective \**šad¯ıd* 'strong' as intensifier. This lexical isogloss confirms that the Baggara dialect of the White Nile is the most affected by contact with eastern Sudanic dialects.

#### **5. Conclusions**

Based on the previous comparative overview of Baggara Arabic, we can now attempt at reconstructing the main dynamics of dialect convergence and divergence across the Baggara Belt. First of all, despite their common ethnolinguistic and sociohistorical background, Baggara dialects display a high degree of diatopic variation. Indeed, if we exclude common Pan-Sudanic features (cf. 1), there are only a few isoglosses that are shared by all five varieties included in our sample. These comprise the vowel change \**-a > -e* in pre-pausal position*,* the presence of backness vowel harmony (cf. 4.1), the forms of bound personal pronouns (cf. note 2), the elision of pronominal subjects in the prefixed conjugation (Table 2, ex. 9), and several lexical isoglosses (cf. 4.3).

Secondly, the lack of a number of WSA innovations (e.g., etymological *c,ˇ* ex. 7–8; *bilh.en¯* 'very', ex. 14) in the White Nile region supports the hypothesis of a west (i.e., Chad) > east (i.e., Sudan) migration of Baggara groups. In fact, if KA still gives evidence of competing WSA and ESA features (e.g., type-2 prefixed paradigm, Table 3), WA is clearly more affected by contact with ESA dialects, and therefore it is more similar to ShA (e.g., type-1 prefixed paradigm, Table 3; *hul/hil ¯ el¯* SG.M/PL possessive particles, Table 4). This suggests that there is no overlap between the ethnic and the dialect borders of the Baggara Belt, as WA lost most of its WSA features while integrating several ESA innovations. In this context, the attestation of both WSA and ESA features across the Baggara Belt (e.g., type-1/type-3 paradigms, Table 3; *h¯ıl/hint* F.SG possessive particles, Table 4) points to a longstanding coexistence of these dialect sub-types and provides evidence that the Baggarization process did not exclusively involved speakers of WSA varieties.

In terms of contact-induced change, we have shown that there is no linguistic evidence for a Fulani substrate in Baggara Arabic. In fact, the heterogeneity of the languages spoken across the Baggara Belt limits the possibility of a substrate interference via language shift. In such conditions, contact-induced innovations are mainly a product of areal diffusion on a west–east axis, from the Lake Chad region in direction of the White Nile. This is the case of both the implosive consonant *ғ* (ex. 5) and exceed comparative constructions (ex. 11) whose grammatical productivity tend to fade eastwards. As far as NA and BgA are concerned, contact-induced changes also occurred as a consequence of language attrition. This is the case of depharyngealization (ex. 2) which should be seen as a relatively recent innovation induced by the high degree of Kanuri/Arabic bilingual proficiency of local sedentarized Baggara Arabs. The prominence of the adstrate over a supposed Fulani substrate is also testified by a number of loanwords occurring in the basic vocabulary of the western-most Baggara dialects (ex. 12). All things considered, NA and BgA undoubtedly represent the most innovative Baggara varieties of our sample and they cannot therefore be adopted as a dialect prototype for Baggara Arabic. Furthermore, the previous linguistic arguments potentially support the hypothesis of Baggarization as a gradual process of socioeconomic integration rather than a sudden ethnolinguistic hybridization between Arab and Fulani agro-pastoralist groups.

As a final remark, it is without doubt that Baggara Arabic as a whole represents a WSA dialect sub-type. Nonetheless, we have also shown a number of isoglosses opposing most Baggara dialects to sedentary dialects of Chad and western Sudan. These include the presence of pharyngealized and pharyngeal consonants (ex. 1–2), the presence of implosive consonants (especially *ғ* , ex. 5), and the elision of pronominal subjects in the prefixed conjugation (Table 2). This suggests that, despite the generalization of traditional Bedouin features across the Sudanic area (cf. 1), WSA is actually affected by an important eco-linguistic variation. Further to this, the structural divergences between Baggara Arabic and the sedentary dialects of Chad and western Sudan reduce the geographical extent of a number of isoglosses that were formerly thought to represent pan-WSA features (i.e., depharyngealization, implosivization). However, this is only partially true for NA and BgA that, being predominantly spoken by sedentarized Baggara Arabs, are phonologically closer to sedentary WSA dialects (ex. 3). All things considered, WSA features vary significantly according to both diatopic and eco-linguistic factors. Although there is no sharp boundary between Bedouin and Sedentary dialects in the Sudanic area, eco-linguistic factors still matter and should be therefore taken into account in further research of the area.

**Author Contributions:** Both authors contributed to several aspects of the study, specifically, conceptualization, S.M and C.R.; methodology, investigation, resources, and data curation S.M.; writing original draft preparation S.M. and C.R.; writing, review and editing, S.M. and C.R.; supervision S.M. All authors have read and agreed to the published version of the manuscript. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Data sharing not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **Note**

<sup>1</sup> Among these, we can recall the forms *= a* 3SG.M, *= ki* 2SG.F (after both consonant- and vowel-final items) *= ku* 3PL.M which are also variably found in the sedentary dialects of Chad and western Sudan.

#### **References**

Abu-Absi, Samir. 1995. *Chadian Arabic*. München and Newcastle: Lincom Europa.

Behnstedt, Peter, and Manfred Woidich. 1985. *Die Ägyptisch-Arabischen Dialekte*. Beihefte Zum Tübinger Atlas Des Vorderen Orients. Reihe B, Geisteswissenschaften. Wiesbaden: L. Reichert.

Behnstedt, Peter, and Manfred Woidich. 2011. *Wortatlas der arabischen Dialekte, Vol. 1*. Leiden: Brill.

Behnstedt, Peter, and Manfred Woidich. 2014. *Wortatlas der arabischen Dialekte, Vol. 3*. Leiden: Brill.

Behnstedt, Peter. 1998. La frontière orientale des parlers maghrébins en Egypte. In *Peuplement et Arabisation au Maghreb Occidental. Dialectologie et Histoire*. Edited by Jordi Aguadé, Patrice Cressier and Angeles Vicente. Madrid-Zaragoza: Casa de Velázquez, pp. 85–96.

Behnstedt, Peter. 2016. The *niktib-niktibu* issue revisited. *Wiener Zeitschrift für die Kunde des Morgenlandes* 106: 21–36.

Bergman, Elisabeth M. 2002. *Spoken Sudanese Arabic: Grammar, Dialogues, and Glossary*. Washington: Dunwoody Press.


Reichmuth, Stefan. 1983. *Der arabische Dialekt der Šukriyya im Ostsudan*. Zürich: Georg Olms.


Roth-Laly, Arlette. 1979. *Esquisse Grammaticale du Parler Arabe d'Abbéché (Tchad)*. Paris: Paul Geuthner.


Zeltner, Jean-Claude, and Henry Tourneux. 1986. *L'arabe dans le bassin du Tchad*. Paris: Karthala.

## *Article* **Mahdia Dialect: An Urban Vernacular in the Tunisian Sahel Context**

**Cristina La Rosa**

Department of Humanities (DISUM), University of Catania, 95124 Catania, Italy; cristinalarosa@unict.it

**Abstract:** This paper aims to present some preliminary results of the linguistic analysis of the dialect of the Wilaya of Mahdia on which few studies exist, focused mainly on phonology. My analysis, ¯ here extended to the morpho-syntactic level, is based on a corpus of interviews taken from some social media pages. The sample will be composed of respondents of different geographical origin (from Mahdia and some nearby towns), gender, age and social background. A deeper knowledge of the Arabic of Mahdia region, which is a bundle of urban, Bedouin and "villageois" varieties, would contribute to throw new light on the features of the Sah. l¯ı dialects and would add a small piece to the complex mosaic of Tunisian and Maghrebi dialects, whose traditional categories of classification should be reconsidered.

**Keywords:** Mahdia Arabic; Maghribi Arabic; Tunisia; Sahel; urban dialects; Bedouin dialects; *villageois* dialects; Arabic dialectology; Sociolinguistics

#### **1. Introduction**

During the recent years, the need to more precisely examine and describe the varieties of Arabic used in the Tunisian Sahel has been central to the scientific dialectological debate. In fact, there are few systematic studies available on Sahel varieties and the data offered need to be partially reinterpreted. In 1950, William Marçais, with regard to "les parlers villageois" of Tunisia, claimed: "on a surtout en vue ici ceux des bourgs et des petites villes du Sahel [ ... ] n'ont fait encore l'objet d'aucune enquête" (Marçais 1950, pp. 210–11).<sup>1</sup> The scholar classified the non-coastal dialects of Sahel as "villageois" and introducted a third category of dialects sharing both sedentary and Bedouin features (see also (Marçais and Guiga 1925, p. XXV)).

The first systematic study on a Sahel variety is *Textes arabes de Takroûna* by William Marçais. According to him (Marçais and Guiga 1925, p. XIX), "Le parler arabe de Takroûna concorde dans l'ensemble avec ceux des centres agricoles, bourgs et villages, qui parsèment la region côtière de la Tunisie centrale, communément appelée Sâh. el [ ... ] depuis le moyen âge. Séparés les uns les autres par des différences de detail, ces parlers relèvent, quant à la phonétique et à la grammaire, d'un même type general dont le takroûni n'est qu'une variété particulière".2 Marçais and Guiga offered a linguistic study and also a monumental glossary of the variety considered (Marçais and Guiga 1958).

In 1980, Talmoudi classified the Tunisian dialects in four groups: The varieties of Sahel, which are urban and semi-urban, these latter "spoken in small villages as Ksibet Susa and Khnis display features of both nomadic and sedentary dialects". The Northern ¯ dialects are also "divided into two types: urban and rural. The villagers on the left side of Oued Medjerda speak so-called Zba:li dialects (mountain dialects) which have features in common with North East Algerian vernaculars". The Central Western dialects are rural and nomadic. The rural ones "resemble in several respects the East Arabic dialects". The Southern dialects are divided in three groups: "urban dialects in Sfax, rural in the oasis and nomadic. The latter is spoken by semi-pastoral people in Sahara" (Talmoudi 1980, pp. 10–11). "The genuine dialect of Susa", according to the scholar, is spoken in the ¯

**Citation:** La Rosa, Cristina. 2021. Mahdia Dialect: An Urban Vernacular in the Tunisian Sahel Context. *Languages* 6: 145. https://doi.org/ 10.3390/languages6030145

Academic Editors: Simone Bettega and Roberta Morano

Received: 2 July 2021 Accepted: 23 August 2021 Published: 27 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Medina by the older generation, but the author also took into account the innovations of the younger generation for the composition of his study on the Arabic of Sousse (Talmoudi 1980, p. 13).

Lajmi, in 2009, Zammit, in 2014, and Sellami, in 2019, conducted some investigation on some features of Sfaxi Arabic, providing new elements for the knowledge of the dialects of Sahel.

Mion (2015, 2018) <sup>3</sup> offered some reflections on the origin of this "third category" of dialects, that is the "villageois", and on the phenomena that characterise Tunisian "villageois" dialects, whose 'mixed' features are the product of a long process of interdialectal contact between a sedentary and a Bedouin variety of Arabic. The latter was introduced by the Banu Hil ¯ al who invaded the Maghreb around the 11th century. ¯

Mion and Luca D'Anna during the "Prima giornata di dialettologia maghrebina" (Cagliari May 16 2019), launched the research project "The Tunisian Sahel: Dialectological, Historical and Sociolinguistic Perspectives", which aims at shedding new light on the features of the Arabic varieties of Tunisian Sahel, which, in fact, should be better described regardless of the existing rigid classification criteria (Bedouin/urban/rural), which do not highlight the richness characterising the varieties spoken in the region (Mion 2015). A century after Marçais' studies, however, no systematic research on the Bedouin and urban varieties of the Sahel region has been conducted, yet.

My paper intends to be a modest contribution to the knowledge of the varieties of the area, starting from the analysis of the dialect(s) of Wilayat al-Mahdiyya (henceforth ¯ Mahdia).

Marçais and later scholars have included Mahdia Arabic among the varieties used in the coastal towns of Sahel, such as Monastir, Sousse and Sfax. Saada (1984, p. 17) included the urban dialect of Mahdia among "les parlers arabes des capitales", but she also stated that she did not have elements to classify the varieties used in the neighbouring villages.

At present, there are few studies entirely dedicated to the Arabic of Mahdia, and the dialects of the surrounding towns and villages have never been described. Attia (1969) offered a phonological analysis of the variety used by the fishing community in the 1960s. What emerges from his paper is a quick description of the phonological inventory of the Mahdia dialect, accompanied by few examples. In his paper, the scholar highlighted some well-known features of Mahdia Arabic: The passage of the interdental /t/ and /d/ to /d/ and /t/, the reduction of the diphthongs /ay/ and /aw/ to /e/ and / ¯ o/ in the ¯ middle of a word and the voiced articulation of /q/ in some terms such as *gar*ޏ*a* "irrigated land" and *gamra* "moon" (Attia 1969, p. 125). He also quickly focused on combinatory phonetics, vowels and vocalic phenomena such as final *imala ¯* , syllable patterns and prosody. Based on these phenomena, Attia defined Mahdia Arabic an urban variety.<sup>4</sup> Yoda's study (2008) added some important information to the knowledge of the vocal system of Mahdia dialect. In fact, the scholar focused on the special status of Mahdia Arabic having /e/¯ and /o/ phonemes, unlike the other sedentary dialects, and attributed the presence of ¯ these phonemes to the influence of the nearby village dialects. Besides, in 2019, Yoda published some texts in Mahdia Arabic, accompanied by a simple grammatical sketch concerning mainly phonology. In his paper, he writes that Mahdia Arabic: "is an eastern Maghribi sedentary dialect showing some features of the village dialects of the region, the most conspicuous being the word-final *imala ¯* e. Among the sedentary dialects of ¯ Tunisia, Mahdaw¯ ¯ı dialect [ ... ] is characterized by a five-long-vowel system [ ... ] and the correspondence of the interdentals of Old Arabic [ ... ] to plosives, as attested in most of the Jewish dialects of Tunisia. In this respect the dialect in question is worthy of more detailed descriptions". (Yoda 2019, p. 55).

It is clear, however, that the few existing studies have focused mainly on the vowel and consonant system. Therefore, the linguistic analysis of Mahdia variety should be extended to all linguistic levels. It would also be fruitful to extend the research to the varieties used in the territory near the town, which remain to be investigated (except for D'Anna's study about Chebba: see (D'Anna 2020)). In fact, Mahdia Arabic is worthy of study because it is surrounded by village varieties and, since no language can live isolated and separated from the adjacent varieties, it could have some peculiarities owing to contact phenomena. That is, as Yoda already showed, it is an urban variety containing some village features.

In addition to this, a deeper knowledge of the Arabic of Mahdia region, which is a bundle of urban, Bedouin and "villageois" varieties, would contribute to shed new light on the features of the Sahel dialects and would add a small piece to the complex mosaic of Tunisian and Maghrebi dialects. In fact, a deeper knowledge of the varieties spoken in this area would also be useful to understand through which historical and socio-linguistic dynamics these dialects originated, since we have scant data on the history of Tunisian dialect(s). More generally, this would also help provide a more detailed and specific classification of the dialects of the area (Taine-Cheikh 2017; Guerrero 2018; Benkato 2019), which share many common features, but present many differences that deserve to be highlited.5

Mahdia and the surrounding towns

Mahdia is a town whose importance and splendour are rooted centuries ago. The geographer al-Idr¯ıs¯ı (d. 1175–1176) describes Mahdia as a beautiful town, two days from Sfax and Kairouan. Mahdia, not long before his arrival, had a harbour visited by merchant ships from everywhere: Maghrib, Mashriq, al-Andalus and Christian countries. al-Idr¯ıs¯ı informs us that, during those times, Mahdia was already famous throughout the world because of its goods and renowned clothes that were exported to all the other countries, but he also adds that, since the Norman conquest, its trade has been strongly reduced. Moreover, the geographer defines the wall surrounding the town a wonder worthy of mention (Bresc and Nef 1999, pp. 183–84). Mahdia, during al-Idr¯ıs¯ı's times, was composed of two towns: Mahdia, the seat of the power and the residence of the sovereign, and al-Zaw¯ıla. The latter was beautiful and densely populated with merchants. The geographer writes of Mahdia with a sort of nostalgia because the invasion of the "Arabs", that is the Banu Hil ¯ al, ¯ <sup>6</sup> and of the later Norman conquest, destroyed many important and emblematic aspects of the town. Mahdia, however, remained the capital of Ifr¯ıqiya ( ¯ Bresc and Nef 1999, pp. 183–86).

Nowadays, the vestiges of a flourishing medieval city can be retraced in its old *mad¯ına*. Mahdia was built on a peninsula (see Figure 1), situated on the Eastern coast in the centre of the Republic of Tunisia. It is 200 km from the capital, Tunis, has a mild climate, which is usually affected by the Mediterranean air currents, and its economy is based on agriculture, especially on oil production, fishing and craft industries specialised in producing silk, leather clothing and mosaics. Thanks to its position on a 75-km-long coast, close to Sousse and El Jem, tourism also plays an important role in its economic activities.<sup>7</sup> Mahdia is a town of pre-Hilalian foundation that later underwent the Hilali invasion. Founded in 909 A.D. by the Fatimid Caliph ޏUbayd Allah al-Mahd ¯ ¯ı as the new capital of the Reign, replacing Kairouan, the Aglabid capital, Mahdia was the first urban settlement in the ˙ peninsula. The history of the town is quite complex. The ancient nucleus was the already mentioned quarter of al-Zaw¯ıla, which was also the commercial core. During the Hilalian invasion, Mahdia regained its role of capital, after a short time in which the capital was al-Mans.uriyya. In 1087, Mahdia was conquered by Pisans and Genoans, then by Normans ¯ in 1123, in 1134 by the Hammadids and, in 1140, the Normans of Sicily imposed harsh ¯ conditions on the town. Then, Roger II caused the end of the Z¯ırid Dinasty Until the French protectorate of 1884, and the town was attacked and conquered by numerous dynasties and conquerors and was even destroyed and plundered (Talbi 1986, pp. 1246–947). The suburb of al-Zaw¯ıla was completely destroyed during the Hilalian conquest and rebuilt in 1200. As a consequence of centuries of riots, pillages and plagues, in the 16th and 17th centuries, the composition of the population changed above all because of two elemets: The arrival of the Muslim refugees coming from al-Andalus and the introduction of the Turkish garrisons. According to Talbi, in 1986, 60% of the population was composed of

descendants of the Kouloughlis (Bearman et al. 1986, p. 366),8 affecting onomastics and customs (Talbi 1986, p. 1247).

**Figure 1.** Contemporay map of Mahdia 9.

Goitein (2010, p. 311) claimed that Mahdia and some coastal towns of Tunisia resisted the perpetual chaos in which the Hilalian invasion threw the country,<sup>10</sup> but their hinterland was lost and was exposed to the subsequent attacks from the Normans and Italians. In the Geniza documents, the Christian and the Almohad conquests are well documented as being a catastrophe and a cause of economic decline of the region. Despite this, the economic exchanges between Tunisia, Italy, Spain and Syria continued, but some merchants with their families were obliged to move to Sicily and Egypt. Moreover, in the 11th century, the repeated pillaging by the Banu Hil ¯ al obliged the inhabitants of the villages to seek ¯ safety within the walls of the surrounding towns; besides, some nomadic populations have returned to the steppes (Decret 2003). These population movements, only in part caused by the Hilalian invasion, must have had an important influence on the Tunisian Arabic language, regarding which we currently have scanty or no data. Mion (2015, pp. 275–76) distinguishes the two phases of the Arabization of Sahel and underlines some relevant historical events which had linguistic consequences: In the pre-Hilali period, the region was probably inhabited by some sedentary Arabized people, and during this phase, the main urban features of Sahel Arabic developed. The Hilalian invasions of the 11th century deeply troubled the region to the extent that many urban centers and villages were threatened and disappeared, so that Ibn ᏤaldĀn (d. 1406), in the 14th century, writes that only some weak traces of sedentary culture could be found in some families from Kairouan or Mahdia. A relevant piece of information, linked to the population movement in Mahdia region and its consequences on the Arabic of the region, is that "en 945–46 Isma¯ޏ̄ıl al-Mans.ur¯ [... ] quitta la ville de Mahdia pour établir sa résidence à Sabra, en provoquant la ruine de l'ancien siège de l'empire fatimide et la perte des habitants et ses faubourgs, ce qui nous incite aujourd'hui à voir en cela un évenement qui laisse le champ libre, plus tard, à un repeuplement de la part de gens beaucoup moins urbanisés" (Mion 2015, p. 275).

If we continue our imaginary journey, guided by al-Idr¯ıs¯ı, we find Monastir (al-Munast¯ır) at 30 miles of navigation from Mahdia. The geographer repports only that the town has some castles in which some fruits are produced and then exported to Mahdia and that the inhabitants of the latter bury their dead in the cemetery of Monastir (Bresc and Nef 1999, pp. 184–85). Actually, after the Arab conquest of the mid 8th century, Monastir became renowned for religious reasons because of its *riba¯t.* and its cemetery in which important personalities, such as the last members of the Zirid dinasty, were buried.

As recounted by the Imam al-M ¯ azar ¯ ¯ı (d. 1141), Monastir and Mahdia appear to be very connected and seem to have been spared from the Hilalian invasions. Monastir was described as a prosperous town whose religious importance was proved by the piligrimage of numerous people from the nearby regions, similarly to what happened in the holy city of Kairouan (Soucek 1993, pp. 227–29).

Monastir is a town located on the southern end of the Gulf of Hammamet, about 160 km south of Tunis, and today is part of the Wilayat al-Munast ¯ ¯ır (see Figure 2). Its main commercial activities are tourism, the textile industry for wool processing, the production of salt, which is an ancient activity already described by al-Bakr¯ı (d. 1094) due to the fact that the town was built near a salt pan, soap, olive oil and fishing. Monastir is also a university town (for further details on the history of Monastir, see (Soucek 1993, pp. 227–29)). <sup>11</sup>

**Figure 2.** Map of Monastir.

Msaken (Arabic: M'sakin) is a small town of the Tunisian Sahel located at a dozen ¯ kilometres to the south of Sousse. Administratively dependent on the governorate of Sousse, it has an increasing number of inhabitants in the summer due to the return of expatriates, most of whom work in France.<sup>12</sup> For this reason, its inhabitants call Msaken "la Petite Paris". Its economy is based on olive oil production. There are, however, a number of handicraft and industrial enterprises in the surrounding area, set up by old emigrants who returned to Msaken in the 1980s (Ma Mung 1984).

Msaken was founded during the Hafisd dynasty in the 14th century (Bouhlel 2009, p. 125). A brief description of Msaken as a holy place in Tunisia together with Kairouan is contained in *Tunisie et tunisiens* by François Bournand (Bournand 1893, pp. 311–13) who actually quotes the information offered in *Promenades d'une Française dans la régence de Tunis* by Voisins d'Ambre (1884, pp. 171–74). Bournand states that Msaken is a small town of about 9000 inhabitants, 9 km from Sousse, built in a slightly mountainous area planted with olive trees, famous as a religious place because of the *madrasa* of S¯ıd¯ı ޏAl¯ı b. ᏤalÎfa, renowned in North Africa because of the high number of students and the high level of teaching of its university. Msaken is considered by the author to be like Seville, Padua, Oxford or Cologne. The town occupied a large area with houses surrounded by greenery, while the centre of the town developed along the main road and had several schools and mosques. According to M.me de Voisins, entry to the holy city was forbidden to Christians and especially Europeans. For this reason, the author claims to have seen Msaken from a nearby hill and that she found it fascinating to the the point that she was very impressed by it. Then, she gives a rather exotic and Eurocentric description of the town (Voisins d'Ambre 1884, pp. 171–74; Bournand 1893, pp. 311–14).

Ma Mung (1984, p. 163), stated that in 1984, Msaken had 41,219 inhabitants and that it was among the largest 15 or 16 towns of Tunisia, together with Mahdia. In his study,

he focused on the importance of the migratory movement that affected a large part of the population from the 1960s onwards, which had important effects on the economy of the country.Today, Msaken is the second town of Sahel in terms of size; its inhabitants constantly move back and forth to the rural and village hinterland as well as to Sousse. The growing integration of this town into the economic space of the regional metropolis, Sousse, means that it has become a basin of intense activity (industrial, construction and services) and, above all, of employment for the populations of the surrounding areas. The main categories of commuters are those employed in industry and government, especially teachers (Boubakri and Lamine 1992).13 According to Bouhlel (2009, p. 126), Msaken is close to several small towns that show different linguistic features from each other as well as some common features. The scholar pointed out that Msaken Arabic has some peculiarities that distinguish it from the other Tunsian varieties. Moreover, this variety is in continuous evolution because of its inhabitants' migration movements, and therefore the "original" Msaken variety would be spoken only by aged people, children, emigrated and housewives.

#### **2. Materials and Methods**

This paper offers some preliminary results of the linguistic analysis of the dialect(s) of the Wilaya of Mahdia, based on a corpus of interviews taken from some social media ¯ pages, which will be presented below. The sample is composed of speakers of different geographical origin, gender, age and social background. The videos analysed have been collected between 2019 and 2021 and the speakers speak varieties of Tunisian Arabic that are more or less dialectal in the sense that they can be more or less influenced by Modern Standard Arabic, and may have mixed features, depending on the the speakers' levels of education and their role (i.e., politicians and teachers tend to use a higher register).14 Moreover, their speech can be more or less influenced by the medium used, i.e., radio or video. The role of the Arabic of the capital Tunis and of the main city of Sahel, Sousse, will be highlighted too.15

Starting from the previous studies on Arabic dialectology, and particularly of the Maghribi area and on Tunisian Arabic, I will observe and analyse some of the urban, Bedouin and village isoglosses indicated by Marçais in his studies (Marçais 1950, pp. 207–14) and I will then add some selected morpho-syntactic elements attested in the Arabic of Mahdia.

Even if the focus of this study is the urban vernacular of Mahdia, some linguistic elements of some nearby towns will be underlined in order to follow the suggestions of earlier scholars. For this reason, some video-recordings of speakers from some small villages near Mahdia, such as Tlelsa and Teboulba, have been analysed as well as audios of respondents from Monastir and Msaken, since, according to Marçais, the two towns have village features (Marçais 1950, p. 207). Similarly, Saada's statement "On a recueilli en outre des informations concernant le parler de Bqalta<sup>16</sup> qui ne suffisent pas à le classer (parler de Musulmans)" attracted my attention.<sup>17</sup>

Therefore, several dialogues and interviews contained in numerous online radio and tv programs and social media information pages have been analysed. Due to the difficulty of determining the real background and origins of the speakers involved in the videos available online, deriving from the analysis of online material, I have selected mainly the materials in which the origin of the speakers has been specified by the speaker or the interviewer.

#### Souces for Mahdia and Monastir Arabic

For linguistic data on Mahdia and Monastir, I consulted the following online radios that broadcast on their relative Facebook pages:

∗ *Menara Fm* (http://www.menarafm.net/ accessed on 19 August 2021) is a radio station based in Mahdia since 2019 whose editorial line is based on independence and freedom of expression. It offers many interviews with Tunisian artists and craftsmen and those from other countries in the Arab world, as well as reports, news and live radio broadcasts focused on debates on topical issues chosen by the radio speakers. It also broadcasts some fixed radio programs, such as *Naharek z ¯ ¯ın*, i.e., "Have a nice day", in which every morning some presenters give the weather forecasts and the main news of the day.


#### Sources for Msaken Arabic

News on Msaken is provided by the online tv channel *Msaken Tv* and by *Radio RM FM*, which is a radio station based in the town broadcasting entertainment programs and news. Both broadcast in their Facebook pages.

#### Sources for Bekalta Arabic

As regards Bekalta, I analysed some videos from the Facebook pages *100% ba9louti* and *Bekalta Today*, which provide daily news, reportages and interviews about the town.

Except for the programs involving children and young people mentioned earlier, the speakers involved in the radio programs analysed are men and women of all ages.

When analysing an oral corpus on media or on social media, the researcher should be aware of many challenges of such a work. In fact, as Van-Mol (2010, pp. 67–68) states, "people of all layers of society appear in these media to participate, by means of oral expression, which reflects a wide variety of language capabilities and layers". Van-Mol also distinguishes two main classes of speakers: The first is made of professional workers, such in the case of radio presenters in my corpus, or intellectuals who use a top-down strategy of communication by using a higher register with some dialectal features in order to also be understood by less educated people. The other is composed of less educated people and non-professional workers, i.e., the majority of the speakers in my corpus. However, even if, as Van-Mol states, oral media Arabic gives us the opportunity to observe how people with different linguistic competences communicate with each other, many elements that can influence the linguistic variety and register used by speakers should be taken into account. First of all, the kind of media and of program in which they are involved influences their linguistic choices as far as regards the register and the spontaneity of their speech. Furthermore, the different linguistic competences of young and aged people and of women and men (i.e., linked to age and sex) have a role in their linguistic production. In addition, the topic dealt with and kind of oral text produced by them, such as dialogues, interviews, monologues and multilogues, are important. Moreover, the audience who is going to listen to the speakers has to be taken into account because it has an influence on the speaker's linguistic choices.18

In spite of these challenging aspects, conducting a linguistic analysis through social media resources has some advantages. Firstly, it offers the opportunity to observe many examples of informal communication among several individuals. In fact, it is possible to reach many speakers in different locations, such as small towns and villages thanks to their social media pages. Secondly, it allows research to continue despite the sanitary crisis by remotely studying a linguistic variety and formulating hypotheses that will then be verified during the fieldwork.

In fact, even if using social media resources for linguistic analysis has some advantages, it cannot substitute fieldwork. For this reason, the first results presented below will be compared and integrated with those obtained from a period of field research in which I will record some interviews. This was the very first aim of my research, but due to COVID-19, reaching Tunisia has become impossible, at the moment. The data provided by the in-field interviews will show further elements useful in verifying the hypothesis formulated and will provide a more precise description of the varieties used in the Wilaya. During the field ¯ work, eventual phenomena linked to diatopic and diastratic variation will be highlighted in order to point out the possible local variants belonging to any social group such as older and younger people, men and women. Moreover, any possible diachronic element that might have contributed to the formation of Mahdia Arabic will be identified and described. This will be done by choosing a sample of speakers that is as varied as possible and by consulting some Arab historical and geographical works.

Some selected phonological, morphological and syntactic features will be presented below.

#### **3. Results**

#### *3.1. Phonological Remarks*

#### 3.1.1. Interdentals

It is well known (among others, (Marçais 1950, p. 201; Attia 1969, p. 22; Yoda 2008, 2019)) that in Mahdia, Arabic interdentals d/, /t/ and /d. / are substituted by the respective dental consonants /d/, /t/ and /d. /. The data collected through the online social media pages analysed are not so unequivocal and show a certain variation.<sup>19</sup>

#### /t/

There were several cases of dental realisation of /t/ in /t/ in initial and median positions: *istitna'iyy ¯* , *istitna'iyya ¯* "exceptional", *aktar* "more", *al-ten¯ ¯ı* <sup>20</sup> "the second", *ymatt*@*l* "represents/plays", *at¯ ar¯* "antiquities, ruins" and the plural *at¯ ar¯ ¯ın*, *el-gimˇ* ޏ*a l-ten¯ ¯ı wella l-tleta ¯* "the second or the third week". The interdental phoneme is also attested, such as in *tawra* "revolution", *itn¯ın* "two (m.)" *tmaniya ¯* "eight", *al-teniya ¯* "the second", *tleta ¯* or *tle¯ta* "three (f.)", *aktar* "more", *mitel¯* "example", *tnaš¯* "twelve" and *kari ¯ tiyya* "catastrophic". In some cases, the pronunciation of dental/interdental is present in the same informant: *mitel¯* "example", *tnaš¯* "twelve", but *ymattel* "represents/plays" and *tlatamiya ¯* "three hundreds", *tmaniya ¯* and *tmaniya ¯* "eight (f.)". D'Anna (2020, p. 88) states that in the word *tla¯ta,* only the second interdental is preserved, but in my corpus, there is a certain oscillation in the realisation of /t/ in this word.<sup>21</sup>

The general tendency seems to show that initial /t/ is more likely to be preserved. In my opinion, the trend in the preservation of the interdental /t/ in the Arabic of Mahdia may be due to the influence of Tunis Arabic, which has it. In fact, the Arabic of the capital applies a centripetal force to the other "peripheral" varieties of Tunisia because of its increasing prestige, due to the fostering of travel and education during the recent years (Gibson 2002, pp. 29–30). Media and social media, on which several programs are broadcast in *dari ¯ gaˇ* , have a relevant role in this process too. In general, as Gibson showed, Arabic dialects do not follow the general trend affecting many other languages of the world, that is they do not tend towards a levelling to Modern Standard Arabic, but they are inclined to be closer to the standard dialect variety of the region, such as in the case of Tunis Arabic.22

#### / d /

The phenomenon is less evident in the voiced interdental consonant, which, however, sometimes oscillates with /d/: *asatida ¯* "professors", *madabiyya ¯* "I want/I would like to", *had¯ aka ¯* "that (m.)", *hada ¯* (Yoda 2019, p. 59), which alternates with *ha¯da* "this", *hed¯ ¯ı* "this m"., *hediyya ¯* "this (f.)" (attested in Tlelsa; see also Yoda 2019, p. 65; Attia 1969, p. 123). The same phenomenon is extended to all the demonstratives containing a /d/. In the same speaker, we find the two realisations of the consonant in the same utterance: *kade w-kad ¯ e¯* "so and so", *hadeya* but also *hadiya ¯* "this (f.)" *ha¯da* "this (m.)", *naٰdaw¯* "we take", but *madabiya ¯* "I want/I would like to". This feature is attested in a wide audience, different by age and sex and has also been attested by Yoda, for instance, *ٰde¯* "he took" (Yoda 2019, p. 63).

In Sousse, the verb *hda¯* loses its final /d/ in the imperfect conjugation (Talmoudi 1980, p. 93), and some examples are also available in Mahdia Arabic: *ya¯ٰu* "he takes". The phenomenon is also attested by Yoda (2019, p. 63) and in Bekalta too, where we find, for instance, *na¯ٰu ah. na* "we take it". The weakening of /d/ is an ancient phenomenon, attested in Sicilian Arabic where it sometimes even passed to /l/ and the consonant has not been pronounced in Maltese Arabic since the 14th century (La Rosa 2019, pp. 116–17; Avram 2012, p. 102).

In Monastir and in Bekalta, interdentals are generally maintained, but in the latter, a certain oscillation, depending on free variants and the speakers' levels of education, is attested in the following cases: *aktar min talat¯ ¯ına* "more than thirty", *na¯ٰdu* "we take", *ya¯ٰduh* "they take it", *na¯ٰduh* "we take it", but in the same speaker we find *ha¯d¯ı* "this (m.)" and *ha¯daya ¯* "this (f.)", *hadiya ¯* "this (f.)", *had¯ aka ¯* "that (m.)", which alternates with *ha¯daka ¯* , and *ٰd¯ına* "we took".

#### 3.1.2. /q/

In Mahdia, Monastir, Msaken and Bekalta, /q/ is generally pronounced as voiceless, but it might be realised as /g/ in the following examples: *gal¯ u l ¯ ¯ı* "they told me", *gutlak* "I told you", *sil*ޏ*a mangula ¯* "transported goods", *bagra* "cow", *mungela ¯* "watch", *ynagg*@*z* "he jumps", *ga¯*ޏ*d¯ın* "[the ones] staying". The phenomenon was noticed also by Attia (1969, pp. 125–26) who offered few examples that have been already mentioned above. As regards Msaken, Bouhlel observed that the inhabitants of the town generally pronounce the uvular stop /q/ as voiceless because they consider themselves as "citizens", such as in the following words: *iqalla*ޏ" eradicate", *yuq*ޏ*ud* "he sits down", and *qalb* "heart". According to Bouhlel, the consonant is pronounced as /g/ in some words, such as *mungela ¯* "watch" or *bilgd*@ "well", and in his opinion this would be the "real" pronunciation of the consonant. Bouhlel also shows a certain variation in the pronunciation of some terms, such as *zququ ¯* and *zgugu ¯* "pine nuts", *baqr¯ı* and *bagr¯ı* "veal" (Bouhlel 2009, p. 127). Actually, none of the Tunisian dialects know the exclusive use of /q/ and/or /g/. If compared to the spread of /g/, the realisation /q/ is in fact limited and restricted only to the cities of Bizerte, Tunis, Sousse, Monastir, Mahdia, Sfax, and Kairouan. Mahdia is, therefore, "entouré de g" ((Mion 2015, p. 271); see the maps in (Skik 2000)).<sup>23</sup> Moreover, Yoda (2008, p. 484) stated that the appearance of /g/ instead of /q/ in some words in Mahdia Arabic, as well as in

other Tunisian dialects, is very common.<sup>24</sup> In addition to this, already in 1984, Saada (1984, pp. 27–28) talking about the Arabic of Tozeur, stated that the existence of a /g/ sound is not a valid criterion for establishing the "non citadinité" of a dialect in Tunisia. She also added that throughout rural Tunisia, the phoneme /g/ was used for /q/, but that it was also present in some coastal zones and towns, in Judeo-Arabic and Muslim varieties, even if the Judeo-Arabic varieties of Tunisia are usually an exception and all have /q/, except Tataouine.25

#### Free Variants

#### /k/ as /q/

In some aged inhabitants of Mahdia, among which are some fishermen, natives of the quarter Burž al-ras¯ *,* /k/ is articulated as /q/ in few words or expressions, such as *kif kif* "the same", pronounced *qif qif,* and *fluqa ¯* "felucca, rowing boat", generally attested as *fluka ¯* in Tunisian Arabic.<sup>26</sup> The term *fluka ¯* is pronounced with /k/ three times in the same video by two other speakers, that is a Mahdaw¯ ¯ı poet, while reciting a poem dedicated to Mahdia, and the speaker. Both used a higher register: The poet because of the nature of the text that he was reciting and the speaker because of his role of presenter of the documentary. Therefore, their roles may have influenced their linguistic choices. Regarding previous studies on Mahdia Arabic, Attia (1969, p. 125) did not highlight this feature, but identified an emphatic variant of /k/ indicated with /k. /, such as in *k. an¯ ¯ı* "I am".

#### 3.1.3. /r./

The phenomenon is attested in some aged fishermen natives of Mahdia, such as in the words *bur. ž*, of the toponym Burž al-ras, ¯ *bh. ar.* "sea", *r. ah. ma* "mercy", *aktar.* "more", *bar.r. a* "out" and *da¯r.* "house". The same informant may also pronounce some words in which /r/ is not emphasised, such as *gˇarek ¯* "your neighbour". The phenomenon has also been attested in some other mid-aged speakers, women and men, but not in all the speakers of the video analysed. The realisations of /r/ as /r/ and /r./ may also oscillate in the same informant, a forty-year-old female nurse from Mahdia, such as in the word *ar.ba*ޏ and *arba*ޏ" four". The emphatic /r./ is also attested in Mahdia Arabic by Yoda (2008, 2019). The presence of /r./ does not seem to allow fronting or raising of /a/ and /a/. In the speakers involved ¯ in the videos analysed who do not pronounce /r./, /a/ is maintained as well and is not raised nor fronted, such as in *arb*ޏ*a* "four", *dinar¯* "dinar" and *barra* "out". However, further investigation is required to verify the presence and the distribution of the phenomenon since it has been attested in few speakers, at the moment.27

Regarding Msaken Arabic, Bouhlel (2009, p. 127) stated that, differently from some other regions of Tunisia, the *ra'¯* is generally not "amplifié", and is even pronounced *re*, such as in the words *ureq* "leaves", *kreheb ¯* "cars". Bouhlel added that some exceptions are available, such as *mra¯*"woman" and *h.rabiš ¯* "pills", but he does not explain in which conditions the phenomenon occurs. In my corpus, some cases of emphasisation of /r./ in Msaken Arabic have been attested, as well as the raising of /a/ after /r/, such as in" ¯ *kreheb ¯* "cars", as indicated by Bouhlel.

#### 3.1.4. De-Emphasisation

Some cases of loss of emphasis have been attested in Mahdia and Bekalta in speakers of different sex and age, such as in the words *atfal¯* "children", *tb¯ıb* "doctor", *musaba ¯* "infected" *durufa ¯* "conditions". In the latter, we notice the possible shift /d/>/ d. />/d/ because of the loss of the interdental articulation. Some other examples are *tufult¯ı* "my childhood", *bi-sifa* "in quality of", *abyad* "white", *tul tul* "directly, straight", *muwatin ¯* "citizen". The phenomenon involves all the emphatics /d/, /d. /, /t./ and /s./ and does not seem to affect the vowel quality, which is maintained. In the corpus of Bekalta, we also find some examples of de-emphasisation and sonorisation such as *mazdar* "source" and *mazrah.¯ı* "theatral", in which there is a shift /s./ > /z/ with a possible intermediate step of /s./ > /s/. The phenomenon of de-emphasisation is also attested in Kairouan, where we find, among

other words, *matalib ¯* "requests".<sup>28</sup> In a woman of Monastir, we find a de-emphasisation of /t./ realised as lightly affricated /ts/ in the exclamation *ya lat ¯ ¯ıf*! "oh my God!". A light affrication of /t/, already highlighted by Cantineau (1960, p. 37) as a feature attested in Algerian and Moroccan Arabic, generally also characterises Tunisian Arabic, but here involves emphatic consonants too.<sup>29</sup> Saada (1984, p. 24) identified a /ts/ sound in the speakers of some tribes of Tozeur and Dallaji (2017, pp. 153–57) reported that in Nabeul there is a tendency to affricate /t/ that people define through the verb *taštaš* and the terms *tašt¯ıš* and *taštaša*. Furthermore, Maamouri (1967) described the variety of Nabeul as being characterised by a strong affrication of /t/.<sup>30</sup> As regards the loss of emphasis, it is also typical of Tunisian Judaeo-Arabic (Taïeb and Sayah 2003). Cohen (1975, vol. 2, p. 14 and n. 7) states that some phenomena in Tunis Judeo-Arabic lead us to think that there must have been a period in which the articulation of emphatic consonants was stronger and "sans doute plus forte qu'à Tunis musulman, ce qui expliquerait le fait que les musulmans qui veulent imiter le parler des juifs, exagèrent l'emphase en même temps que les modulations expressives de la phrase".<sup>31</sup> Cohen's statement also shows that in Tunisia the weakening of the articulatory strength in pronouncing the emphatics has long been attested. However, the prononciation of emphatic consonants with more or less articulatory strength from different confessional groups, i.e., Jews/Muslims, is a phenomenon of old, present in different Arab contexts because of community or identity reasons.<sup>32</sup> Saada (1984, pp. 83–84) noticed the phenomenon in the Arabic of Tozeur and indicated a series of conditions for its occurrence. Further studies will be necessary to be able to indicate which are the causes in Mahdia Arabic and if any regularity can be found.

#### 3.1.5. Dropping of Final /n/

In Monastir and Mahdia, the weakening or dropping of the final /n/ of numerals in pausal position, that is not followed by the *ism al-ma*ޏ*dud¯* , should be noted: ޏ*išr¯ın milyu¯* "twenty millions", *sab*ޏ*at w-*ޏ*išr¯ı* "twenty-seven", *tman¯ ¯ı* "eighty", *ٰams¯ı* "fifty". The dropping of the final /n/ is an old feature, attested in Andalusi Arabic only in dual nouns (see (Ferrando 1995, pp. 50–51; Corriente et al. 2015, pp. 110, 125–26). It also should be noticed that in Yoda's transcriptions, this feature does not emerge (see, for instance (Yoda 2019, p. 63)). In Saada's study on the Arabic of Tozeur, some consonantal dropping involving final /n/ have been attested, such as the cases of *tne¯* "two", *mne¯* "from where", *le¯* "until when", *ga*ޏ*d¯ı* "standing", *taku¯* "you will be". The scholar defines this phenomenon, which in Tozeur is wider spread than in Mahdia and involves some other consonants, *tarٰ¯ım*, i.e., "softening" or the process of shortening a word (for a more complete definition, see (Carter 2007, p. 17), and states that only one informant uses it and therefore judges this feature as in decline/regression (Saada 1984, pp. 39–42).

#### 3.1.6. Vowels

According to Attia (1969, pp. 126–27), Mahdia Arabic has five short vowels /a/, /e/, /i/, /o/, /u/ and five long vowels /a/, / ¯ e/, / ¯ ¯ı/, /o/ and / ¯ u/. ¯ <sup>33</sup> Yoda (2008, 2019) stated that Mahdia Arabic has three short vowels, /a/, /i/, /u/, and five long vowels, and assumed that Mahdia dialect, as an urban variety, acquired /e/ and / ¯ o/ as phonemes ¯ because of the influence of the surrounding village dialects. He also added that "Because of the existence of /e/ and / ¯ o/, Mahd ¯ aw¯ ¯ı dialect dialect is unique among the Tunisian sedentary dialects" (Yoda 2008, p. 489).<sup>34</sup> Therefore, Mahdia Arabic vowel system would be made of five long vowels and five short ones for Attia, but only of three short vowels for Yoda.<sup>35</sup> Starting from this vowel scheme, Mahdia Arabic long vowels can undergo some qualitative changes; i.e., some cases of opening of the close vowels /i/ and /u/ are present in the varieties of the region, such as in Bekalta, were we find both *z¯ıt* "oil" and the variant *zet¯* , or *hon¯ ¯ı* "here" in Monastir (Attia 1969, p. 129) and a certain variation is also attested in Msaken, such as in the cases of *ro¯h.* "soul" and *ful¯* "broad beans". Bouhlel (2009, p. 128) also provides some minimal pairs: *qum¯* "stand up"/ *qom¯* "people" pejorative, *kun¯* "be"/*kon¯* "world".

In this section, I shall focus exclusively on the treatment of /¯ı/ when it is followed by the third person singular suffix pronouns. What emerges is that, in general, /¯ı/ + ha¯ and /u/ + h ¯ a are maintained, such as in ¯ *ma f ¯ ¯ı-haš¯* "there is not" (as well as in Takrouna Arabic see (Marçais and Guiga 1925, p. XII, n. 2) and in Sousse, (Talmoudi 1980, p. 152)). However, some cases of *fe-h ¯ a¯* "in it" have been attested (Yoda 2008, p. 489; 2019, pp. 61, 65, 66). In the videos analysed in my corpus, I found only rare cases of *fe-h ¯* "in it (m.)" and of *fe-ha ¯* "in it (f.)" in a speaker from Knaies, a small village 11 km near Msaken, and in a small group of people allegedly from Mahdia. The majority of the people speaking in the videos analysed, coming from Mahdia or Monastir, uttered *f¯ı-h* or *f¯ı-ha*. The reduced presence of the phenomenon may be due, once again, to the linguistic levelling towards the Arabic of Tunis or of the other coastal towns of Sahel such as Sousse (see (Mion 2015, p. 272)) or to the conditioning of the medium used, which may push the speaker to use a higher register.<sup>36</sup>

Therefore, the vowel system of Mahdia seems similar to that of Monastir, which, according to Gibson (1998, p. 276), has *villageois* features since it has five long vowels as in Figure 3.

**Figure 3.** Jemmel and Monastir Vowel system, taken from (Gibson 1998, p. 276).

According to Marçais, Takrouna Arabic has"une particularité de détail propre aux dialectes sâh. li: l'ouverture en e et en ¯ o de ¯ ¯ı et de u" that we find in Mahdia too ( ¯ Marçais and Guiga 1925, pp. XXI–XXII).

#### 3.1.7. Diphthongs

According to Yoda (2008) and Mion (2015, p. 271), the vowel system of Sahel Arabic is characterised by five long vowels, a, ¯ e,¯ ¯ı, o, ¯ u, obtained from the monophthongisation ¯ of the diphthongs /ay/ and /aw/ in e and ¯ o. According to the traditional categories of ¯ classification, which contrast some allegedly Hilali and pre-Hilali features, in the Hilali system, /ay/ and /aw/ would be reduced to e and ¯ o, while the passage of /ay/ > / ¯ ¯ı/ and /aw/ > /u/ would be typical of the pre-Hilali dialects. Actually, the situation is ¯ more complex than this, because, as Mion (2015, p. 271) showed, the reduction of the diphthongs to e and ¯ o "is shared by both /q/ and /g/ varieties". As already dealt above, ¯ the pronunciation of /q/ has been the main criterion for establishing whether a dialect had Bedouin or sedentary features, but Mion and some other scholars have raised doubts about the validity of this principle.37 Based on what emerges from my corpus, the realisation of etymological diphthongs is usually sedentary in Mahdia, in Monastir and in Bekalta, such as in the following cases: *al-yum¯* "today" (Tlelsa), *ٰ¯ır* "better", *d.¯ıf* "guest". Even if Yoda (2019, p. 59, n. 9) states that the diphthongs in Mahdia are usually reduced to /e/ and / ¯ o/, ¯ the cases identified in my corpus are similar to those of Sousse, where the diphthongs /ay/ and /aw/ are reduced to /¯ı/ and /u/ ( ¯ Talmoudi 1980, p. 55). Some other existing studies on Sahel varieties all go in the same direction; for instance, to recount it with Marçais' words regarding the Arabic of Takrouna, "Comme les dialectes congénères du Sâh. el, le takroûni y suit sur quelques points des voies propres qui, l'éloignant des parlers citadins de la Régence, le rapprochent de certains dialectes bédouins du Maghréb oriental [ ... ]: les diphtongues anciennes ai et au accentuées et non en finale absolue s'y réduisent généralement à e et ¯ o"¯ (Marçais and Guiga 1925, pp. XXI–XXII). <sup>38</sup> As D'Anna (2020, pp. 88– 89) showed, in Chebba, a small village 35 km south of Mahdia, etymological diphthongs

are reduced to /e/ and / ¯ o/ and are occasionally partially diphthongised. According to ¯ Bouhlel (2009, p. 128), in Msaken Arabic, the diphthongs /aw/ can be reduced to /o/ in many terms, i.e., *lon¯* "color", *moz¯* "bananas", *dora ¯* "tour". The situation seems to be quite complex since, once more, it is clear that Mahdia Arabic is an urban dialect surrounded by mixed or contact varieties, which can influence it in different ways and at different levels.<sup>39</sup>

#### 3.1.8. Imala ¯

Several cases of final *imala ¯* , etymological or not, have been found in all positions in Mahdia and the nearby towns: *hne¯* "here", ޏ*ale¯* "on/upon", *mše¯* "he went", *ٰde¯* "he took" ((Yoda 2019, pp. 59, 63); see also (Attia 1969, p. 130)), At the moment, the pronoun *eni ¯* "I" is the only case of *imala šad ¯ ¯ıda* found in Mahdia and Monastir. (Bouhlel 2009, p. 127) added some other examples attested in Msaken Arabic, such as *m¯ı* "water" and *sm¯ı* "sky".

*Imala ¯* is a phenomenon shared by many Tunisian vernaculars. Gibson (1998, p. 279), for instance, analysed the pronunciation of the verbal forms *mša¯* "he went" and *mšat¯* "she went" and stated that in Sousse the average vowel of the feminine form is /E/ and those of the masculine form are /E/ and /e/. He added that among his Sousse informants, nobody uttered a long /e/, while in Monastir, there was only one speaker with /e/ as the tense ¯ vowel. Marçais affirmed that in Takrouna Arabic "a accentu ¯ é en finale absolue y passe à peu près constamment a e" ( ¯ Marçais and Guiga 1925, pp. XXI–XXII). The feature of *imala ¯* is also shared by Tunisian Judeo-Arabic, the /a/ is affected by ¯ *imala ¯* in all positions (Cohen 1975, vol. 2, p. 56).<sup>40</sup>

#### 3.1.9. Raising of Final—a

Mahdia Arabic is also characterised by the raising of the final feminine singular ending –a such as in the cases of *mah. dude ¯* "limited", *ٰamse* "five", *insaniyye* "human", *bnayye* "child (f.)". There is a role played by syntactic elements since we also find *ٰamsa s.g˙ar¯* "five children", with the numeral in *id. afa ¯* . The presence of emphatic and pharyngal consonants usually blocks this raising.

#### *3.2. Nominal Morpho-Syntax*

#### 3.2.1. Personal Pronouns

The existing studies on the personal pronouns of Tunisian Arabic show a complex situation, which can be summarised in Figure 4:


**Figure 4.** Distribution of the first p. sing., second p. sing., first p. plu. of the personal pronouns in Sahel Arabic (Mion 2015, p. 274).

Mion, in his paper *Réfléxions sur la catégorie des «parlers villageois» en arabe tunisien*, offered a brief sketch of the first singular and plural personal pronouns "I" and "we" and of the second singular person "you" and focused on gender opposition in the pronominal system. If the village and Bedouin dialects share the presence of gender opposition in the second person singular, according to Marçais' classification, the urban varieties do not (Marçais 1950, p. 208). With respect to this element, in Mahdia Arabic, there is no gender

distinction in pronouns and verbs, as well as in Monastir, Msaken and Bekalta. So, the situation could be represented as follows in Table 1.


**Table 1.** Personal pronouns in Mahdia, Monastir, Msaken and Bekalta.

As for the first-person plural "we", *ah. na* is the main pronoun used and *nah. na* seldom alternates. Therefore, further studies are needed to check whether this is due to the origin of the speakers interviewed in the videos analysed. In Yoda (2019, p. 65), *h. na* is also attested when preceded by a vowel such as in the expression *tawwa*-*h. na* "now we".

The first person singular "I" generally shows a final *imala šad ¯ ¯ıda¯*. <sup>42</sup> Moreover, Takrouna Arabic shows the pronoun ¯ *a*¨*ni* "I", which according to Marçais is original to the variety, and also presents the variants *n*¯ *a*¨*ya* and *n*¯ *a*¨*y,* which have been imported later (Marçais and Guiga 1925, pp. XXII–XXIII).43 As regards Msaken, Bouhlel (2009, pp. 130–31) indicates *ani ¯* with the meaning of "I" and "me" and the second person singular pronoun *inti* "you".

#### 3.2.2. Relative Pronouns

In my corpus, in all the towns and villages considered in this study, the only relative pronoun used is *ell¯ı* and the variants *ill¯ı* ((Yoda 2019, p. 61) has *illi*) and *l¯ı* when the preceding word ends in a vowel. The relative pronoun *ella*, defined as "villageois" by Marçais, has not been attested at the moment ((Marçais 1950, p. 211); for Chebba, see (D'Anna 2020, p. 90)). Other varieties of Tunisian Sahel, such as the dialect of Sousse, also have the variants *ell¯ı* and *ll¯ı*, used when the relative is preceded by a word ending in a vowel (Talmoudi 1980, pp. 146–47).<sup>44</sup>

#### 3.2.3. Numerals

In numbers 3 to 10, the *ism al-ma*ޏ*dud¯* is plural: ޏ*ašra yyam¯* (Yoda 2019, p. 66), *xams iyyam¯* "five days" and *tla¯ta yyam¯* "three days" (Yoda 2019, pp. 60, 63), *tla¯ta* ޏ*askar* "three soldiers" and *arb*ޏ*a* ޏ*askar* "four soldiers" ((Yoda 2019, p. 59); for Tunis Arabic, see (Bi¸tuna 2011)).45 As already noticed by Bi¸tuna for Tunis Arabic, the noun *d¯ınar¯* "dinars" is always used as a singular in Mahdia Arabic too (see (Bi¸tuna 2011) and (Yoda 2019, p. 63)).46

In Mahdia Arabic, numerals from 11 to 19 have the same form that they have in Tunisian Arabic in general, that is *h. daš¯* "eleven", *tnaš¯* "twelve", *tlut.t.aš¯* "thirteen", *arba*ޏ*at.aš¯* "fourteen", *ٰums.t.aš¯* "fifteen", *<sup>s</sup>*@*t.t.aš¯* "sixteen", *sab*ޏ*at.aš¯* "seventeen", *tmun¯ t.aš¯* "eighteen" and *tsa*ޏ*at.aš¯* "nineteen" (some numerals are present in (Yoda 2019): *passim*). One of the main evident characteristics is the disappearance of the pharyngeal /ޏ/ or its assimilation to the /t/ (Bi¸tuna 2011, p. 32). These numbers from 11 to 19 are usually followed by a singular, such as in *tnaš¯* <sup>ޏ</sup>*am¯* "eleven years" and *ٰamst.aš¯* @*<sup>n</sup>* <sup>ޏ</sup>*am¯* "fifteen years". It is worth noting the form *tlut.t.aš¯* @*n el-sna¯* "eighteen years", followed by a determined singular noun. Some other examples, taken from Yoda's texts, are *xums.t.aš-in y ¯ om¯* and *xmus.t.aš-in y ¯ om¯* "fifteen days" (Yoda 2019, pp. 65, 66), attested in the same speaker. It is well known that in Maghribi dialects, numerals 11 to 19 show an -n form when in direct annexation with the *ma*ޏ*dud¯* ((Marçais 1977, p. 178); for Tunis Arabic, see (Bi¸tuna 2011, p. 32)). In fact, in Sousse, numerals from 11 to 19 also have an –n final shape when in *id. afa ¯* (see (Talmoudi 1980, p. 169)). Concord from 20, in Mahdia Arabic, reflects the rules of *fus.h. a¯* too, such as in the cases of ޏ*išr¯ın yom¯* "twenty days" ((Yoda 2019, p. 65); for Tunis Arabic, (Bi¸tuna 2011, p. 34)) and *mya* ޏ*askar* "one hundred soldiers" ((Yoda 2019, p. 59); for Maghribi Arabic, see (Marçais 1977, pp. 173–80), for Tunis Judeo-Arabic, (Cohen 1975, p. 232); for Takrouna Arabic see also Marçais; (Marçais and Guiga 1925): *passim*).47

The tendency towards weakening or to let the final /n/ drop in tens, in the towns considered in this analysis, has already been dealt with above; some other cases involving dozens are *ٰams¯ı* "fifty" and *arba*ޏ*̄ı* "fourty".

#### *3.3. Verbal Morpho-Syntax*

#### 3.3.1. An Urban Conjugation

Verbs in the Arabic variety of Mahdia and the surrounding towns, that is Monastir, Msaken and Bekalta, are generally conjugated according to the stem *yimši*/*yimš¯ıw* and *yaqra*/*yaqraw¯* , traditionally defined as pre-Hilali (Marçais and Guiga 1925, p. 49 and passim; Marçais 1950, p. 209)48 as well as in other urban varieties of Sahel, such as Sousse and Sfax (Sellami 2019, pp. 86–87, Sellami 2019; Herin and Zammit 2017, p. 143). Some examples are *yemš¯ıw* "they go", *naqraw¯* "we read", *nalqaw¯* "we find", *netmannew¯* "we hope", *nistennew¯* "we wait for" (see also Yoda 2019, pp. 59, 66)49.

Even if data for Mahdia Arabic seem to be clear, as Mion (2018, pp. 117–18) pointed out, for Sahel, the issue cannot be reduced to the simple opposition pre-Hilali/Hilali conjugation, since a large part of Tunisian varieties have mixed features with plural perfect conjugation in -aw and imperfect in – ¯ u, i.e., ¯ *mšaw¯* /*yamšu* (see also (Mion 2015, pp. 272–73)).50

#### 3.3.2. Feminine Third Person Singular of Weak-Final Verbs

In Mahdia, Monastir, Bekalta and Msaken, the feminine third person singular of the perfect tense has a long vowel and –*a-t ¯* ending, also showing *imala ¯* , such as in the following cases: *mšet¯* "she went"; *ta*ޏ*addet¯* "she passed". The long vowel, preserved as in the so-called pre-Hilali dialects, has been also attested by Yoda (2019, p. 65) in *xdat¯* "she took" and ޏ*t.at¯* "she gave". In addition, the surrounding urban varieties have similar features (see Talmoudi 1980, pp. 86–88).51

#### 3.3.3. Perfect Tense Pattern

Similar to the other urban dialects, Mahdia Arabic does not show gender opposition at the feminine second person singular @*nti* of the perfect tense, just like the other urban varieties of Sahel such as Msaken (Bouhlel 2009, p. 131), Sousse (Talmoudi 1980, pp. 77–78) and Sfax (Lajmi 2009, p. 138; see also (Mion 2015, p. 273)).

Gender opposition at the second person singular "you", in fact, is maintained only in conservative52 Bedouin or village dialects in which the feminine person has the desinence –ti (Marçais 1977, p. 36).

#### 3.3.4. Use of the Verb ra¯

In the Arabic of Mahdia, the verb to see is expressed through the verb *šaf¯* "to see, to look at, to watch", common to a wide variety of Arabic dialects,53 and *ra¯* "to see".

Additionally, in Sfaxi Arabic, the two verbs alternate with a certain predominance of *ra¯* (Lajmi 2009, p. 140; Zammit 2014, p. 34; Sellami 2019) and, in Sousse, the verb *ra¯* is used, but with a peculiarity, that is the first person singular of the perfect tense *rut¯* (Talmoudi 1980) and not *r¯ıt*, as in Kairouan and Mahdia. Among the examples, we find *r¯ıtuš*? "have not you seen it?" (Yoda 2019, p. 66). In Mahdia Arabic, *šaf¯* is very used, as in the cases of *šuf ¯* , *šaf¯* (Yoda 2019, pp. 60, 63, 66). In Tunis Judeo-Arabic, the verb *ra¯* is used and it has maintained the dipthongation /ay/ (Cohen 1975, p. 106). From an historical perspective, the verb *ra¯* replaced *šaf¯* in Sicilian Arabic, in Sicilian Judeo-Arabic and in Maltese. Moreover, it was very productive in Andalusi Arabic too (La Rosa 2019, p. 259; Burgaretta 2016, p. 89; Corriente et al. 2017, pp. 508–9, for šaf, p. 742). ¯

#### 3.3.5. Passive

The passive form of verbs is usually expressed through an initial t- pattern, used in Tunisia and in other Maghribi varieties (see, i.e., (Cohen 1975, pp. 123–25)), "né, sans doute, sous l'influence analogique des réfléchis-passifs à t- initial des thèmes V et VI, procédant

respectivement des thèmes II et V, ce thème procède de verbes du thème fondamental généralement en usage. Il en constitue le réfléchi-passif" (Marçais 1977, p. 66).54

The following examples have been attested in Mahdia, Bekalta, Monastir and Msaken: *t*@*bnat* "it was built"; *yitnah. h. a* "it is removed", *yitfahem* "it is understood", *t*@*th.et fuq al-ras¯* "it is placed on the head", *tut.f¯ı* "it switches off", *ma tu ¯ t.fiš* "it does not switch off" (for the phenomenon in Sfaxi Arabic, see (Sellami 2019)).55

#### **4. Discussion and Conclusions**

Marçais (1950, p. 214) stated that the Tunisian Sahel is characterised by sedentary varieties, which break the continuum of Sulaymi dialects and divided Bedouin Tunisian dialects into two main groups: Hilali and Sulaymi. According to this kind of classification, the features of Mahdia Arabic would have some characteristics attributable to the Banu¯ Hilal, such as the ¯ *imala ¯* of internal vowels, the masculine singular third person pronoun suffix –u, opposed to the suffix –a(h) of the Sulaymi group and the passive in t-. Some Sulaymi features would be the weaker articulation of emphasis and the final *imala ¯* (see (Marçais 1950, p. 217; Ritt-Benmimoun 2014, p. 354; Taine-Cheikh 2017, pp. 20–21). However, Ritt-Benmimoun's studies on the Bedouin Arabic of South Tunisia have showed that the categorisation Hilali/Sulaymi is not always definite and obvious and led us to wonder whether "there is a real zone of transition between areas where the S and the H dialects are spoken or if these areas are separated by a more or less well-defined boundary, perhaps corresponding exactly to the settlement area of the different tribes" (Ritt-Benmimoun 2014, p. 358).

In 2017, Taine-Cheikh pointed out that Marçais' criteria of dialect classification had to be revisited and stated:

Reste, me semble-t-il, une question, celle de la valeur de la distinction entre parlers hilaliens vs sulaymites vs maޏqiliens. S'il existe bien de groupes de parlers plus ou moins différenciés et s'il est nécessaire de leur attribuer un nom, je ne suis pas sûre que ces trois désignations d'origine socio-historique soient d'une réelle précision et donc, d'une véritable secours (Taine-Cheikh 2017, p. 38).56

Guerrero (2018) revisited the idea of the villageois category in which the rural Tunisian dialects are generally included, and based on 20 phonological, morpho-syntactic and lexical features, demonstrated that they show important differences from the Algerian and Moroccan varieties, which, instead, constitute a group with consistent features. Benkato (2019, pp. 11–12) pointed out that the classifications of Bedouin dialects, and particularly Sulaymi dialects, derived from W. Marçais' statements, arose in turn from de Slane and G. Marçais' assumptions, were "taken as fact", even if the scholar did not offer any genealogy of these tribes. Benkato added that "these categories were hardly based on sound linguistic argumentation and instead more on the personal experience and reputation for their creators" (Benkato 2019, p. 14).

For all these reasons, Marçais' traditional criteria of classification therefore cannot be applied *tout court* to the analysis of Mahdia Arabic. That is, it is not possible nor methodologically appropriate to attribute a precise and certain origin to all of its features. In spite of this, the first data presented in this contribution confirm that Mahdia Arabic is a sedentary variety showing some "contact" features.

According to the traditional classification, typical sedentary traits are the voiceless articulation of /q/, the relative pronoun *ell¯ı* and the conjugation pattern *mšew¯* /*yemš¯ıw*. As regards the latter, it is useful to stress what Mion (2018, p. 118) claimed:

Alors que -¯ıw du Maroc à la Tunisie doit être vue comme sédentaire et citadin, un système transversal avec un parfait –/aw/ et imparfait –/ ¯ u/ est en r ¯ éalité bien plus fréquent que ce que la simple opposition méthodologique pré-hilalien/hilalien laisserait entendre. Si ce système doit être conçu comme transitoire au sein d'un *continuum* don't les deux poˆles seraient justement les typologies pré-hilalien/hilalien, alors une bonne partie de la Tunisie (à l'exception de ses métropoles et des variétés Maraz¯ ¯ıg) devrait être considérée paradoxalement comme une zone de transition.<sup>57</sup>

Some "contact" features attested in Mahdia Arabic are the final etymological *imala ¯* , and the opening of /¯ı/ towards /e/ in the particle ¯ *f¯ı* followed by the personal suffix pronoun –ha. ¯

However, the situation does not always seem to be clear and definite since sedentary dialects also share many elements with Bedouin Hilali dialects, such as the following used in Mahdia region: The pronoun for the third person masculine singular –u, and the realisation of final /¯ı/ and /u/ as /e/ and /o/, when followed by the pronoun –h ¯ a, which ¯ "is not considered as an isogloss marking either Bedouin or sedentary varieties. However, it is attested in dialects not generally considered as 'rural'" (D'Anna 2020, p. 93). According to Ritt-Benmimoun:

The affinity of the H [Hilali] dialects with Tunisia's sedentary dialects can be found in Philippe Maarçais' statement that in the regions with a prevalence of H dialects sedentary dialects had originally been spoken which were subsequently overlaid by Bedouin dialects (Ritt-Benmimoun 2014, p. 355).

As regards the reduction of diphthongs to /e/ and / ¯ o/, which is generally considered ¯ to be a Bedouin feature, is attested by Yoda (2019) in Mahdia Arabic, but is not much attested in my corpus where, instead, a strong presence of /¯ı/ and /u/ is noticed. Therefore, further ¯ investigation will shed new light on the real attestation and distribution of this feature.

Besides, some of the main sedentary features of Mahdia Arabic are also shared by the so-called *villageois* varieties, such as the voiceless realisation of /q/,<sup>58</sup> the final etymological *imala šad ¯ ¯ıda*, the lack of gender opposition in verbs and pronouns and the long vowel –a in ¯ the desinence –a-t of the feminine third person singular of final-weak verbs ( ¯ Marçais 1950, pp. 207–12).

From a diachronic point of view, some elements pointed out in previous studies offer a sketch of a slightly different variety of Mahdia Arabic. For instance, in 1969, Attia observes that /d/ "remplace l'interdentale /d/ qui a disparu du parler de Mahdia" Attia (1969, p. 123) and in 2008, Yoda identifies a series of phenomena, which in my corpus are not so obvious, such as the presence of *fe-ha ¯* or the reduction of diphthongs to /e/¯ and /o/. These and other traits suggest a possible explication, that is that the Arabic of ¯ Mahdia is undergoing a change, a sort of linguistic levelling or standardisation towards the Arabic of the capital Tunis and, probably, also through the influence of the main city of Sahel, together with Sfax, Sousse, which is only 60 km away. As already shown above, Gibson's research goes in this direction59 as well as Mion's observations about a series of phenomena that, in a certain way, contribute to create this situation of standardisation not only in Sahel Arabic, but more generally in Tunisian Arabic. Among these factors, I find particularly interesting the role of the media, and above all, television, that Mion considers a facilitator for the diffusion of Modern Standard Arabic to all the social strata. Television is therefore responsible for the reintroduction of some phonemes in people's speech, such as the *hamza* and the voiceless articulation of /q/,<sup>60</sup> and also for the new diffusion of the –aw/ ¯ ¯ıw perfect tense endings. Moreover, it helped the spread of the prestigious dialectal Arabic of the capitals, such as the Tunisian of Tunis (Mion 2018, p. 120).<sup>61</sup> The fact that the aged fishmen natives of Burž al-ras, interviewed in the already-mentioned documentary ¯ on this quarter, all pronounce the interdental /d/ as /d/ confirm Attia's observations and lead us to consider this feature as a confirmation of the linguistic levelling in progress, above all among the youngest and more educated inhabitants of Mahdia. According to Sayahi (2019, p. 237), the fact that in recent years the institutions often choose to utter their official speeches in Tunisian Arabic fostered the use of the dialect in the public space for public communications and not only for private or semi-private occasions (on this subject, see also (La Rosa 2018)).

Moreover, I believe that the diffusion of the use the smartphone has been crucial, above all after the 2011 Arab Spring(s). In fact, thanks to it, the internet has become more accessible to a large part of the population. Easy and cheap access to the internet has allowed the acquisition and the circulation of information from far-off places within the country and abroad. Thanks to the use of social media, such as Facebook and WhatsApp, or the use of YouTube, the spread of several forms of oral and written artistic forms and cultures has become more and more easy.62 What is most important here is that the linguistic contact between people from different regions of Tunisia, speaking different varieties of Arabic, has become easier too. I think that this has helped the linguistic levelling process in progress by bringing people together, in spite of the geographical distance.

Furthermore, we should also take into consideration that the role of the migration movements inside Tunisia and abroad (see (D'Anna 2017) regarding the Sicilian community of Mazara, mainly consisting of Tunisians from Mahdia and Chebba) for study and work reasons may have had particular relevance in the levelling of some linguistic features contained in the varieties of Sahel. In fact, "migration from the cuntryside to cities has constituted the most significant demographic change in the last two centuries. Linguistic contact in cities is more intense and involves faster changes than in rural areas. [... ] In recent decades, a significant increase in the frequency and variety of types of interdialectal and intra-dialectal contact occurred causing faster and greater degrees of levelling" (Vicente 2019, p. 106). Linguistic levelling is also evident in the town, which is probably most affected by migration movements among those mentioned in this study, i.e., Msaken. As Bouhlel (2009, p. 133) already pointed out, in fact, Msaken Arabic underwent a deep change during the last 50 or 70 years, affecting mainly, but not exclusively, lexicon.

In addition to this, as shown above, the medium used for the interviews in my corpus has a crucial role since, to a certain extent and in many cases, it conditions the speaker who knows that he/she is being filmed or recorded and, therefore, tends to use linguistic structures and/or a lexicon belonging to a higher register, that is the standard variety, be it Modern Standard Arabic or Standard Dialect.<sup>63</sup>

**Funding:** This paper contains the first results of a research project, titled "*Lahgˇat Wil ¯ ayat al-Mahdiyya ¯* : proposte di descrizione e classificazione", which is part of the line of research "Starting Grant", of the "Piano di incentivi per la ricerca di Ateneo 2020/2022 (Pia.ce.ri.)", funded by the University of Catania.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


seem to be distinctive. Elongated, it tends to merge with the nearby aperture degree /i/ or /a/". English translation is mine] and about /o/: "C'est une voyelle semi-ouverte, arrière, brève. Elle représente entre /a/ et /u/ un degré d'aperture absent en Arabe classique. Son absence ne semble pas distinctive. Allongée, elle tend à se confondre avec le degré d'aperture voisin /a/ ou /u/" ["It is a semi-open, back, short vowel. It represents a degree of aperture between /a/ and /u/ which is absent in classical Arabic. Its absence does not seem distinctive. Elongated, it tends to merge with the nearby aperture degree /a/ or /u/". English translation is mine]. For Attia, these two phonemes in Mahdia Arabic could only be short (see Attia 1969, pp. 128–29).


#### **References**

Attia, Abdelmajid. 1969. Le parler de Mahdia. In *Travaux de Phonologie. Parlers de: Djemmal, Gabes, Mahdia (Tunisie), Treviso (Italie)*. Edited by Taieb Baccouche, Hichem Skik, Abdelmajid Attia and Mohamed El Habib Ounali. Tunis: Centre d'etudes Économiques et Sociales, pp. 116–38.

Avram, Andrei. 2012. Some phonological changes in Maltese reflected in onomastics. *Bucharest Working Papers in Linguistics* 14: 99–119. Bearman, Peri, Thierry Bianquis, Clifford Edmund Bosworth, Emery van Donzel, and Wolfhart P. Heinrichs, eds. 1986. *K* . ul-Oghlu. In

*Encyclopaedia of Islam*, 2nd ed. Consulted online on June 9 2021. [CrossRef]


Zack, Liesbeth, and Arie Schippers. 2012. *Middle Arabic and Mixed Arabic. Diachrony and Synchrony*. Leiden and Boston: Brill.

Zammit, Martin. 2014. The Sfaxi Tunisian Element in Maltese. In *Perspectives on Maltese Lingusitics*. Edited by Albert Borg, Sandro Caruana and Alexandra Vella. Berlin: De Gruyter, pp. 23–44.

## *Article* **Contrastive Feature Typologies of Arabic Consonant Reflexes**

**Islam Youssef**

Department of Languages and Literature Studies, University of South-Eastern Norway, 3833 Bø i Telemark, Norway; islam.youssef@usn.no

**Abstract:** Attempts to classify spoken Arabic dialects based on distinct reflexes of consonant phonemes are known to employ a mixture of parameters, which often conflate linguistic and nonlinguistic facts. This article advances an alternative, theory-informed perspective of segmental typology, one that takes phonological properties as the object of investigation. Under this approach, various classificatory systems are legitimate; and I utilize a typological scheme within the framework of feature geometry. A minimalist model designed to account for segment-internal representations produces neat typologies of the Arabic consonants that vary across dialects, namely *qaf, ¯ gˇ¯ım, kaf, ¯ d. ad, ¯* the interdentals, the rhotic, and the pharyngeals. Cognates for each of these are analyzed in a typology based on a few monovalent contrastive features. A key benefit of the proposed typologies is that the featural compositions of the various cognates give grounds for their behavior, in terms of contrasts and phonological activity, and potentially in diachronic processes as well. At a more general level, property-based typology is a promising line of research that helps us understand and categorize purely linguistic facts across languages or language varieties.

**Keywords:** phonological typology; feature geometry; contrastivity; Arabic dialects; consonant reflexes

**Citation:** Youssef, Islam. 2021. Contrastive Feature Typologies of Arabic Consonant Reflexes. *Languages* 6: 141. https://doi.org/10.3390/ languages6030141

Academic Editors: Simone Bettega and Roberta Morano

Received: 8 June 2021 Accepted: 18 August 2021 Published: 23 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Modern Arabic vernaculars have relatively large, but varying, consonant inventories. Because of that, they have been typologized according to differences in the reflexes of their consonant phonemes—differences which suggest common origins or long-term contact (Watson 2011a, p. 862). The resulting dialect categories often coincide with various divisions: geographical (eastern–western), lifestyle (sedentary–Bedouin), ethno-religious, social (based on status, age, gender), as well as stylistic and historical. However, using such mixed classificatory devices has always been problematic. Not only do the various factors cross-classify the dialects, but, with persistent exceptions, they exhibit internal inconsistency as well (see Palva 2006 for a discussion of some of these challenges). Moreover, the outcome is largely descriptive. Works that have explored Arabic consonant variation from this perspective include Cantineau (1960), Fischer and Jastrow (1980), Holes (1995), and Kaye and Rosenhouse (1997).

While classifying languages or dialects according to the type of sounds they contain is a recognized approach to phonological typology, it has been criticized for proposing oversimplified groupings with no explanatory value for synchronic or diachronic facts (Kiparsky 2008; Dresher et al. 2018). A more theory-oriented, 'property-driven' outlook to typology, advanced by Hyman (2007, 2018), has the individual phonological traits, not language varieties as such, as the primary objects of comparison. In this approach, typology and theory must go hand in hand, and since modern phonological theory is multifaceted and pluralistic in nature, we must admit that any meaningful typology builds on a specific theoretical framework (Kiparsky 2018). This, in turn, means that there will always be several viable options to formulate a typology; hence, there is no such thing as a one-size-fits-all classification system. Within Arabic, theoretically motivated typologies of syllabification phenomena (e.g., Broselow 1992; Kiparsky 2003; Farwaneh 2009) and

of stress placement (e.g., Kiparsky 2000; Watson 2011b) have been more successful than segmental studies.

This article embraces the latter line of research, by which the typologies of Arabic consonants are couched within a theory of representation. So, rather than considering only the existing phoneme reflexes in one inventory as opposed to another, I explore phoneme classes in terms of which of their constituent features are active in the phonology. Representational typologies will be formulated in a minimalist and highly abstract model of feature geometry, which optimizes the use of a minimum number of contrast-relevant features. This model, I argue, affords one possible concrete scheme to correlate the consonant reflexes without resorting to the problematic, long-established categories. It also explains and predicts phonological behavior in a systematic and unambiguous way. I will demonstrate that the typological and traditional classifications can coexist, but only to relate the structural generalizations to what we already know. Apart from that, the two are methodologically incompatible.

The remainder of the paper is organized as follows. Section 2 introduces the theoretical model employed in the analysis. Section 3 develops the typologies of the varying Arabic consonantal phonemes, sketching the geographical distribution of each reflex and justifying its component features in accordance with phonological facts. Sections 3.1–3.7 treat the consonant prototypes *qaf, ¯ gˇ¯ım, kaf, ¯ d. ad¯* –*d. a¯ )* , interdentals, rhotic, and pharyngeals (in that order). Section 4 discusses various implications of this type of analysis, both for the study of Arabic dialects and for phonological typology in general.

#### **2. A Model for Feature-Based Typology**

Typology in general is the classificatory study of languages according to their structural features; and by convention, phonological typology will group them according to the number and type of the phonemes they contain. This traditional view is challenged by Hyman (2007, 2018) who claims that typology is not about classifying languages but rather about characterizing linguistic properties across the linguistic spectrum. When this becomes the primary object of comparison, we move into what he calls property-driven typology. Under this view, the phonologist studying typology should not be interested in how phonological properties are distributed according to extra-linguistic factors. How to analyze the system of variation has been more of a priority for phonology than the 'where' question of traditional dialectology (Hyman 2018, pp. 14–15).

By focusing on the 'how', I will adopt a line of research that places dialect typologies within theories of phonological representation. Of course, features are the atoms of such representation. They are typically regarded as segment properties and as cross-classifying dimensions that characterize natural or phonologically active classes of segments. Moreover, there is solid evidence that features are arranged in some hierarchical structure, typically under higher-order categories known as 'class nodes', such as Place, Manner, and Laryngeal. This understanding of features constitutes the premise for most models of feature geometry (e.g., Clements 1985; Sagey 1986; McCarthy 1988, inter alia). The property-driven analysis of typology in this paper is feature geometric in nature.

Any feature-based theory of typology is potentially undermined by the different assumptions about the nature of the hierarchy or the very set of phonological features upon which it is based (Gordon 2016, p. 71). This should not be a problem, however, if we acknowledge that "there are no theory-neutral grammars, and consequently no theoryneutral typology" (Kiparsky 2018, p. 54). There is no contradiction, Kiparsky argues, that typological generalizations are the product of linguistic theory while they themselves are theory-dependent. The criterion to generate an informed theory-specific typology is thus to ensure that categories are founded on "independently justified linguistically significant representations" (Kiparsky 2018, p. 55). This is a fundamental principle of the framework I am going to employ here.

My analysis of Arabic consonant reflexes is couched in the Parallel Structures Model of feature geometry (PSM; Morén 2003, 2006, inter alia). The PSM is a minimalist framework in which consonants and vowels have parallel structures and identical, broadly defined features for place and manner articulations. It integrates insights from various other proposals, in particular Unified Place Theory (Clements 1991; Clements and Hume 1996), Element Theory (Harris and Lindsey 1995), and Dependency Phonology (Anderson and Ewen 1987). Features in the PSM are monovalent and exclusively distinctive, i.e., present only if they are necessary to maintain phoneme contrasts and/or are active in the phonology (cf. Clements 2001). In this sense, a PSM analysis is also congruent with the Contrastivist Hypothesis (Hall 2007; Dresher 2009).

How a terminal feature is interpreted in the PSM hinges on its association to a superordinate class node in the hierarchy. As diagrammed in Figure 1, each place or manner feature can be represented under two separate nodes/tiers, with the V-node being dependent on the C-node. This symmetry aims to establish a unified machinery that captures consonant– vowel interactions as well as acoustic/articulatory parallelisms in natural language. To explain their asymmetries, consonants can have both C- and V-features, while vowels can only have the latter. Another architectural mechanism of the model is building segmentally complex structures from simpler ones, which, together with the dependency principle, allows for a high degree of economy in the feature system.

**Figure 1.** Basic PSM geometry. (**a**) Place tier; (**b**) Manner tier.

The Place and Manner tiers in the PSM deserve some attention. Under the Place tier (a), we use the articulator-based features [labial], [coronal], and [dorsal] under the C-place node and its daughter V-node (cf. Clements 1991). Simple consonants have one place feature; complex consonants have multiple features on the same place node; and consonants with secondary articulation have features on both C-place and V-place nodes (Morén 2003, pp. 199, 233). Similarly, under the Manner tier (b), we make use of the loosely defined features [open] and [closed], which can be attached to a C-manner or a V-manner node in arrangements that reflect the relative sonority of segments (Morén 2003, pp. 222–23). As for a Laryngeal tier, it should suffice here to use the feature [voice] to differentiate voiced from voiceless obstruents (see Morén 2003, p. 230).1

Specifying the above features to a particular segment depends on finding positive evidence in the relevant variety. When varieties are closely related, phonological activity will show major parallels. This, in addition to the universal phonetic properties of speech sounds, means that a given segment will have the same composition across varieties of the same language, unless there is proof to the contrary. Because of this, a contrast-based model like the PSM is a valid tool in drawing typologies, as I will demonstrate in the next section.

#### **3. The Typologies of Arabic Consonant Reflexes**

The present study provides feature-based typologies of \**q*, \**gˇ*, \**k*, \**d.* –\**d.* , \**θ*–\**ð*, \**r*, and \**h*¯ –\*Q, which display variation across Arabic dialects. Reflexes of these consonants can be differentiated representationally along the dimensions of place and manner of articulation,

in addition to voicing. As mentioned above, I will draw feature geometric generalizations of these reflexes based on the contrast and phonological activity they exhibit. The facts and data denoting their behavior appear in various studies on individual Arabic dialects as well as in cross-dialectal surveys, as the references indicate.

Before embarking on the analysis, a few remarks are in order. First, when discussing segmental composition in the PSM, it is crucial to separate underlying from surface levels. Our concern here is the major reflexes that have a contrastive phonemic status, i.e., not predictable allophones nor marginal phonemes that exist only in free variation. Although these types of sounds will not be analyzed representationally, they will be mentioned and clearly labeled for what they are, so that confusions are avoided. It is still, however, a challenge to decide which variant, in case of multiple allophones, should be treated as the basic phoneme, and whether sounds confined to loanwords should be included in the phoneme inventory (cf. Gordon 2016, p. 43). There is also the issue of how to deal with several reflexes of one consonant cooccurring in the same dialect.

The answer to the above questions will vary depending on the available evidence in each case: the nature of the environment in which the variants occur, the existence of minimal pairs, the extent and stability of the borrowings, etc. We will see, for example, that /k/ and /dZ/ are the basic phonemes in dialects that exhibit [k]–[tS] and [dZ]–[j] alternations, simply because the [tS] and [j] allophones are restricted to front vowel contexts, while the other two occur elsewhere. We will also learn that many, but not all, of the dialects with /g/ and /P/ reflexes of \**q* have retained a /q/ phoneme in both stable and more recent loans from Standard Arabic (SA), sometimes leading to minimal pairs or morphological doublets. And even aside from direct borrowing, the diglossic coexistence of dialects with SA often leads to the reintroduction of SA phonemes into their inventories.

In the coming subsections, I will examine reflexes for each of the consonant prototypes listed above, describing their geographical distribution, but more importantly their phonological behavior: the phonemes they contrast with and the processes they participate in. This behavior serves as the basis for assigning their PSM feature-geometric structures, which are the building blocks of the new representational typologies I propose in this work.

#### *3.1. The Qaf Typology ¯*

Cantineau (1960, p. 68) states that "the pronunciation of *qaf¯* is of utmost importance" in the classification of Arabic dialects. Four major reflexes, /q P k g/, are often named in the literature. A widely accepted generalization is that dialects with the voiceless cognates /q P k/ are spoken by sedentary people, while those with the voiced /g/ are spoken by Bedouin or Bedouin-descended populations (Watson 2011a, p. 859). However, this principle is not without exceptions; for instance, both in North Africa and the Levant we encounter urban dialects with /g/, and in reality, every geographical region has a distinct pattern of variation (see Bahloul 2007). Let us briefly examine each of these four reflexes, in pursuit of a representational typology of \**q*. For a thorough overview of geographical distributions, see Cantineau (1960, pp. 68–71), Kaye and Rosenhouse (1997, pp. 270–73), Bahloul (2007), and Edzard (2009).

Looking first at the voiceless uvular /q/ reflex, it is most notably attested in the sedentary dialects of Syria and the Maghreb, as well as *q*@*ltu* Mesopotamian and parts of Oman and Yemen (Fischer and Jastrow 1980, p. 52). Some examples are [trawwaq] 'he had breakfast' (Latakia, Syria), [bqa] 'he stayed' (Morocco), and [qạsQi .:K] 'short' (Mosul, Iraq). Based on phonological activity in these dialects, /q/ can be treated as a member of the natural class of primary dorsal segments. It often patterns with velar stops in triggering nasal place assimilation (NPA) toward a back nasal, as in /manqal/ > [maðqal] 'brazier', and totally assimilates adjacent velar/uvular fricatives in *q*@*ltu* dialects, as in /PaqKaQ/ > [PaqqaQ] 'bald' (Youssef 2019, p. 26).

With no trace of phonological activity that discriminates velars and uvulars, I infer that there is a single natural class of C-place [dorsal] consonants. Of these, uvular /q/ is most suitable for a mannerless segment, i.e., with a bare place feature, since it patterns with both stops and fricatives. Phonetically, [dorsal] is a fitting feature since /q/'s posterior articulation is known to cause lowering or backing of all immediately adjacent vowels (cf. Al-Ani 1970, pp. 32–33).

By far the most common reflex of \**q* is the voiced velar stop /g/, which is characteristic of Bedouin dialects (Watson 2002, p. 17). This reflex covers eastern/central Arabian Peninsula and southern Iraq, but also significant pockets in North Africa, Upper Egypt, Sudan, the Levant, and southern Peninsula (Bahloul 2007). As implicated above, /g/ belongs to the class of C-place [dorsal] consonants, as it triggers NPA, e.g., /ji-ngar/ > [jiĎgar] 'he pecks' (Muslim Baghdadi; Youssef 2013, p. 67); it also partakes in the labialization of /i/ to [u] in certain Iraqi and Levantine dialects (see e.g., Haddad 1984 and Youssef 2015). Further, if we assume that a C-manner [closed] feature indicates a stop constriction in the PSM, then /g/ is specified for this feature as well as [voice], so that it is distinguished from /q/ and /k/.

The second most widespread reflex is the glottal stop /P/, mainly attested in urban centers of the Levant and Lower Egypt, and sporadically in some Maghrebi city dialects (Holes 1995; Bahloul 2007), but also in rural areas especially in Lebanon (Fischer and Jastrow 1980, p. 52); examples: [Pạ:dQi .] 'judge' (Cairo); [Pil-Puds] 'Jerusalem' (Beirut); and [rifPa:t] 'friends' (Damascus). Given that all other stop consonants show contrastive evidence for a place feature, we are left with /P/, the Arabic epenthetic consonant, to assign a single C-manner [closed] feature. From an articulatory standpoint /P/ is simply a stop formed with complete closure between the vocal folds.

A voiceless velar stop cognate, /k/ (sometimes appearing as emphatic /k. /), is generally marked as ruralite; and Edzard (2009) notes that it surfaces in those dialects which have affricated the original *kaf¯* (see Section 3.3 below). It is typical of central Levantine villages, but also in areas of North Africa (Watson 2011a, p. 862). We find, for example, [kalb] 'heart'; [ka:l] 'he said'; and [karji] 'village' (rural Palestinian). Representationally, /k/ is the voiceless counterpart of /g/, and it participates in the same processes: NPA (producing a velar nasal) and labialization (Herzallah 1990). We infer, then, that it is specified for the features C-place [dorsal] and C-manner [closed].

There also exists a number of conditioned variants, which are not included in the analysis because they appear to be the more restricted subsidiary allophones of one of the main reflexes above. For instance, certain eastern Arabian nomadic dialects affricate their /g/ to [dZ] and further to [dz], but only in front vowel contexts (Johnstone 1967). Alternations such as [ga:l] 'he said' vs. [tQạri:dZ] 'road' (southern Iraqi) and [tQạri:dz] (Šammari, central Saudi) led Holes (1995, p. 60) to classify the /g/ group into three subtypes of Bedouin dialects.

Another marginal variant is the voiced uvular fricative [G], which appears to be in free variation with [q] in parts of southern Iraq and the Arabian Gulf, e.g., [Gada]~[qada] 'lunch' and [qịt <sup>Q</sup>ạ:r]~[Gạt <sup>Q</sup>ạ:r] 'train' (Fischer and Jastrow 1980; see also Al-Nassir 1993, p. 40). Moreover, many dialects with one of the major reflexes /P k g/ preserve /q/ in a number of borrowed words from SA, sometimes giving way to semi-contrasts like [wọrQgạ] 'tree leaf'–[worqa] 'sheet of paper' (Moroccan Bedouin; Cantineau 1960, p. 70).

Table 1 offers a restatement of the *qaf¯* typology in Arabic, with a rough geographical distribution of the four major cognates. Using this feature typology, we can simply refer to dialects with a \**q* reflex that has all or a subset of the features named. The specifications both reflect and explain each segment's synchronic phonological behavior. And although not the focal point here, historical shifts from one reflex to the other could also be motivated through feature loss or gain.


**Table 1.** Representational typology of the major \**q* phoneme reflexes.

#### *3.2. The Gˇ¯ım Typology*

Another famously varying consonant is *gˇ¯ım*, with the three major reflex phonemes /dZZg/. The first two are the most widespread pronunciations, and, broadly speaking, /dZ/ is characteristic of Bedouin dialects, while /g/ and /Z/ are sedentary. Exceptionally, however, /Z/ is the predominant reflex in North Africa, irrespective of the sedentary– nomadic split (Cantineau 1960, p. 59). Below I discuss each of the \**gˇ* cognates separately. Detailed geographical typologies are provided in Cantineau (1960, pp. 58–60), Fischer and Jastrow (1980, p. 51), Holes (1995, pp. 61–62), and Zaborski (2007).

The voiced palatoalveolar affricate /dZ/ is standard "in the majority of eastern Bedouin dialects, in rural dialects of the Levant and Mesopotamia, in the majority of dialects in central Yemen, and in some sedentary dialects in Algeria" (Watson 2011a, p. 863).<sup>2</sup> Phonologically, /dZ/ is the voiced counterpart of /tS/ in dialects that have developed the latter phoneme through borrowings and historical affrication (Fischer and Jastrow 1980), with minimal pairs like (Baghdadi) [tSanna] 'daughter-in-law'–[dZanna] 'paradise' and [furatS] 'brushes'–[furadZ] 'he dispelled'; hence, it has [voice].

Two different phonological processes provide evidence that /dZ/ is coronal. One is that it typically participates in the assimilation of the definite article (L-ass), as one of the 'sun letters', e.g., /l-dZiba:l/ > [dZ-dZiba:l] 'the mountains' (though not in SA). The other is that it tends to assimilate partially to a following coronal obstruent in onset clusters, producing a fricative [Z], with possible devoicing to [S], e.g., /dZtima:Q/>[Stima:Q] 'meeting' (Iraqi; Youssef 2013, p. 69). Because /dZ/ is a blocker of emphasis spread (ES) in many dialects, /dZ/'s coronality is interpreted as a secondary feature, i.e., V-place [coronal], in conflict with the secondary emphatic feature (cf. Davis 1995). Lastly, since affricates behave phonologically as stops, we specify /dZ/ for C-manner [closed] as well.

The second most frequent reflex is the voiced palatoalveolar fricative /Z/, attested in the urban dialects of the Levant (exceptions include Aleppo and most of Jordan, which have /dZ/) and most urban and non-urban Maghrebi dialects (Zaborski 2007). It is the voiced equivalent of /S/, as seen in the minimal pair [Za:j] 'coming'–[Sa:j] 'tea'. It is always a trigger of L-ass, e.g., [Z-Zami:l] 'the pretty' (Lebanese), hence coronal, but it also patterns with the ES blockers (see above), hence V-place [coronal].

As discussed above, /Z/ results from the assimilation of /dZ/ to a coronal obstruent. Since /dZ/ has both V-place [coronal] and [voice], the only way to distinguish it from /Z/ is constriction. Parallel to the stops, we may hypothesize that a C-manner [open] feature marks fricative constriction for /Z/ and all other consonants with a similar manner of articulation. Mustafawi (2017, p. 15) argues that "the best alternative for /dZ/ while keeping most of its distinctive features would be /Z/".

A voiced velar stop /g/ is found in Cairo, in rural central and northeastern Delta, and in all urban centers of northern Egypt down to Bani Swef, but also in various Bedouin ¯ dialects of central Arabia and in some Yemenite and Omani dialects (cf. Watson 2002, p. 16; Zaborski 2007, p. 494). The /g/ reflex is often thought to be "the most salient feature of Egyptian speech across the Arab-speaking world" (Holes 1995, p. 61). Phonologically, /g/ has a stop constriction; it contrasts with voiceless /k/, e.g., [gu:Q] 'hunger'–[ku:Q] 'elbow'; and it triggers NPA, e.g., /finga:l/ > [fiĎga:l] 'coffee cup' (Cairene; Youssef 2013, p. 35). We may conclude, then, that /g/ has the following contrastive features: C-place [dorsal], C-manner [closed], and [voice].

As I have noted for *qaf¯* , there are a few conditioned and marginal variants of *gˇ¯ım,* which, although excluded from the featural analysis, are worth mentioning here. Perhaps the most well known is a palatal approximant [j], found mainly in Bedouin dialects of the Gulf and lower Iraq, which is partly in free variation with [dZ] and partly lexically conditioned (Zaborski 2007). Despite the variability, e.g., [jarju:r]~[dZardZu:r] 'shark' or [Qaji:n]~[QadZi:n] 'dough' (Bah.raini), [j] is considered a marker of Gulf speech (Holes 1995, p. 62). Other notable variants include an alveolar stop [d] in some Upper Egyptian dialects in front of liquids and nasals (Behnstedt and Woidich 1985), two affricates: palatoalveolar [tS] in Palmyra and alveolar [ts] in the oasis of Suh <sup>&</sup>lt;ne (Syria), and a fricative [z] in some Jewish dialects of the Maghreb (cf. Fischer and Jastrow 1980).

Table 2 summarizes and restates the *gˇ¯ım* typology in terms of five contrastive features, which are assigned based on synchronic phonological activity. We now realize that urban Egyptian dialects, which have a glottal stop reflex of \**q* and a /g/ reflex of \**gˇ*, have exploited the features C-place [dorsal] and C-manner [closed] to differentiate segments in their inventories. Historically, claims that the Proto-Semitic origin of \**gˇ* is indeed a velar plosive /g/ (see e.g., Roman 1981) can be also explained by a place of articulation shift from C-place [dorsal] to V-place [coronal] in /dZ/, while keeping all other features intact.


**Table 2.** Representational typology of \**gˇ* phoneme reflexes.

#### *3.3. The Kaf Typology ¯*

This consonant exhibits conditioned and unconditioned variation in modern Arabic dialects. The former type—which concerns us here—affects the \**k* regardless of neighboring sounds and is due to advancement of /k/'s place of articulation, which makes it prone to affrication and spirantization (Cantineau 1960, p. 66), resulting in /tS/. Conditioned alternations produce a [tS] or a [ts] allophone of /k/ in the vicinity of front vowels, and [k] elsewhere. As before, we concentrate on phonemic reflexes for our phonological analysis, namely /k/ and /tS/, but will also mention the allophonic pattern for the purpose of comparison. Elaborate surveys can be found in Cantineau (1960, pp. 66–67), Johnstone (1967), and Kaye and Rosenhouse (1997, pp. 273–74).

On the one hand, most Arabic varieties from east to west have preserved a velar stop /k/ as the only reflex available. In the east, this is generally viewed as an urban feature, while in Egypt and westwards, the lifestyle factor is insignificant as there is little to no variation observed (Palva 2006, p. 606). Among the dialects with a /g/ phoneme, either as a reflex of \**q* or \**gˇ*, /k/ is its voiceless cognate; thus, it has no voicing specification. The contrastive features for /k/ have already been discussed in Section 3.1: a velar point of articulation corresponds to C-place [dorsal], and a stop constriction corresponds to C-manner [closed].

In various ruralite dialects of the Levant, a voiceless palatoalveolar affricate cognate, /tS/, is attested, irrespective of the phonological environment (Watson 2011a, p. 873). More specifically, this is the case in central Palestine, a few Syrian villages, and two regions of Algeria, as well as among the Shiites of Bah.rain (Fischer and Jastrow 1980, pp. 51–52). Moreover, several Bedouin dialects seem to have regularized affricate /tS/ within roots; and although there often remains few [k]–[tS] alternations, one can safely pose two phonemes, /k/ and /tS/, in contrast. In Muslim Baghdadi (Youssef 2014), for instance, we encounter minimal pairs like [tSuwa] 'he scorched'–[kuwa] 'he ironed' and [ba:tSir] 'tomorrow'–[ba:kir] 'virgin'. And in some rural Jordanian elderly speech (Cantineau 1960), extensions of [tS] to non-front vowel contexts occur as a result of analogy, e.g., [di:tS] 'rooster' > [dju:tS] 'pl'.

More widespread are the conditioned alternations where either [tS] or [ts] occurs in front vowel contexts in complementary distribution with [k], with no morphological repairs, and are thus regarded as allophones of the /k/ phoneme (cf. Holes 1995, p. 60). The [tS] variant is attested in the Bedouin north Arabian and related dialects of Jordan and Iraq (Fischer and Jastrow 1980), with alternating examples like [ritSib] 'he mounted'– [jirkab] 'he mounts'.3 The [ts] variant is predominant in central Najdi, among the ( Anaiza and Šammar tribes (Cantineau 1960, p. 67), e.g., [tsaff] 'palm of the hand'–[kfu:f] 'pl.'.

Let us now discuss the featural composition of the phonemic /tS/ reflex, drawing mainly on Youssef (2014). First, note that all /tS/-dialects have a /dZ/ cognate of \**gˇ*, the two forming a phoneme pair that differ in terms of voicing (see above); so in other respects, they should have comparable phonological status. On the one hand, /tS/ is necessarily coronal because it triggers L-ass, as in [tS-tSa:ku:tS] 'the hammer'. On the other, affricates are stops phonologically, so /tS/ is also assigned C-manner [closed].

The proposed feature composition may also reflect the historical development of affricate /tS/ in the relevant dialects. If we treat affrication as a shift from velar to coronal that was once motivated by adjacent high vowels /i i:/ or palatal /j/, and if these triggers are specified for V-place [coronal], being blockers of ES, then the output of the assimilation process, namely /tS/, must also have the latter feature, while C-manner [closed] remains unchanged (cf. Watson and Dickins 1999). A concise representational typology of *kaf¯* is given in Table 3.


**Table 3.** Representational typology of \**k* phoneme reflexes.

#### *3.4. The Interdental Typology*

Here we will be dealing only with the plain interdentals *ta¯ )* and *dal¯* ; emphatic *d. a¯ )* will be discussed in the next section. A general principle is that Old Arabic /θ ð/ are preserved in Bedouin-type dialects and merged with the corresponding alveolar stops /t d/, and less frequently with alveolar /s z/ or labiodental /f v/ fricatives, in sedentary speech (Cantineau 1960, p. 44). However, this dichotomy encounters numerous exceptions. For example, all dialects in Morocco seem to have shifted to stops (ibid.), while a few city dialects (e.g., Tunis, Mosul, Mardin) have retained the interdentals (Fischer and Jastrow 1980, p. 50). Below, I will individually examine the three pairs of reflexes; for a full overview, see Cantineau (1960, pp. 44–45), Fischer and Jastrow (1980, p. 50), and Mustafawi (2017, pp. 14–15).

What we may call 'the preservation dialects' constitute all "Bedouin dialects, dialects of Bedouin origin, the rural sedentary dialects of central Palestine/Jordan, Tunisia and Mesopotamia, and [ ... ] all but the western coastal city dialects of the Peninsula" (Watson 2011a, p. 863): an assortment of dialects, if one assumes traditional dichotomies. In all of these, both /θ/ and /ð/ participate in L-ass, e.g., [θ-θo:b] 'the shirt' and [ð-ðahab] 'the gold', hence C-place [coronal]. Considering that /θ ð/ are non-sibilants, with relatively weak turbulence, we may propose that they are devoid of manner features (and thus featurally distinct from the sibilants /s z/). Furthermore, the two consonants contrast in voicing, which means that /ð/ is marked for additional [voice].

The majority of urban dialects, as well as many neighboring rural areas, have the dental/alveolar stop cognates /t d/ (Fischer and Jastrow 1980). This vast isogloss covers all of Morocco, all sedentary dialects of Egypt, Hijazi Arabic, and the rest of the Levant (Mustafawi 2017, p. 14). Concerning their featural content, there is good indication that /t d/ are C-place [coronal]. In Cairene (Youssef 2013), for instance, they trigger L-ass, e.g., [Pit-tiPi:l] 'the heavy', [Pid-de:l] 'the tail'; and they regressively assimilate to labial and velar stops across word boundaries, e.g., /baQat kita:b/ > [baQak kita:b] 'he sent a book', /nafad bi-gildu/ > [nafab bi-gildu] 'he saved his skin'. As stops, they are also specified for C-manner [closed], and /d/ has yet another [voice] feature.

In various northern Mesopotamian dialects, as well as in the Arabic of Afghanistan and Uzbekistan, the development is toward the alveolar sibilants /s z/ (Jastrow 1978), as in [sa:se] 'three', [Paxaz] 'he took' (Az¯ @x, Anatolian). These sibilants also tend to replace /θ ð/ in borrowings from SA in the urban dialects of Egypt and the Levant (Mustafawi 2017), e.g., [jisbit] 'he proves', [Piza:Qa] 'broadcasting' (Aleppo). The /s z/ pair partakes in L-ass, voicing assimilation, and often sibilant assimilation. We can therefore specify them for C-place [coronal], being alveolars, and C-manner [open], being fricatives; with an extra [voice] feature for /z/.

Another known pair of cognates are the labiodental fricatives /f v/, attested in Siirt (southeastern Anatolia), e.g., in [fa:fe] 'three' and [vahab] 'gold' (Jastrow 1978, pp. 34–39), in some nomadic dialects of the Tell Atlas Mountains, and in Palmyra (Cantineau 1960, p. 45). In the Shiite dialect of Bah.rain, only a /f/ reflex of \**θ* is attested (Mustafawi 2017, p. 15). The /f v/ reflexes form a voiceless-voiced pair; and I further assign them C-place [labial], as they would be expected to trigger NPA, and C-manner [open], which characterizes fricatives.

A crucial point to notice is that in all but the preservation dialects, the change is that of merger with an already existing phoneme—a fact simply built into the feature typology in Table 4. We can also make sense of Cantineau's (1960, p. 44) observation that the sedentary dialects which pronounce \**q* as /q/ have retained the interdentals. It appears that such dialects have a preference for reflexes with no manner features. Historically, in addition, the cross-linguistically common sound changes /θ ð/ > /t d/ or /s z/ are effortlessly explained as insertion of manner features.


**Table 4.** Representational typology of \**θ*–\**ð* phoneme reflexes4.

#### *3.5. The D. ad–D ¯ . a¯ ) Typology*

Next are the emphatic consonants denoted by the Arabic letters *d. ad¯* and *d. a¯ )* , which in the modern dialects either appear as two distinct phonemes, respectively alveolar stop /dQ/ and fricative /zQ/, or merge into a single interdental fricative /ðQ/. The former is characteristic of sedentary dialects and the latter of nomadic dialects. Historically, these two sets of dialects have restructured the asymmetrical Old Arabic system in different ways (Holes 1995), as we will see below. Both historical and synchronic surveys are provided in Holes (1995, pp. 57–59), Versteegh (2006), and recently Hamdan and Al-Hawamdeh (2020).

We start with dialects maintaining a contrast between /dQ/ and /zQ/. These coincide unmistakably with city dialects that have neutralized the interdental fricatives /θ ð/, merging them with the corresponding alveolar stops /t d/ (Holes 1995, p. 58). These dialects are said to have a dyadic (binary) system, with voiceless–voiced series for both plain and emphatic consonants, i.e., /t d/, /t<sup>Q</sup> dQ/, /s z/, /s<sup>Q</sup> z<sup>Q</sup>/ (Bellem 2014).5 Representationally, /d<sup>Q</sup> zQ/ are emphatic consonants that trigger long-distance ES, e.g., in [tạ-xfịdQ-ạ:t] 'discounts' or [QạzQạmạ] 'greatness' (Cairene; Watson 2002, p. 273).

Emphatics are distinguished from their plain counterparts by an additional nonprimary back articulation (Davis 1995, p. 472). Youssef (2006, 2013) posits V-place [dorsal] to characterize this natural class. This way, [dorsal] alone, on separate tiers, is used to account for velar/uvular and emphatic consonants, which is clearly more economical than introducing an additional [pharyngeal] (McCarthy 1994), [guttural] (Watson 2002), or any other feature proposed specifically for Arabic or Semitic. It is worth mentioning that McCarthy (1994) has also suggested [dorsal] as a redundant feature for emphatics.

The emphatics generally have C-place [coronal] as their primary articulation; /d<sup>Q</sup> zQ/ do trigger L-ass, e.g., [PịdQ-dQạjQa] 'the village', [PịzQ-zQạri:f] 'the pleasant' (Damascene). Further, /d<sup>Q</sup> zQ/ are specified for [voice], as they contrast with voiceless /t<sup>Q</sup> sQ/ in the dyadic system. In terms of manner of articulation, /dQ/ is a stop, with C-manner [closed], and /zQ/ is a fricative, with C-manner [open].

The other group of dialects, where /dQ/ had fallen together with /ðQ/, are mainly Bedouin or have a Bedouin origin, such as *gilit* Mesopotamian, Yemenite, and Peninsular essentially dialects that have retained the plain interdentals (cf. Embarki 2008, p. 592).<sup>6</sup> This merger has engendered confusion in defining minimal pairs that used to contrast /dQ/– /ðQ/, e.g., [fạ:jịðQ] 'overflowing/ usury' and [ðQụfạr] 'he plaited/ overcame' (Baghdadi; Youssef 2013, p. 131). These dialects are said to have reduced the asymmetry of the system by developing triadic series, with two three-member sets of voiceless–voiced–emphatic cognates: alveolar plosives /t d tQ/ and interdental fricatives /θ ð ðQ/ (Holes 1995, p. 58; see also Bellem 2014). The featural makeup of /ðQ/ should now be easy to deduce: C-place [coronal], as a trigger of L-ass, V-place [dorsal], as a trigger of ES, and [voice]. And just like the plain interdentals (cf. Table 4) it need not be specified for C-manner.

It is probable that *d. ad¯* was historically a voiced lateral/lateralized interdental fricative emphatic (cf. Corriente 1978). A remnant of this is apparently the pronunciation of \**d.* as emphatic lateral /lQ/ in a few dialects of southern Arabia, such as the Saudi Tihama ¯ (Al-Azraqi 2010) and the Yemeni dialect of Dat¯ına (Landberg 1905–1913, cited in Versteegh 2006). I will not pursue an analysis of this marginal reflex here, although my presumption is that it is featurally identical to /ðQ/. Table 5 recapitulates the featural composition of the three major phonemes discussed above: the contrastive /d<sup>Q</sup> zQ/ and their merged reflex /ðQ/.

**Table 5.** Representational typology of \**d.* –\**d.* phoneme reflexes.


#### *3.6. The Rhotic Typology*

Most Arabic dialects have a rhotic phoneme /r/, corresponding to the letter *ra¯ )* , which is typically realized as a voiced alveolar tap or trill (Younes 1994; Watson 2002). However, two groups of dialects have introduced a phonemic split whereby a new emphatic /rQ/ or uvular fricative /K/ contrasts with a plain /r/ phoneme. A third group has only an emphatic /rQ/ reflex, a fourth has a plain /R/ with a double place of articulation, and a fifth has just a plain /r/. The first four types are thoroughly examined in Youssef (2019, forthcoming); below I provide a synopsis.

Type-I dialects have established two distinct phonemes in contrast, a plain /r/ vs. an emphatic /rQ/, and are therefore dubbed 'the split-*r* dialects'. They mainly comprise the Arabic dialects of Africa, which include the Maghrebi and Egyptian families, and a few peripheral dialects in sub-Saharan Africa (but also in Anatolia). Minimal pairs are abundant, e.g., [rQạ:jịb] 'curdled'–[ra:jib] 'collapsed' (Moroccan), [PạrQbạQ] 'a Wednesday'– [ParbaQ] 'he guzzled' (Egyptian), and [kạrQạ] 'he was seen'–[kara] 'he rented' (Mardin). Additionally, [rQ] and [r] exist partly in the same environments, suggesting that they have parallel distribution.7

The phonemes /r rQ/ trigger L-ass, e.g., [@.r<sup>Q</sup>-rQạ:Z@.l] 'the man', [@r-razwar] 'the shaver' (Moroccan); hence, they are C-place [coronal]. They also trigger coronal sonorant assimilation (CSA), whereby /n l/ assimilate regressively to a following /r rQ/ across word and morpheme boundaries, e.g., /min riglu/ > [mir riglu] 'from his leg' (Cairene). The inference is that /r rQ/ are sonorants, for which we may assign a composite of C-manner [open] and V-manner [closed] (see Morén 2006, p. 1210), denoting that sonorants are continuants (open) and vowel-like (sonorous). Finally, emphatic /rQ/ in this group is a trigger of ES, with the same bidirectional, long-range spreading of pharyngealization as other primary emphatics, e.g., [QạrQạbịjj-ạ:t-ạk] 'your cars' (Cairene). We thus assign it a secondary V-place [dorsal] feature in addition.

Type-II dialects have a single, emphatic /rQ/ phoneme, and incorporate the Levantine dialects spoken in Syria, Lebanon, Palestine, and Jordan. The phoneme has emphatic [rQ] and plain [r] allophones in complementary distribution, and there is no sign of a phonemic split. Distributional evidence that the phoneme is /rQ/ and not /r/ includes the fact that it does not trigger vowel raising (*imala ¯* ), e.g., [dZọ:rQạ] 'hole' rather than \*[dZo:ri] (rural Palestinian; see Younes 1994 for details). Furthermore, /rQ/ patterns with other emphatics in inducing ES, although it partially differs in its more limited domain and vulnerability to undergo de-emphasis (Younes 1993; Davis 1995); and it participates in L-ass, as well. Therefore, it has C-place [coronal] and V-place [dorsal]. It is a sonorant, as it triggers CSA, e.g., /le:l rQa:jig/ > [le:r<sup>Q</sup> rQạ:jig] 'a calm night' (Jordanian), so we add C-manner [open] and V-manner [closed].

Type-III dialects have one /R/ phoneme, which is underlyingly non-emphatic, yet arguably both coronal and dorsal. They belong to the Peninsular and Mesopotamian *gilit* groups. Here, the /R/ phoneme has fully predictable plain and emphatic realizations; the emphatic allophone causes backing of adjacent low vowels only (Al-Ani 1970, p. 33), which implies low-level coarticulation rather than ES. As expected, /R/ obligatorily triggers L-ass, e.g., [R-Ri:*h*¯ a] 'the smell' (Muslim Baghdadi; Erwin 2004); besides, a process of labialization in these dialects shows that /R/ behaves more like velar/uvular than emphatic triggers (cf. Youssef 2015). We infer that /R/ is specified for two primary places of articulation, C-place [coronal] plus [dorsal], but no secondary articulation. In addition, it patterns with the coronal sonorants in CSA; therefore, it also gets the usual manner features for sonorants.

The remarkable type-IV group exhibits two distinct phonemes, an alveolar sonorant /r/ and a uvular fricative /K/, and comprises primarily the Mesopotamian *q*@*ltu* dialects, spoken in various cities in Iraq. In those dialects, the uvular /K/ reflex of \**r* coincides and merges with etymological *gayn ˙* , whereas /r/ is found principally in loanwords (Blanc 1964; Jastrow 1978). Distributional evidence for two phonemes includes minimal pairs, e.g., [rakkib] 'he let climb'–[Kakkib] 'he assembled' (Mosul), [farraq] 'he distinguished'–[faKKaq] 'he separated' (Jewish Baghdadi).

Further phonological processes of assimilation, vocalization, and dissimilation take place to resolve some unusual contacts between uvular /K/ (from \**r*) and the back consonants /q x K/. If these processes are motivated by an OCP violation, we can propose that /K/ is specified for C-place [dorsal]. Since /K/ also behaves as a fricative and contrasts with voiceless /x/, we can assign additional C-manner [open] and [voice] features. As for the /r/ phoneme, it triggers both L-ass and CSA, so I propose C-place [coronal] together with the two (sonorant) manner features. It does not trigger emphasis spread, though.

We may also add a fifth group for dialects with just a plain /r/ reflex, which contains several Yemeni and Peninsular dialects, as well as peripheral dialects that have lost all emphatic versus plain contrasts in their inventories, e.g., Maltese, Cypriot, Uzbekistani, Juba, and Ki-Nubi. In San ( ani, for instance, the allophone [r ¯ <sup>Q</sup>] is only found in proximity of an emphatic obstruent; elsewhere, it is realized as a plain [r], even in words such as [ra:s] 'head' and [*h*¯ arr] 'hot' (Watson 2002, p. 16). For this group, the /r/ is representationally similar to the /r/ of types I and IV above.

The rhotic typology provides an interesting case where it is hard to rely on surface forms at the expense of actual phonological behavior. This behavior is disclosed in the feature representations of the various reflexes, summarized in Table 6. Variability is due to the general elusive nature of rhotics (see Wiese 2001) and in Arabic, additionally due to the involvement of the notorious emphatic–plain distinction (Youssef, forthcoming). This latter point also relates to the so-called marginal emphatics, with a list consisting of [l<sup>Q</sup> n<sup>Q</sup> m<sup>Q</sup> f Q b<sup>Q</sup> x<sup>Q</sup> kQ] (Davis 2009), but since these are often attested in restricted environments, next to other emphatics or to a low vowel, they are arguably not part of the phonemic inventory of most dialects (Youssef 2013, p. 102). If, however, they show contrastive behavior in a dialect, they can be analyzed as having a V-place [dorsal] feature.

**Table 6.** Representational typology of \**r* phoneme reflexes.


#### *3.7. The Pharyngeal Typology*

The voiceless and voiced pharyngeals /*h*¯ Q/ have largely been preserved in the modern dialects; however, a weakening of one or both phonemes can be observed in a few outskirts of the Arabic sprachbund (Watson 2002, p. 18). According to Fischer and Jastrow (1980, p. 52), Chadian and Nigerian Arabic have reduced old /*h*¯ Q/ to laryngeal /h P/, whereas Tihama (Yemen) and Š ¯ ¯ıgo-Sason (Anatolia) dialects have only turned /Q/ into /P/.

For /*h*¯ Q/, there is no phonological evidence to support a C-place [dor] specification (nor any other place feature). I propose the double C-manner features [closed] and [open], a specification that ties phonetically with the considerable variation in the production of pharyngeals, which have been described as having fricative, approximant, or stop gestures (see McCarthy 1994; Shosted et al. 2017).

In Section 3.1, we assigned a single C-manner [closed] feature to the glottal stop /P/, which surfaced as a reflex of \**q* in certain other dialects. For the natural class of fricatives, I proposed C-manner [open]. Now, let us posit that /h/ is the (placeless) segment composed entirely of that feature, considering its tendency to delete in wordfinal position in modern dialects. Table 7 summarizes the feature representation of these consonants and illustrates that the sound changes /Q/>/P/ and /*h*¯/ > /h/ involve a simple feature deletion mechanism.


**Table 7.** Representational typology of \*¯*h*–\*Q phoneme reflexes.

#### **4. Discussion and Conclusions**

An important characteristic of the property-driven approach is that it refrains from classifying languages, or for that matter dialects, into types. The latter methodology leads to three false implications, elaborated in Hyman (2018, pp. 10–12), which I will consider in relation to the typologies of Arabic consonant reflexes.

The first is that the resulting categories appear to be mutually exclusive. A good illustration of this is the customary classification of Arabic dialects based on reflexes of \**k* into /k/ vs. /tS/ type dialects. As we saw in Section 3.3, the pure /tS/ dialects are relatively few, and many more dialects in fact contrast /tS/ and /k/ phonemes. Additionally, with increasing pressure to normalize educated speech toward SA, this phonemic split is expanding or even disappearing in favor of /k/. A division of this sort, therefore, appears simplistic.

A second argument is that the outcomes of such studies pretend to offer unique taxonomies, as if "something has been accomplished" (Hyman 2018, p. 11), while in fact multiple categorizations are often possible. Take the case of stop /t d/ vs. fricative /s z/ reflexes of the interdentals in Section 3.4. One typologist may classify, say, the sedentary dialects of Egypt in the /t d/ group (e.g., Fischer and Jastrow 1980) when considering well-established lexical items; another may classify them as /s z/-type dialects (e.g., Embarki 2008) given their rendering of recent SA borrowings into fricatives, never stops.

Another example is the Mesopotamian *q*@*ltu* dialects in the \**r* typology, which are classified under a separate category as a result of their unique /K/ reflex. However, synchrony alone dictates that they should be part of the 'plain r' group since they have a single rhotic phoneme /r/ in loanwords, and since the fricative /K/ reflex of \**r* has totally merged with an existing phoneme, the etymological *gayn ˙* . Rarely are the facts so uncomplicated that we can place a dialect in one or the other category. What really matters in the current approach is that the two categories are structurally delineated so that the phonological behavior of those reflexes can be explained, regardless of which dialect falls under which type.

The final argument advanced by Hyman is that the typological labels are often imprecise and invariably run into exceptions. Let us take, for example, the labels proposed by Youssef (2019) for the \**r* typology. The so-called 'split-*r* dialects' represent a type that contrasts plain /r/ and emphatic /rQ/, but the label may equally apply to the 'uvular-*r* dialects', which also happen to split the etymological *r* into two phonemes, /K/ and /r/. The third type, labeled 'plain-*r* dialects', have a non-emphatic rhotic phoneme which is either doubly marked for C-place [coronal] and [dorsal], /R/, or just [coronal], namely /r/. That is why it is more accurate to divide them into two discrete groups, as I have done

in Section 3.6. Another case of inaccurate labeling is the designation of a 'type' with an approximant reflex of \**gˇ*, i.e., /j/ (cf. Watson 2011a, p. 863), even though [j] is typically a conditioned variant of the /dZ/ phoneme in such dialects (Section 3.2). We must admit, of course, that labels are useful descriptive tools that help us conceptualize the object of our research; the important thing is that they should not be perceived as explanatory typological devices instead. They are not themselves manifestations of actual phenomena.

So, if typology is not about finding types, what should its goal be? By discarding types and embracing the property-driven view, we were indeed able to make valuable predictions, both empirical (for Arabic) and theoretical. Let us review them one by one.

First, as Hyman (2007, p. 265) states, this approach makes no clear distinction between phonological typology and phonological theory; and in doing so, it affords a range of theoretically informed schemes to typologize and explain variation. The current study appealed to the formal apparatus of phonological representation to account for variation in Arabic consonant phonemes. Here, the raw material of the typological analysis is not the phoneme reflexes per se, but how these reflexes are differentiated by the feature hierarchy (cf. Dresher et al. 2018). This contrastive-feature typology then has explanatory power in that the featural makeup of the various reflexes will correlate with their distinct phonological patterning across varieties of Arabic.

Secondly, the feature typology was generated by a specific model of feature geometry, the PSM. By utilizing a handful of features, which are only operative when distinctive (contrastive and/or active), the PSM provides a minimalist device to account for phonological alternations across languages and dialects. I have illustrated that the PSM is not only sufficient to capture complex typological correlations, but also that the correlations are made transparent by the architectural properties of the model. Crucially, feature economy is maximized and phoneme distribution is accounted for.

A relevant case here is that (the mostly Bedouin) dialects with a /g/ reflex of \**q* (Section 3.1) are more likely to have a /dZ/ reflex of \**gˇ* (Section 3.2). By activating C-place [dorsal] for /g/ and V-place [coronal] for /dZ/, other features being equal, those dialects make maximal use of the few available distinctive features to express their phoneme inventories (cf. Clements 2003). At the same time, they escape creating a common reflex for the two historical consonants, which would result in a merger (mergers happen typically when the phoneme contrasts have a low functional load, which is not the case here).

Thirdly, although exclusively synchronic in essence, the PSM analysis also sheds light on processes of sound change and phonologization, by offering linguistic explanations for how such processes might have taken place. According to Kiparsky (2008), structural properties (including features), rather than systems of opposition, should form the basis for language change. Typological generalizations then simply follow from recurrent patterns of change. As such, historical changes can provide explanations for closely related dialects, but how is this achieved?

We know, for instance, that partial sound change can eventually lead to a phonemic split. This occurs for several of the consonants under scrutiny where multiple reflexes cooccur in a given group of dialects, e.g., /g/ and /q/ reflexes of \**q* (Section 3.1), /k/ and /tS/ reflexes of \**k* (Section 3.3), and /K/ and /r/ reflexes of \**r* (Section 3.6). When there is a single systematic reflex, we have an indication that the change is complete. Additionally, since the reflexes are characterized by minimal feature distinctions, we can often register that phonological change involves the addition or deletion of a few features. Finally, conditioned phonetic variants—as I have pointed out for \**q*, \**gˇ*, and \**k*—can provide clues for the process of phonologization. For a principally diachronic perspective of variation in a range of Arabic consonants, readers may consult Embarki (2008, 2014).

In conclusion, contrastive-feature taxonomies provide an interesting insight into the relations that exist between varieties of the same language, both synchronically and diachronically (cf. Dresher et al. 2018). Having demonstrated that the PSM is well suited to capturing variation in the consonants of genealogically related Arabic dialects, we can also claim, following Kiparsky (2018), that typological generalizations are inevitably theory dependent. The variety of available theoretical solutions should open up new avenues for dialect categorization, independent of traditional classificatory systems that conflate multiple extra-linguistic factors.

**Funding:** This research received no external funding.

**Acknowledgments:** For their thoughtful comments and suggestions, I thank the anonymous reviewers and the audience of the "Semitic Dialectology: Crises and Change" conference, organized by Heidelberg University. All remaining errors are my own.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


#### **References**

Al-Ani, Salman H. 1970. *Arabic Phonology*. The Hague: Mouton De Gruyter. [CrossRef]


Behnstedt, Peter, and Manfred Woidich. 1985. *Die ägyptischarabischen Dialekte. II. Dialektatlas von Ägypten*. Wiesbaden: L. Reichert.


Clements, George N. 2003. Feature economy in sound systems. *Phonology* 20: 287–333. [CrossRef]

Clements, George N., and Elizabeth Hume. 1996. The internal organization of segments. In *The Handbook of Phonological Theory*. Edited by John A. Goldsmith. Cambridge: Blackwell, pp. 245–306.

Corriente, Frederico. 1978. D. -L doublets in Classical Arabic as evidence of the process of de-lateralisation of *d. ad¯* and development of its standard reflex. *Journal of Semitic Studies* 23: 50–55. [CrossRef]

Davis, Stuart. 1995. Emphasis spread and grounded phonology. *Linguistic Inquiry* 26: 465–98.

Davis, Stuart. 2009. Velarization. In *The Encyclopedia of the Arabic Language*. Edited by Kees Versteegh. Leiden: Brill, vol. 4, pp. 636–38.

Dresher, Elan B. 2009. *The Contrastive Hierarchy in Phonology*. Cambridge Studies in Linguistics 121. Cambridge: Cambridge University Press.

Dresher, Elan B., Christopher Harvey, and Will Oxford. 2018. Contrastive feature hierarchies as a new lens on typology. In *Phonological Typology*. Phonology and Phonetics 23. Edited by Larry M. Hyman and Frans Plank. Berlin: Mouton De Gruyter, pp. 273–311.

Edzard, Lutz. 2009. Qaf. In ¯ *The Encyclopedia of the Arabic Language*. Edited by Kees Versteegh. Leiden: Brill, vol. 4, pp. 1–3.

Embarki, Mohamed. 2008. Les dialectes arabes modernes: État et nouvelles perspectives pour la classification géo-sociologique. *Arabica* 55: 583–604. [CrossRef]


Herzallah, Rukayyah. 1990. Aspects of Palestinian Arabic Phonology: A Non-Linear Approach. Ph.D. dissertation, Cornell University, Ithaca, NY, USA.


Hyman, Larry M. 2018. What is phonological typology? In *Phonological Typology*. Phonology and Phonetics 23. Edited by Larry M. Hyman and Frans Plank. Berlin: Mouton De Gruyter, pp. 1–20.

Jastrow, Otto. 1978. *Die Mesopotamisch-Arabischen Q*@*ltu-Dialekte*. Band I: Phonologie und Morphologie. Wiesbaden: Steiner.

Johnstone, T. M. 1967. *Eastern Arabian Dialect Studies*. London Oriental Series 17; London: Oxford University Press.


Watson, Janet C. E. 1992. Kashkasha with reference to modern Yemeni dialects. *Zeitschrift für Arabische Linguistik* 24: 60–81.


## *Article* **Definiteness Systems and Dialect Classification**

**Mike Turner**

Department of World Languages & Cultures, The University of North Carolina Wilmington, Wilmington, NC 28403, USA; turnerml@uncw.edu

**Abstract:** In this article I explore how typological approaches can be used to construct novel classification schemes for Arabic dialects, taking the example of definiteness as a case study. Definiteness in Arabic has traditionally been envisioned as an essentially binary system, wherein definite substantives are marked with a reflex of the article *al*- and indefinite ones are not. Recent work has complicated this model, framing definiteness instead as a continuum along which speakers can locate referents using a broader range of morphological and syntactic strategies, including not only the article *al*-, but also reflexes of the demonstrative series and a diverse set of 'indefinite-specific' articles found throughout the spoken dialects. I argue that it is possible to describe these strategies with even more precision by modeling them within cross-linguistic frameworks for semantic typology, among them a model known as the 'Reference Hierarchy,' which I adopt here. This modeling process allows for classification of dialects not by the presence of shared forms, but rather by parallel typological configurations, even if the forms within them are disparate.

**Keywords:** definiteness; indefiniteness; specificity; referentiality; determination; article systems

#### **1. Introduction**

To date, most efforts at classifying Arabic dialects have been concerned with grouping dialects on the basis of shared forms. At times, these forms have been phonological, such as the reflexes of \**q* that inform the well-known sedentary–bedouin division; at others, they have been morphological, such as the 1SG imperfective prefix *n*- that differentiates western from eastern varieties (see Palva 2006 on these, among others). In this paper I put forth an alternate proposal: that it may be beneficial to look past forms themselves, and add to our toolset the use of semantic typology as a metric for grouping and subgrouping dialects. In doing so, the possibility arises that formally dissimilar features in two or more varieties may actually have more in common than previously thought, at least to the extent that the features in question exhibit the same types of polysemy. This approach is not exclusive of existing classification schemes. Instead, it may be seen as a way to further test and refine previous characterizations, or otherwise break a tie when a classification decision is questionable.

Although the typological approach itself can theoretically be applied to any number of interrelated feature sets, I opt to focus here on the interplay between nominal morphosyntax and a set of semantic notions that I refer to with the umbrella term 'definiteness'. The choice to use the term holistically follows that of other works, including Lyons (1999), similarly titled *Definiteness*, and presumes Chafe's (1976, p. 39) definition of the same as "whether I think you already know and can identify the particular referent I have in mind". Nonetheless, to be clear, in speaking of 'definiteness systems', my focus is on a particular range of definite-indefinite meanings, including relevant subcategories, that can accompany common nouns in response to Chafe's question (whether or not the answer is affirmative). Definiteness is a useful feature set with which to test a typological classification approach for various reasons, among them that (1) it can be modeled with a reasonable degree of precision, (2) Arabic dialects are known to differ in the ways they express it, and (3) sufficient material exists such as to be able to model discrete dialects and compare them, at least on a preliminary basis.

**Citation:** Turner, Mike. 2021. Definiteness Systems and Dialect Classification. *Languages* 6: 128. https://doi.org/10.3390/ languages6030128

Academic Editors: Simone Bettega and Roberta Morano

Received: 13 June 2021 Accepted: 22 July 2021 Published: 28 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Most discussion of definiteness in the Arabic dialectological literature is, as is often true of other features, primarily concerned with formal representations. These discussions can be subdivided into two primary types, the first being the shape and assimilation patterns of the so-called "definite article" \**al-*1, and the second being the presence and shape of "indefinite articles" in dialects that exhibit them. In the case of the former, the article \**al-* typically receives little explicit semantic discussion, as it is usually presumed to indicate true definiteness. Indefinite articles have fared somewhat better, perhaps because they clearly depart from formal expectations imparted by the standard language, and differ within dialects themselves; Mion (2009) provides an excellent survey of these articles, and even provides a preliminary (form-focused) typology, though his paper stops short of placing them into a comparative semantic framework.

The organization of the present paper is as follows: I begin with a theoretical discussion of definiteness and models that can be used to envision it, especially as they apply to the Arabic case. Following that, and in keeping with the overall focus on meaning over form, I provide a tier-by-tier view of the primary semantic categories attested in the above models, providing evidence of variation in Arabic by drawing on material from the dialectological literature. The next section provides more complete models of a sample of discrete Arabic dialects, selected again to exhibit the extent of possible variation, and to allow for side-byside comparison. Finally, I return to questions of dialect classification, including both how we can construct schemes from the present data and how these schemes might interact with classification proposals previously made.

Because linguistic examples are drawn from various sources, many of which exhibit different conventions, I have adapted them (with the exception of Nubi) into a single transcription system and provided my own interlinear glosses and free translations.2 In addition, throughout this paper I follow Dryer's (2014, p. e234) in adopting an intentionally broad and more semantically oriented definition of the term 'article,' which is used interchangeably with 'marker' to refer to any morphosyntactic structure that adds referential meaning to a noun. As such, its use here should not be understood as a syntactic judgment of any particular form.

#### **2. Modeling Definiteness**

As a starting principle, definiteness (in the holistic sense) is presumed here to be a semantic property of nouns in all human languages, stemming from shared cognitive perceptions of the world, entities within it, and other humans' knowledge of them. This semantic view is distinct from the grammatical expression of definiteness, which may be realized differently (or not at all) on a language-by-language basis. Dryer (2005a, 2005b) respective overviews of definite and indefinite articles for the *World Atlas of Language Structures* (WALS) underscore this point, showing that common cross-linguistic definiteness systems include formal representation of (1) both definiteness and indefiniteness, (2) definiteness but not indefiniteness, (3) indefiniteness but not definiteness, and (4) neither definiteness nor indefiniteness. Despite the variability of possible arrangements, maps of the same data show that they are not distributed at random, but rather display areal characteristics, often bridging disparate language families that are geographically proximate, but then varying inside a single language family that is geographically distributed. As Arabic falls into the latter category, that it sees variability in the expression of definiteness is a reasonable initial assumption.

Although grammarians often speak of "definiteness and indefiniteness" in binary terms, scholars have nonetheless recognized that definiteness and its expression cannot adequately be envisioned on a bipartite basis. In the past half-century, various models have been offered as visualizations of the cognitive statuses that underlie nominal referentiality, a common component of which has been the subdivision of either the 'definite' or 'indefinite' categories—often both—into more precise subcategories. These models have also generally recognized the same ordering of categories, which form a sort of continuum along

which formal representations might be distributed. Here I briefly review some of these models and select one for the present task, then move more explicitly into the Arabic case.

#### *2.1. The Wheel Model*

Givón (1978, p. 298) proposes a wheel-shaped model that distinguishes six possible nominal statuses, which he identifies as (a) 'referential definite', (b) 'referential indefinite', (c) 'referential nondefinite', (d) 'nonreferential object', (e) 'generic predicate', and (f) 'generic subject', with the first and last categories bordering each other. Figure 1 shows this model as he envisioned it for standard English. The choice of a wheel is motivated by Givón's observation that, while languages often use a single morphosyntactic strategy (possibly including zero-marking) for two or more statuses at once, their distribution across categories is nearly always contiguous. One notes, for example, that the English 'indefinite article' *a* (or *an*) can indicate multiple underlying semantic statuses. Givón's terms are somewhat clumsy—it not immediately apparent how one would contrast 'indefinite' and 'nondefinite' without reviewing examples—but they do establish the basic principle of multiple semantic distinctions underlying a single form. He also rightly indicates that plural and singular forms do not have to follow the same patterning, and uniquely carves out space in his model for generic entities.3

**Figure 1.** Givón's Wheel Model, for English (redrawn).

#### *2.2. The Givenness Hierarchy*

Gundel et al. (1993) approach the same issue more broadly, framing definiteness as a subcomponent of a larger set of meanings, including those indicated by personal and demonstrative pronouns, that they refer to as 'givenness'. They propose a 'Givenness Hierarchy' (Table 1) consisting of six cognitive statuses, wherein the more discursively 'known' or 'given' a referent it is, the further to the left of the hierarchy it will be. The three rightmost statuses in the Givenness Hierarchy might be seen as corresponding with the four statuses (a)–(d) of Givón's Wheel Model, showing a discrepancy in the choice of subdivision despite a general agreement that subdivisions should exist. One contribution of Gundel, Hedberg, and Zacharski is that they provide a formal representation of one of the 'indefinite' subcategories by giving informal English *this* as an indefinite article, a use that is further confirmed in Ionin (2006), who calls it a 'specific' marker. As it is useful to be able to provide semantically nuanced free translations, I make ample use of indefinite *this* in translations of Arabic examples in this paper.

**Table 1.** The Givenness Hierarchy (Gundel et al. 1993).


#### *2.3. The Reference Hierarchy*

Drawing together advantages of both the Wheel Model and the Givenness Hierarchy, a more recent proposal by Dryer's (2014, p. e235) by the name of the 'Reference Hierarchy' (Table 2) combines the more limited scope and greater categorical distinctiveness of the former with the hierarchical implications of the latter. Dryer's model enjoys the unique advantage of having been constructed on the basis of a large corpus of real-world language data, which featured in his (Dryer 2005a, 2005b) work for the WALS database; as such, it is likely to be sufficient for the description of most languages (including Arabic). Like Givón before him, Dryer emphasizes the tendency of articles to be both polysemous and contiguous across a particular range of meanings; meanwhile, like Gundel, Hedberg, & Zacharski, Dryer relies on the notion of a hierarchical relationship whereby nouns that are more 'known' or 'given' are located further to the left. His choice of five categories is more akin to the Wheel Model, though he leaves out generics and splits 'referential definites' into 'anaphoric definites' and 'nonanaphoric definites'. Also like the Wheel Model, the Reference Hierarchy proposes three non-generic indefinite statuses, i.e., one more than the Givenness Hierarchy indicates. Finally, although Dryer's particular terminologies are lengthy, he does provide a set of 2 to 3-letter abbreviations (in heading of Table 2), which are particularly suitable for in-line reference and interlinear glosses.


#### *2.4. Applying the Reference Hierarchy*

Because it captures the advantages of models before it, was specifically proposed as a response to cross-linguistic data, and allows for abbreviated reference to particular semantic statuses, I opt to use the Reference Hierarchy as the working model for the current paper, and hereby adopt the terms AD, ND, PSI, PNI, and SNI for their respective meanings. These abbreviations are henceforth used liberally in both glosses and prose. It is nonetheless worth pointing out that broad terminological consensus has yet to emerge within this field of inquiry, so I summarize each status as follows, for clarity:


from others of its type. It has elsewhere been called 'existential', and in English is obligatorily marked with *(a)n*, but can also be marked with *some* (Israel 1999).

5. Semantically nonspecific indefinite (SNI), which corresponds with Givón's 'nonreferential object' and Gundel, Hedberg, & Zacharski's 'type identifiable', refers to the status of a noun that is fully unindividuated and is interchangeable with any other of its type. In English it is obligatorily and exclusively marked with *a(n)*.

Using the above definitions, it is possible to build a visual representation of a given language's definiteness system by representing the Reference Hierarchy as a series of blocks along which corresponding forms can be mapped. Figure 2 gives my interpretation of the system in spoken American English. The articles represented at top, *the* and *a(n)*, are obligatory; meanwhile, the forms at bottom represent auxiliary strategies. This strategy is maintained for other iterations of the model in this paper. The visual model has the added benefit of easing comparison between multiple systems, as is our purpose here, and explored further in Section 4.


**Figure 2.** Forms represented along the Reference Hierarchy for American English.

#### *2.5. Definiteness in Arabic*

A handful of works to date have treated definiteness (or aspects of it) in Arabic specifically. Of these, Brustad (2000, pp. 18–43) is the most immediately relevant in both its focus on spoken Arabic and its comparative approach. She introduces the idea of a 'definiteness continuum' that includes not only meanings that are "wholly definite" or "wholly indefinite", but also exist within an intermediate range that she terms 'indefinitespecific'. Within the current framework, "wholly" definite and indefinite correspond with the statuses AD/ND and SNI, respectively; meanwhile, the indefinite-specific range that Brustad speaks of seems to cover both PSI and PNI. Looking at Moroccan, Egyptian, Syrian, and Kuwaiti dialects, Brustad identifies common patterns, among them the marking of true definites (AD/ND) with a reflex of \**al-*, as well as the zero-marking of non-referential (SNI) nouns. Taken alone as a binary opposition, this initial observation corresponds with the way definiteness in Arabic is often framed.

At the same time, Brustad also establishes the presence of structures that add more nuance than the binary model allows, many of which vary by dialect. Within the indefinitespecific range, she documents use of reflexes of \**wa¯h. id* 'one' for all four dialects, observing that it often marks a new topic that is subsequently adopted in the discourse. I qualify such referents as inherently PSI, in that new topics are necessarily known to the speaker who can therefore expound upon them—but are presumed inaccessible to the listener. Nonetheless, as Brustad notes that \**wa¯h. id* is often restricted to humans (e.g., *wa¯h. id badwi* 'a certain bedouin', p. 20), I am more inclined to read it in such cases as an indefinite pronoun modified by an adjective (i.e., *someone (who is a) bedouin*) rather than a truly inclusive article that can modify any common noun. The exception is in Moroccan, which I discuss more specifically in Section 3.3.

For Moroccan and Syrian varieties, Brustad locates an article *ši*, which she glosses as 'some (kind of)' and contends speakers use "to indicate that they have a particular type of entity in mind". Brustad also raises the possibility of interpreting dialectical *tanw¯ın* as a sort of indefinite-specific marker, citing Ingham (1994, pp. 47–50) comments on its semantic qualities in Najdi Arabic, and shows how both partitive structures and demonstrative adverbs can have the same semantic effect in Egyptian (Brustad 2000, pp. 30–31). Under the broad definition of 'article' used here—which, again, privileges semantic function over syntactic analysis—I consider such structures part of a given dialect's article system, and specifically include them in below models.

Elsewhere, Brustad complexifies uses of the article \**al-*, typically seen to be a marker of true definiteness. Two principal qualifications arise from her data. The first of these is that while true definite (AD and ND) nouns are consistently represented with \**al-*, anaphoric definites are often further marked with an unstressed demonstrative adjective (*had- ¯* , *ha-*, etc.) as a means of increasing their discursive prominence (112–139). I see this common strategy as akin to other auxiliary strategies for marking particular referential meanings, and thus class these as a type of AD marker. The second qualification involves the presence of \**al-* in apparently indefinite contexts, which Brustad identifies as a common occurrence in Moroccan (e.g., *xˬs.s.*-*ni l-wˬld* 'I need a son'; p. 36). I interpret this as evidence that the Moroccan reflex of \**al-* is distributed over a wider range of referential statuses in general (see Section 4.9).

There are few other holistic studies of definiteness in spoken Arabic. Turner (2018) is comparative and concerned exclusively with spoken Arabic, and employs the same descriptive model as the current paper to explore variability in spoken Arabic; the reader is encouraged to refer to it for additional data presented within the Reference Hierarchy framework. Fassi Fehri (2012, pp. 205–31) provides a more traditional syntactic view of determination in Arabic and Semitic at large, and includes some spoken Arabic data. Remaining studies that have relevance for the study of definiteness in Arabic can be divided into two types. The first are those that focus on single varieties, such as Caubet (1983), Belyayeva (1997), and Fabri (2001) focused and theoretically nuanced descriptions of definiteness in Moroccan Arabic, Palestinian Arabic and Maltese, respectively. The second type of relevant studies are those that examine a single form through a multifunctional semantic lens, and include in turn accounts of its articular functions; among these, Wilmsen's (2014) expansive account of *ši* across Arabic varieties and Leitner and Procházka's (Forthcoming) examination of *fard* in the dialects of Iraq and Khuzestan stand out.

#### **3. Points of Variation**

Following from Brustad's observations that structures not traditionally recognized as articles can, on a semantic and pragmatic level, be used to indicate particular referential meanings, a set of metrics for locating these in situ is useful. Even for forms that have been recognized as articles—whether definite or indefinite—in previous literature, a semanticsfirst view allows us to more specifically delineate the range of meanings that they cover. The aim of this section is, accordingly, to walk through each of the semantic statuses along the Reference Hierarchy, describe how each can be located by discursive context, and identify some points of variation in regard to how each is expressed formally across spoken Arabic varieties.

Because the goal of the section is simply to survey variation, it is more concerned with the fact that a strategy is attested at all than it is with that strategy's relative frequency of use. Nonetheless, as it is useful for comparative purposes (which follow in Section 4) to establish a baseline measure of how grammaticalized a given strategy is, I do also offer here initial readings of where each falls on a conceptual continuum that ranges between fully 'obligatory' and 'auxiliary'. While obligatory articles are easy to define—they are used by all speakers for all instances of the target meaning—and auxiliary articles can be understood as marked structures that are used by speakers for special emphasis, there is also an intermediate category of markers that are used so frequently with their corresponding meanings such as not to be highly marked, but are still not obligatory in all cases. I refer to these markers as 'conventionalized', a placeholder term used with an understanding that truly accurate frequency judgments will require more in-depth semantic study of individual varieties.

#### *3.1. Anaphoric Definites*

Anaphoric definites are easily located in extended discourse because they simply involve subsequent reference to an entity that has already been explicitly introduced. In the Hassaniya Arabic sentence given in (1), for example, the narrator introduces a certain *sba'* 'lion' as a new referent; when it re-occurs in the text, the referent *sba'* is now necessarily AD, and is accordingly marked with \**al-*:

*r. as.s. af 'l¯ı-h sba*' ... *yr. as.s. af 'l¯ı-h* s-sba' *w- ygul -lu ¯* (1) jump.PFV.3MSG upon-3MSG lion.PSI jump.IPFV.3MSG upon-3MSG AD-lion and- say.IPFV.3MSG -3MSG.DAT 'This lion jumped on him . . . the lion is jumping on him and saying to him . . . ' (Heath 2003, p. 116)

> That the article \**al-* is used here as the marker of anaphoric definiteness is not particularly surprising to anyone with knowledge of Arabic, formal or informal, and in most varieties it is indeed the sole obligatory marker of AD nouns. Nonetheless, the point of the example is to highlight contextual expectations. Importantly, when the same sort of discursive context is located elsewhere in the same variety, we find the variation of the type noted by Brustad elsewhere, namely in the auxiliary use of an unstressed demonstrative, as in (2):


While I am not aware of any survey beyond Brustad's, many Arabic varieties exhibit the demonstrative anaphoric reinforcement pattern in one way or another, and because demonstratives themselves vary widely in form, but mirror each other semantically, it is not particularly useful to list off all possible forms here (although see Magidow 2016 for a survey). On a typological level, it is also unsurprising that the demonstrative frequently plays this role, given demonstratives are a frequent source of definite articles in world languages (De Mulder and Carlier 2011). Instead, what is more worthwhile to note in the Arabic case is the degree to which a variety has conventionalized the demonstrative as an AD marker, at which point it might be said be an article of its own. At least some Levantine dialects appear to meet this description, as is evident in the use of *hal*- (etymologically *ha +¯ il-*) in (3), from Baskinta (Lebanon):


Just how widespread this pattern is in the Levant warrants further study4, but for the present purpose it is enough to point out that a dialect that could be shown to *obligatorily* mark AD nouns with a certain structure, but not ND ones, would be typologically distinct from most other varieties at present, and worthy of recognition of such. This phenomenon is also attested in the Nubi Arabic-based creole, wherein a postposed demonstrative reflex *'de* accompanies AD nouns. The major difference is that, in Nubi, the Arabic article \**al-* has been lost entirely:


All of these strategies, of course, are overt, and they all incorporate either \**al-* or a demonstrative (or a combination of both). The one major exception for AD nouns is the Central Asian cluster of dialects spoken in Uzbekistan (near Bokhara) and northern Afghanistan (near Balkh), which Ingham (2003) has suggested are branches of the same historical group (see also Seeger 2013). These varieties neither have a reflex of \**al-* nor have any obligatory compensatory strategy when nouns are AD, as in (5):


That said, even these varieties use demonstratives anaphorically, as in *duk zag˙¯ır* 'the child [previously mentioned]' (Ingham 2003, p. 33), so they do have at least an auxiliary means of overtly marking AD statuses. In this sense, the Central Asian group shares a typological feature with the larger dialect landscape, even if it is missing a 'core' Arabic feature in its lack of \**al-*.

#### *3.2. Nonanaphoric Definites*

Nonanaphoric definites are uniquely identifiable to both the speaker and listener via world knowledge, and they can be distinguished from AD nouns in extended discourse in that they have not previously been introduced. Common nouns that are ND in most circumstances include 'the sun,' 'the world', 'the country', 'the king', and any other for which there is only likely to be one possible interpretation on the part of the listener, despite being new to the discourse; as such, they are relatively easy to locate. This semantic status shows the least variation from dialect to dialect, and is most often represented by \**al-* to the exclusion of all other strategies (including demonstrative reinforcement). A typical example is in (6), from the Jazira area of Sudan, where 'the mayor' is unique and identifiable as the mayor of the implied town in the narrative despite only being mentioned for the first time:


The primary exception to this pattern is, predictably, varieties that have lost the article \**al-*; in such cases, the ND noun is unmarked. The Afghanistan Arabic utterances in (7), for example, provide first mention of 'the queen' with no further modification. Similar unmarked patterns can be identified in Nubi, as in *'hari ta 'shems* 'the heat of the sun' (Wellens 2003, p. 67). It is worth noting that these varieties, like others, do not see auxiliary use of demonstratives for ND nouns, even though they allow them for AD nouns.


#### *3.3. Pragmatically Specific Indefinites*

Pragmatically specific indefinites can be identified in extended discourse as referents that are mentioned for the first time, and not accessible to the listener via world knowledge, but for which the speaker can thereafter be seen to provide specific information. Strategies for marking PSI nouns are the most varied and innovative, particularly if we are to adopt a wide view of what an article is, and many have been under-recognized to date. Most of the "indefinite articles" of the dialectological literature are, in fact, PSI articles, whether exclusively or in a polysemic distribution with the PNI status.

A common source for PSI articles is, as is common in world languages (Heine 1997, pp. 66–83), a numeral \**wa¯h. id* 'one' or \**fard* 'one, an individual'. The former of these is best associated with Moroccan and western Algerian varieties, where a reflex of \**wa¯h. id* is typically obligatory for new, pragmatically salient referents of which the speaker has unique knowledge. Unique to this structure, however, is that \**wa¯h. id* accretes with *\*al-*, yielding a sort of double-marked structure. Caubet (1983, p. 83) gives the Moroccan article as a fused *wa¯h.ˬd-ˬl*, which is a plausible reading in most cases, but I venture that the article *l*- itself might also be considered a PSI marker, especially as it can be syntactically detached from *wah¯.ˬd* but still coincide with a clear PSI meaning, as in (8) from Anjra (Morocco):


The articular use of \**wahid ¯* to mark PSI referents is also attested in eastern varieties of Hassaniya, as spoken in Mali, though here it is suffixed rather than prefixed, and is not obligatory. It has not been explicitly recognized as such, but is regularly apparent in contexts such as (9), recorded in Gao, where further specification of the noun *blad* 'place' makes it clear that the speaker has unique knowledge of it. A similar structure is documented in Nubi, e.g., *mas'kin 'wai* 'a certain poor man' (Wellens 2003, p. 64).


The article *\*fard*, of similar semantic provenance, is widely recognized in the dialectogical literature, where it is most often associated with Mesopotamian varieties. Blanc (1964, 118) locates this article in Baghdad, and describes phonological variants of it associated with particular sectarian groups, but gives limited semantic information, saying "its presence contrasts fairly clearly with that of the article /l/ or other determination marks, but the degree to which it contrasts with absence of any mark is yet to be determined." Recent work by Leitner and Procházka's (Forthcoming) significantly expands on the functions of \**fard*, showing that it is a polyfunctional lexeme with multiple senses, one of which is to mark a noun that is "new for the hearer and important for the subsequent discourse." This quintessentially PSI sense for *\*fard* is attested throughout Iraq and Khuzestan, as in (10), from Basra, where the speaker starts a story by introducing a particular *t.alib ¯* 'student':


Mion (2009) locates reflexes of \**fard* in other Arabic varieties, too, including those of Mardin and Tunis, but in most of these cases the reflex is less apparently referential and simply implies 'one, the same' (though potential for future reanalysis remains). Nonetheless, it is attested with a clear PSI meaning in Central Asian varieties, as in *fad mara* 'a [certain] woman' in (5), above.

These are the only structures regularly called 'articles' in the literature, to my knowledge, that meet the semantic parameters of PSI, but under the broad definition we can easily expand the field of extant PSI articles. The first sort of novel article is derived from the demonstrative adverb, but has the same pragmatic effect of indicating a referent that is identifiable to the speaker, but not the listener. Brustad offers this interpretation of *kida* in Cairene (*šuft h. aga kida ¯* 'I saw this thing ... '), a view that is supported by numerous examples in Woidich (2006, p. 236). The same function can also be located elsewhere in Egypt, as in (11), from Bani Swayf:


Furthermore, there is evidence for a parallel strategy in some Yemeni varieties, which use the demonstrative adverb *haka ¯ d ¯ aha ¯* (and similar; see Watson and 'Amri 1993, pp. 418– 19) to the same effect; in (12), for example, the speaker introduces a *bug'ah* 'place' and immediately provides more information, a hallmark of a PSI noun:


Another structure that qualifies as a PSI marker on the basis of its semantic associations is the so-called 'dialectical *tanw¯ın*' (DT) of the dialectological literature. Even though the origins of this marker remain an object of debate, its functions are relatively similar across varieties. Stokes (2020, p. 637) summarizes DT as "the morpheme, typically realized as *in* or *an*, that is suffixed to a morphologically indefinite noun, primarily when followed by some type of adnominal adjective or clause". The fact that DT is restricted to indefinites is alone sufficient to establish that it has some relationship with the semantics of refentiality; in addition, that it typically proceeds an adnominal element—which, on a pragmatic level, individuate the noun as distinct from others of its type—calls for a PSI or PNI reading of the resulting phrase. As such, it is not surprising that it can be located with nouns that clearly meet the parameters of a PSI referent, as in (12) from the Jezira (Sudan):


While DT can accordingly be read as a sort of PSI article, in most cases it is still syntactically conditioned, in that it depends on the presence of an adnominal attribute (regardless of the speaker's ability to uniquely identify the referent). There is nonetheless evidence that some varieties have moved toward fully semanticizing DT, as in Najdi, for which Ingham (1994, p. 50) gives examples such as *liget b ¯ et-in ¯* 'I've found a [certain] house'. It is also possible to locate varieties in which a reflex of DT (which only occurs in this sense) accretes with another PSI article such as \**wa¯h. id*, as can be seen in (14), from Tillo (Anatolia):


Finally, it is worth pointing out that in many varieties, underlying PSI referents are simply unmarked. Such nouns have the same underlying semantic properties, but are not overtly marked as such, either because a marker is unavailable or the speaker chooses not to use it. A typical example is in (15), from Baskinta (Lebanon).

*'in-na* žar¯ *ib- had. -d. ay'a b-ih. ibb an-nawm* ... (15) have-1PL neighbor.PSI in- AD-village IND-sleep.IPFV.3MSG GEN-sleep 'We have this neighbor in the village who loves to sleep' (Abu-Haidar 1979, p. 141)

#### *3.4. Pragmatically Nonspecific Indefinites*

Pragmatically nonspecific indefinites are neither uniquely identifiable to the speaker nor the listener, but are conceived of by the speaker as being distinct from others of their type in the world at large. Though a speaker of a variety that marks overtly PNI nouns can signal them as such in any desired context, from an observer's perspective this semantic status is most easily located where the speaker speculates about the potential nature of a unique referent not yet located; as such, it is often the object of verbs such as 'find', 'obtain', and 'make'. The most easily identifiable PNI article is *ši*, conventionalized in Levantine (16) and Moroccan (17), and which carries this sense exclusively when used as an article:<sup>5</sup>


The article \**fard*, described above as a conventionalized marker of PSI statuses, is also attested with a PNI meaning, making the form itself polysemous, as in (18), from Baghdad. The *bayt* 'house' in question here is semantically specific, but the speaker has not located it yet. Reflexes of \**fard* are used comparably in Central Asian varieties, as in *fad*- *ord ¯* 'some place' (Ingham 2003, p. 34).

*dˬ-ndawwir 'ala* f*ˬ*d bayt *l- il-'igˇar¯ ani w- zawi ¯ gt-i ˇ* (18) ASP-search.1PL.IPFV for PNI house for GEN-rent 1SG and- wife-1SG.POSS 'I'm looking for some house or the other for me and my wife to rent' (McCarthy and Raffouli 1965, p. 17)

> Exhibiting similar polysemy, if we are to read it as a type of article, is dialectical *tanw¯ın,* which can also indicate a PNI meaning. This is evident in (19), from the Jezira (Sudan), where the speaker has no particular *arnab* 'rabbit' in mind, but implies God might:


Beyond these articles, I am not aware of any other regularly occuring PNI markers, and most varieties simply leave PNI nouns unmarked, as in (20) from Sanaa. This is not to rule out that partitive-like structures, in particular, might sometimes bridge into this meaning; Sanaani itself does, for example, occasionally use a form *zarat ¯* with plurals or as part of the SNI indefinite pronoun *zarat w ¯ ah¯. id* (Watson and 'Amri 2000, p. 114).


*3.5. Semantically Nonspecific Indefinites*

Semantically nonspecific indefinites are, by definition, interchangeable with any other entity of their type, and cannot be discursively prominent. As such, they are nearly always the object of a verb or preposition and not typically modified. Across Arabic varieties, SNI nouns are most commonly unmarked. The word *h. bal* 'rope' in the Hassaniya example in (21) is typical:


As a general rule, articles that fulfill the PSI or PNI function are not used to indicate SNI entities, though pragmatic considerations may occasionally let PNI markers bridge into this meaning.6 In the case of *tanw¯ın*, which is both semantically and syntactically conditioned, the fact that SNI nouns are unmodified means there is no syntactic impetus for it to appear with them, and I am not aware of any examples that show it being used alone with any sense other than the PSI one noted in Section 3.3.

The primary exception to the general tendency of Arabic varieties to leave SNI nouns unmarked is, perhaps unexpectedly, in varieties that instead mark them with \**al*-, at least in some circumstances. Moroccan is most notable for this, as in (22) and (23), where both the *tur¯* 'bull' and *sˬlham¯* 'cloak' are non-referential, being mentioned only once in passing:


That this pattern is attested and permissible is sufficient to call the view of *\*al*- as a *universal* definite article in Arabic into question.<sup>7</sup> That said, within Moroccan it is possible to find SNI nouns both with *\*al*- and with no marking at all. I have elsewhere argued that the marked pattern is more common with type-focused uses of SNI nouns and that the unmarked one is mostly reserved for delineating a specific quantity (Turner 2018, pp. 184–88). It is probably not prudent to call *\*al*- obligatory in this sense, but it is frequent.

#### **4. Systems in Comparison**

Taking the above data into account, it seems fair to say that there are a wide variety of strategies for expressing discrete definiteness values in Arabic dialects. This observation alone has implications for descriptive practice, as being aware of extant diversity within a linguistic group is always helpful in delineating which grammatical categories one should check for in fieldwork and comment on in publications. The greater promise of explicitly collecting such data, that said, is that it opens the door for new comparative approaches. In this section, I provide provisional sketches of the overall arrangement of definiteness systems in a sample of ten Arabic varieties, in addition to the Nubi Arabic-based creole, allowing for side-by-side comparison, before moving into the final discussion of how we might use such characterizations for classification. The rough order of sketches here is from more simplex systems to more complex ones, as I estimate them to be.8

#### *4.1. Libyan*

Libyan Arabic dialects, including those spoken in the eastern Benghazi area (Elfitoury 1976; Owens 1984) and Tripoli further west (Grand'Henry 2000; Yoda 2005), show a very strict binary division between definite (AD and ND) nouns, marked with *(i)l*-, and indefinite (PSI, PNI, and SNI) nouns, which are invariably unmarked. A review of texts in Grand'Henry (2000) confirms this impression, and I am not able to locate any regular auxiliary strategies. Figure 3 gives the distribution of forms in Libyan.

**Figure 3.** Reference Hierarchy for Libyan.

#### *4.2. Egyptian*

Egyptian varieties show the same basic pattern of obligatorily marked definite (AD and ND) nouns, and Brustad (2000, p. 140) specifically notes "the absence of an anaphoric demonstrative article in Egyptian." Brustad's data are from Cairo, but texts from Behnstedt and Woidich (1988) show the same patterns elsewhere in Lower Egypt. Although it does not have any obligatory means for indicating indefinite meanings, speakers of Egyptian do have the auxiliary marker *kida* for PSI referents (see Section 3.3). Figure 4 gives the distribution of forms in Egyptian, with the obligatory *il*- represented at top and the auxiliary *kida* at bottom.

$$
\overbrace{\text{AD}\,\,\overbrace{\text{ND}}\,\,\text{ND}}^{\text{all}}\underbrace{\text{PSI}}\_{\text{-}k\,\text{id}a}\text{ }\text{PNI}\,\,\text{SNI}
$$

**Figure 4.** Reference Hierarchy for Egyptian.

#### *4.3. Kuwaiti*

Kuwaiti Arabic (Figure 5) also shows the formal distinction between true definites marked with *il*- and unmarked indefinites, but also allows for regular auxiliary marking of AD nouns with an unstressed anaphoric demonstrative *ha*- (Brustad 2000, pp. 120–21), which accretes with the definite article. Brustad does not identify any Kuwaiti structures that would express meanings in her 'indefinite-specific' range (i.e., PSI and PNI), and I am likewise unable to locate any in her texts.

 $\overbrace{\underbrace{\text{AD}}\_{\text{ha}^+}}^{\text{H}^+}$   $\overbrace{\text{ND}}^{\text{NID}}$   $\text{PSI PNI SNI}$ 

**Figure 5.** Reference Hierarchy for Kuwaiti.

#### *4.4. Hassaniya*

Hassaniya Arabic varieties are found across a wide expanse of western Africa; Cohen (1963) provides a description of the Hassaniya of southwestern Mauritania, Heath (2003) a collection of texts from further east in Mali, and Aguadé (1998) a brief overview of speech in southern Morocco. The latter shows features more similar to Moroccan (below), so I do not consider them here. More western varieties (Figure 6), including those in Mauritania and Gao, show a relatively simplex distribution of forms that looks much like Kuwaiti, i.e., an obligatory definite marker *il*- and auxiliary marking of AD referents with a demonstrative *d ¯ ak¯* or *d ¯ ¯ık* (inflected for gender). Malian varieties around Gao (Figure 7), however, exhibit additional complexity in that they have a relatively frequent PSI marker *wa¯h.¯ıd* (see Section 3.3). Heath (p. 8) asserts that "the grammar of Malian Hassaniya differs little from that of Mauritanian dialects," but the current framework does raise the question of whether grammatical marking of PSI nouns might be a useful metric for internal classification of Hassaniya.

$$
\overbrace{\underset{d\vec{a}k+}{\overleftrightarrow{\text{AD}}}}^{\vec{\text{il}}\text{-}}
$$

**Figure 6.** Reference Hierarchy for western Hassaniya varieties.

$$
\overbrace{\underbrace{\text{AD}}\_{\text{dik-}}}^{\text{il-}} \text{ND} \underbrace{\text{PSI}}\_{\text{-}wo\ddot{h}\ddot{h}\dot{d}} \text{ PNI } \text{SNI}
$$

**Figure 7.** Reference Hierarchy for eastern Hassaniya varieties.

#### *4.5. Sanaani*

There is so much linguistic diversity in Yemen that I am hesitant to make broad pronouncements about "Yemeni," and thus base my judgements here only on Watson and 'Amri (2000) texts from Sanaa. In them, Sanaani (Figure 8) can be seen to obligatorily mark AD and ND statuses together with *il*-, like other varieties above, and also allows for auxiliary marking of AD with a preposed demonstrative *d ¯ ayyik* (etc.).<sup>9</sup> In addition, Sanaani has an auxiliary strategy, described in Section 3.3, wherein PSI referents can be further differentiated with what is elsewhere a demonstrative adverb *haka ¯ d ¯ a(y ¯ a)¯* . This marker is similar in function to the Egyptian PSI marker *kida*.

$$
\overbrace{\underbrace{\text{AD}}\_{\text{dayyik}+} \text{ND}}^{\text{H}-} \underbrace{\text{ND}}\_{\text{-hākadē}} \text{PSI} \underbrace{\text{PNI} \quad \text{SNI}}\_{\text{-hākadē}}
$$

**Figure 8.** Reference Hierarchy for Sana'ani.

#### *4.6. Levantine*

Levantine varieties again show the pattern of marking AD and ND nouns with *il*-, and allow for additional delineation of AD nouns with an unstressed demonstrative *ha*-, but differ from varieties above in that they have a conventionalized article *ši* that denotes PNI referents (see Section 3.4). As discussed in Section 3.1, varieties of the Levant also make particularly productive use of anaphoric *ha*-, some perhaps to the extent that the resulting fused marker *hal*- should be considered its own, exclusive marker of AD statuses. Figure 9 gives a more conservative interpretation of the distribution of forms in Levantine, and Figure 10 offers the secondary analysis.

$$
\overbrace{\underbrace{\text{AD}}\_{\text{ha}^+}}^{\text{il-}} \text{ND}^+ \text{ PSI } \overbrace{\text{PNI}}^{\text{\tiny \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $}}}}}}^{\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny}}}}}}}}^{\text{\tiny $\text{\tiny$ \text{\tiny\text{\tiny}}}}}^{\text{\tiny $\text{\tiny$ \text{\tiny $\text{\tiny\text{\tiny}}}}}} \text{PSI} \text{ } \overbrace{\text{PNI}}^{\text{\tiny$ \text{\tiny $\text{\tiny$ \text{\tiny\text{\tiny}}}}} \text{SNI}
$$

**Figure 9.** Reference Hierarchy for Levantine varieties.

**Figure 10.** Possible reading for some Levantine varieties.

*4.7. Iraqi*

Arabic varieties in Iraq (Figure 11) have been described as having an indefinite \**fard* (Blanc 1964, p. 118), and Leitner and Procházka's (Forthcoming) focused semantic analysis supports the notion that this polyfunctional lexeme acts as a conventionalized PSI/PNI article in most Iraqi dialects (see Sections 3.3 and 3.4). Texts in Iraqi varieties also regular show the use of demonstrative *ha*- as an auxiliary AD marker alongside the oblitary definite marker *il*-, as is common elsewhere.

$$\overbrace{\underbrace{\text{AD}}\_{ha+}}^{\text{il-}}\text{ND} \xrightarrow{\text{fard-}} \text{PSI PNI}^{\text{}}\text{ SNI}$$

**Figure 11.** Reference Hierarchy for Iraqi varieties.

#### *4.8. Najdi*

The expression of definiteness in Najdi Arabic (Figure 12), as described in (Ingham 1994), somewhat parallels the formal distribution given for Iraqi above. For AD and ND nouns, *il*- is the obligatory article, with auxiliary marking of AD nouns possible with *ha*-. As a dialect that has so-called dialectical *tanw¯ın*, PSI and PNI nouns that are adnominally modified with adjectives, relative clauses, or prepositional phrases obligatorily have the marker -*in*. There is also evidence, described in Section 3.3, that at least some Najdi speakers can use DT on a purely semantic basis, i.e., without the noun being followed by any sort of modifier.

$$
\overbrace{\underbrace{\text{AD}}\_{ha^{+}}}^{\text{il-}}\text{ND}\overbrace{\text{PSI PNI}}^{\text{-}in}\text{SNI}
$$

**Figure 12.** Reference Hierarchy for Najdi varieties.

#### *4.9. Moroccan*

Moroccan varieties (Figure 13) represent a relatively complex case, the main complications of which are that (1) the article *l*- is not restricted to definite (AD and ND) nouns

and (2) both PSI and PNI meanings are uniquely distinguished with overt, highly conventionalized articles. While the reflex of *\*al*- in all the above varieties is restricted and can thus truly be considered a definite article, in Moroccan it is conventionally extended to PSI referents (see Section 3.3) and is frequently used with SNI nouns as well (see Section 3.5). For PSI nouns, *l*- accretes with an article *wa¯h.ˬd*, which is similar in function to the optional article found in eastern Hassaniya (Section 4.4); meanwhile, for PNI nouns, an article *ši*—identical in form and meaning to that attested in the Levant (Section 4.6)—is used. Moroccan also allows for auxiliary indication of AD nouns with the proximal and distal anaphoric demonstratives *had¯* - and *dak¯* -, the former of which is uninflected.

**Figure 13.** Reference Hierarchy for Moroccan varieties.

#### *4.10. Central Asian*

Central Asian varieties combine known strategies from elsewhere in Arabic with the unique feature of not having a reflex of *\*al-*; among others, this latter feature has probably played a role in these varieties being characterized as "metatypized" (Ratcliffe 2005), particularly given other nearby languages also lack true definite articles. There is evidence that Central Asian Arabic varieties can, like many others, use unstressed demonstratives for anaphoric (AD) reference (see 3.1). In addition, these dialects also show a reflex of *\*fard* that has the same PSI/PNI semantic scope of \**fard* in Iraqi varieties (4.7). Central Asian also shows its own reflex of dialectical *tanw¯ın*, which sees the same syntactic conditioning as elsewhere (i.e., before adnominals), but has a wider semantic range because it can also occur with true definites.10 It is not attested with SNI nouns, but considering these are unlikely to be adnominally modified in the first place (see Section 3.5), it would not be unreasonable to say that DT in Central Asian has fully lost its referential dimensions, and can be envisioned purely as a syntactic linker, hence the question mark in Figure 14.

$$\overbrace{\underbrace{\text{AD}}\_{\overleftarrow{\text{id}}\text{-}\text{ND}}\overbrace{\text{PSI}\overbrace{\text{PSI}\overbrace{\text{PNI}}\text{}\text{SNI}}^{\text{jad-}}\text{SNI}}^{\text{jad-}}}^{\text{jad-}}$$

**Figure 14.** Reference Hierarchy for Central Asian varieties.

#### *4.11. Nubi*

Finally, while Wellens (2003), among others, has classified Nubi (Figure 15) as an Arabic-lexifier creole rather than a "true" Arabic variety, it is worthwhile to consider points of overlap with the above varieties in its expression of definiteness. Like Central Asian, Nubi has lost the article *\*al*-, differentiating it from the greater body of Arabic; nonetheless, also like Central Asian, the markers it does use have commonality with strategies attested in Arabic at large. The "definite article" '*de* that Wellens identifies is, in my reading, primarily an AD article, and shares semantic scope with the many other demonstrative forms that mark anaphoric definiteness in Arabic dialects. In addition, the apparently polysemic PSI/PNI article '*wai* has clear parallels with the postposted use of *wa¯h. id* in Hassaniya (Section 4.4).

**Figure 15.** Reference Hierarchy for Nubi.

#### **5. Definiteness and Classification**

In theory, if the definiteness systems of Arabic dialects can be modeled, they should be relatively easy to classify. In practice, various complications arise that mean any attempt at classification will necessarily be subject to caveats and in need of ongoing refinement. As indicated more than once above, some of the systems themselves need more focused study to confirm how fully applicable the provisional models I have provided are to the dialect group as a whole. Scholars of Levantine Arabic, for example, face an open question as to just how close the unstressed anaphoric demonstrative complex *hal*- has come to acting as an obligatory article; similarly, scholars of Moroccan and Iraqi dialects may be able to further quantify uses of their respective indefinite articles in the same way by looking at them through a primarily semantic lens.

A related question is the concept of 'obligatory' vs. 'auxiliary', which I have attempted to frame here as a sort of continuum, the intermediate range of which might be described as 'conventionalized'. For the purpose of grouping and classification, it seems that obligatory articles—those that are required when a speaker wants to denote a particular referential meaning—should take priority, as they represent a sort of linguistic consensus on the part of the speaker community that is not present for other markers. Nonetheless, is not always immediately clear what 'obligatory' means. It seems unwise to treat it as an absolute notion that only a single contrasting token would disqualify, especially when diglossic practices allow speakers to switch between registers (and their respective definiteness systems) at will. Instead, it seems more reasonable to look at the preponderance of the evidence: what forms *most often* arise in everyday conversation between native speakers of the variety in question? I suggest that these highly conventionalized strategies should also be prioritized for the purposes of classification.

This is not to say, either, that less frequent auxiliary strategies have no value, else I would not have included them here. To the contrary, it does appear worthwhile to point out that a majority of Arabic varieties optionally use unstressed demonstratives for anaphoric definite meanings, and that both varieties that do not (such as Egyptian) and varieties that oblige them (such as some in the Levant) are the outliers. It does seem relevant to note that not just one, but at least two, Arabic varieties (Egyptian and Sanaani) show the same typological pattern of co-opting a demonstrative adverb as a marker of specific indefinites, even if these are not required or even all that frequently used, statistically speaking, to express that meaning. Most importantly, although these are synchronic patterns, all fully crystallized innovations were presumably in flux at one time, so for the historical record alone it is worth noting that such strategies exist.

With these qualifications in mind, then, we can approach the question of classification more directly. I propose that there are two primary methodologies for grouping dialects when looking at a set of interrelated semantic features, as is the case with definiteness. The first is a 'single-tier' approach, meaning we simply limit our view to a particular type of meaning within the Reference Hierarchy, survey the forms that are attested for it, and order them into groups. This approach is not particularly distinctive from the survey I provided in Section 3, and can be useful as a starting point for hypotheses, especially because it is suitable for identifying outliers. The Central Asian group, for example, clearly stands out in that it does *not* obligatorily mark definite (AD/ND) nouns (see Sections 3.1 and 3.2), and Moroccan clearly stands out in that it *can* mark full indefinite (SNI) nouns (see Section 3.5). Nonetheless, while this approach might be initially useful for looking beyond forms and toward semantic function—e.g., for noting that *ši* and *\*fard* have at least partial semantic overlap—it is not particularly useful for comparing systems as whole.

Instead, I offer that a preferable approach is to look at the distribution of forms holistically, in what might be called a 'multi-tier' approach. It is still necessary, of course, that we prioritize some features over others as a means of subgrouping, but as a general principle I hold that each primary subgroup should be selected to describe as many varieties as possible while whittling away the outliers. One possible schema, based on the

comparative systems given in Section 4 (minus Nubi), and taking into account the above points about obligatory and conventionalized forms, as is follows:

	- a. No highly conventionalized marking of indefinites . . .
		- *i.* No attested auxiliary strategies: **Libyan, Kuwaiti**
		- *ii.* Attested auxiliary strategies: **Egyptian, Hassaniya, Sanaani**
	- b. Highly conventionalized marking of some indefinites . . .
		- i. Marking syntactically determined: *tanw¯ın* **dialects; Najdi**
		- ii. Marking semantically or pragmatically determined . . .
			- 1. Single marker for specific (PSI) and existential (SNI) indefinites: **Iraqi**
				- 2. Marker for existential (SNI) indefinites only: **Levantine**
	- a. Marked definites: **Moroccan**
	- b. Unmarked definites: **Central Asian**

There are admittedly other ways in which this same set of metrics could be ordered, and the varieties in question consequently be grouped, but this one has a few advantages. The first is that the present classification does give some credence to traditionalist views of Arabic as having a normative system where *\*al*- is a "definite article," while leaving room for exceptions and, at the same time, expanding the profile of what a "normative" dialect is by showing that a majority of these do have at least some means of marking indefinite referents, a pattern that stretches from the Atlantic to the Gulf. A second advantage is that the classification serves to group together varieties that might not necessarily share features, but which do share basic semantic patterns, in turn opening the door for diachronic questions, especially when these varieties are geographically distant from each other. I do not mean to imply by this a hereunto undiscovered genetic relationship between Moroccan and Central Asian varieties, but I do mean to point out that both groups have seen the strict categorical distinction between definites and indefinites unravel, and they are both at the far ends of the Arabic-speaking world.

Interpreted this way, the definiteness data align most closely with a 'core-periphery' classification model, in that a strict formal distinction between definites and indefinites is maintained across a large, contiguous cultural area and frays only at its edges. Within the core area, there is frequent variation in the particular means of marking referential indefiniteness, and somewhat of a northern–southern split as one moves from unmarked or optional marking strategies of Egypt, Yemen, and the Gulf to the more conventionalized strategies of the Levant and Mesopotamia, but the strict and exclusive association of *\*al*with definiteness goes unchallenged. Meanwhile, on the geographic fringes of this core, dialects break away typologically by either (1) extending *\*al-* to indefinite meanings or (2) detaching it from definite meanings.<sup>11</sup> The concept of peripheral dialects has been explored in volumes such as Owens (2000) and Anghelescu and Grigore (2007), and even though such varieties are just as often defined by what they are not than what they have in common, the addition of definiteness as a metric does at least support the idea of the 'core' against which they are defined as a viable linguistic entity.

Other classification proposals do not align as well with a scheme based on definiteness systems. The oft-proposed east–west division of dialects (see Palva 2006) is not easily evident here, especially given that the minimal expression of indefiniteness in Hassaniya varieties fall into the same general pattern as dialects much further east, including those of Egypt, Yemen, and Kuwait. The bedouin–sedentary division (again see Palva) is tenable only on the basis of the *tanw¯ın* feature, which is largely limited to bedouin-type varieties and is unique among indefinite markers in that it is conditioned by syntactic factors in addition to semantic ones. Nonetheless, in a purely typological sense, the presence of a conventionalized indefinite marker actually places DT-expressive bedouin varieties such as Najdi closer to the indefinite-marking sedentary dialects of the Levant and Mesopotamia than it does to other bedouin varieties that lack it, such as western Hassaniya or Kuwaiti. Finally, one may consider whether, within the sedentary dialects, an urban-rural division is relevant; this too seems unlikely, given the systems found in a given geographic region *do* tend to be contiguous across urban and rural areas. The Levantine PNI article *ši*, for example, is used by speakers both in Beirut and small mountain villages in the same way that the Moroccan PSI article *wah¯.ˬd* is found both in the old cities and rural countryside.

In summary, the system-level configuration of definiteness marking does ultimately seem to be an areal pattern, and even minor differences between systems might consequently be useful for further subdividing clusters of geographically adjacent dialects. This possibility has already been raised for eastern vs. western Hassaniya (Section 4.4), as well as Levantine (Section 4.6) varieties. I also offer the observation that somewhere between central Algeria and Tunisia, dialects see an abrupt shift from complex, Moroccan-like systems (Section 4.9) to simplex, Libyan-like systems (Section 4.1). Precisely where these lines may lie—and why—is a question for future studies to address. Many of the systems in question seem to be the product of innovation, whether via semantic extension or leveling, and whether prompted by contact or otherwise. As it seems reasonable to expect that groups that innovate together, along the same timeline and to the exclusion of nearby groups, are indeed more likely to share history and social ties, further studies on definiteness and referentiality in spoken Arabic will be of value to the larger project of dialect classification.

#### **6. Conclusions**

In this paper I have outlined the process of building a novel classification scheme for Arabic dialects, using semantic typology as a metric for grouping rather than relying on the presence of forms alone. Taking definiteness as a case study, I discussed a selection of possible models, and adopted Dryer's (2014) 'Reference Hierarchy' as the most suitable of these for the task of envisioning definiteness systems in Arabic. I thereafter showed that, for expression of each semantic status along the Reference Hierarchy, the dialectological literature attests multiple strategies across the Arabic-speaking world. This variability can be made more useful for classification by modeling the semantic distribution of forms for discrete dialects holistically and then placing those models side by side, in turn allowing us to look past the forms themselves and instead class the dialects by shared typological characteristics. Key metrics that emerge are whether varieties maintain a strict formal delineation between true definites and indefinites, whether they overtly distinguish referential indefinites, and whether the latter is subject to syntactic conditions beyond the semantic ones. This particular classification approach does not align well with some traditional proposals, such as a east–west or bedouin–sedentary split, but it does lend some credence to the idea of a 'core' dialect area that contrasts with a 'periphery'.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** No new data were created or analyzed in this study. Data sharing is not applicable to this article.

**Acknowledgments:** I would like to acknowledge Kristen Brustad, Mahmoud Al-Batal, Pattie Epps, and Cinzia Russi, all of whom served on the committee for the dissertation in which many of these ideas were developed. I would also like to thank the two anonymous reviewers for their valuable insights and suggestions.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Notes**


#### **References**

Abu-Haidar, Farida. 1979. *A Study of the Spoken Arabic of Baskinta*. Leiden and London: E. J. Brill.


Vicente, Ángeles. 2000. *El Dialecto Árabe de Anjra (Norte de Marruecos)*. Zaragoza: Universidad de Zaragoza.

Watson, Janet C. E., and 'Abd al-Salam 'Amri. 1993. ¯ *A Syntax of S. an'an¯ ¯ı Arabic*. Leipzig: Otto Harrassowitz Verlag.

Watson, Janet C. E., and 'Abd al-Salam 'Amri. 2000. ¯ *Was.f S. an'a: Texts in S ¯ . an'an¯ ¯ı Arabic*. Wiesbaden: Harrassowitz.

Wellens, Inneke Hilda Werner. 2003. An Arabic Creole in Africa: The Nubi Language of Uganda. Ph.D. dissertation, Radboud University, Nijmegen, The Netherlands.

Wilmsen, David. 2014. *Arabic Indefinites, Interrogatives, and Negators: A Linguistic History of Western Dialects*. Oxford: Oxford University Woidich, Manfred. 2006. *Das Kairenisch-Arabische. Eine Grammatik*. Leipzig: Otto Harrassowitz Verlag.

Yoda, Sumikazu. 2005. *The Arabic Dialect of the Jews in Tripoli (Libya): Grammar, Text and Glossary*. Leipzig: Otto Harrassowitz Verlag, vol. 35.

## *Article* **Interrogating the Egypto-Sudanic Arabic Connection**

**Thomas A. Leddy-Cecere**

Bennington College, Bennington, VT 05201, USA; thomasleddycecere@bennington.edu

**Abstract:** The Arabic dialectology literature repeatedly asserts the existence of a macro-level classificatory relationship binding the Arabic speech varieties of the combined Egypto-Sudanic area. This proposal, though oft-encountered, has not previously been formulated in reference to extensive linguistic criteria, but is instead framed primarily on the nonlinguistic premise of historical demographic and genealogical relationships joining the Arabic-speaking communities of the region. The present contribution provides a linguistically based evaluation of this proposed dialectal grouping, to assess whether the postulated dialectal unity is meaningfully borne out by available language data. Isoglosses from the domains of segmental phonology, phonological processes, pronominal morphology, verbal inflection, and syntax are analyzed across six dialects representing Arabic speech in the region. These are shown to offer minimal support for a unified Egypto-Sudanic dialect classification, but instead to indicate a significant north–south differentiation within the sample—a finding further qualified via application of the novel method of Historical Glottometry developed by François and Kalyan. The investigation concludes with reflection on the implications of these results on the understandings of the correspondence between linguistic and human genealogical relationships in the history of Arabic and in dialectological practice more broadly.

**Keywords:** dialect classification; subgrouping; Sudanic Arabic; Egyptian Arabic

**Citation:** Leddy-Cecere, Thomas A. 2021. Interrogating the Egypto-Sudanic Arabic Connection. *Languages* 6: 123. https://doi.org/ 10.3390/languages6030123

Academic Editors: Roberta Morano and Simone Bettega

Received: 10 June 2021 Accepted: 16 July 2021 Published: 23 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

This investigation intends a twofold contribution to the advancement of Arabic dialect classification. In the finer grain, I present an empirical evaluation of the frequently asserted macro-level classificatory grouping comprising the Arabic dialects of Egypt and those of the greater Sudanic region (Fischer and Jastrow 1980; Kaye and Rosenhouse 1997; Dickins 2011; Versteegh 2014). At a broader scale, I seek to interrogate the principal theoretical premise invoked in support of this proposed classification: that the shared (human) genealogical history of speech communities constitutes a reliable a priori basis for the classification of those communities' dialects in terms of diachronic relatedness and/or synchronic similarity. While the former priority will primarily engage specialists in Arabic and related languages, it is hoped that the latter will provide reflection pertinent both within and beyond the Arabist sphere, and initiate mutually informative conversations with colleagues of diverse foci, perspectives and expertise.

Dialectological description of Arabic dialects spoken in the Egyptian and Sudanic areas is remarkable for its unevenness. This broad, contiguous zone extends from the Mediterranean in the north to the Sudan–South Sudan border region in the south, and from the Red Sea westward to the Libyan Desert and, further, the vicinity of Lake Chad in Central Africa—the north–south stretch of the Nile Valley constituting an organizing "spine" and center of geographic and demographic gravity. Arabic varieties spoken in this region are utilized by a combined speaker population well in excess of 100 million (Eberhard et al. 2021). Knowledge of dialect diversity in the Egyptian portion of the zone has benefitted immensely from the achievement of Behnstedt and Woidich's (1985–1999) multivolume dialect atlas, text collection and glossary, and analysis of the dialect of Cairo has been particularly thorough (esp. Woidich 2006b). In comparison, Fischer and Jastrow could write of the vast Sudanic Arabophone territory as late as 1980 that "[w]ir haben

zwar aus diesen Raum eine Anzahl Texte und einige Lehrbücher sowie Vokabulare, aber nicht eine einzige halbwegs moderne grammatische Monographie" [We have from this area a number of texts and some textbooks, as well as vocabularies, but not a single halfway modern grammatical monograph] (Fischer and Jastrow 1980, p. 31). The state of scholarship has improved somewhat since, with the publication of two key book-length treatments of varieties local to the east (Reichmuth 1983) and far west (Owens 1993a) of the Sudanic region alongside thematic analyses of structural phenomena in, respectively, urban and semi-nomadic lects of Sudan's center (Dickins 2007a, 2009, 2010) and west (Manfredi 2014, 2018). Even so, the differential in scholarly attention to the Egyptian and Sudanic dialect areas remains severe, and the two, respectively, contain some of the best and least described speech varieties of modern Arabic.

The relevance of this imbalance is heightened when taken in combination with the fact that dialects of the combined Egypto-Sudanic zone are commonly associated with one another in discussions of Arabic dialect classification and subgrouping, frequently culminating in their collective classification as an identifiable dialectological unit superordinate to more localized groups. Illustrative articulations of this view are, among others, Kaye and Rosenhouse's assertion that "[a]s a whole, Sudanese dialects, at least those in the north, form one macro-grouping with the Egyptian dialects" (Kaye and Rosenhouse 1997, p. 265), and Fischer and Jastrow's positioning of the dialects of central and eastern Sudan as "[d]ie südliche Fortsetzung der oberägyptischen Dialekte" [the southern continuation of the Upper Egyptian dialects] (Fischer and Jastrow 1980, p. 31). To some extent, this lumping may stem from the shared failure of a number of varieties in both the Egyptian and Sudanic areas to clearly align with either of two primary classificatory dichotomies espoused by Arabic dialectologists, the Bedouin vs. sedentary split and the Eastern Arabic vs. Western Arabic split (cf. Heikki Palva 2006): Fischer and Jastrow, for instance, describe the collected dialects of Egypt and the Sudan as taking "eine Sonderstellung zwischen denen des Ostens und des Maghrib" [a special position between those of the East and those of the Maghrib] (Fischer and Jastrow 1980, p. 29). Such negative characterizations, however, framed on these dialects' incongruity with external typologies, do little to positively establish dialectal unity within the Egypto-Sudanic region. In this regard, analysts like the latter authors instead place particular emphasis on the identification of Egypt as the primary source for the historic in-migration of Arabic speakers to the greater Sudan (Fischer and Jastrow 1980, p. 22). It is this second criterion—the putative common genealogical history of the Egyptian and Sudanic Arabophone speech communities—which has most frequently and most prominently featured as the anchoring factor of proposed classificatory relationships between Egyptian and Sudanic Arabic dialects.

Present in the influential early work of Kaye (1976), reliance on genealogical connection persists as the dominant narrative of more recent scholarship linking Egyptian and Sudanic Arabic. This reasoning is encapsulated in Dickins' position that, "[r]eflecting the fact that the major penetration route of Arabic speakers was from Upper Egypt, through Nubia into Central Sudan, CUSA [Central Urban Sudanese Arabic] is more closely related to Egyptian Arabic—and particularly the S. a ¯ıd" ¯ı [Upper Egyptian] dialects, than any other non-Sudanese dialects" (Dickins 2011, p. 936). Likewise, Versteegh, in describing varieties of the combined Egypto-Sudanic area under the heading "Egyptian dialects," frames his account with the assertion that "[f]rom Egypt, the Arabic language was brought along the Nile to the South, into Sudan and Chad" (Versteegh 2014, p. 205). Certainly, the correlation of linguistic isoglosses to paths of migration and human movement remains a venerable and valuable practice in the Arabist tradition (Behnstedt and Woidich 2005; Heikki Palva 2006) and dialectology more generally (Chambers and Trudgill 2004; Britain 2016). In the Egypto-Sudanic case, however, the practice has not precisely been realized. Likely connected to the comparative lack of reliable dialectological description in the Sudanic portion of the area, observations of genealogical links between Egyptian and Sudanic speech communities have most often been proffered in place of a detailed accounting of shared linguistic features, rather than alongside one—thus positioning common genealogy as a direct *indicator* of

dialectal classificatory relationship, not an *explanans* to be utilized in the interpretation of a relationship separately established on linguistic grounds. On the whole, it is observed that macro-level co-classifications of Egyptian and Sudanic dialects have tended to proceed from the extra-linguistically founded premise of shared population origin to treat the collective body of Sudanic Arabic, definitionally, as "originally a dialect of an Egyptian dialect of Arabic" (Kaye 1976, p. 177), and to subsequently adduce linguistic evidence of this—linguistic—relationship only in a secondary, corroborating fashion (if at all).

Gratefully, a small number of exceptions to this general pattern are to be found. Owens (1993b) undertakes a thorough investigation of the dialectological relationships of Nigerian Arabic, which he situates within concentric spheres of affiliation incorporating, successively, other West Sudanic varieties, Sudanic Arabic writ large, and (primarily Upper) Egypt. In a further (2003) work, the same author presents a focused and convincingly argued account of the migratory dispersal of a particular feature, inflection of the first person imperfect, between specified subregions of the Egypto-Sudanic area. These contributions prove marked advancements in the understanding of dialectal inter-relationships within the region, and stand out for their reliance on concrete linguistic data. However, given their targeted framing and methodological emphasis on "patchwork" features which typify particular pairings/subsets of Egypto-Sudanic varieties but not the area as a whole (Owens 1993b, p. 158), these studies are not positioned to stand as full corrective or confirmation to the more broadly construed claims of macro-level classificatory unity so often advanced elsewhere in the literature. Approaching that task in more direct yet far more perfunctory fashion is Reichmuth, who in the introductory pages of his descriptive grammar of the East Sudanese dialect of the Šukriyya (Reichmuth 1983, pp. 24–29) sketches the extra-Sudanic incidence of several isoglosses characteristic of that variety as a baseline evaluation of its compatibility with the proposal of an Egypto-Sudanic subgroup, among other potential affiliates. Though he does identify a degree of isoglossic overlap between Šukriyya and Egyptian forms, he deems the comparison inconclusive and unable to demonstrate a direct taxonomic dependency. Valuable as Reichmuth's work may be in conception, the preliminary state of his evaluation and its conscious limitation to the focal point of the Šukriyya variety unfortunately constrain its usefulness as a linguistically anchored counterpoint to the genealogy-centered accounts of Egypto-Sudanic subgrouping that continue to dominate Arabist discourse.

It is in relation to this lacuna that I frame the present contribution: a linguistic investigation of the validity of the proposed linking of the Arabic dialects of the Egypto-Sudanic region as a macro-level classificatory unit, as has been prominently and repeatedly proposed in the Arabist literature on the nonlinguistic grounds of shared genealogical history. As described in detail in the following subsections, I shall present data from a selection of Sudanic and Egyptian varieties for analysis via both conventional and more innovative dialectological methods to determine whether their common classification as a macro-level dialect grouping is linguistically justified—or whether, in Reichmuth's words, "[s]o bleibt nur die Annahme gemeinsamer Ursprünge übrig" [all that remains is the assumption of common origins] (Reichmuth 1983, p. 29).

#### **2. Methods and Sources**

Consistent with the framing described just above, this investigation does not seek to re-litigate the historical basis of shared genealogies and migration paths that dialectologists and others have considered to bind Arabic speakers of the Egypto-Sudanic region. That these have their root in the first major demographic influx of Arabs westward into Egypt in the seventh century, thence southward into the Sudanic area—incipient as early as the tenth century, more saliently from the fourteenth onward (with prominent place given to tribal entities including the Juhayna and the Ja aliyy " ¯ın)—is largely accepted in the historical literature and has not substantively fluctuated over the previous century of scholarship (Holt and Daly 2011; and cf. MacMichael 1922). Though this consistency does not elevate the accepted narrative of events or its central tenets beyond any question or criticism (see, e.g., Spaulding 2000), it does make it likely that any meaningful revision of these understandings needs be based in a specialist comprehension of historical demography and supported by the advent of novel or reinterpreted historical data—neither of which I claim here. Instead, the present inquiry, true to its conception, centers on the evaluation of the linguistic relationship purported to mirror these historical genealogical linkages and connect the region's speech varieties in a manner worthy of reflection in macro-level schemes of Arabic dialect classification.

To accomplish this, I compare dialectological data from a sampling of Arabic varieties local to the proposed Egypto-Sudanic dialect area, in order to establish the definition and incidence of isoglosses which might weigh for or against the identification of a regionwide dialectal unity. I have selected six dialects to serve as core sources of data for this inquiry: three from the Egyptian portion of the zone and three from the Sudanic. The choice of the latter, especially, is constrained on the basis of available descriptive material. Thus, I have opted for the two varieties of Sudanic Arabic most comprehensively documented via book-length descriptive grammars—the dialect of the Šukriyya of eastern Sudan's But.ana region, as described by ¯ Reichmuth (1983), and that of Arabic speakers living in northeastern Nigeria's Borno state, documented by Owens (1993a)—in addition to the dialect of Khartoum, here mainly reflecting the grammatical sketches of Dickins (2007b, 2011), as occasionally supplemented by material from Bergman (2002) and Hillelson (1935). Together, these exemplify the West Sudanic type (Nigerian) and both traditional (Šukriyya) and urban (Khartoum) speech forms of the core Sudanic area. The three varieties representing the Egyptian portion of the region comprise, from north to south, those of Cairo (Woidich 2006b), Qift (Nishio 1995) on the east bank of the Nile in Upper Egypt, and the il-Bi e"r.at territory on the Nile's west bank opposite Luxor ( ¯ Woidich 2006a). Drawn from a larger pool of available descriptive material, these Egyptian varieties have been selected to provide a focus on the Nile Valley, due to its centrality in existing discussions of dialectal interrelationship within the Egypto-Sudanic sphere (Owens 2003; Versteegh 2014). While the dialects of Qift and il-Bi e"r.at are spoken quite near to one another in absolute ¯ terms, each is recognized as belonging to a distinct dialectal subregion of Upper Egyptian Arabic (cf. Behnstedt and Woidich 2018). This sampling of six varieties is not intended to be comprehensive, but rather sufficiently representative to establish the minimum viability of a proposed Egypto-Sudanic dialect classification—in the view that any isogloss with the potential to support a unified Egypto-Sudanic grouping should provide a detectable signal in *at least* this subset of six dialects, and that the artificial reduction in dialect diversity this (or any) sampling entails is more likely to overestimate the incidence of globally unifying features than to ignore them.

As to the nature of such potential features, this inquiry will address variation across the six dialects examined in the areas of phonology (segmental phonology and synchronic phonological processes), pronominal systems (personal, demonstrative, relative and interrogative), verbal inflectional morphology (agreement and tense-aspect-mood marking), and selected areas of syntax (negation, analytic possession, and demonstrative and interrogative word orders) in the attempt to identify shared features which might serve to join all or most of the six in support of a unified Egypto-Sudanic dialect grouping. The first three of these domains, and the features within them, have been chosen for (a) their consistently important roles in existing frameworks of Arabic dialect classification and (b) their attestation via comparable qualities of data across the individual dialect descriptions consulted. The fourth domain, that of syntax, is less commonly relied upon than these first three in general Arabic dialectological surveys,1 but is included here due to its prominence in discussion of Egypto-Sudanic varieties, specifically (e.g., Versteegh 2014). Coverage within each domain will strive to be inclusive of all potentially relevant variables but will generally restrict discussion to features which typify two or more of the speech varieties under examination, favoring a focus on cross-dialectal commonality rather than individually defining features.

Once the data pertaining to each of these domains have been presented and appropriately described, the global results will be evaluated to determine their consistency with an Egypto-Sudanic classificatory unit proposed on the basis of shared genealogical history of the region's speech communities. Consistent with the migration-based narrative's inherent implication of diachronic linguistic relatedness, elements of these findings will also be assessed in the light of directly attested historical data, as well as comparative and internal reconstructive analyses. Following this, further insight will be derived via application the novel model of Historical Glottometry developed by François and Kalyan (François 2014; Kalyan and François 2018), which will be shown to offer interpretatively relevant perspective on the complex data at hand. Following discussion of these points, I will reflect on their implications for direct reliance on shared genealogical history in the shaping of linguistic classificatory schemes—in the Egypto-Sudanic case, in Arabic at large, and, by extension, as a practice adopted by students and scholars of dialectology more generally.

#### **3. Results**

The following subsections present results of the investigation of phonological, pronominal, verbal inflectional, and syntactic variables in the six dialects of the Egypto-Sudanic area currently under consideration. Unless otherwise specified in the text or via a table note, data for each dialect are derived from the descriptive source mentioned in association with that variety in Section 2, above. As relevant, the incidence of a given dialect feature in Arabic varieties spoken outside the immediate study area will also be noted.

#### *3.1. Phonology*

#### 3.1.1. Segmental Phonology

This section describes the variable realization of consonantal and vocalic segments in the six studied varieties. In terms of consonants, these variables comprise the reflexes of Old Arabic \*/g/ (<D >), \*/q/, the interdental series \*/θ, ð ˙ , ð ˙ /, and \*/t./ (a subscript dot indicating the phonemic feature of "emphasis", the phonetic quality of which has been variously described as pharyngealization, verlarization, uvularization, or dorsalization cf. discussion in Jongman et al. 2011). Vowels examined include reflexes of the Old Arabic diphthongs \*/ay, aw/ and short vowels \*/i, a, u/. Results are summarized in Table 1.


**Table 1.** Segmental Phonology.

<sup>1</sup> Bergman (2002). <sup>2</sup> Hillelson (1935). <sup>3</sup> Owens and Hassan (2009).

Following Behnstedt and Woidich (2018, pp. 69–70) in identifying the Old Arabic articulation of <D > as [g], rather than the received [Ã] of the Classical Arabic tradition, we may conservatively view the /g/ realization of \*/g/ in Cairo as a retention. Outside Cairo, more fronted realizations are evidenced. The palatal articulation /é/ dominates in the core Sudanic region represented by the dialects of Khartoum and the Šukriyya, and is variably present in Nigeria and in B eri Arabic, the southernmost of the three Egyptian ¯" varieties examined.<sup>2</sup> Realization as an alveopalatal affricate /Ã/ is variably attested for Nigeria and Qift, and increasingly alveolar articulations /d<sup>j</sup> / and /d/ are additionally observed in the B eri and Qift varieties, respectively. None of these realizations, then, ¯" is ubiquitous. Palatal /é/ is perhaps of high salience, given its comparative rarity outside

this region (primarily also known from a limited number of varieties of the Arabian Peninsula—Ingham 1971; Zaborski 2007), but cannot be described as a typifying feature of the collected Egypto-Sudanic dialects as whole, or even of a substantial majority.

Perhaps of greater potential in this sense is the voiced reflex /g/ of Old Arabic \*/q/, robustly characteristic of all varieties in the sample outside of Cairo. Typical of all six dialects inclusive of Cairo is the merger of the Old Arabic interdental series \*/θ, ð, ð˙ / with corresponding alveolar stops \*/t, d, d. /. Setting apart the three varieties of the Sudanic area, however, is an additional emphatic reflex /d. / of \*/ð/, the conditioning of which vis à vis plain /d/ is not immediately clear, but which is clearly and consistently attested across all three varieties and must logically have preceded the more general merger of \*/ð/ > /d/. Neither the voicing of \*/q/ nor the fortition of the interdentals is unique to the Egypto-Sudanic region, as these features are pervasive throughout the modern Arabic-speaking world. The coincidence of the two is perhaps more noteworthy, breaking as it does from the oft-discussed Bedouin/sedentary dichotomy which associates voiced reflexes of \*/q/ with the preservation of interdentals and voiceless realizations with their loss. Taine-Cheikh (2000), however, illuminates in detail the more general co-occurrence of these two isoglosses across a wide, northeast African geographic zone stretching from western Libya to points in the Sinai Peninsula and easternmost Hijaz, thereby rendering the coincidence less unusual in the Egypto-Sudanic area's immediate geographic context.

Far scarcer, but not unknown, in broader dialectological light are glottalic/glottalized realizations of \*/t./, comprising the Nigerian implosive /â ˙ / alongside the glottalized articulation /t.ij/ typical of B eri and variably noted for Qift—all viewed similarly here for ¯" the conspicuous involvement of the glottis in the production of each (downward retraction of the glottis in /â ˙ /, closure and release of the glottis in/t.ij/). Similar realizations are noted outside the Egypto-Sudanic area in some Moroccan varieties as well as in scattered locations in the Levant and southern Arabia (cf. Zeroual 2006); these remain minority forms cross-dialectally, however, and may therefore be indicative of a linkage between the three specific Egypto-Sudanic dialects that display them. These realizations do not, however, serve to typify Egypto-Sudanic varieties as a whole, and neither is their status as innovation or retention—a crucial distinction in this instance—immediately clear.

In terms of vocalism, we may note the ubiquitous monophthongization of inherited diphthongs \*/ay, aw/ to long mid vowels /e,¯ o/. This feature links all six members of ¯ the present sample, though it does not meaningfully distinguish them from neighboring dialects to the east (Kaye and Rosenhouse 1997) or immediate west (Owens 1984). Retention of all three Old Arabic short vowels has been invoked as a more distinctive regional feature in the case of Egyptian varieties (Versteegh 2014), and this generalization bears out in the current sample for all dialects save that of Nigeria, in which reflexes of Old Arabic \*/i, u/ are largely noncontrastive. This consistency is noteworthy in the context of widespread merger of \*/a, i/ to the west of the Egypto-Sudanic area and of \*/i, u/ to its north and east. Retention of all three vowels is not unknown outside the area, however attested, for example, in Yemen and other portions of the Arabian Peninsula (Behnstedt and Woidich 2018)—and the nature of the feature as a common inheritance rather than a shared innovation limits its utility in support of a diachronically oriented, migration-based model of dialect classification, as will be discussed in Section 4.1 below.

#### 3.1.2. Phonological Processes

Our review of phonological features also includes three synchronically active phonological processes: the raising of /a/ > /e/ in word-final position (often referred to as final *imala ¯* ); the elision of unstressed /i, u/ in nonfinal open syllables following a vowel; and the shortening of phonemically long vowels in unstressed position. The incidence of these processes is summarized in Table 2 (<+> denoting the presence of a given process in each dialect and <−> its absence).

**Table 2.** Phonological Processes.


A process which may be broadly described as word-final /a/-raising, affecting reflexes of both Old Arabic \*/a, a/, is indicated for the B ¯ eri, Qift and Nigerian dialects. Cross-dialectally, ¯" processes with similar phonetic outcomes may be identified in a number of Levantine varieties, alongside looser correlates in Mesopotamia, Arabia and elsewhere in the Arabic-speaking world (cf. Levin 2007). However, significant differences in conditioning complicate the coidentification of the three Egypto-Sudanic processes as a single, shared feature, either within or without the sample: raising in Qift is reported to occur word-finally (though it would appear from Nishio's data that the rule is variably applied), B eri Arabic raises in pausal position, and ¯" in Nigerian Arabic /a/ is raised word-finally as triggered by the presence of a front vowel in the preceding syllable. Regardless of this feature's ultimate (dis)unity, it is not sufficiently widespread in the sample to be considered characteristic of a potential Egypto-Sudanic dialect grouping, occurring as it does in three dialects at most.

Elision of /i, u/, however, presents a different picture. Five of the six dialects in the sample display a similar form of conditioned deletion affecting the two short high vowels to the exception of their low counterpart, the core environment of which involves occurrence in a nonfinal, unstressed open syllable preceded by a vowel. In Nigerian Arabic, all short vowels, including /a/, are potentially subject to elision processes, and the conditioning environment is somewhat distinct from and more limited than that observed elsewhere in the set, requiring the presence of a preceding long vowel or sonorant (see Owens 1993a, pp. 33–36). The consistent occurrence of the elision feature across the remainder of the dialects surveyed is noteworthy, though not necessarily distinctive, as it further typifies an extensive array of additional dialects spoken across the Levant, Northwest Arabia, and elsewhere (frequently identified under Cantineau's traditional designation *parlers différentiels* for their distinct treatment of high and low short vowels under these conditions).

Finally, the shortening of unstressed long vowels is observed to occur across the sample's three Egyptian varieties, and is in fact commonly referenced as a distinctive phonological process of that area. While this generalization is borne out for Egyptian varieties by the current data, it would not seem to extend to the Sudanic contingent of the dialects examined, all three of which maintain vocalic length distinctions in both stressed and unstressed positions.

#### *3.2. Pronominal Morphology*

#### 3.2.1. Personal Pronouns

Table 3 summarizes the independent personal pronoun paradigms for the six dialects under review. Discussion here will primarily focus on these morphologically free forms, used in subject function, though mention of their enclitic counterparts utilized in object and possessive roles is also made as relevant below. Note that the Qift forms cited ending in /a/ vary with equivalents ending in /e/ (see discussion of /a/ > /e/ raising in Section 3.1.2), and that one speaker of this variety attests a 3.pl form *humman*.

On the whole, the observations arising from comparative review of these paradigms tend toward the identification of distinct Egyptian and Sudanic norms over pan-regional unity. The first such generalization that can be made is the association of 1.pl forms lacking initial /n/ (typically viewed as an innovative) with the dialects of the Egyptian portion of the area, and forms maintaining it with those of the Sudanic portion—here considering Nigerian *an¯ına* an /n/-ful form, perhaps remodeled by analogy with 1.sg *ana*, and additionally recognizing the occasional occurrence of /n/-ful reflexes in Egyptian territory, as described variably for Qift. In the second place, we may also observe the distinct distri-

butions of "short" and "long" forms of the third person pronouns, with short, monosyllabic forms (e.g., Šukriyya *hu¯*, *h¯ı*, *hun*, *hin*) typical of the Sudanic portion of the region and long, disyllabic forms (e.g., Cairo *huwwa*, *hiyya*, *humma*) typical of Egyptian territory. In the singular, both short and long forms may be considered innovations from earlier \*huwa, \*hiya (which appear to be variably retained in Khartoum and Qift alongside innovative short and long forms, respectively). For the plural, typically reconstructed as \*hum, \*hinna, the long masculine and short feminine forms may be seen as innovations and the short masculine and long feminine forms as retentions (Fischer and Jastrow 1980; Procházka 2014). The geographic distribution of third person patterns is somewhat complicated by the "mixed" composition of the Nigerian paradigm, presenting short forms in the singular and long forms in the plural, and the existence of both short and long variants of the B eri singulars, ¯" but overall the general principle of bifurcation between the two subregions—rather than commonality across them—is maintained. Also consistent with this pattern are shifts of \*/nt/ > /tt/ in second person forms and 3.m.pl \*hum > *hun* in the dialects of the Šukriyya and Khartoum, the latter perhaps deriving via analogy with feminine *hin*. An exception is the shift of initial \*/a/ > /i/ in all six dialects' second person forms, an innovation common to the majority of modern Arabic varieties.


**Table 3.** Independent Personal Pronouns.

<sup>1</sup> Bergman (2002) [transcription of final vowel length regularized for comparability].

The loss of the masculine/feminine distinction in second and third person plurals, with consequent generalization of the inherited masculine form, may be noted in Cairo and Qift and would appear to be currently progressing in the dialect of Khartoum, where distinctive feminine plural forms appear obsolescent and are sociolinguistically associated with rurality (Dickins 2007b, p. 561). Indeed, given the relative population structures of the speech communities under discussion, such a rural/urban dichotomy may underlie the distribution of this feature in the current sampling more meaningfully than would the geographic divide described in relation to the previous two, though it should not escape notice that the two geographically atypical cases, Khartoum and B eri, represent ¯" the northernmost Sudanic and southernmost Egyptian varieties sampled, respectively. In either case, the picture is once again one of heterogeneity rather than conformity of personal pronoun systems across the Egypto-Sudanic zone.

Turning briefly to the bound personal pronoun forms, not presented in Table 3, three distinctive features are observed. The unusual retention of 2.sg.f -*ki* and the innovation of 3.m.sg -*a* (< \*-hu) serve to bind Nigerian and B eri, while the remaining varieties instead at- ¯" test the innovative forms -*ik* and -*u* (though Qift -*o*) near-ubiquitous in modern Arabic, likely as pre-diasporic developments (cf. Behnstedt and Woidich 2005; Owens 2006). Thirdly, the three Sudanic dialects of Nigeria, Khartoum and (more marginally) the Šukriyya share with one another the variable loss of initial /h/ in third person bound forms, and display similar interactions of this feature with stress assignment.

#### 3.2.2. Demonstrative Pronouns

The proximal and distal demonstrative pronoun series of the six dialects examined are presented in Table 4. This presentation summarizes a highly diverse array of available data, particularly as pertains to the three Egyptian varieties. Intra-dialectally varying forms deemed to represent progressive degrees of reduction from a common source etymon have been simplified with a single representation here, and those displaying singular sporadic phonetic developments or synchronically predictable pausal realizations are likewise not shown; for a full accounting of all variants, relevant to this analysis and otherwise, see Woidich (2006b, pp. 44–46, 303), Nishio (1995, p. 190), and Bergman (2002, p. 43). When a sole plural form is indicated, its use comprises both masculine and feminine values, its position within the table selected on the basis of cognacy.


**Table 4.** Demonstrative Pronouns.

<sup>1</sup> Bergman (2002).

As a point of departure for analysis, it seems likely that the full array of forms presented here (with the possible exception of the distal plurals, as discussed below) ultimately originates in a paradigm similar to that attested for the Šukriyya variety in Table 4. Relevant features at a broad level of Arabic demonstrative classification involve the leveling of initial /d/ (< \*/ð/) across all members of the paradigm, the absence of a reflex of the Old Arabic presentative particle \*ha-, and the use of vowel alternation to indicate ¯ gender distinction in both the singular and plural while, for the most part, simultaneously maintaining consonantal marking of plurality (for further discussion of these traits in crossdialectal context, see Magidow 2013). Taken individually, none of these characteristics is restricted to the Egypto-Sudanic area; their converging incidence, however, largely is, identifiable elsewhere in comparable fashion only at scattered points in southwestern Arabia and possibly the central Levant (Magidow 2016). The essential tenets of this shared basic paradigm, then, together rise as a potentially significant piece of linguistic evidence supporting the common classification of the dialects of the Egypto-Sudanic area.

At the same time, substantial secondary divergences parallel the north–south splits between Egyptian and Sudanic varieties already observed in relation to several personal pronoun forms. Primary among these is the rise of forms etymologically comprising a demonstrative element of the type witnessed above supplemented by the incorporation of a following independent personal pronoun. With the exception of the Qift variant *dak¯* , these forms have entirely supplanted the presumably unsupplemented original distals in the three Egyptian varieties (e.g., Cairo *dukha* < \*dak huwwa, ¯ *dikha* < \*d¯ık hiyya, etc.), and occur variably in the proximal series of the two northernmost Egyptian varieties as well (e.g., Qift *d¯ıye* < \*di h¯ıye). Magidow (2013, p. 400) has previously proposed that these composite forms evolved from an original presentative structure of the same composition, on the basis of like presentatives in use in other dialects, such as H. assaniyya. This assessment is support ¯ by the presence of presentatives of this type closer to home in the current sample, in the form of Nigerian *dawa* < *\**da huwa 'here he is . . . ', *ɗ̣akwa* < \*dak huwa 'there he is . . . ', etc. ¯ (cf. Cairo *dawwa* 'this (m.sg)', *dukha* 'that (m.sg)'). This development—to which Nigerian shares the precursor, but in which it does not participate—thus serves to differentiate the three Egyptian dialects of the sample from their Sudanic counterparts.

Following from analysis of Egyptian distals in this manner is the further insight that the Cairo, Qift and B eri paradigms may display a distinctive, vowel-alternating mode of ¯" plural formation. In contrast to the distal plurals of the three Sudanic varieties, which are transparently formed via the addition of the distal morph -*k* to the existing proximal plural, the Egyptian forms do not contain any visible reflex of the proximal plural's distinctive /l/ instead, we encounter a plural marking back vowel of the type *dukham*, *dokkum*, *dukkumma*. It is certainly possible that an original /l/ of the plural form has simply vocalized, or that, given the occurrence of /u/ in some Egyptian m.sg forms (e.g., B eri ¯" *dukkati ¯* ), these plurals contain reflexes of a generalized singular \*dak. It is also plausible, however, to connect ¯ the vowel-alternating inflection attested in, e.g., Qift *dakka*, *dikke*, *dokkum*, to that known from a number of North African varieties, as in H. assaniyya ¯ *ðak¯* (m.sg), *ð¯ık* (f.sg), *ðuk¯* (pl.) (Taine-Cheikh 2007). If this development is indeed reflected in the Egyptian forms, it would mirror that encountered in the Khartoum variant *dek¯* , which may in turn have a counterpart in the initial element of B eri f.pl ¯" *dikkinna*. Such plurals are not the norm in the three Sudanic varieties, however, which instead maintain the /l/-marked plural intact (including in the Nigerian presentative set perhaps cognate to the Egyptian distals, m.pl *ɗ̣olakkahumma*,, f.pl *ɗ̣elakkahinna*)), thereby presenting a further potential north–south distinguishing feature among the dialects examined.

The final secondary development of note in relation to the demonstrative pronouns is the occasional loss of gender distinction in the plural, accompanied by the generalization of a single plural form to encompass both gender values. In the Egyptian varieties that have lost their original gender distinction, an original masculine form has generalized, whereas in Khartoum, when gender distinctions are lost, it is an original feminine form that has done so (compare Šukriyya m.pl *dol¯* , f.pl *del¯* with Cairo c.pl *dol(a) ¯* , Khartoum c.pl *del¯* ). Qift, with m.pl *dol¯* , f.pl *dola ¯* ~ *dole ¯* , would seem to have initially followed Cairo in generalizing a masculine form, but subsequently reallocated originally variable *dol ~ d ¯ ola ¯* to distinct gender values (perhaps via analogy with the f.sg nominal marker -*a*). Given the differentiated pathways taken in the generalization of formerly gendered forms in the dialects of Cairo and Qift, on the one hand, and Khartoum, on the other, it is perhaps advisable to view these two developments as parallel yet independent.

#### 3.2.3. Relative Pronoun

Rather than unifying the Egypto-Sudanic zone, relative pronoun forms further perpetuate the previously witnessed divide between the three Egyptian varieties of Cairo, Qift and il-Bi e"r.at and the three Sudanic ones of Nigeria, Khartoum and the Šukriyya. The for- ¯ mer set all display the identical relative form *illi*, reflecting a development near-ubiquitous across modern Arabic varieties (Vicente 2009). The dialects of the Sudanic area, on the other hand, all present the identical form *al*-, which has for all intents and purposes functionally merged with the definite article (Dickins 2009). This latter development is far less common in comparative scope, but is also apparent in a small number of dialects of the northern Fertile Crescent area (Vicente 2009).

#### 3.2.4. Interrogative Pronouns

Table 5 summarizes the interrogative pronouns 'who?', 'what?' and 'which?' attested for the six dialects under investigation. Cautiously excluded here are forms transparently mirroring Classical Arabic *ayy* 'which?' noted for Cairo and Nigeria, on the grounds that this is frequently identified in the modern Arabophone world as a diglossic import, not indicative of these varieties' inter-dialectal relationships but rather of their individual connections to a shared acrolect (cf. Woidich 2006b, p. 35). In the case of Qift, such a form is the only one given for 'which?' by Nishio (1995); interpretation of this is fact discussed below.


**Table 5.** Interrogative Pronouns.

<sup>1</sup> Demonstrates agreement phenomena.

Replicating the geographic patterning now familiar from other aspects of the pronominal system, forms for 'who?' in the Egyptian portion of the Egypto-Sudanic region are unified in displaying an innovative, sporadic long vowel /¯ı/. This long vowel is not present in any of the three more southerly varieties, though these likewise agree with one another in the inclusion of an original personal pronoun, incorporated alongside inherited \*min as a marker of gender and number agreement. Such inflectional behavior is maintained in its full form in the dialect of the Šukriyya (m.sg *minu¯*, f.sg *min¯ı*, m.pl *minun*, f.pl *minin*), alongside an uninflecting form *min* (the distinct syntactic behavior of which is treated below in Section 3.4.4). Inflecting forms are noted as well in older descriptions of the speech of Khartoum (Hillelson 1935), though modern sources (Dickins 2007b; Bergman 2002) indicate that an invariant (originally m.sg) *minu* at least alternates with these, if it has not replaced them entirely. The latter outcome would seem to have been the case for Nigerian *mine*, which does not inflect for number or gender but appears to display the reflex of an earlier incorporated pronoun. As isoglosses, both short and long vocalic reflexes, as well as personal pronoun incorporation, are well known outside the Egypto-Sudanic region.

Forms for 'what?' follow a similar north–south division: those of the three Egyptian dialects feature a reflex of earlier \*eš < \* ¯ Payy šayP'which thing?', while those of the Sudanic varieties seem to ultimately reflect a version of a similar etymological source phrase with the inclusion of nunation: \*šin < \*Payy šayPin 'which thing?'. As observed for the Sudanic 'who?' forms, 'what?' forms of this area also display the incorporation of personal pronouns, with comparable patterns of productivity in agreement inflection to those described just above. Though neither the nunated nor the non-nunated derivation serves to unify the study area, both are widespread in modern Arabic more broadly.

Pronouns meaning 'which?' may additionally be distinguished into northern and southern blocks within the Egypto-Sudanic zone, though along a slightly different boundary. Complicating evaluation in the context of the present study is the recording of a single form *ayy* for Qift, which, as has been noted, likely represents a borrowing from Classical Arabic *ayy* (more transparently so in the case of Cairo *ayy*, which displays an initial glottal stop regularly lost in the variety). It is probable that Qift also includes (or included, until recently) a form cognate with Cairo *anhi*, B eri ¯" *innhi*, perhaps similar to the *inh¯ı* reported for nearby Izbat al-B " u¯ s.a (Khalafallah 1969). Regardless, it would appear that forms of this type, reflecting Old Arabic \*Payyun (or Aramaic *ayna¯*) combined with an etymological personal pronoun, are typical of the Egyptian portion of the area, and likely also include the Nigerian variant *yenu ¯* . The southern dialects, including Nigerian via its variant *yatu*, are instead distinguished by reflexes of earlier \*Payyat (plus incorporated pronoun). Products of both etymologies inflect for agreement when following a modified noun (e.g., B eri m.sg ¯" *innhu¯*, f.sg *innh¯ı*, pl. *innhumma*, Khartoum m.sg *yatu ¯* , f.sg *yati ¯* , pl. *yatum ¯* ), but occur invariantly when preceding one—the Sudanic varieties fixing an original masculine singular form in this usage, the Egyptian ones more often an original feminine. Forms of the \*Payyun type are well known beyond the confines of the Egypto-Sudanic region. Reflexes of \*Payyat are much more unusual, known elsewhere only from a few locations in western (and especially northwestern) Arabia (cf. Reichmuth 1983, p. 118).

#### *3.3. Verbal Inflectional Morphology*

#### 3.3.1. Agreement Inflection

Table 6 summarizes the major distinctive elements of verbal agreement inflection across the Egypto-Sudanic varieties surveyed. The feature "f.pl" refers to the presence of distinct masculine and feminine agreement morphemes in the second and third person plurals of all conjugation paradigms; those dialects that do not display this feature have generalized inherited m.pl forms across both contexts. The remaining features relate specifically to either the perfect or the imperfect conjugation of Form I sound verbs, as indicated in the table.


**Table 6.** Verbal Agreement Inflection.

<sup>1</sup> Behnstedt and Woidich (1985–1999, Map 207). <sup>2</sup> Bergman (2002).

Parallel to the pronominal development described in Section 3.2.1, above, the dialects of Cairo and Qift do not retain a gender distinction in plural agreement morphology, and this distinction appears to be fading from use in the dialect of Khartoum. This development may thus be seen as a source of differentiation within the six varieties sampled, perhaps reflecting a rough north–south geographic divide, perhaps on the basis of difference between urban and rural populations. All dialects which distinguish the feminine plural do so in a formally identical manner, via use of a suffix -*an* (3.f.pl)/-*tan* (2.f.pl).

In assessing agreement markers of the perfect conjugation, features distinctive of this dataset in the pan-Arabic view include the conjugation of the 1.sg (identical in all cases to the 2.m.sg), the 3.f.sg, and the 3.m.pl. The 1.sg forms of the three dialects of the Egyptian area show the expected -*t* (< \*-tu) typical of the great majority of modern Arabic varieties. Among the three dialects of the Sudanic area, however, we view a pair of innovative local developments. In Khartoum, the 1.sg agreement value is marked with the suffix -*ta*, the /a/ of which likely represents the morphologized product of a former paragogic vowel, following an earlier development \*-tu > \*-t (similar to the 3.m.sg -*a* of geminated verbs in the same dialect). In Nigeria, we witness the loss of earlier 1.sg -*t* and consequent rise of contrastive stress distinguishing 1.sg *ka*"*tab* from 3.m.sg "*katab*; the original -*t* resurfaces prevocalically, as when preceding a bound object suffix or occasionally in connected speech. The same inflectional pattern is recorded among the Šukriyya, there alternating with more standard -*t*. The 3.f.sg suffix is -*at* in all dialects save that of Cairo, where it is -*it*. The Cairene reflex is innovative; retention of inherited -*at* thus typifies the rest of the group, though as a feature it does not serve to differentiate these dialects from neighboring varieties of Libya or the Arabian Peninsula.

South of Cairo, one encounters lowered realizations of the inherited 3.m.pl suffix \*-u. These begin marginally in Qift, in a minority variant - ¯ *ow* of more general -*u*. In B eri, ¯" this suffix is -*aw*, with an allomorph -*o¯*- in nonfinal position (i.e., when followed by an additional suffix), and Nigerian Arabic likewise appears to show lowered realizations, -*o* and -*o¯*-, in both conditions—though definitive interpretation of the Nigerian data is potentially confounded by the influence of vowel harmony (Owens 1993a, p. 105). In the dialects of Khartoum and the Šukriyya, a lowered realization only emerges as a nonfinal allomorph, contrasting final -*u* with nonfinal -*o¯*-. The universally lowered reflexes identified in B eri and Nigerian are a traditionally acknowledged "Bedouin" feature characteristic of a ¯" wide array of Arabic varieties from North Africa to the Arabian Peninsula to Mesopotamia. The conditioned lowering exemplified in the speech of Khartoum and the Šukriyya is of far more limited distribution, though it does also occur in the dialect of Mecca and the Jewish communolect of Baghdad (Reichmuth 1983, p. 28). Were these two types of

lowering to be identified as a single dialectal feature, then they would serve as an additional southerly isogloss linking the three Sudanic varieties, as well as the southernmost Egyptian variety—however, it is not clear that it is warranted to overlook the potentially significant allomorphic differences between the two.

Turning to the imperfect conjugation, the six dialects do not pattern uniformly with regard to the quality of the vowel utilized in the formation of imperfect agreement prefixes with Form I sound verbs. As noted in relation to several previous features, the general shape of the distribution would seem to be one of a north–south divide: the varieties of Cairo and Qift, aligning with the majority of modern Arabic varieties, show a prefix vowel /i/, while those of Khartoum and the Šukriyya have /a/. The dialect of Nigeria offers variation on this count, speakers utilizing both /i/ and /a/ reflexes. In B eri, the prefix ¯" vowel shows harmony with the theme vowel of the inflected verb, thus manifesting as /i/ or /a/ in predictable fashion. All of these patterns may be considered innovative in relation to the oldest reconstructable state of this variable in Arabic, which has been proposed to consist of alternation between /a/ and /i/ in inverse relation to the height of the imperfect theme vowel, in accordance with the Barth-Ginsberg Law (Bloch 1967; Pat-El 2017) Thus, salient isoglosses within the Eypto-Sudanic area, none of these developments are confined to the zone: generalization of /a/ is known in Western Arabia and the Yemeni Tihama, while that of /i/ dominates elsewhere, and harmonization of the B eri type is also known ¯" in North Africa (cf. Behnstedt and Woidich 2005, pp. 12–13).

The innovative first person agreement marking scheme 1.sg *n*-/1.pl *n-. . . -u*, typical of North Africa west of the Egypto-Sudanic area, also appears in our data as an inflectional norm in B eri and as an available variant in Nigerian. Though its presence is of ¯" dialectological note, this feature does little to clarify broader understandings of a potential macro-level Egypto-Sudanic dialect classification, as heterogeneity on this point is already well established in both the Egyptian and Sudanic portions of the area. For excellent discussion of this development's history and distribution in the region, see Owens (2003).

#### 3.3.2. Tense, Aspect, Mood and Voice Inflection

Beyond agreement, Table 7 summarizes additional verbal inflectional morphology utilized in the expression of tense, aspect, mood and voice. The prefix of the imperative mood is provided first, followed by the passive marker. Next, an array of "preverbal" modifiers are included which indicate a complex (and often varying) set of tense, aspect and mood values, details of which will be explicated as part of the following discussion.


**Table 7.** Tense, Aspect, Mood and Voice Inflection.

<sup>1</sup> Bergman (2002). <sup>2</sup> Behnstedt and Woidich (1985–1999, Map 221).

Vowel qualities of the imperative prefix display a similar north–south differentiation to that previously noted for the prefix vowel of the imperfective, and in synchronic terms these two traits are likely not systemically independent; total convergence of this kind is likely best interpreted as innovative in each case, the probable product of analogy (cf. Bar-Asher 2008). A prefix *i*- is thus encountered in the three dialects of Egypt, while the form *a*- is found in the three varieties of the Sudanic portion of the region. The latter is known outside the area in the same limited distribution described for the C*a*- prefix vowel (Section 3.3.1), while the former occurs in modern Arabic more widely.

The passive morpheme splits the area latitudinally in a similar manner, though in this instance Khartoum, the northernmost variety of the Sudan, is seen to pattern with the body of Egyptian varieties in displaying -*it*. The dialects of Nigeria and the Šukriyya, on the other hand, share in presenting /n/-based forms. Both features are generally considered to be innovations on the Old Arabic type, and each shares a wide distribution in the modern Arabophone world more broadly.

The first TAM modifier to be discussed, (shallowly) reconstructable to \*bi-, is the most widely distributed in the present sampling with detectable reflexes in all but one of the six varieties—not being recorded for B eri. In Nigerian, this morpheme has partly ¯" been subsumed into the person marking system, occurring as a quasi-fixed component of originally vowel-initial agreement prefixes of the imperfect conjugation; elements of productive use do remain, though their precise functions in Nigerian remain far from clear (see discussion in Owens 1993a, pp. 106–10). Values of *bi*- in Khartoum and among the Šukriyya more plainly include continuous (ongoing, repetitive, habitual) aspect, and futurity. Cairene *bi*- echoes the former of these, though notably not the latter, and adds a meaning of general realis or indicative mood (Brustad 2000, pp. 246–47). Little information on preverbal modifiers is provided as part of Nishio's descriptive materials for Qift. Behnstedt and Woidich's (1985–1999) immediately neighboring sample point of il-Barahma, ¯ however, attests a "Verbmodifikator Präsens" *ba*-, which, given its treatment in the atlas, likely expresses semantics similar to those of Cairo *bi*-. While the functions of these various items are thus differentiated across the dialects examined, their simple exponence as a feature does unite the greater part of the area. Reflexes of innovative \*bi- are, of course, well known outside the Egypto-Sudanic zone as well—most especially in the Levant, the Arabian Peninsula, and Libya. Semantically, the functional range described for the core Sudanic varieties is the more typical cross-dialectally, which in some cases leans even more heavily toward future and volitional readings.

An additional continuous aspect marker is found in Egypt, reconstructable to \*Qammal. ¯ This item is reflected in Cairene *ammal¯* , the meanings of which are far more narrowly defined than those of *bi*- and express a notion of intensity, iterativity, and repetition. B eri ¯" *a*- ~ *ama*- is of more generalized usage, and is reported to carry functions largely comparable to those filled by reflexes of \*bi- in other dialects of the sample. Though \*Qammal is not ¯ entirely absent in Sudanic territory (cf. Hillelson 1935), it is a definitive rarity there, and does not occur in the three Sudanic dialects sampled. It is elsewhere known outside Egypt from the Levant and scattered points in southwest Arabia. Future tense markers reflecting \*rāħ are attested in all three Egyptian dialects and in Khartoum. In the latter location, *h*¯ *a*- is reported by both Dickins (2007b, p. 569) and Bergman (2002, p. 38) as a recent Egyptianism. In light of its absence in the other Sudanic varieties sampled, and the lack of a clear dialect-internal grammaticalization chain (*maša* largely outcompeting lexical *ra¯h*¯ as the general term for 'go' in Sudanic dialects), this attribution is likely correct. Regardless of its ultimate originality in Khartoum, this innovative feature serves to differentiate the two southernmost varieties of the sample from the four northernmost, which join a wide array of modern dialects to attest products of this development from Algeria to Mesopotamia (cf. Leddy-Cecere 2020).

#### *3.4. Syntax*

The following subsections address a selection of syntactic features relevant in the evaluation of a potential Egypto-Sudanic dialectal unity, namely negation strategies, analytic genitive structures, the ordering of adnominal demonstratives, and WH-question formation. Though all elements of the ensuing discussion will likely be familiar to Arabic dialectologists, it bears note that, while syntactic variation of these types has frequently been addressed in formal (Aoun et al. 2010), comparative (Brustad 2000) and diachronic light (Wilmsen 2014), it has less often been the stuff of broad-based efforts toward top-tier Arabic dialect classification. In the Egypto-Sudanic case specifically, however, shared syntactic features—particularly the latter two considered in this section—are consistently among

the few concrete pieces of linguistic evidence invoked in support of the identification of a unified subgroup (cf. Versteegh 2014, p. 209); as such, they merit a full treatment here.

#### 3.4.1. Negation

Rather than uniting the six Egypto-Sudanic dialects surveyed, negation strategies are seen once again to divide the region into northern and southern camps, reminiscent of geographic patterns previously established in relation to numerous phonological and morphological variables already considered. The three Egyptian varieties surveyed display a "split" negation system typical of both modern and historical forms of Arabic, whereby two distinct strategies exist for the negation of verbal and nonverbal predicates. In all three dialects, the first of these involves a discontinuous negation structure, the second a unitary particle deriving diachronically from a negated third person singular pronoun (having since shed such morphological specification). The following examples from Cairene, showing verbal *ma . . . -š* and nonverbal *miš* (~*muš*), are typical:


'He did not write.' (Cairo: Woidich 2006b, p. 335)

'That is not good.' (Cairo: Woidich 2006b, p. 334)

Equivalent markers in Qift and B eri are ¯" *ma . . . -š*/*muš* and *ma . . . -(i)š*/*miš*. B eri stands ¯" out for allowing at least a limited application of verbal *ma* ...-*(i)š* to nonverbal predicates (e.g., *ma zen-iš ¯* 'not good', Woidich 2006a, p. 303) alongside more standard *miš*, although potential pragmatic specificities of such usage remain undescribed (cf. discussion in Brustad 2000, pp. 291–94).

In the three Sudanic varieties of the sample, by contrast, no such verbal/nonverbal distinction in negation strategies exists, and both predicate types are negated by a unitary operator with no discontinuous element. Consider, from Khartoum:


Negator *ma* ~ *ma¯* is used similarly in the dialects of the Šukriyya and Nigeria. In the latter variety, alongside *ma* we also encounter a generalized negator *mi* grammaticalized from an earlier negated third person pronoun and used in nonverbal negation, thus analogous in origin to Egyptian *miš* ~ *muš* but with no sign of an original discontinuous element -*š*. In sum, then, we find negation dividing the sampled dialects into Egyptian and Sudanic camps on two fronts. In the first place, the three Egyptian varieties are defined by the presence of discontinuous negation, and the three Sudanic varieties by its absence; in the second, the Egyptian dialects characteristically comprise distinct verbal and nonverbal negation strategies while the Sudanic dialects do not—B eri and Nigerian each ¯" demonstrating a degree of variable "slippage" from these otherwise generalizable norms. On the first count, the innovative Egyptian trait is broadly typical of dialects of the Arabicspeaking West, the more conservative Sudanic one those of the East. On the second count, it is Sudanic which stands out against the general backdrop of modern Arabic in utilizing a single strategy for the unmarked negation of both verbal and nonverbal predicates, though such may in fact represent a retention of inherited properties of Old Arabic *ma¯* (cf. discussion in Brustad 2000, pp. 277–83; Ouhalla 2008).

A third negation type, that of a negated personal pronoun paradigm fulfilling what has often been described as a negative copular function, is also in evidence in dialects of the Egypto-Sudanic area; however, a paucity of coverage in descriptive sources renders a comprehensive evaluation here impossible. On the basis of those dialects for which sufficient data are available (those of Cairo, Nigeria and the Šukriyya), it seems likely that a north–south split of the dimensions already described characterizes treatment of this negation strategy as well. This would be true both in terms of the pragmatic markedness of such usage (largely unmarked in the two Sudanic varieties, while in Cairene indicating the negation of a presupposition) and in terms case assignment (the negative structure generally triggering accompanying accusative pronouns in the two Sudanic varieties but nominative ones in Cairene, e.g., Šukriyya *mak¯* , Cairo *mantaš¯* 'you (m.sg) are not'). Though thus not inconsistent with the geographic division outlined in relation to the better known strategies, more definitive analysis of this third negation type awaits further descriptive information.

#### 3.4.2. Analytic Genitive

All six dialects of the Egypto-Sudanic area examined present use of an analytic genitive structure alongside the inherited Old Arabic synthetic (juxtaposed) genitive. Such structures as a general scheme are a widespread innovation in modern Arabic, though individual forms and properties vary widely from dialect to dialect (Behnstedt and Woidich 2005; Eksell Harning 1980). The essential components of the construction are a possessum, which governs a following genitive exponent, which in turn governs a following (nominal or pronominal) possessor, on the model of the following:


Beyond the existence of the general schema, which all six dialects attest, the overall picture of analytic genitive structures across the varieties sampled is one of both formal and functional diversity. In the first place, a wide array of different exponents occur, of diverse etymology. The most widely spread are those reconstructable to \*bita¯Q, ultimately < \*mata¯Q'property'. Reflexes of the latter are distributed broadly from Morocco to the southern Levant, but known with the sporadic mutation of initial \*/m/ > /b/ in the eastern portion of this region only (Egypt and the Sudan, alongside some Levantine attestations). Such forms are instantiated in Cairo *bita¯* and Qift *bita¯* ~ *ibta¯* , the sole genitive exponents reported at these locations, and in variation with products of other etymologies in the dialects of il-Bi e"r.at ( ¯ *ibta¯* ), Khartoum (*bita¯* ), and (more marginally) the Šukriyya (*bita¯* ~ *buta¯* ), thus leaving Nigerian the sole dialect sampled not to attest a reflex. There, the genitive exponent is instead *hana* < \*hana 'thing', which is also reflected in B eri ¯" *ihn¯ın*, and encountered outside the region in the interior northern Levant. Alongside *bita¯* , Khartoum sports an exponent *h*¯ *agg* < \*ħaqq 'property, right', well known from dialects of the Arabian Peninsula, which is also attested as a marginal variant *h*¯ *agg* among the Šukriyya. The primary exponent in this last variety is *hul ¯* (likely < \*hu li- 'it (3.m.sg) [is] ¯ for'), which may also be reflected in its f.sg guise *h¯ıl* as a suppletive variant member of the Nigerian *hana* paradigm: m.sg *hana*, f.sg *h¯ıl* ~ *hinta*. Šukriyya further attests yet another variant *all¯ıl* (< \*all¯ı li- 'which [is] for'), also known from dialects of southern Egypt not included in this sample (Behnstedt and Woidich 2005).<sup>3</sup> Thus, the picture which emerges is one in which reflexes of \*bita¯Qare typical and (to a degree) distinctive of the bulk of the varieties surveyed, though not to the exclusion of other exponents in use in the same speech communities. Meanwhile, reflexes of \*hana, \*hul and \* ¯ ħaqq serve to unite pairs of dialects within the sample, but do not broadly typify the set as a whole.

As far as syntactic behavior and semantic functions are concerned, the information provided by descriptive sources is uneven, but the following generalizations may be made. In all cases save that of Qift, for which Nishio does not specify, exponents are observed to agree in gender and number with their governing possessum (e.g., B eri m.sg ¯" *ihn¯ın*, f.sg *ihn¯ıt*, m.pl *ihniyy¯ın*, f.pl *ihniyyat¯* ); such inflection in fact reveals underlying variation in the use of *bita¯* , which inflects for m.pl as *bita¯ ¯ın*/*ibta¯ ¯ın* in the Khartoum, B eri and ¯" Šukriyya dialects, but as *bitu¯* in Cairo. These agreement properties are not unique to the Egypto-Sudanic area, though nor are they universal cross-dialectally. Cairene *bita¯* constructions have been demonstrated to show a strong dispreference for the governing of indefinite and/or nonspecific possessors, outside of an idiomatic meaning of 'one who likes . . . ' (Brustad 2000, pp. 80–82); information is lacking for Qift and B eri, but ¯" Dickins (2007b) description of Khartoum *bita¯* and *h*¯ *agg* would seem to indicate a similar state of affairs. In the Šukriyya and Nigerian dialects, however, such uses are noted, in Nigerian even extending as far as fully nonreferential classificatory function:


Examples like these indicate a clear heterogeneity of analytic genitive functional properties within the Egypto-Sudanic zone, and mirror potential correlates in dialects as far-flung as Morocco and Kuwait (cf. Brustad 2000). Though a geographic, social, or other ordering may ultimately underlie these patterns, information is insufficient to offer such a determination at the present time.

#### 3.4.3. Adnominal Demonstrative Order

The etymological form and paradigmatic organization of demonstratives has been described above (Section 3.2.2) as a potentially strong instance of innovative uniformity across members of a proposed Egypto-Sudanic dialect classification. In addition to these commonalities noted in the morphological dimension, the syntactic properties of demonstratives in adnominal usage also display distinctive and uniform characteristics across dialects of this region—a fact which has arisen in the Arabist literature as one of a small number of concrete linguistic traits identified as definitive of a macro-level Egypto-Sudanic grouping. Specifically, demonstratives in all six dialects sampled occur post-nominally, as in (9) and (10), thus opposed to the typical Arabic pre-nominal pattern exemplified by Moroccan in (11):


While available as a pragmatically marked alternative to the pre-nominal position in many dialects, as well older forms of Arabic, utilization of the post-nominal structure as an unmarked norm, without a genuinely productive pre-nominal counterpart, is highly unusual cross-dialectally and virtually restricted to the Egypto-Sudanic area (Brustad 2000; Vicente 2006). Within the area, minor but potentially significant exceptions in the form of rhetorically/stylistically specified usages and fixed expressions with pre-nominal ordering may be noted for—at least—the dialects of Cairo and the Šukriyya; the implications of these will be considered in Section 4.1. Irrespective of this fact, post-nominal demonstrative order in its present incarnation does appear to present a key point of unity across dialects of the Egypto-Sudanic zone, and a key point of distinction between these and the collective body of Arabic varieties spoken elsewhere.

#### 3.4.4. WH-Movement

Alongside post-nominal demonstrative position, Versteegh notes for "Egyptian Arabic . . . as well as . . . the related Sudanese dialects" an additional conspicuous syntactic trait: the nonfronting of WH-elements in content questions (Versteegh 2014, p. 209). Such in situ question formation is not typical of Arabic, in which the fronting of interrogative elements, whether accompanied by resumption or gapping, is more usually the unmarked norm (Aoun et al. 2010). Retaining a degree of cautious agnosticism regarding Qift, the descriptive source for which does not provide sentence level examples, in situ question formation is attested across the full set of Egypto-Sudanic varieties sampled. Representative instantiations are provided in (12) and (13), accompanied by a WH-fronted sentence from Lebanese Arabic in (14) for comparison:


In addition to this pattern, Šukriyya departs from the rest of the dialects in containing a parallel set of interrogative pronouns, morphologically distinguished by the lack of an incorporated personal pronoun (see Section 3.2.4), which are not utilized in situ but only in fronted position. Compare the following (with the /n/ of *šin* assimilating to following /b/ in (16)):


Despite its status as a minor and pragmatically marked variant, the structural properties of this usage have important ramifications for the interpretation of the otherwise regular and distinctive feature of in situ WH-question formation in Egypto-Sudanic dialects. They, and other points noted throughout our review of these varieties' phonological, morphological and syntactic characteristics, will provide a critical qualitative dimension to the global evaluation of linguistic evidence for an Egypto-Sudanic dialect classification based in shared genealogical history. It is to this task we shall turn in the paper's remaining sections.

#### **4. Discussion**

#### *4.1. Global Evaluation of Results*

Having reviewed the major phonological, pronominal, verbal inflectional and syntactic characteristics of the Arabic dialects of Cairo, Qift, il-Bi e"r.at, Khartoum, the Šukriyya, and ¯ Nigeria, we will now direct the information adduced toward a linguistic evaluation of existing proposals of an Egypto-Sudanic dialect classification, as has been repeatedly asserted on the nonlinguistic basis of shared genealogical history uniting the region's Arabic speakers. In the event that such nonlinguistic factorsas migration history and common descent prove viable grounds for the classification and grouping of language varieties used in the region, expectation is that a substantial number of shared linguistic features will arise to characterize the varieties in question. This would justify the prediction of a meaningful degree of dialectological similarity as a consequence of the historical and demographic unity ascribed to their speakers by extra-linguistic lines of research.

This expectation, however, is not substantively met by the linguistic data gathered through the process of this inquiry. Of over fifty phonological, morphological, and syntactic features identified and discussed in the preceding subsections, only seven may be recognized as uniformly present across all Egypto-Sudanic varieties sampled. These are:


To these, we might, for the sake of consideration, generously add six more—those features which proved characteristic of all but one of the surveyed dialects, and whose incidence may thus have been proved broader in a different sampling. These are:


The question, then, stands: Are these features sufficient to corroborate the existence of a linguistically significant Egypto-Sudanic dialect classification, proceeding from a common dialectal input carried by those historical communities who introduced Arabic first to Egypt, then to the Sudanic area via subsequent migration?

Though no conventionalized, objective threshold exists by which to make such a determination, the evidence in the Egypto-Sudanic case is not compelling—neither in terms of its quantity nor, critically, its quality. Of the thirteen isoglossic features identified as uniform or near-uniform across the six varieties examined, two—3.f.sg -*at*, and distinction of \*/a, i, u/—are clear retentions from a common Old Arabic inheritance, not innovations distinctive of further dialectal diversification. While thus not contradicting a narrative of dialectal relatedness due to shared migration history, neither do they positively support one: rather, they simply reflect the fact that dialects of the Egypto-Sudanic area have remain largely unimpacted by the mergers of \*/a, i/ emanating from the west of the modern Arabic-speaking world and \*/i, u/ associated with its north and east, as well as the change -*at* > -*it* typical of a number of Eastern Mediterranean varieties. None of these facts are surprising, and do nothing to indicate a shared developmental history of Arabic varieties in the region—simply a shared, central geography.

Of the remaining features which may be considered genuinely innovative, some are so ubiquitous across modern Arabic as to hold little meaningful value in establishing an identifiable Egypto-Sudanic dialect classification based in shared demographic heritage. Among these are the monophthongization of \*/ay, aw/, retained as diphthongs only in scattered relict zones; the use of a 1.sg/2.m.sg perfect suffix -*t* (< \*-tu), typical of virtually all modern Arabic varieties save those of the northern Fertile Crescent and parts of Yemen; and the change of initial \*/a/ > /i/ in the second person independent pronouns, identifiable in the vast majority of dialects outside the Arabian Peninsula (and many within it). These traits do not serve to differentiate dialects of the Egypto-Sudanic area from their immediate geographic neighbors in eastern Libya, the Hijaz or the Sinai (Owens 1984; Schreiber 1970; de Jong 2000), nor from the bulk of modern Arabic more broadly. A further number of features are not quite so universal in attestation, but still spread far beyond the bounds of the Egypto-Sudanic region. Fortition of interdental fricatives to corresponding stops, though not typical of the Egypto-Sudanic varieties' closest orbit of northern neighbors in eastern Libya or the Sinai (Owens 1984; de Jong 2000), is shared with the majority of varieties (both "sedentary" and some traditionally "Bedouin") of the remainder of North Africa and the

Levant, as well as urban Hijazi speech across the Red Sea (Schreiber 1970).4 Elision of /i, u/ (but not /a/) in unstressed, open-syllable environment is well known outside the region and is present in the Egypto-Sudanic varieties' easterly dialectal neighbors in the Sinai and Mecca; the same is true for the voicing of \*/q/ > /g/, which is commonplace westward into Libya as well (de Jong 2000; Schreiber 1970; Owens 1984). Reflexes of the verb-modifying prefix \*bi- extend beyond the Egypto-Sudanic zone's eastern edges into the urban Hijaz and the Sinai (Schreiber 1970; de Jong 2000), and further into Arabia and the Levant. Though absent from eastern Libya, their presence resumes in that country's west (Owens 1984). These features, then—while of obvious descriptive relevance—do not much contribute toward the definition of a classificatory unit which interprets the Egypto-Sudanic varieties as a discretely identifiable group, distinguished from other, neighboring dialects by the products of a separate developmental history.

The original thirteen features which might have been invoked in this regard, then, have fallen to four: a proximal demonstrative paradigm on the pattern \*da, d ¯ ¯ı, dol, d ¯ el, unmarked ¯ and obligatory post-nominal demonstrative order, in situ WH-question formation, and use of the genitive exponent \*bita¯Q. These traits, held in common across all or nearly all members of the sampled group, are both innovative and, largely, distinctive—not generally encountered beyond these dialects' immediate environs, neither are they typical even of closely neighboring varieties. A similar demonstrative paradigm is reported for Mecca alongside more common variants with an initial \*ha- element, and \*bit ¯ a¯Qis variably attested in some dialects of the Sinai, but neither trait dominates in either region (Schreiber 1970; de Jong 2000). Both Meccan and eastern Libyan Arabic allow in situ WH-question and post-nominal demonstrative orders, but these are not unmarked or obligatory to the degree identified among the Egypto-Sudanic dialects considered here (Schreiber 1970; Owens 1984). From a synchronic descriptive standpoint, then, these four isoglosses stand as strong candidates to delineate linguistically meaningful boundaries between dialects of the Egypto-Sudanic area and adjacent Arabic varieties.

Such does not automatically, however, render these four features supportive of an Egypto-Sudanic dialect classification of the form so often proposed, predicated on the shared genealogical history of the Egyptian and Sudanic Arabic speech communities. Under such a framework, the claim advanced is that the migration of Arabic speakers from Egypt to the Sudanic region from the early Middle Ages onward carried to the latter a linguistic input characterized by recognizable dialectological features which may be observed to meaningfully describe and unite Arabic varieties of the Egypto-Sudanic zone to this day. There are clear reasons to doubt, however, that three of the four diagnostic features remaining to us represent the products of such a history. The \*bitaQ-type genitive exponents, for example, may be of reasonable antiquity—possibly attested as early as the eleventh century (Lentin 2018)—yet at the same time show every indication of representing a (Lower) Egyptianism only much later adopted by Arabic speakers of Upper Egypt and the Sudan. In the present sample, reflexes of \*bitaQexist below Qift only in variation with other, heterogeneous genitive exponents, and are consistently identified by researchers and speakers alike as carrying urban and Egyptian sociolinguistic valuation (for empirical investigation of this sociolinguistic dimension, see Miller and Abu-Manga 1992; Miller 2005). These facts, combined with the relative novelty of \*bitaQforms noted by Hillelson (1935) and their absence from Nigerian, would support a scenario of spread accompanying the colonial expansion and consolidation of Cairene political influence throughout the region under the Ottoman/Khedival and Anglo-Egyptian state apparatuses (ca. 1820–onward), rather than as part of an original linguistic input carried southward during the first waves of Arabization several centuries earlier.

Certain data likewise complicate the identification of two further syntactic features, post-nominal demonstrative order and in situ WH-question formation, as having arrived to Sudanic territory as part of a founding in-migration of Arabic speakers from Egypt. While post-nominal demonstrative ordering is normative throughout the Egypto-Sudanic region today (as the sampled dialects attest), this is known to not always have been the case. Doss

has demonstrated that pre-nominal demonstrative ordering in Egypt long existed as a historical alternative alongside the presently familiar post-nominal, and was "alive and productive" (Doss 1979, p. 356) in direct historical attestations dating as late as the seventeenth and eighteenth centuries; the pre-nominal structure, in fact, still exists in modern Cairene in a number formulaic usages and fixed expressions, including the grammaticalized *dilwa ti* 'now' (< \*di l-waPt 'this time'). Though lacking a pre-modern textual record to provide comparable direct evidence, similar synchronic clues (e.g., Šukriyya and earlier Khartoum *dah*¯*¯ın* 'now' < \*dal-ħ¯ın 'this time') indicate that exclusively post-nominal demonstrative order has likewise not always been uniform in the Sudanic area (Reichmuth 1983, pp. 122–26). In this light, the present-day regime of obligatory post-nominal demonstrative ordering becomes a far less viable candidate to have been imported to the Sudanic area from Egypt as part of the latter region's initial Arabicization—not only because it does not appear to always have existed in Sudanic Arabic varieties, but also because it would not seem to have been so established in Egyptian varieties of the relevant era to begin with.

Direct historical attestation of WH-question formation is unfortunately less forthcoming, but internal reconstruction of the multimorphemic Sudanic interrogative pronouns \*šinu and \*minu may prove similarly revelatory. In contrast to their Egyptian counterparts of the types \*e(h) (< \* ¯ eš) and \*m ¯ ¯ın, these forms incorporate a reflex of a personal pronoun, which in some varieties still inflects to demonstrate agreement with the interrogated noun phrase. This difference is a critical one, in that it points to a structural dissimilarity in the diachronic source constructions that have given rise to the respective sets of interrogatives. Namely, the presence of the incorporated pronoun in the Sudanic varieties indicates the (historical) presence of a syntactic transformation in WH-questions, by which the noninterrogative element undergoes movement and is resumed by a third person pronoun in its deep-structure position. The following alternation of interrogatives with/without incorporated pronouns in the dialect of the Šukriyya is instructive:


The pronoun-incorporating structure in (18) would, presumably, have originally had its roots in a more complex, cleft-like structure on the order of (19), which has subsequently been subject to syntactic reanalysis/rebracketing:

19. \*[al-ħaddas-ak]*<sup>i</sup>* min [hu]¯ *<sup>i</sup>*? REL-told.3MSG-you who he 'He that told you, who is he?'

While sentences like (15), above, make it demonstrably clear that pronoun-incorporating interrogatives in present-day Sudanic varieties do not (or do not necessarily) carry a synchronic clausal interpretation of this type, the diachronic implication of this developmental pathway should not be overlooked. While questions formed in the manner of *ra* y-ak e?¯ and *ra y-ak šinu?* 'What's your opinion?' (Cairo and Khartoum, own knowledge) may both be validly described synchronically as displaying in situ formation, the latter presupposes an earlier cleft structure (\*[raPy-ak]*<sup>i</sup>* šin [hu]*<sup>i</sup>* 'Your opinion, what is it?'), which in turn presupposes the existence of a once-productive, WH-fronted, pronounless *šin* (cf. older Sudanese *šin gol-ak ¯* 'What do you say [lit. What's your saying]?'; Hillelson 1935, p. 62). The former does not, and the congruous modern products are thus assigned to two demonstrably incongruous developmental paths.

In the cases of WH-questions and demonstrative order, then, we must heed Pat-El's warning that "syntactic reconstruction based on cognate patterns may conflate genuine inherited syntactic material with cases of parallel development" (Pat-El 2020, p. 332)— or, we may add, cases of contact-induced convergence. Either or both of these syntactic patterns may have emerged in dialects of the Egypto-Sudanic area independently, or either or both may be the products of mutually influenced development through centuries of intra-regional contacts. In light of the historical and internally reconstructed data, however, neither appear to have been imported intact from Egypt to the greater Sudan with the onset of Arab settlement.

In terms of common Egypto-Sudanic features identified by this investigation which do in fact support such a narrative, we are subsequently left with a single linguistic trait: a proximal demonstrative paradigm on the model \*da, d ¯ ¯ı, dol, d ¯ el. This commonality ¯ is a genuinely striking one—being both innovative and distinctive—and demonstrative pronouns are undoubtedly a substantial feature of relevance to any serious attempt at Arabic dialect classification (see Magidow 2013, 2016). Yet, most would agree that they do not, in isolation, provide a viable solitary basis for the formulation of such groupings. This remaining commonality is thereby rendered less proof positive of classificatory relationship and more enigmatic isogloss to be marked for future investigation in light of broader Arabic demonstrative typologies. The traditional Egypto-Sudanic classification of the Arabic dialectology literature, predicated on the nonlinguistic genealogical relatedness and shared migration history of the region's Arabic-speaking communities, is thus left roundly unsupported following focused linguistic review.

#### *4.2. Whence from Here? An Excursus in Historical Glottometry*

Rejection of the traditionally formulated, genealogy-based Egypto-Sudanic dialect classification at a macro-level does not, however, refute or diminish the multifarious and noteworthy dialectal commonalities linking and cross-cutting smaller subsets of Arabic varieties spoken in this region, in varying combinations. These isoglosses, and the linguistic relationships they identify, are real and significant, and merit further study and elaboration—more than can be accomplished in a single contribution, by a single researcher, or, perhaps, via a single perspective on the information at hand. In cases like the present one, in which a long-standing hypothesis has been determined to lack fit, a fresh view on existing data is often as essential, and as conducive to progress, as the gathering of new. Here, one such opportunity (among many) comes in the form of "Historical Glottometry," a novel approach to linguistic subgrouping recently elaborated by François and Kalyan (François 2014; Kalyan and François 2018).

Historical Glottometry was developed by its creators for application in scenarios in many ways analogous to the Egypto-Sudanic case described heretofore, in which the potential for "tree-like" relationships between once-unitary dialectal entities and "wave-like" patterns of convergence between previously more distinctive groups both loom large, and need both be considered in any comprehensive interpretation of the data. The method accomplishes this by integrating the key dialectological notion of the isogloss with the comparative method's focus on the common innovation, and labors to produce a diachronically interpretable measure of the relative strengths of multiple potential classificatory units revealed by analysis of a given dataset. Such an approach has been called for previously in the study of Arabic dialects (for a forcefully argued articulation, see Magidow 2017), and Historical Glottometry in particular has fruitfully filled this role in the examination of Boni dialect linkages (Elias 2019) and the Sogeram language family (Daniels et al. 2019), among others. I offer a preliminary application of the method here not as a route to a definitive classificatory model, but instead as an exploratory exercise into new views which may inform future analysis of the Egypto-Sudanic data, failing the identification of a meaningful macro-level relationship based in shared migration history. For example, review of the isoglosses presented in Section 3 offered numerous examples of two-way divisions separating northern dialects of the Egypto-Sudanic area from southern, but the precise positioning of isoglosses within this general pattern was observed to frequently shift on the basis of individual features, and to display a number of variable exponences. Can

a technique like Historical Glottometry offer additional, informative perspective which might lead to clarity in the comprehension and description of cases like these?

The tradition of quantitative dialectometry of which Historical Glottometry is a part is not alien to the Arabic dialectological tradition (see Behnstedt and Woidich 2005, pp. 106–35, for discussion), yet similarly has not been widely embraced by the field's practitioners—for a host of valid critiques. As an analytical tool, Historical Glottometry joins these approaches in the effort to produce a linguistically meaningful yet condensed mathematical summation of data researchers "already know" (Daniels et al. 2019, p. 124) but which is copious and complex enough to defy ready intra-set comparability without transformation. Historical Glottometry accomplishes this via the production of two related values, each attending to a different aspect of linguistic classification generally agreed to hold significance in the field: "cohesiveness," a measure of the proportion of relevant isoglosses held in common by the members of a potential classificatory unit, and "subgroupiness," a measure of the number of isoglosses unique to the members of a proposed grouping. For a fully elaborated discussion of these measures' conception and justifications, see Kalyan and François (2018, pp. 68–71); to summarize, cohesiveness is calculated as the number of innovative isoglosses shared by all members of a proposed grouping divided by the total number of isoglosses attested by any member of the group, thus taking into account both the quantity of isoglosses supporting a group and those conflicting with it; subgroupiness is derived by multiplying a grouping's cohesiveness value by the number of exclusively shared isoglosses unique to the members of that group, thereby recognizing the importance of distinctiveness to most models of dialect classification while weighting the value of such features to reflect their position in broader dialectological context.

To apply this approach and calculate cohesiveness and subgroupiness scores for the array of dialect linkages attested by the Egypto-Sudanic data, I have accumulated the combined set of isoglosses considered in Section 3, focusing on those features which are clearly identifiable as innovative which are attested in a minimum of two varieties, and determined their presence/absence in each of the six dialects sampled. This tabulation of 342 values (6 dialects × 54 isoglosses) is included in Appendix A. I then calculated cohesiveness and subgroupiness scores for each of the subgroupings attested in the collected data, summarized in Figure 1, below. Cohesiveness scores are shaded in black, subgroupiness scores in white. Acronyms identify the composition of each classificatory group supported in the data by at least one exclusively shared feature (e.g., CBQK is a group consisting of the dialects of Cairo, Qift, il-Bi e"r.at and Khartoum; KSN those of Khartoum, the Šukriyya ¯ and Nigeria, etc.).

The first and most evident take-away from the Historical Glottometry analysis of the Egypto-Sudanic dialects is that two potential classificatory units stand out as particularly strong and "subgroupy": these are CQB and KSN—in other words, the three dialects of the Egyptian area taken as a group, and the three of the Sudanic. Not only do these respective sets of varieties share a meaningful proportion of their total features, but they also display a high number of exclusively shared features not identifiable outside the confines of the grouping (9 for each group). This geographical polarization of the dialect region, divided into groups representing the three northernmost and the three southernmost varieties of the sample, is replicated in the four-way groupings that emerge, which, with one (weaker) exception, consist of all three members of CQB or KSN in addition to one member of the other triad—the substantial diminution of both cohesiveness and subgroupiness incurred via such additions, though, reinforces the interpretation of the Egyptian/Sudanic split as a primary faultline in the data, rather than a single stage in a more gradual fading between northern and southern features. Indeed, turning to pairwise relationships, we similarly see that, excepting two linkages involving Nigerian, all other two-dialect groupings attested are internal to the CQB or KSN headings. Despite high cohesion, these are on the whole substantially weaker than either of the three-way groupings in terms of subgroupiness. Even the most significant pairing, KS, emerges as notably less strong than its superordinate KSN. These are key indications that the pan-Egyptian and pan-Sudanic dialect entities

KSN and CQB represent are not illusory extracts of a gradated continuum, nor secondary linkages of core plus orbit, but rather demonstrable, classificatorily significant units across which multiple distinctive, innovative features obtain. Historical Glottometry, then, has offered incisive, actionable insight to be further pursued in reshaping understandings of what dialect classifications may succeed the macro-level Egypto-Sudanic hypothesis: a scenario under which an Egyptian and a Sudanic group, though sharing a few broad characteristics and more numerous partially cross-cutting trends, stand out as robustly and independently definable in the absence of overarching linkage.

**Figure 1.** Historical Glottometry Scores of Attested Egypto-Sudanic Dialect Subgroupings.

The exception to this pattern is, as mentioned, Nigerian, which the four-way grouping CQBN shows to pattern more closely to the body of Egyptian varieties than does any other Sudanic dialect sampled; the still stronger pairwise grouping BN shows this affinity to exist more precisely with the dialects of Upper Egypt, particularly those represented by B eri. This finding is notable in light of the well-described linguistic and demographic link- ¯" ages between Upper Egypt and the Western Sudanic area detailed by Owens (1993b, 2003), including isoglosses beyond those considered here and a set of thoroughly sketched population movements from north to south occurring most prominently in the years leading up to 1500. The significance is thus twofold, serving as: (a) corroboration (admittedly circumstantial) of Historical Glottometry's compatibility with otherwise-derived understandings of the region's linguistic interrelationships, and (b) a reminder that the impact of migration events and shared genealogical history is not to be ignored in the interpretation of linguistic classificatory relationships. This last, then, underlines the urgency of the question of how a linguistically meaningful Egypto-Sudanic classification at large, girded by similar nonlinguistic factors, could fail to emerge in our broader analysis?

#### **5. Conclusions**

Given that population movements and shared genealogical histories of speech communities can and do influence dialect development in meaningful ways, it is understandable that these factors were utilized as proxy for genuine linguistic data in initial postulations of a classificatory affinity between Egyptian and Sudanic varieties of Arabic. These were generated at a time when such data were not forthcoming, and much of both areas remained dialectological *terra incognita* to Western Arabist scholarship. The window of usefulness for such stand-ins, though, is past. Existing descriptive works treating dialects of the Egypto-Sudanic region, though limited, are shown here to be sufficient to transition beyond this stage to engage in genuine linguistic evaluation of at least a subset of the varieties in question: their similarities, their differences, and their interrelationships. Yet, until this point, such has not been attempted in more than cursory fashion. Instead, once-preliminary assumptions based on nonlinguistic details were carried forward as received linguistic interpretation—fed by confirmation bias to the casual dialectological observer in the form of salient shared retentions, participation of dialects in broad areal trends, and instances of convergence via matter- and pattern-based borrowing or, conceivably, parallel development.

The present inquiry has demonstrated that, when faced with concerted linguistic investigation, little meaningful support can be found for the proposal that contemporary dialects of the Egyptian and Sudanic zones together constitute a viable classificatory grouping that reflects a common linguistic input carried by founding migrations of Arabic-speaking populations from the former region the latter. The study is not without its limitations its relatively narrow sampling and inattention to lexical variables, to start—but regardless has advanced a fairly unambiguous conclusion: that the historical demographic and genealogical ties seen to bind the area's Arabic-speaking communities in human relation to one another do not similarly define the relationships of those communities' dialects. Instead, these appear to pattern in discrete Egyptian and Sudanic blocs without significant superordinate connection, as occasionally disrupted by point-specific linkages and recent convergences contravening their general independence.

How is this contradiction between two dimensions of connectedness, the demographichistorical and the linguistic, to be reconciled? The first response of many will, perhaps, be to question the veracity of one set of understandings or the other. The linguistic findings of this study are, of course, not beyond reproach, and room similarly exists to interrogate historical conceptions of the Arabicization of Egypt and the Sudan from initial Muslim conquests (Booth 2013) to consolidation under the early Caliphate (Power 2012) to southward migrations of the medieval period (Spaulding 2000). But prior to—or, perhaps, in conjunction with—such, I would call for a pause. As dialectologists, we should not miss the opportunity to reflect on the assumptions and theoretical stances that have led us to such a conflicting position, and to ask whether the more fruitful questioning is that of the data or that of the frames through which we are wont to interpret it.

Much remains unknown about today's Arabic dialects' collective linguistic past, and much of that unknown is undoubtedly relevant to the sound comprehension and interpretation of their dialectological present. We must not, however, allow pursuit of those unknowns to become a preoccupation that unduly limits our imagination of what dialect classification strives to describe, or how the linguistic reality it represents enters into being. The amount to be learned from painstaking and revelatory excavation of the dialectal foundation laid by the earliest and subsequent waves of Arab migration and expansion is enormous—but it will never constitute a complete account. Arabic's arrival and establishment beyond its pre-Islamic environs via the physical movement of peoples is an obvious, massive watershed; all the same, myopic focus on the legacy of this era risks an artificial confidence that "by the 10th century [or perhaps, in the Egypto-Sudanic case, the fourteenth] dialectal areas were already shaped" (Abboud-Haggar 2006, p. 620).

The linguistic traces of past movements and demographic linkages are often longlasting and significant—but they are not guaranteed to be present, and nor are they, when present, indelible. Contemporary sociolinguistic scholarship (Trudgill 1986; Al-Wer 2007) has repeatedly shown language use in the wake of demographic upheaval to be highly variable and diffuse, often so much so as to defy stable dialectological description. Similar states have been demonstrated for the Caribbean Englishes of Le Le Page and Tabouret-Keller (1985) and, at greater time-depth, in the case of Indo-European and Proto-Greek (Garrett 2006), to the extent that the *prima materia* of future dialect formations is reduced to classificatory nondistinctness. Moreover, as much as a dialect linkage is a product of its input, it is in equal or greater proportion an emergent entity which manifests over time, the earlier connectivities and commonalities shaped by its social and interactional past fully prone to being remolded and over-written—or occasionally, as we may be witnessing in the instances of syntactic convergence covered above, created anew as speakers' present dictates. As Behnstedt and Woidich remind their colleagues following discussion of the development of the Egyptian dialect area, "[in] the historical evaluation of Arabic dialect phenomena, one cannot always assume that a feature was introduced from the original home of the speakers and implanted somewhere. One should also entertain the possibility that a given feature is the result of dialect mixing and dialect contact which eventually led to new dialects and new dialect areas" (Behnstedt and Woidich 2018, p. 95). My hope is that the present investigation of Egyptian and Sudanic Arabics serves to answer and emphasize this timely and pressing call.

**Funding:** This material is based in part upon research supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1110007. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available in the study body and Appendix A.

**Acknowledgments:** I am grateful to the organizers and attendees of the 11th Conference of the Association Internationale de Dialectologie Arabe for comments on an early stage of this work, as well as to Rama Hamarneh for valuable feedback on the study in its current form.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Appendix A**

**Table A1.** Values for Historical Glottometry Analysis of Arabic Dialects of the Egypto-Sudanic Region.



**Table A1.** *Cont.*

<sup>1</sup> As necessary, innovations have been reformulated from their in-text descriptions to match Historical Glottometry's sole focus on innovations rather than retentions; only those innovations attested in 2+ varieties are listed, and an innovation is considered present in a given dialect even if its occurrence there is variable. Innovations are presented in the order they are discussed in the article text.

#### **Notes**


<sup>4</sup> Sporadic \*/ð/ (> \*/ð˙ /?) > /d. / in Sudanic varieties, but not Egyptian ones, also indicates that more general loss of interdentals in those dialects likely post-dates arrival of their speakers to Sudanic territory.

#### **References**


Hillelson, Sigmar. 1935. *Sudan Arabic Texts: With Translation and Glossary*. Cambridge: Cambridge University Press.


Le Page, Robert Brock, and Andrée Tabouret-Keller. 1985. *Acts of Identity: Creole-Based Approaches to Language and Ethnicity*. Cambridge: Cambridge University Press.

Leddy-Cecere, Thomas. 2018. Contact-Induced Grammaticalization as an Impetus for Arabic Dialect Development. Doctoral dissertation, University of Texas at Austin, Austin, TX, USA.


MacMichael, Harold Alfred. 1922. *A History of the Arabs in the Sudan: And Some Account of the People Who Preceded Them and of the Tribes Inhabiting Dárfur, vol. I ¯* . Cambridge: Cambridge University Press.

Magidow, Alexander. 2013. Towards a Sociohistorical Reconstruction of Pre-Islamic Arabic Dialect Diversity. Doctoral dissertation, University of Texas at Austin, Austin, TX, USA.

Magidow, Alexander. 2016. Diachronic Dialect Classification with Demonstratives. *Al- Arabiyya* " 49: 91–115.


Miller, Catherine. 2005. Between Accommodation and Resistance: Upper Egyptian Migrants in Cairo. *Linguistics* 43–45: 903–56.

Miller, Catherine, and Al-Amin Abu-Manga. 1992. *Language Change and National Integration: Rural Migrants in Khartoum*. Readings and Khartoum: Garnett-Khartoum University Press.

Nishio, Tetsuo. 1995. Characteristics of the Arabic Dialect of Qift. (Upper Egypt). *Journal of African and Asian Studies* 48–49: 173–219.


Procházka, Stephan. 2014. Feminine and Masculine Plural Pronouns in Modern Arabic Dialects. In *From Tur Abdin to Hadramawt: Semitic Studies; Festschrift in Honour of Bo Isaksson on the Occasion of His Retirement*. Edited by Tal Davidovich, Ablahad Lahdo and Torkel Lindquist. Wiesbaden: Harrassowitz, pp. 129–748.

Reichmuth, Stefan. 1983. *Der arabische Dialekt der Šukriyya im Ostsudan*. Hildesheim: Olms.


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Languages* Editorial Office E-mail: languages@mdpi.com www.mdpi.com/journal/languages

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel: +41 61 683 77 34

www.mdpi.com ISBN 978-3-0365-6140-0