A Quantitative Approach to Microvariation: Negative Marking in Central Romance

Pescarini, Diego

doi:10.3390/languages7020087

Open AccessArticle

A Quantitative Approach to Microvariation: Negative Marking in Central Romance

by

Diego Pescarini

^1,2

¹

Bases, Corpus, Langage (BCL), Université Côte d’Azur, 06108 Nice, France

²

Bases, Corpus, Langage (BCL), Centre National de la Recherche Scientifique (CNRS), 75016 Paris, France

Languages 2022, 7(2), 87; https://doi.org/10.3390/languages7020087

Submission received: 17 October 2021 / Revised: 1 February 2022 / Accepted: 25 February 2022 / Published: 6 April 2022

(This article belongs to the Special Issue Double-Negation and (Negative) Polarity Phenomena in the Romance Languages and Their Dialects)

Download

Browse Figures

Versions Notes

Abstract

:

This work presents an exploratory data analysis of the syntactic distribution of pre- and postverbal negation (N1 and N2) in a corpus of data gathered from two linguistic atlases, the Linguistic Atlas of France (ALF) and the Italo-Swiss Atlas (AIS). Metadata concerning the distribution of N1 and N2 across dialects and syntactic contexts are analyzed with the r package Rbrul. Multiple logistic regression allows us to assess how independent variables affect the presence/absence of N1/N2. Geographical and grammatical factors are examined; the latter concern mainly clause typing and negative concord, i.e., the co-occurrence of clausal negation and a negative word. The data from the two atlases are first analyzed separately and eventually merged in order to strengthen the statistical significance. Both geographical and grammatical factors prove to be significant. In particular, the preliminary findings show that N1 is more likely retained in sentences containing another negative word, the incidence of N1 varies according to the type of co-occurring negative word, and veridicality has a mild effect on N2 but not N1.

Keywords:

syntax; negation; negative concord; microvariation; Romance

1. Introduction

Languages exhibit a natural tendency to evolve from preverbal to postverbal negative marking through a stage of discontinuous negation in which pre- and postverbal markers co-occur (Jespersen 1917). To avoid misleading terminology, the descriptive terms “preverbal negation” and “postverbal negation” are here replaced by the conventional labels N1 and N2, which are defined as follows:

N1 usually derives from the Latin non (Fr. ne, It. non) and co-occurs with negative polarity items (NPIs), giving rise to negative concord configurations; it is usually placed preverbally and, in finite clauses, is proclitic to the finite verb (only other clitics can occur between N1 and the verb).
N2 derives from various kinds of elements, such as nouns denoting a minimal quantity (dubbed minimizers, e.g., pas, “step”, “point”; mie/mica/brisa, “crumb”; etc.), or, to a lesser extent, negative quantifiers and polarity particles. It sometimes co-occurs with NPIs (Fr. je ne mange (*pas) rien, “I eat nothing”), and it normally occurs postverbally in finite clauses.

The position of N2 is subject to cross-linguistic and cross-contextual variation. Despite usually being referred to as postverbal negation, N2 can occur preverbally in infinitives, as in (1) below or imperatives, as in (2), or can be focus-fronted under specific pragmatic conditions, in particular in languages such as standard Italian, as in (3), in which N2 is in an early stage of grammaticalization (Pescarini and Penello 2012).1

(1)	a.	Je	ne	bouge	pas	mon	bras. (Fr.)
		I=	N1	move 1sg	N2	my	arm
		“I cannot move my arm.”
	b.	pour	ne	pas	bouger	mon	bras
		to	N1	N2	move.inf	my	arm
		“(in order to) not to move my arm”

(2)	a.	eu̯̯	kiˈpeːʃ	bek (AIS: map 1658; Ems, Rhaeto-Romance)
		I	understand.1sg	N2
		“I do not understand.”
	b.	ˈbeːkɐ	sɐˈmwεntɐ	(AIS: map 1647; Ems, Rhaeto-Romance)
		N2	move.2pl
		“don’t move”

(3)	a.	Non	lo	mangio	mica. (Italian)
		N1	it=	eat.1sg	N2
	b.	Mica	lo	mangio!
		N2	it=	eat.1sg
		“I do not eat it/I am not eating it.”

Postverbally, N2 can be preceded by a past participle and various classes of aspectual adverbs that in Romance usually follow the inflected verb (Cinque 1999). The number and type of adverbs that precede negation are subject to cross-linguistic variation (Zanuttini 1997; Manzini and Savoia 2005, 2012). Most N2s precede complements, except the negator corresponding to the polarity particle “no”, which in northern Italian dialects, such as Milanese, always occurs in sentence-final position (e.g., Mil. [ˈdɔrmi nɔ] “I do not sleep”). Despite compelling syntactic evidence, however, in this paper, all instantiations of N2s are regarded as if they belong to a single and uniform class. The paper then focuses exclusively on the presence/absence of N1/N2, regardless of the position of the latter in the structure of the clause and regardless of its etymon. Variations in the syntax (and morphology) of different types of N2s will be addressed in a separate work.

The goal of this paper is twofold:

(a): It intends to assess the feasibility of multivariate analysis of syntactic variables in a sample of data collected almost a century ago for dialectological purposes.
(b): It aims to verify whether grammatical factors such as clause typing and negative concord play any role in the cross-linguistic and cross-contextual distribution of N1/N2.

A statistical analysis of microvariation can shed light on the internal mechanisms that rule the distribution of N1 and N2 across space (i.e., across linguistic systems) and time. Jespersen (1917) first discovered the tendency to replace N1 with N2—a tendency that has lately been dubbed “Jespersen’s cycle”—and argued that the change is triggered by a weakening of N1 associated with a progressive grammaticalization of certain NPIs (minimizers, quantifiers, pro-sentences). Originally, the distribution of emergent N2s was constrained by pragmatic conditions: the presence of a double negator tends to trigger the activation of an inferential link between the negative clause and a discourse-old proposition (Schwenter 2005; Larrivée 2008). Given its pragmatic value, N2s in statu nascendi are often ruled out in factive sentences, which are already presuppositionally linked (Cinque [1976] 1991). When this pragmatically marked value is lost, however, such elements become fully fledged N2s (i.e., exponents of the default clausal negation (Larrivée 2008) until N1 is eventually lost (see Gelderen (2008), and Breitbarth (2013) for formal analyses).

This paper does not deal with change across time, but was intended to investigate how the syntax of N1 and N2 varies across space. To do so, an exploratory data analysis of 22,500 negative sentences contained in the Atlas linguistique de la France (ALF; Gilliéron and Edmont 1902–1910) and the Sprach- und Sachatlas Italiens und der Südschweiz (AIS; Jaberg and Jud 1928–1940) was conducted. These atlases, which have been (partially) digitized in recent times, contain parallel translations of sentences, phrases, and isolated words elicited in, respectively, 639 and 407 datapoints across France, Belgium, Italy, southern Switzerland, and nearby regions. All data were gathered by interviewing a single NORM (non-mobile, old rural man) speaker per dialect following a written questionnaire. The ALF and AIS questionnaires include 46 negative sentences, 24 in Italian for the AIS and 22 in French for the ALF (more on this in Section 2).

The geolinguistic distribution of N1 and N2 is quite regular, as shown in Figure 1, which overviews the distribution of dialects that exhibit N1 (purple dots), N1 and N2 (green dots), and N2 (yellow dots) among the 1046 ALF/AIS datapoints.

The map in Figure 1 plots the data from one map from the AIS and one from the ALF, both reporting translations of negative declarative clauses. However, if variation across sentences (i.e., across AIS/ALF maps) is examined, cross-linguistic variation appears to be more nuanced, as shown in Figure 2. While Figure 1 shows the distribution of N1 and N2 in indicative clauses, Figure 2 was generated by superimposing the data from various maps, e.g., those reporting data on imperatives, questions, “if” clauses, etc.

The light green and dark yellow points in Figure 2 indicate that in certain datapoints, the distribution of N1 and N2 varies across syntactic environments. Therefore, besides geolinguistic variation, we need to account for other kinds of variation that may be conditioned by either external (sociolinguistic) or internal factors.

Several works have focused on sociolinguistic variations in negation marking, especially in French. Ashby’s 1981 seminal paper, for instance, examined a corpus of over 100 interviews in colloquial French recorded in the Tour region, showing that the occurrence of N1 varies according to a complex set of linguistic, stylistic, and social factors, including discourse setting (formal/informal), age, social class, etc. The AIS/ALF dataset, however, is not amenable to a sociolinguistic analysis because, as previously mentioned, both AIS and ALF interviews were conducted with a single NORM speaker per dialect and with a standardized questionnaire in order to minimize diastratic and diaphasic variation. Nonetheless, since various kinds of negative clauses are contained in both AIS and ALF questionnaires, we can try to compare the occurrence of N1 and N2 in different syntactic environments, including some of those that, according to previous studies, such as Ashby (1981), and Pescarini and Donzelli (2017), may affect the behavior of N1 and N2.

Ashby’s study showed that the incidence of N1 is affected by internal (i.e., grammatical) factors. Ashby showed that N1 is found less frequently in sentences containing an NPI, in embedded clauses (especially in subjunctive clauses), and when a subject clitic is present. Similar results are found when, instead of comparing speakers of a single variety, we compare data from nearby dialects. Pescarini and Donzelli (2017), for instance, found that variations in a sample of AIS data from Ticinese dialects (Lombard-type dialects spoken in southern Switzerland) appeared to be related to syntactic factors such as mood or the adjunct/argument status of NPIs. Pescarini and Donzelli’s (2017) conclusions need to be confirmed with a bigger dataset, such as the one examined in the present work.

To find out whether cross-linguistic and cross-contextual variations correlate with grammatical factors, I performed a multivariate analysis of the AIS/ALF dataset with the intent of verifying whether the distribution of two dependent variables, such as the incidence of N1 and N2, is linked to other syntactic factors or not. The main concern with this methodology is that syntactic factors are intertwined, as sentences are often characterized by multiple grammatical factors at the same time. This holds particularly true for structured corpora, such as linguistic atlases, which were not conceived for the purpose of syntactic analysis. Hence, since the inventory of input sentences is limited while the spectrum of syntactic parameters is relatively wide, corpora such as the AIS/ALF tend to be heavily collinear. Due to this collinearity, the role of each predictor is difficult to ascertain, even if the overall number of tokens (i.e., dialect clauses) is relatively high.

Furthermore, syntactic factors are intertwined with geographical factors; by performing multiple regression, I tried to ascertain whether grammatical factors are significant or, alternatively, whether the geographical distribution of dialects (i.e., that dialects belong to independently established groups) suffices to account for microvariations.

The paper is structured as follows: Section 2 provides an overview of sources, data, metadata, methods, etc. Section 3 and Section 4 present some early results of the multivariate analysis of AIS and ALF data, respectively. In Section 5, the results regarding the two areas are compared and then combined to obtain better results. Section 6 presents the conclusion.

2. Materials and Methods

The dataset (Pescarini 2022) can be downloaded from the Zenodo repository: https://zenodo.org/record/5820466#.Yfj1t_jjJc8 (accessed on 16 February 2021). Table 1 reports the number of negative sentences contained in each atlas.

The primary data contained in the AIS and ALF were collected and transcribed almost a century ago. The mapping from the primary AIS/ALF data to the metadata contained in my dataset (Pescarini 2022) was carried out manually in order to ensure accuracy. Some data, however, remained unclear and were therefore excluded when certain factors or variables were being tested. For instance, in examining the distribution of N1, I focused on a subset of 19,432 sentences, excluding various examples that contained other preverbal nasal formatives, such as first person plural subject clitics, impersonal clitics deriving from the Latin homo (“man”), and partitive/genitive clitics, e.g., ALF 97: (Des pommes) nous n’en aurons (guère). All of these elements have a nasal formative that can be easily mistaken for a preverbal negation marker, and if a single nasal segment occurred in the transcription, annotators were not always able to conclude whether that segment was a negation formative or not.

Table 2 shows how negation systems (N1, discontinuous: N1 + N2, N2) are distributed in the two datasets (sentences containing NPIs are excluded because N2 is often in complementary distribution with NPIs). The examples from the AIS have been organized into two subsets, those collected in northwestern regions vs. the others (more on this below).

For the sake of consistency, in this paper I focus on a subset of the AIS data, those languages spoken in northwestern regions (Piedmont, Lombardy, the Aosta Valley, Liguria, Emilia–Romagna, and Trentino), thus excluding most dialects that exhibit only N1. Northwestern dialects are directly comparable to what we found in Gallo-Romance. However, even if we focus only on Italo-Romance dialects, the co-occurrence of N1 and N2 in the two atlases remains uneven because patterns of discontinuous negation are quite frequent in the ALF, as evidenced by the greenish dots in Figure 2, but relatively rare in the AIS.

The data reported in the AIS and the ALF were collected following a well-established fieldwork methodology in dialectology, questionnaire-based interviews with NORM (non-mobile, old rural men) informants, carried out in the late 19th and early 20th century (for the ALF and AIS, respectively). Interviews were conducted by experienced linguists who were specifically selected and trained for the purposes of the project. All data were transcribed on the spot using ad hoc phonetic alphabets.

The AIS/ALF questionnaires consist of thousands of items (words, phrases, and sentences). Each interview took several hours, usually divided into multiple sessions over three days. The interviewers therefore had the opportunity to interact with their informants and with other members of the community for a relatively long period of time. By cross-checking the elicited material with spontaneous speech (and with data from nearby dialects surveyed during the same campaign), AIS/ALF interviewers were able to provide a faithful picture of each dialect.

In principle, mixed methodologies (Poletto and Cornips 2005) would probably elicit better results, in particular with respect to phenomena such as a negation marking, which as mentioned in Section 1, is probably subject to sociolinguistic, stylistic, and contextual variation. However, with atlases such as the AIS and ALF, one must “make the best of … incomplete data” (Garzonio and Poletto 2018) and accept the limits of tools designed more than a century ago. A quantitative perspective, in my opinion, is the safest way to revive the data contained in atlases such as the AIS and ALF by limiting possible biases that are intrinsic to traditional dialectological enterprises.

The data contained in the above sources were first mapped into discrete (binary) variables and factors, which were organized on a single spreadsheet. Supervised annotators worked mainly on the original maps of the AIS and ALF, which had already been digitized and can be freely downloaded or consulted online. By looking at areal distributions, annotators had a better chance to understand the phenomena under study, provide a more precise segmentation of the transcribed material (sometimes phonetic transcriptions can blur morphosyntactic structure), and, in the end, ensure a more correct annotation of the data. Each annotation was cross-checked by another annotator and eventually by myself.

We obtained a single spreadsheet containing three kinds of metadata:

Source: AIS/ALF, sentence and datapoint identifiers, geographical coordinates, region/province;
Dependent variables: presence/absence of N1 and N2;
Factors (independent variables), e.g., clause type, mood, presence, and type of co-occurring NPI.

Clause typing was coded by using descriptive labels such as “indicative”, “question”, or “modal”. I believe that such theory-neutral terms suffice for an explanatory analysis such as the one carried out in this work, whereas a finer and more abstract taxonomy is probably needed for hypothesis testing. Declarative indicative present tense clauses were considered as the baseline; the remaining sentences were tagged to signal that they differed from the baseline with respect to one or more features. If a sentence differed from the baseline with respect to two or more features (e.g., with respect to tense and force), the annotators were asked to choose the highest one on the scale force > mood/modality > tense/aspect. For instance, all interrogatives were tagged as “question” regardless of their specific tense/aspect/mood. A screenshot of the dataset (which can be freely downloaded from https://zenodo.org/record/5820466#.Yfj1t_jjJc8, accessed on 16 February 2021) is shown in Figure 3.

The statistical analysis was carried out using the r package Rbrul (Johnson 2009). The software performs multiple logistic regression in order to assess how independent variables affect the distribution of dependent variables, in this case the incidence of N1 and N2. Rbrul is the last descendant of the family of variable rule programs that sociolinguists have been using extensively since the early 1970s to evaluate the effects of social and linguistic factors on binary variables. Rbrul, like its predecessors, identifies which groups of factors significantly affect the chosen dependent variable, and for each group of factors (e.g., age, clause type, phonological context), it weights to what degree specific factors affect the distribution of the independent variable. Since Rbrul cannot handle polychotomous variables, each possible negation pattern (N1, N1 + N2, N2) was reduced to a combination of two binary variables (N1 = 1/0, N2 = 1/0).

Section 3, Section 4 and Section 5 examine the effects of two types of factors on the distribution of each dependent variable (N1 or N2): an external factor such as region (i.e., where a dialect is spoken) and one or more grammatical factors. With the help of the Rbrul package, I first ascertained whether models with multiple factors (geographic and grammatical) fit the data better than models with “region” as a single factor. Second, I tried to establish, for each group of grammatical factors, which correlated better with the presence of N1 or N2.

As for grammatical factors, previous qualitative analyses of Romance dialects provide insightful data about how negation marking interacts with other syntactic properties (Zanuttini 1997; Manzini and Savoia 2005, vol. III, pp. 130–338; Garzonio 2008; Garzonio and Poletto 2018; Dagnac 2015; Palasis 2015; Pescarini and Donzelli 2017; Poletto and Olivièri 2018; Guilliot and Becerra-Zita 2019). Even though many studies have been carried out on the topic in recent times, “there are still too many unidentified factors that might play a role in the doubling mechanism” (Poletto 2016, p. 837).

The present work, like most of the previous literature, concentrates on two main factors:

Clause typing. As mentioned in Section 1, most N2s derive from NPIs, such as minimizers, and were originally confined to specific pragmatic contexts. The syntax of N2 (and its interaction with N1) in present-day dialects may therefore reflect the original conditions in which N2 underwent grammaticalization. For instance, N2 (like minimizers in general) is expected to occur less frequently in factive environments (Cinque [1976] 1991), or (like minimizers) to occur without N1 in nonveridical contexts, such as questions, imperatives, if clauses, etc. (Giannakidou 1998). An exploratory data analysis can therefore provide some preliminary indications regarding the possible role of semantic factors, such as factivity and veridicality, that can be confirmed by successive hypothesis testing.
Negative concord, i.e., the interaction between N1/N2 and NPIs, such as negative quantifiers (anything/nothing), adverbs ((n)ever, yet), coordinators (Fr. ni … ni). N2 is often in complementary distribution with NPIs, but at some datapoints, we find instances of negative concord between N2 and an NPI (Dagnac and Burnett 2016) and, more rarely, between NPIs and discontinuous clausal negation (i.e., N1 + N2).

Other factors seem to play a role in the distribution of N1 and N2, but for feasibility reasons, they cannot be properly addressed here, given the limits of the AIS and ALF datasets. However, it is worth mentioning some of the other factors that have been proposed in the literature.

Poletto (2016, pp. 842–43) argued that the inner aspect may have played a role in the diachronic change that turned negative quantifiers into N2s, as in Piedmontese dialects; in fact, in other northern Italo-Romance dialects, where this change has not yet happened, negative quantifiers can function as emphatic N2s only with activity verbs. Focus might have played a role in the grammaticalization of N2 derived from polar particles. Other factors that may correlate with the distribution of N1 and N2 are the presence of subject clitics (Ashby 1981, a.o.) or indefinite objects introduced by the preposition de. As for subject clitics, the topic has so many ramifications that a separate paper is needed (this work is limited to verifying whether person and number agreement has some effect on the distribution of N1 in the AIS). As for indefinite objects, Garzonio and Poletto (2018, p. 13) hypothesized that indefinites are introduced by de in dialects exhibiting N2 (see also Manzini and Savoia 2005, pp. 280–85), but the AIS and ALF data cannot shed light on this. Finally, imperative clauses deserve special attention. As Zanuttini (1997) pointed out, negated imperatives tend to be suppletive in dialects lacking N2. Imperatives, however, still need to be encoded in the dataset. Therefore, issues other than those listed above under 1 and 2 cannot be properly addressed in this work, which focuses on one external factor (region) and two grammatical factors (clause typing and type of NPI). The effect of person/number is briefly discussed in Section 3. The list of factors and example of specific tags/values are given in Table 3.

3. Negation Marking in Italo-Romance (AIS)

As shown in Figure 1, N2 is attested in northwestern regions such as the Aosta Valley, Piedmont, Lombardy, southern Switzerland (where both Lombard and Rhaeto-Romance dialects are spoken), Emilia–Romagna, and two southern datapoints where northern communities immigrated in the Middle Ages. Some instances of N2 are attested in Trentino and Liguria, whereas in the eastern regions, such as Friuli and Veneto, as with all central and southern varieties, there is a (predominant) N1 system. I therefore focused on the 191 AIS datapoints that belong to the former group of regions in order to obtain more robust statistical results (the relevant subset of dialects can be selected by using the “northwestern” tag in the dataset). The following two subsections focus on N1 and N2, respectively.

3.1. N1 (AIS)

As mentioned in Section 2, the sample of northern Italo-Romance dialects exhibiting discontinuous negation is quite narrow. A second issue with the AIS dataset (in fact, an issue with all linguistic atlases) is collinearity, i.e., too many tokens are elicited by using the same input sentence, and therefore exhibit the same syntactic properties. For instance, all sentences containing a certain NPI are also subjunctive clauses and vice versa. In this condition, logistic regression provides no reliable result, because the algorithm cannot dissociate the two factors at play.

To avoid collinearity (or at least limit its effect), I decided to test two separate subsets of data, as shown in Table 4: (a) indicative clauses containing NPIs and (b) all clauses containing no NPIs. In the remaining clauses, in fact, the presence/absence of N1 might be conditioned either by the presence of an NPI or by a nonveridical operator, such as a modal verb.

For each subset of data, I tested mixed models containing both geolinguistic and grammatical factors to find out whether region alone accounted for the data or mixed models including external and grammatical factors have better predictive power. I found that region is always a significant factor, as expected in a dataset containing data from tightly related languages. For this reason, the results concerning the factor region will be systematically omitted from now on. In general, however, I found that region is seldom sufficient to account for the distribution of N1 and N2, and mixed models including syntactic factors usually fit the data best. This preliminary conclusion confirms the results of previous statistical analyses of microvariation, such as those of van Craenenbroeck et al. (2019).

First of all, I tested whether the factors region and clause are significant predictors of the distribution of N1 in the subset of tokens that do not contain NPIs. Rbrul found that both factors are statistically significant (region p = 2.3 × 10⁻²⁴⁹; clause p = 0.00053). The weight of each value of clause is given in Table 5. All of the following tables are organized in the same way: for each group of factors, they report the number of tokens (i.e., the number of negative clauses per type) and the relative frequency of N1 in each type of clause (the number of clauses in which N1 is present divided by the number of clauses with available data), and the last column reports an index ranging from 0 and 1 indicating the probability that N1 will occur in a given type of sentence. Notice that probability may differ from frequency, because the former is calculated by taking into account all types of factors, e.g., both clause and region. By excluding and including factors, Rbrul tests different scenarios until it finds the model that fits the data best (the winning model can be the one without factors) and eventually weights the single factors in the distribution of the dependent variable.

Table 5 shows that (embedded) subjunctive clauses are the environment in which N1 is most probably found, whereas imperatives are the context with the lowest incidence of N1.

Imperatives, however, deserve further attention (and probably a separate analysis), because there is a great deal of variation in the ways that negative imperatives are syntactically encoded. In most dialects, negative imperatives are not obtained by adding a negative marker to the positive imperative form, but are instead expressed by a periphrasis with the verb “stay” (e.g., Ven. No sta partir, “do not leave”, lit. “not stay to leave”), a subjunctive form, or an infinitive. Zanuttini (1997, pp. 105–7) claimed that suppletive imperatives are found in dialects without N2, while in dialects in which N2 is available, negative and positive imperatives may have the same form (see Garzonio and Poletto (2018, p. 5) for apparent counterexamples and discussion). At present, I cannot verify whether Zanuttini’s generalization is confirmed or not by the AIS/ALF dataset because the data on imperatives still need to be coded in our spreadsheet. However, without a clear indication about the incidence of suppletive imperatives, it seems to me that the comparison between imperatives vs. other clauses is not trustworthy because other orthogonal factors are probably at play.

Another issue with imperatives is that they cannot exhibit subject clitics, which—according to previous corpus studies on French—may play a role in the retention of N1 (Ashby 1981). Northern Italo-Romance, as well as northern Occitan, is a promising area to investigate the relationship between subject clitics and negation. In these areas, subject clitics are mandatory (even if a DP subject occurs), but inventories of subject clitics are often defective (Poletto 2000; see Pescarini (2019) for a quantitative overview). In principle, we can therefore verify whether the probability of finding N1 increases or decreases in the contexts and dialects in which subject clitics are missing. This kind of study, however, requires a painstaking reconstruction of each clitic system, a goal that goes beyond the limits of the present paper. I therefore limited myself to checking how N1 varies depending on person (and number), a factor that might be in turn related, albeit indirectly, to the syntax of clitics. I found that person is a significant factor in a model containing the factors region, clause, and person (region p = 7.7 × 10⁻¹⁶⁹; person p = 0.0066; clause is not significant). The data in Table 6 therefore provide some first indications regarding the interaction between subject clitics and N1: the frequency of N1 is lower at the 2/3sg and higher at the 3pl and 1sg (sentences containing 1pl subjects were omitted, as 1pl clitics are easily mistaken for N1). The data in Table 6 point toward various avenues of research. Phonologically, 1sg and 3pl clitics in many dialects have a vocalic formative that provides a nucleus on which the N1 marker -n can syllabify. A higher incidence of N1 may therefore have a morphophonological explanation. Alternatively, one can argue that the ranking in Table 6 correlates with syntactic factors, as vocalic clitics often occupy a higher syntactic position than other clitics (Poletto 2000) and therefore tend to precede N1, whereas the other clitics occur between N1 and V. As previously mentioned, these issues will remain open until a fine-grained analysis of subject clitics in the AIS datapoints is carried out.

I then tested how NPIs affect the distribution of N1 in the AIS dataset. The data in Table 7 show the frequency and probability of N1 with respect to various classes of NPIs and in negative sentences without NPIs. Consider that the following probability rates were obtained in a model containing the factors region and NPI type; only the subset of indicative clauses was taken into consideration to avoid collinearity (Table 4). Both factors (region and NPI type) proved significant.

If we take negative sentences lacking NPIs as the baseline, we find that, on average, NPIs tend to favor the retention of N1, although the difference between clauses containing and not containing NPIs is quite narrow. N1 is most likely to occur when an NPI is embedded in an adjunct prepositional phrase, e.g., in nessun luogo, “in any place”.

The syntax of the adverb “yet”, which seems to disfavor N1, needs elaboration, because in most Italo-Romance dialects, it corresponds to a polysemous adverb (It. ancora) that conveys two aspectual values, repetitive (“again”) and continuative (“up till now”), that are not sensitive to polarity. Analogously, most Italo-Romance dialects do not display a polarity-sensitive alternation of the kind “still”/”yet”, except for 25 datapoints that exhibit negative adverbs, e.g., [ɲaˈmɔ], [ɲaŋˈkora], which are negative counterparts of positive adverbs [ˈmɔ] “now” and [aŋˈkora] “again”. These negative adverbs occur predominantly in dialects without N1 and might be therefore analyzed as n-words (like Eng. nothing, never, etc.) that do not need to be licensed by N1. This explains why the occurrence of N1 in clauses containing “yet” is less frequent/probable than in other negative clauses.

After removing sentences containing “yet”, I grouped the values “anymore”, “anything”, “never”, and “anywhere” (PP) together, thereby obtaining a binary factor of presence vs. absence of NPI, which proved not to be statistically significant in a model containing the factors region (p = 2.1 × 10⁻¹⁷⁰) and NPI type (as above, the model was tested on the subset of indicative clauses). This amounts to saying that the presence of generic NPIs is not predictive of the distribution of N1 in the AIS dataset. Specific NPIs, on the contrary, are good predictors of the behavior of N1, which is more likely retained when the NPI is embedded in an adjunct PP.

Before addressing N2 (Section 3.2), the remainder of this section elaborates on the occurrence of N1 in sentences with N2 doubling, i.e., sentences in which N2 co-occurs with an NPI. In languages with discontinuous negation, N2 usually triggers a double-negative reading when it co-occurs with another NPI, e.g., Fr. Il n’a pas rien vu, “It is not the case that he saw nothing”. However, in a few AIS datapoints, we find examples of N2 + NPI combinations that, instead of triggering a double negation effect, yield a pattern of bona fide negative concord. Theoretically, this may mean that in these varieties, N2 has become a fully-fledged clausal negator that is able to license NPIs (a property that normally characterizes N1-type negators, according to the preliminary typology given at the beginning of Section 1). We may therefore hypothesize that N2 doubling is allowed only if N1 is missing. In fact, however, I found a number of sentences in which both N1 and N2 co-occurred with an NPI, mostly in examples in which the NPI was the argument of a preposition (as in the case of expressions such as It. in nessun luogo, “anywhere”, literally “in no place”) and in sentences containing a “neither … nor” coordination. If PPs and coordinations are removed, the number of cases of N2 doubling drops to six. N2 doubling therefore proved to be a significant factor in the distribution of N1 (along with region; p = 0.00021). As previously mentioned, the frequency and probability of N1 were found to drop in sentences featuring N2 doubling, as shown in Table 8, although it is worth recalling that N1 is more likely to be found if the NPI is embedded in an adjunct PP.

3.2. N2 (AIS)

In order to verify whether clause typing affects N2, I excluded from my sample all tokens containing NPIs, which are often in complementary distribution with N2 (the co-occurrence of N2 and NPIs will be addressed at the end of this section). Imperatives and examples subject to collinearity were removed for the reasons discussed at the beginning of Section 3.1. Then I tested the factors region and clause, both of which had significant results with respect to the distribution of N2 (region p = 4.9 × 10⁻¹⁸⁹; clause p = 5.8 × 10⁻⁵). The weights in Table 9 indicate that core nonveridical environments such as question, if clause, and subjunctive are the contexts in which N2 is less likely to occur.

I then tested NPIs, which, as previously mentioned, are not free to co-occur with N2. The data in Table 10 confirm that not all NPIs are incompatible with N2; negative concord involving N2 is marginally allowed when NPIs are embedded in adjunct PPs or with a negative coordination of the type “neither … nor”.

3.3. Interim Conclusion (AIS)

Mixed models including grammatical and geographical factors often perform better than models containing only geographical factors. As for N1, clause, person, and NPI type, all proved significant, but I was not able to model all grammatical factors together due to collinearity.

As for clause, (embedded) subjunctive clauses proved to be the environment in which N1 is more likely retained, but no clear distinction emerged between, e.g., veridical and non-veridical contexts. The range of probability, in general, is quite narrow, and the overall ranking of values is difficult to interpret under current analyses of negation marking. Conversely, I noticed that nonveridical clauses, such as if clauses, questions, and embedded subjunctive clauses, are the contexts in which N2 occurs less frequently.

Person seems to play a role in the distribution of N1, and I briefly commented on the possible morphophonological and syntactic reasons that might link person and negation marking in dialects with subject clitics. The role of subject clitics, however, remains open to further research.

The absence vs. presence of a generic NPI does not play a significant role in the distribution of N1 in the AIS dataset, but N1 is retained more frequently in combinations with specific types of NPIs, such as adjuncts. Analogously, N2 and certain NPIs (adjunct PPs and negative coordinators) can marginally co-occur in a negative concord configuration. If NPIs are licensed by N2, N1 seldom occurs.

4. Negation Marking in Gallo-Romance (ALF)

4.1. N1 (ALF)

Table 11 shows the structure of the ALF dataset with respect to the factors clause and NPI type.

As I did with the AIS dataset in Section 3.1, I first verified whether clause is significant in determining the distribution of N1 in sentences not containing NPIs. Then, I focused on the role of NPIs in indicative and modal clauses, which occur in the dataset with and without NPIs.

First of all, I tested whether clause is significant in the distribution of N1. After removing all tokens containing NPIs, I found that the factors region and clause significantly affected the distribution of N1 (region p = ~0; clause p = 3.3 × 10⁻⁵⁹). Frequency and probability of N1 with respect to clause are reported in Table 12, which shows that indicative clauses are the context in which N1 is most disfavored (the usual caveats apply for imperatives, Section 3.1).

Then I checked the role of NPIs in indicative clauses. Rbrul showed that the factors region and NPI type are both significant in the distribution of N1 (region p = 1.4 × 10⁻²⁶⁰; NPI type p = 8.9 × 10⁻¹⁶⁵). The data in Table 13 show that N1 is more likely to be found in sentences containing NPIs. This holds particularly true for sentences with a preverbal negative quantifier (the factor nobody (subj.)); in this context, the occurrence of N1 is almost at the upper limit. In this respect, it is worth noting that the same condition cannot be checked in the AIS dataset, which does not contain a comparable sentence. It is worth noting that most Gallo-Romance languages are strict negative concord languages, in which a preverbal NPI must co-occur with N1, while most Italo-Romance dialects are weak negative concord languages, but unfortunately, we cannot ascertain the distribution of strong/weak negative concord systems in the AIS.

In terms of frequency and probability, the data in Table 13 show that besides preverbal NPIs, N1 is likely to occur with the adverb “yet”. Unlike northern Italo-Romance (Section 3.1), Gallo-Romance dialects do not have an n-word for “yet”, but the fact that the incidence of N1 is higher in sentences containing “yet” than in sentences lacking NPIs suggests that “yet”/”still” is not polarity-neutral. To better understand the status of the adverb “yet”, we need to examine its interaction with N2 (Section 4.2).

Eventually, I tested a mixed model with three factors: region, NPI type, and clause. The model was tested on indicative and modal clauses, which occur in the ALF with and without NPIs (see Table 11). The structure of the relevant subset is given in Table 14. The NPI type factor was simplified by eliminating the value “yet” and reducing the remaining values to a binary choice: presence vs. absence of NPI. Notice that the subset in Table 14 is not well-balanced, because we already know that the incidence of N1 varies depending on the type of NPI it co-occurs with. Therefore, the ±NPI condition is not uniform across clausal contexts. Bearing in mind the above caveat, I tested a model with the factors region, clause (modal vs. indicative), and NPI (presence vs. absence). Region and NPI proved significant; the data in Table 15 confirm that N1 is more likely to occur in sentences containing an NPI. The significance of clause was too changeable; it varied depending on whether sentences including preverbal “nobody” were included or not in the sample. In any case, the values of the probability index for the factor clause (modal vs. indicative) were always too close to reach any solid conclusion.

The remainder of this section focuses on the occurrence of N1 in combination with N2 and an NPI (N2 doubling). As we will see in Section 4.2, the phenomenon I dubbed N2 doubling is marginally allowed in the ALF dataset only in two contexts: with an adverb plus “anymore” and with the “ni … ni” coordination. As in the case of Italo-Romance, N2 doubling proved to be significant with respect to the retention/omission of N1 in various models. Table 16 shows that N1 is disfavored in sentences displaying N2 doubling.

4.2. N2 (ALF)

In the Gallo-Romance corpus, the incidence of N2 is at the upper limit in almost all contexts, as shown in Table 17.

The distribution of N2 in the ALF dataset is quite homogeneous even if NPIs are introduced into the model. Cases of N2 doubling, in fact, are quite sporadic, although the data in Table 18 show some interesting tendencies that can be compared with those observed in the literature (see Section 5). In the ALF dataset, “anymore” and, to a lesser extent, negative coordinators are marginally found to co-occur with N2, whereas negative quantifiers (in subject position) and the adverb guère are in complementary distribution with N2. “Yet” always co-occurs with N2; this amounts to saying that, on the one hand, the adverb encoding “yet” is not an NPI, but on the other hand, it is polarity-sensitive; witness the relatively high incidence of N1 in sentences containing “yet” (Section 4.1).

4.3. Interim Conclusions (ALF)

Clause typing has some effect on the distribution of N1 in the ALF dataset: indicatives are the context in which N1 is less retained, but the ranking of the other clause environments is difficult to interpret in the light of current theorizing. N2, by contrast, is at ceiling in all clausal contexts.

Regarding the effect of NPIs, N1 is retained more frequently in clauses containing an NPI, especially a preverbal one (a trait that is typical of strong negative concord languages). N2 sporadically co-occurs with non-argumental NPIs. If an NPI and N2 co-occur, N1 is often, but not always, dropped.

5. Combining Datasets and Reducing Factors

In doing the statistical analysis in Section 3 and Section 4, I kept the data from the AIS and ALF separated for practical and methodological reasons. Although they seem contiguous on a map (Figure 1 and Figure 2), Gallo- and Italo-Romance systems are separated by the Alps, therefore it is reasonable to hypothesize that the conditions ruling the distribution of NI/N2 in the two areas are not necessarily alike. For the sake of clarity, Table 19 resumes the conclusions of Section 3 and Section 4.

Despite some difficulties and limitations, the preliminary observations made in Section 3 and Section 4 allowed me to reach some provisional conclusions regarding both methodological and theoretical aspects. First and foremost, I proved that mixed models including grammatical factors always perform better than models that are based entirely on geographical factors. Both types of factors (grammatical and geographical) can be improved and refined, but it seems reasonable to conclude that combined models fit the data best (see van Craenenbroeck et al. (2019) for similar conclusions).

Second, by examining the distribution of N1 and N2 in the two datasets, I noticed that the AIS and ALF are not homogeneous test beds, especially in the analysis of N1. Northwestern Italy provides no clear indication regarding the “loss” of N1, as most AIS datapoints exhibit either N1 or N2, whereas the ALF is characterized by a relatively higher incidence of languages with discontinuous negation. Differences between linguistic areas can be discovered or confirmed by means of an analysis of variance. Alternatively, the data from the AIS and ALF can be merged in order to obtain better statistical results. This holds particularly true for one group of factors (clause type) whose effects are not clear and do not converge in Gallo- and Italo-Romance dialects (see Table 19).

Furthermore, to strengthen the statistical results, we may also want to focus, deductively, on only those specific factors that, according to the recent literature, are expected to affect the incidence of N1 and N2. This kind of approach exceeds the scope of the exploratory data analysis carried out in this paper. However, to get the gist of possible hypothesis testing approaches, I would like to elaborate here on two specific topics.

The first is the role of veridicality, which is often considered a factor affecting negation marking. With our dataset, we can test whether the incidence of N1 and N2 varies significantly in two subgroups of clauses, indicative (declarative) clauses vs. interrogatives, without testing the whole AIS/ALF dataset. I found that even if all other types of clauses were removed and data from the AIS and ALF were merged, the distinction between indicative declaratives and interrogatives was not significant with respect to the incidence of N1 in a model with the factors clause and region. Interestingly, however, clause proved to be statistically significant with respect to the distribution of N2 (0.015), which is more likely to occur in indicatives than in questions, as shown in Table 20.

If N2 were analyzed as an NPI (recall that most N2s etymologically derive from NPIs, such as minimizers), we would expect N1 to be disfavored in nonveridical contexts such as questions, where NPIs can be licensed even if N1 is missing, as in English. Since the incidence of N1 in questions is not lower than in declaratives (which have a higher incidence of N2), we can exclude that veridicality has any role in the loss/retention of N1 across Romance vernaculars.

The second topic I would like to address concerns negative concord. The data from the AIS and ALF seem to converge (Table 19), but both atlases provide incomplete indications. By merging the data from both atlases, we can thus fill some gaps and eventually obtain a contingency table reporting the frequency of each pattern of negation (absence of negation, N1, N2, discontinuous negation) per type of NPI (Table 21).

The data in Table 21 are plotted in Figure 4.

Figure 4 shows that negation marking varies significantly depending on the type of NPI. Preverbal quantifiers are in complementary distribution with N2, and in strict negative concord languages always require N1. Postverbal quantifiers (and the adverb “never”) do not trigger negative concord in around 50% of the sentences, whereas negative coordinators and the adverb “anymore” favor negative concord with N1. “Anywhere” stands out because it co-occurs with N2 quite frequently, and it seldom co-occurs with both N1 and N2.

Dagnac and Burnett (2016, and references therein) report similar data from Picard dialects and Montréal French, which are given in Table 22. Notice that while N1 is completely lost in Montréal French, Picard dialects still retain N1, but this seems completely orthogonal to the co-occurrence of N2 and NPIs.

The AIS/ALF data in Table 21 and Dagnac and Burnett’s (2016) data in Table 22 confirm that “anywhere” is the type of NPI with which N2 co-occurs more frequently. Table 21 seems to differ from Table 22 with respect to the behavior of “anybody”, which in Montréal and Picard triggers negative concord with N2 with relatively high frequency, whereas in the ALF “anybody” never co-occurs with N2. However, recall that in the ALF corpus, “anybody” is always a preverbal subject, whereas Dagnac and Burnett reported aggregate data. They nonetheless noticed that preverbal position blocks (in Montréal French) or disfavors (in Picard) negative concord with N2 (Dagnac and Burnett 2016, p. 9).

Another difference between Table 21 and Table 22 concerns the relatively higher incidence of N2 doubling with “anymore”. This may be a particularity of certain geolinguistic areas. In fact, most sentences with N2 doubling in the AIS/ALF are concentrated in two specific areas, shown in Figure 5, but crucially, they are almost unattested in the Picard area studied by Dagnac and Burnett (2016).

The fact that NPIs and N2 can co-occur in certain dialects and with certain NPIs deserves further attention, but it indicates that N2s are marginally involved in negative concord configurations. This provides further evidence supporting the hypothesis that N2s are not always polarity-sensitive items but fully fledged negative elements that conform with the licensing of NPIs under certain syntactic conditions, which need to be clarified.

6. Conclusions and Open Issues

The focus of this paper was twofold: methodological and descriptive. On the methodological side, this work was aimed at exploring the feasibility of applying multiple linear regression to syntactic raw data contained in traditional dialectological atlases. I showed that in order to perform statistical analysis, we need to adapt datasets, merge various primary sources, and aggregate factors in order to avoid collinearity and obtain significant statistical results.

On the descriptive side, I focused on the distribution of N1 and N2 across central Romance dialects and across syntactic contexts. I proved that models including both geolinguistic and grammatical factors always fit the data better than models in which grammatical variables are not taken into account.

In my study, I focused on two kinds of grammatical variables: (i) clause typing (in particular, I tried to verify whether veridicality affects the distribution of N1/N2) and (ii) negative concord, i.e., the interaction between N1/N2 and NPIs. My preliminary results indicate that the role of veridicality in negation marking is not proven with respect to the loss of N1.

On average, the presence of NPIs correlates positively with the presence of N1, but not all NPIs favor N1. Certain NPIs (coordinators and adverbs) favor negative concord with N1. Preverbal quantifiers disfavor negative concord with N2. Adverbial PPs allow concord with N2 and marginally with N1 and N2. These conclusions are in line with the literature on Gallo-Romance (Dagnac and Burnett 2016).

I mentioned in Section 2 and Section 3.1 that many empirical and theoretical questions will remain open because we still lack detailed metadata concerning other grammatical phenomena that may interact with negation marking, such as the morphosyntax of imperatives, the distribution of subject clitics, verb movement (e.g., in infinitives), the role of N2 in licensing indefinite objects introduced by the preposition de, etc. Some of these properties are arguably related to the typology/etymon/shape of N2s, which needs to be encoded in our dataset.

To shed light on these issues, we also need to integrate data from more recent dialectological enterprises, such as the Syntactic Atlas of Italy (ASIt), the Thesaurus Occitan (Thesoc), or the wealth of data published in reference works such as Manzini and Savoia (2005). I believe it is worth distinguishing the data elicited from monolingual speakers in the late 19th and early 20th century (such as most of those interviewed for the AIS/ALF) from the data gathered in the late 20th and early 21st century from speakers that, to various extents, were probably dialect/standard language bilingual. Therefore, in a separate study, I will perform a comparison between the two sets of data in order to verify whether and to what extent negation marking has changed over a century.

Funding

This research was funded by the IDEX program of the Université Côte d’Azur (CSI Recherche 2021).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset analyzed in this work can be downloaded here: https://zenodo.org/record/5820466#.YfkaTPjjJc8 (accessed on 16 February 2021). The file is shared under the Creative Commons Attribution CC BY-NC-SA license.

Acknowledgments

I wish to thank three anonymous reviewers and audiences in Zurich and Cambridge for their feedback.

Conflicts of Interest

The author declares no conflict of interest.

Note

1

In languages with residual V-to-C movement, such as Rhaeto-Romance dialects, subject (clitic) inversion is barred when N2 is fronted as in (i) b:

(i)	a.	sɐ	mwɐnˈtaːv	-ɐl	ˈbecɐ	ple (AIS: map 1665; Camischolas)
		refl=	move.impf.3sg	=he	N2	anymore
	b.	Al	ˈbekɐ	sa	mwɐnˈtau̯	ple (AIS 1665; Ems)
		he=	N2	refl=	move.past.3sg	anymore

References

Ashby, William. 1981. The loss of the negative particle ne in French: A syntactic change in progress. Language 57: 674–87. [Google Scholar] [CrossRef]
Breitbarth, Anne. 2013. Indefinites, Negation and Jespersen’s Cycle in the History of Low German. Diachronica 30: 171–201. [Google Scholar] [CrossRef] [Green Version]
Cinque, Guglielmo. 1991. Mica: Note di sintassi e pragmatica. In Teoria Linguistica e Sintassi Italiana. Edited by Guglielmo Cinque. Bologna: Il Mulino, pp. 311–23. First published 1976. [Google Scholar]
Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A Cross-Linguistic Perspective. Oxford: Oxford University Press. [Google Scholar]
Dagnac, Anne. 2015. Pas, mie, point et autres riens : De la négation en picard. In La Négation: Etudes Linguistiques, Pragmatiques et Didactiques. Edited by Mariana Pitar et Jan Goes. Arras: Artois Presses Université, pp. 129–52. [Google Scholar]
Dagnac, Anne, and Heather Burnett. 2016. Concordance négative optionnelle: Contrastes forts et faibles entre picard et québécois. Web of Conferences 27: 13003. [Google Scholar] [CrossRef] [Green Version]
Garzonio, Jacopo. 2008. A case of incomplete Jespersen’s cycle in Romance. Rivista di Grammatica Generativa 33: 117–35. [Google Scholar]
Garzonio, Jacopo, and Cecilia Poletto. 2018. Exploiting microvariation: How to make the best of your incomplete data. Glossa: A Journal of General Linguistics 3: 112. [Google Scholar] [CrossRef] [Green Version]
Giannakidou, Anastasia. 1998. Polarity Sensitivity as (Non)Veridical Dependency. Amsterdam: Benjamins. [Google Scholar]
Gilliéron, Jules, and Edmond Edmont. 1902–1910. Atlas Linguistique de la France. Paris: Champion. [Google Scholar]
Guilliot, Nicolas, and Samantha Becerra-Zita. 2019. Negative Concord and Sentential Negation in Gallo. In Romance Languages and Linguistic Theory 15. Edited by Ingo Feldhausen, Martin Elsig, Imme Kuchenbrandt and Mareike Neuhaus. Amsterdam: Benjamins, pp. 54–71. [Google Scholar]
Jaberg, Karl, and Jakob Jud. 1928–1940. Sprach- und Sachatlas Italiens und der Südschweiz. Bern: Zofingen. [Google Scholar]
Jespersen, Otto. 1917. Negation in English and Other Languages. København: A. F. Høst & Søn. [Google Scholar]
Johnson, Daniel Ezra. 2009. Getting off the GoldVarb Standard: Introducing Rbrul for Mixed-Effects Variable Rule Analysis. Language and Linguistics Compass 3: 359–83. [Google Scholar] [CrossRef]
Larrivée, Pierre. 2008. The pragmatic motifs of the Jespersen cycle: Default, activation, and the history of negation in French. Lingua 120: 2240–58. [Google Scholar] [CrossRef] [Green Version]
Manzini, Maria Rita, and Leonardo Savoia. 2005. I dialetti Italiani e Romanci. Morfosintassi Generativa. Alessandria: Edizioni dell’Orso. [Google Scholar]
Manzini, Maria Rita, and Leonardo Savoia. 2012. Sentential negation in Piedmontese varieties. Quaderni di Lavoro ASIt 13: 135–59. [Google Scholar]
Palasis, Katerina. 2015. Subject clitics and preverbal negation in European French: Variation, acquisition, diatopy and diachrony. Lingua 161: 125–43. [Google Scholar] [CrossRef]
Pescarini, Diego. 2019. Microvariation and Microparameters. Some Quantitative Remarks. Quaderni di Linguistica e Studi Orientali 5: 255–77. [Google Scholar]
Pescarini, Diego. 2022. Negation and Negative Concord in AIS and ALF Examples. January 1. Available online: https://zenodo.org/record/5820466#.YkVn4jURWUk (accessed on 5 January 2022).
Pescarini, Diego, and Giulia Donzelli. 2017. La negazione nei dialetti della Svizzera italiana. Vox Romanica 76: 74–96. [Google Scholar] [CrossRef]
Pescarini, Diego, and Nicoletta Penello. 2012. L’avverbio mica fra widening semantico e restrizioni sintattiche’. In Linguaggio e Cervello/Semantica, Paper presented at XLII Annual Meeting of the Società di Linguistica Italiana, Pisa, Italy, September 25–27. Edited by Pier Marco Bertinetto and Valentina Bambini e Irene Ricci. Rome: Bulzoni, ISBN 978-88-7870-652-1. [Google Scholar]
Poletto, Cecilia. 2000. The Higher Functional Field. Evidence from Northern Italian Dialects. Oxford: Oxford University Press. [Google Scholar]
Poletto, Cecilia. 2016. Negation. In Oxford Guide to the Romance Languages. Edited by Martin Maiden and Adam Ledgeway. Oxford: Oxford University Press. [Google Scholar]
Poletto, Cecilia, and Leonie Cornips. 2005. On standardising syntactic elicitation techniques. Part I. Lingua 115: 939–57. [Google Scholar]
Poletto, Cecilia, and Michèle Oliviéri. 2018. Negation patterns across dialects. In Structuring Variation in Romance Linguistics and Beyond: In Honour of Leonardo M. Savoia. Edited by Mirko Grimaldi, Rosangela Lai, Ludovico Franco and Benedetta Baldi. Amsterdam: Benjamins, pp. 133–48. [Google Scholar]
Schwenter, Scott A. 2005. The pragmatics of negation in Brazilian Portuguese. Lingua 115: 1427–56. [Google Scholar] [CrossRef]
van Craenenbroeck, Jeroen, Marjo van Koppen, and Antal van den Bosch. 2019. A quantitative-theoretical analysis of syntactic microvariation: Word order in Dutch verb clusters. Language 95: 333–370. [Google Scholar] [CrossRef]
van Gelderen, Elly. 2008. Negative Cycles. Linguistic Typology 12: 195–243. [Google Scholar] [CrossRef]
Zanuttini, Raffaella. 1997. Negation and Clausal Structure: A Comparative Study of Romance Languages. Oxford: Oxford University Press. [Google Scholar]

Figure 1. Distribution of N1 and N2 in declarative main clauses: dialects exhibiting N1 (purple dots), discontinuous negation (green dots), and N2 (yellow dots) among 1046 ALF/AIS datapoints.

Figure 2. Distribution of N1/N2 across syntactic contexts among 1046 ALF/AIS datapoints.

Figure 3. Dataset. Each row corresponds to an observation (i.e., a reply to the questionnaire).

Figure 4. Patterns of negation marking per type of NPI (ALF + AIS datasets).

Figure 5. Distribution of N2 doubling patterns (ALF + AIS dataset).

Table 1. Overview of dataset.

Source	Negative Sentences	Datapoints	Tokens
ALF	22	639	13,185
AIS	24	407	9315
Total	46	1046	22,500

Table 2. Counts of tokens per negation system (excluding clauses containing NPIs).

Source	N1	N1 + N2	N2
ALF	64	4404	3416
AIS: northwestern	559	190	1100
AIS: others	2050	5	18
Total	2673	4599	4534

Table 3. Factors.

Factor	Tags
Region	e.g., Lombardy (AIS), Burgundy (ALF)
Clause typing	subjunctive, question, if clause, etc.
Type of NPI	never, anymore, nobody, etc.
Person number	1sg, 2sg, etc.

Table 4. Number of tokens per sentence/NPI type (AIS western dataset).

Clause	NPI (Type)	No NPI
Subjunctive	0	382
Imperfect	191 (anymore)	0
Conditional	0	191
Indicative	199 (anything) 191 (anywhere (PP)) 945 (never) 191 (yet)	193
Future	0	191
Question	0	191
Modal	4 (anymore) 191 (neither … nor)	187
Imperative	0	764

Table 5. Frequency/probability of N1 in different types of clauses in a model with factors region and clause (AIS western dataset, excluding sentences containing NPIs; see Table 4).

Clause	Tokens	Frequency of N1	Rbrul Probability
Subjunctive	331	0.49	0.64
Indicative	193	0.44	0.54
Question	168	0.42	0.50
Future	187	0.42	0.50
Conditional	174	0.41	0.49
Modal	161	0.37	0.42
Imperative	641	0.35	0.41

Table 6. Frequency of N1 with regard to person (AIS, excluding sentences containing NPIs).

Person Number	Tokens	Frequency of N1	Rbrul Probability
3pl	331	0.49	0.62
1sg	371	0.43	0.49
2sg	168	0.42	0.47
3sg	344	0.39	0.42

Table 7. Frequency/probability of N1 with regard to NPIs in a model containing factors region and NPI type (AIS dataset, northwestern datapoints, only declarative clauses).

NPI Type	Tokens	Frequency of N1	Rbrul Probability
Anywhere (PP)	175	0.49	0.60
Never	836	0.47	0.57
Absence of NPI	193	0.44	0.50
Anything (obj.)	179	0.41	0.47
Yet	174	0.37	0.36

Table 8. Frequency/probability of N1 with regard to N2 doubling in a model containing factors region and N2 doubling (AIS dataset, northwestern).

N2 Doubling	Tokens	Frequency of N1	Rbrul Probability
Absent	3414	0.44	0.64
Present	140	0.13	0.36

Table 9. Frequency/probability of N2 with regard to clause in a model with factors region and clause (AIS dataset, only northwestern datapoints, excluding tokens containing NPIs and collinear values).

Clause	Tokens	Frequency of N2	Rbrul Probability
Conditional	174	0.75	0.65
Future	187	0.74	0.63
Modal	161	0.71	0.53
Indicative	342	0.68	0.49
Question	168	0.65	0.42
If clause	174	0.65	0.41
Subjunctive	477	0.62	0.37

Table 10. Frequency/probability of N2 with regard to NPIs in a model containing factors region and NPI type (AIS dataset, only northwestern datapoints, excluding “yet” and collinear values).

Clause	Tokens	Frequency of N2	Rbrul Probability
Anywhere (PP)	175	0.36	0.79
Neither … nor	160	0.36	0.78
Anything (obj.)	172	0.09	0.27
Never	836	0.05	0.17

Table 11. Number of tokens per sentence/NPI type (ALF dataset).

Clause	NPI (Type)	No NPI
Conditional	0	638
Imperative	0	638
Imperfect	638 (anymore)	0
Indicative	638 (anymore) 359 (nobody) 638 (yet)	3550
Infinitive	1276	0
Question	0	638
Modal	638 (neither … nor)	1276

Table 12. Frequency/probability of N1 in different types of clauses in a model with factors region and clause (ALF dataset, excluding sentences containing NPIs).

Clause	Tokens	Frequency of N1	Rbrul Probability
Imperative	568	0.79	0.80
Conditional	637	0.66	0.57
Infinitive	1275	0.58	0.44
Modal	1265	0.56	0.40
Question	635	0.54	0.38
Indicative	3537	0.52	0.37

Table 13. Frequency/probability of N1 with regard to NPI type in a model with factors region and NPI type (ALF dataset, only indicative clauses).

NPI	Tokens	Frequency of N1	Rbrul Probability
Nobody (subj.)	359	0.96	0.93
Yet	637	0.78	0.45
Anymore	633	0.71	0.33
Absence of NPI	3537	0.52	0.14

Table 14. Subset of ALF sentences.

Clause	NPI	No NPI
Indicative	638 (anymore) 359 (nobody)	3550
Modal	638 (neither … nor)	1276

Table 15. Frequency/probability of N1 with regard to factor ±NPI in a model with factors region, clause, ±NPI (ALF subset in Table 14).

NPI	Tokens	Frequency of N1	Rbrul Probability
Present	1629	0.72	0.66
Absent	4802	0.53	0.34

Table 16. Frequency/probability of N1 with regard to factor N2 doubling in a model with factors region and N2 doubling (ALF dataset, only sentences containing an NPI).

N2 Doubling	Tokens	Frequency of N1	Rbrul Probability
Absent	2698	0.74	0.66
Present	206	0.41	0.34

Table 17. Numbers of sentences containing/not containing N2 (ALF dataset).

Clause	No N2	N2
Conditional	3	634
If clause	6	630
Imperative	5	561
Indicative	3	3501
Infinitive	2	3501
Modal	90	630
Question	5	630

Table 18. Incidence of N2 with regard to NPI type (ALF dataset).

Clause	Tokens	Frequency of N2
Yet	636	0.99
Anymore	1846	0.14
Neither … nor	632	0.09
Nobody (subj.)	359	0.01
Guère	343	0.00

Table 19. Interim conclusions.

Factor	Effects on Variables
Clause Typing	N1 varies across contexts, but not as predicted by current analyses of the Jespersen cycle. N2 seems to be disfavored in nonveridical clauses (AIS), whereas it is always at ceiling in the ALF.
Negative Concord	N1 and N2 are favored when co-occurring with non-argumental NPIs. N1 is mandatory with preverbal NPIs in the ALF, thus confirming the strong negative concord status of Gallo-Romance dialects. N1 seldom occurs if both N2 and an NPI are present.

Table 20. Frequency/probability of N2 with regard to a model containing factors clause and region (AIS western and ALF datasets, excluding sentences containing NPIs).

	Tokens	Frequency of N2	Rbrul Probability
Indicative	3697	0.98	0.60
Question	803	0.92	0.40

Table 21. Incidence of N2 doubling with regard to NPI types in clauses containing/not containing N1 (AIS western and ALF datasets).

NPI	0	N1	N2	N1 + N2
Anymore	0.29	0.59	0.07	0.05
Anywhere	0.22	0.42	0.29	0.07
Neither … nor	0.29	0.57	0.13	0.02
Anything (post-V)	0.51	0.41	0.07	0.01
Never	0.48	0.47	0.05	0.01
Nobody (pre-V)	0.04	0.95	0.00	0.01

Table 22. Incidence of N2 with various types of NPI in Montréal French and Picard.

Clause	Montréal	Picard
Anywhere	0.83	0.90
Anybody	0.59	0.33
Anything	0.15	0.13
Any	0.11	-
Never	0.1	0.10
Anymore	*	0.01

* Ungrammatical; - unattested. Adapted from Dagnac and Burnett (2016).

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pescarini, D. A Quantitative Approach to Microvariation: Negative Marking in Central Romance. Languages 2022, 7, 87. https://doi.org/10.3390/languages7020087

AMA Style

Pescarini D. A Quantitative Approach to Microvariation: Negative Marking in Central Romance. Languages. 2022; 7(2):87. https://doi.org/10.3390/languages7020087

Chicago/Turabian Style

Pescarini, Diego. 2022. "A Quantitative Approach to Microvariation: Negative Marking in Central Romance" Languages 7, no. 2: 87. https://doi.org/10.3390/languages7020087

Article Menu

A Quantitative Approach to Microvariation: Negative Marking in Central Romance

Abstract

1. Introduction

2. Materials and Methods

3. Negation Marking in Italo-Romance (AIS)

3.1. N1 (AIS)

3.2. N2 (AIS)

3.3. Interim Conclusion (AIS)

4. Negation Marking in Gallo-Romance (ALF)

4.1. N1 (ALF)

4.2. N2 (ALF)

4.3. Interim Conclusions (ALF)

5. Combining Datasets and Reducing Factors

6. Conclusions and Open Issues

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Note

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI