*Article* **Island Extractions in the Wild: A Corpus Study of Adjunct and Relative Clause Islands in Danish and English**

**Christiane Müller \* and Clara Ulrich Eggers**

School of Communication and Culture, Aarhus University, DK-8000 Aarhus, Denmark; cue@cc.au.dk

**\*** Correspondence: christiane.muller@cc.au.dk

**Abstract:** Adjuncts and relative clauses are traditionally classified as strong islands for extraction across languages. However, the Mainland Scandinavian (MSc.) languages have been reported to differ from e.g., English in allowing extraction from adjunct and relative clauses. In order to investigate the distribution of possible island extractions in these languages based on naturally produced material, we conducted two exploratory corpus studies on adjunct and relative clause extraction in Danish and in English. Results suggest that both extraction from finite adjuncts and from relative clauses appears at a non-trivial rate in naturally produced Danish, which supports the claim that these structures are not strong islands in Danish. In English, we also found a non-trivial amount of examples displaying extraction from finite adjuncts, as well as a small number of cases of relative clause extraction. This finding presents a potential challenge to the claim that English differs from MSc. in never allowing extraction from strong islands. Furthermore, our results show that both languages appear to share certain trends that can be observed in the extraction examples regarding the type of extraction dependency, the type of adjunct clause featured in adjunct clause extraction, and the type of matrix predicate featured in relative clause extraction.

**Keywords:** adjunct clauses; corpus study; Danish; English; islands; relative clauses

#### **1. Introduction**

Adjunct clauses and relative clauses have traditionally been considered *strong islands* for extraction across languages, based on data such as (1) from English.


In (1a) and (1b), a dependency cannot be established between the initial phrase *Who* (1a) or *Which book* (1b) and its thematic position in the embedded clause (\_*i*). The unacceptability of extraction from adjuncts and relative clauses has traditionally been captured by syntactic island constraints such as the *Adjunct Condition* (Cattell 1976; Huang 1982; Chomsky 1986) and the *Complex NP Constraint* (Ross 1967), banning adjunct and relative clause extraction universally.

However, based on examples like (2), it has been found that Danish and the other Mainland Scandinavian (MSc.) languages allow extraction from both adjunct clauses (2a) and relative clauses (2b) under certain conditions.


**Citation:** Müller, Christiane, and Clara Ulrich Eggers. 2022. Island Extractions in the Wild: A Corpus Study of Adjunct and Relative Clause Islands in Danish and English. *Languages* 7: 125. https://doi.org/ 10.3390/languages7020125

Academic Editors: Anne Mette Nyvad, Ken Ramshøj Christensen, Juana M. Liceras and Raquel Fernández Fuertes

Received: 8 February 2022 Accepted: 5 May 2022 Published: 18 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Examples such as (2) appear to violate the Adjunct Condition and the Complex NP Constraint, respectively, yet are intuitively acceptable to native speakers. Prima facie, this seems to suggest that there is cross-linguistic variation in island effects, thus challenging the assumption that island constraints are universal.

Recent investigations on the possibility of island extraction in MSc. have primarily focused on formally and informally collected judgments of constructed sentences. However, even though native speakers agree in describing such structures as acceptable in MSc. and examples are easily found in authentic material (e.g., Jensen 1998, 2002; Lindahl 2017; Kush et al. 2021), formal studies of adjunct and relative clause islands in MSc. commonly yield acceptability ratings that are unexpectedly low (e.g., Poulsen 2008; Christensen and Nyvad 2014; Müller 2015, 2019; Tutunjian et al. 2017; Wiklund et al. 2017; Kush et al. 2018). This apparent mismatch in formal and informal data raises questions about how this contrast can be explained and under which conditions extraction is possible in MSc., and shows that the findings from acceptability studies in this field should be complemented with data obtained using different methods. At the same time, there is also a range of anecdotal evidence of purportedly acceptable extractions from finite adjunct clauses and relative clauses in English (e.g., Chaves and Putnam 2020), raising further questions as to how different English and MSc. really are in this regard.

To further investigate the space of possible island extractions in these languages, we conduct an exploratory corpus study on adjunct clause and relative clause extraction in one of the MSc. languages, Danish, and in English. We expect that an exploration of naturally produced examples of island extractions will provide further insights into what is in fact possible in Danish and English, and how the distribution of possible island extractions can be characterized. To contextualize our corpus study, we first provide a review of recent research on the topic of strong island extraction in the MSc. languages and in English, and identify some remaining open questions with regard to this topic.

#### **2. Background**

#### *2.1. Island Extraction in MSc.*

Island constructions in the MSc. languages started to attract attention among linguistic researchers in the 1970s and 1980s, when reports accumulated that these languages are unusually permissive with regard to extraction from relative clauses (e.g., Erteschik-Shir 1973, 1982; Maling and Zaenen 1982; Taraldsen 1982; Engdahl 1997) and adjunct clauses (e.g., Hagström 1976; Anward 1982; Jensen 1998, 2002). It is now seen as uncontroversial that extraction from at least some types of adjunct clauses and relative clauses is acceptable in MSc. (although the data on adjunct clause extraction is a bit more sparse than on relative clause extraction). The sentences in (3–4) show some Danish examples reported in the literature.


(Hansen and Heltoft 2011, p. 1815)


However, in subsequent formal investigations of island extraction in MSc., a more nuanced picture has emerged.

First, even though adjunct and relative clause extractions are intuitively acceptable to speakers of MSc. and occur in spontaneous speech, formal experimental studies commonly report that MSc. participants rate such extractions as less acceptable than one would expect if adjunct and relative clauses were not islands in these languages. Formal judgments for extraction data thus appear inconsistent with the informal judgments reported in the literature. For instance, sentences involving extraction from relative clauses are intuitively acceptable in Swedish (Allwood 1982; Teleman et al. 1999) and Danish (Erteschik-Shir 1973; Nyvad et al. 2017), but scored unexpectedly low ratings in experimental studies by Christensen and Nyvad (2014), Müller (2015), Tutunjian et al. (2017), and Wiklund et al. (2017). A similar situation obtains for adjunct clause extractions, which have been reported to be acceptable, at least for certain types of adjuncts, in Swedish (e.g., Anward 1982; Teleman et al. 1999) and Danish (Jensen 1998, 2002; Nyvad et al. 2017), but yielded acceptability ratings on the lower end of the scale in the formal studies by Müller (2019) on Swedish and by Poulsen (2008) on Danish. In a formal study for Norwegian by Kush et al. (2018), extraction from relative clauses and from conditional adjunct clauses yielded not only acceptability scores that were at the bottom end of the scale, but also superadditive island effects.1 It is still debated how these mismatching data from formal and informal judgments can be explained.

Second, a few recent studies of the phenomenon indicate that the acceptability of both adjunct and relative clause extraction in MSc. appears to vary as a function of several factors, among others the specific type of adjunct clause (in the case of adjunct clause extraction), the matrix verb (in the case of relative clause extraction), and the type of extraction dependency.

Specifically, in recent studies on Norwegian (Bondevik et al. 2020) and Swedish (Müller 2017), topicalization from conditional adjunct clauses headed by *om* 'if' and from temporal adjuncts headed by *når* 'when' (in Norwegian) or *efter* 'after' (in Swedish) yielded ratings on the upper half of the scale, whereas topicalization from clauses introduced by *fordi* 'because' in Norwegian and *eftersom* 'because' in Swedish received ratings on the low end of the scale. This has given rise to conjectures that different types of adjunct clauses, or adjunct clauses headed by different subordinators, may vary with regard to extraction possibilities (e.g., Bondevik 2018; Bondevik et al. 2020; Müller 2017, 2019).

In the case of relative clause extraction, it has repeatedly been observed that extraction may be more acceptable or more common with certain types of matrix verbs or certain embedding environments (Erteschik-Shir 1982; Engdahl 1997; Hofmeister and Sag 2010; Kush et al. 2013; Löwenadler 2015). Lindahl (2017) investigated relative clause extraction in Swedish based on a collection of naturally occurring examples and found that the major part (around 75%) of her sample of 100 extractions involved extraction from an existential/presentational relative clause, see the example in (5).


(Lindahl 2017, p. 77)

Existential/presentational relative clauses serve to introduce new referents into the discourse and typically feature an existential verb such as *be* as the matrix predicate introducing the relative clause. Lindahl also found examples of relative clause extraction involving non-existential main verbs in her sample, e.g., with *beundra* 'admire' (6a) or *störa sig på* 'be annoyed by' (6b), but these occurred much less frequently.


Finally, there are indications that the acceptability of island extraction in MSc. might also depend on the type of extraction. Studies on Norwegian by Kush et al. (2018, 2019) and Bondevik et al. (2020) suggest that topicalization from adjunct clauses is easier to accept than *wh*-question formation. Lindahl (2017, p. 47) found that in her sample of 100 naturally occurring relative clause extractions in Swedish, topicalization was by far the most common form of extraction (with 93 cases), followed by relativization with 7 instances. None of her examples involved *wh*-extraction from the relative clause, although Lindahl (2017, p. 166) claims that it is possible to construct acceptable examples (see Engdahl 1997 for a similar observation).

Apart from Lindahl (2017), the only other systematic studies of naturally occurring island extractions in MSc. to our knowledge are Lindahl (2010), Jensen (1998, 2002) and Kush et al. (2021). By and large, these studies seem to lend support to the trends previously observed. Lindahl (2010) collected instances of extraction from presentational/existential relative clauses in the Swedish *Parole corpus* at *Språkbanken* and found 134 examples involving topicalization (other dependency forms were not included in the search). Jensen (1998, 2002) has investigated extraction in Danish, based on a material comprised of 9 interviews from the spoken language corpus BySoc, 2 conversations and an excerpt from a TV program. In the 18 h of spoken Danish that her material amounts to, Jensen found 230 instances of extraction from embedded clauses. Only 10 out of these 230 extraction instances involved extraction from a relative clause, and the only case of adjunct clause extraction found involved a non-finite adjunct clause (Jensen 1998, p. 18). Finite adjunct clause extraction is thus unattested in Jensen's material, although Jensen claims that extraction from finite adjunct clauses is acceptable in Danish under certain conditions, based on introspective judgments and examples reported in the traditional literature. While relative clause extraction does appear in her sample, it occurs at a much lower frequency than extraction from declarative complement clauses. Furthermore, all 10 cases of relative clause extraction involved extraction from an existential/presentational relative clause with *være* 'be' as the matrix predicate, see the examples in (7) (glosses our own).


(Jensen 1998, p. 23)

Kush et al. (2021) recently investigated the occurrence of relative clause extraction (as well as extraction from embedded questions) in Norwegian by searching a child fiction corpus for extraction instances. Relative clause extraction was attested in Kush et al.'s material, however, with 63 examples considerably less frequently than extraction from nonisland declarative complement clauses (411 instances, Kush et al. 2021, p. 17). Their findings moreover showed that relative clause extraction was rather restricted in terms of the type of extraction dependency: In all cases of relative clause extraction found, the extracted element was topicalized. *Wh*-extraction and relativization from relative clauses remained unattested in their sample. In that, relative clauses differed from complement clauses, for which all three types of extraction dependencies occurred at a comparable rate in Kush et al.'s material. Furthermore, the relative clause extractions attested in Kush et al.'s (2021) corpus study also seemed to be very restricted in terms of the embedding environment, as all of the cases of relative clause extraction involved an existential/presentational relative clause (8a) or an *it*-cleft construction (8b).


(Kush et al. 2021, p. 25)

The corpus studies by Lindahl (2010), Jensen (1998, 2002) and Kush et al. (2021), as well as Lindahl's (2017) collection, show that investigations of naturally produced long-distance dependencies can complement acceptability judgment experiments and provide additional evidence regarding the conditions on relative clause and adjunct clause extraction. However, additional corpus studies are needed to increase our understanding of the phenomenon.

#### *2.2. Island Extraction in English*

Both finite adjunct clauses and relative clauses are standardly treated as strong islands in English, banning all extraction (see examples in 1). Truswell (2007, 2011) claims that extraction is acceptable from non-finite adjuncts in English, provided that the matrix and adjunct VP are parts of a single event. According to Truswell, this is the case when the matrix and the adjunct clause event can be interpreted to be related by a *contingent relation* (e.g., causation or enablement), as opposed to a purely temporal relation. However, English disallows any extraction from finite adjuncts according to Truswell, regardless of the semantic relation between matrix and adjunct clause (see also Ernst 2022). For instance, extraction is reported to be acceptable from the non-finite *after*-adjunct clause in (9a) under the premise that John's going home was caused by him talking to someone, but extraction is unacceptable from the finite counterpart of that clause in (9b) (from Truswell 2007, p. 166; see also Manzini 1992).

9. a. [Who]*<sup>i</sup>* did John go home [after talking to *\_i*]?

b. \*[Who]*<sup>i</sup>* did John go home [after he talked to \_*i*]?

At first glance, formal acceptability studies seem to support the picture that finite adjunct clauses are islands in English: *Wh*-extraction from *if*-adjunct clauses received ratings on the lower end of the scale in a study by Sprouse et al. (2012), and similarly, *wh*-extraction from *after*-clauses was rated below the mid-point of the scale in studies by Michel and Goodall (2013) and Müller (2019), even when a causal interpretation of matrix and adjunct clause was available. However, Sprouse et al. (2016) compared *wh*-extraction and relativization out of adjunct clauses in English and found that relativization from *if*-clauses does not result in island effects in terms of the factorial definition of islands (although ratings for both relativization and *wh*-extraction remained relatively low). Nyvad et al. (2022) tested extraction in the form of relativization from *if*-, *when*- and *because*-clauses in English while also providing a supporting context for the extraction stimuli. In this study, extraction from *if*-clauses yielded average ratings on the upper half of the scale and showed no significant difference from *that*-clause extraction. The findings by Sprouse et al. (2016) and Nyvad et al. (2022) seem to suggest that at least for relativization, *if*-adjuncts are not absolute islands in English.

To this date, only few formal acceptability studies of relative clause extraction in English exist. Christensen and Nyvad (2022) investigated *wh*-extraction and topicalization from English relative clauses and found that both types of extraction resulted in ratings on the lower end of the scale, on par with ungrammatical control sentences. Furthermore, the ratings did not differ significantly across different types of transitive matrix verbs. Christensen and Nyvad (2022) take these results to support the standard assumption that relative clauses are strong islands in English. However, Vincent (2021) provides initial evidence that relative clause extraction in English may possibly be ameliorated in the same environments in which also MSc. relative clause extractions have been observed to be particularly felicitous. Vincent (2021) employs the factorial design developed by Sprouse (2007) and Sprouse et al. (2012, 2016) to measure island effects in *wh*-extraction from three different types of relative clauses: relative clauses in existential environments (10a), relative clauses attached to DP predicates in the complement of a copula (10b), and relative clauses embedded under transitive predicates (10c).

	- b. [Which painting]*<sup>i</sup>* do you think that Courtney believes that she is the only art collector [who bid on *\_i*]?
		- c. [Which painting]*<sup>i</sup>* do you think that Courtney saw the only art collector [who bid on *\_i*]? (Vincent 2021, p. 68)

Vincent found statistically significant superadditivity effects (taken to be diagnostic of island effects in the factorial design) for all three tested types of relative clause extraction; however, the size of the effect differed between the three environments. Extraction from existential or predicative relative clauses (type 10a and 10b) yielded smaller island effects than extraction in the transitive environment (type 10c) (Vincent 2021, p. 74). Moreover, extraction from existential or predicative relative clauses yielded higher average ratings than transitive extraction. Vincent takes these results to indicate that at least relative clauses in existential environments may not be strong islands in English.

The picture is further complicated by informal reports of acceptable island extractions in English. Despite the standard assumption that adjuncts and relative clauses are islands in English, there is a range of anecdotal evidence of purportedly acceptable extractions from finite adjunct clauses (11) and relative clauses (12) in English (see also Chaves and Putnam 2020, pp. 67, 91). The examples in (11) and (12) were either attested in authentic language use, or are reported to be acceptable to at least some speakers of English, which raises further questions as to how different English and MSc. really are in this regard.

11. *Adjunct clause extraction*

a. This is [the watch]*<sup>i</sup>* that I got upset [when I lost \_ *<sup>i</sup>*].

(Truswell 2011, p. 175, fn.1)


(Haegeman 2004, p. 70)

#### 12. *Relative clause extraction*

a. Isn't that [the song]*<sup>i</sup>* that Paul and Stevie were the only ones [who wanted to record \_*i*]? (Chung and McCloskey 1983, p. 708)

	- (McCawley 1981, p. 108)

In all these examples, extraction is in the form of relativization from the island. This matches the above-mentioned finding from formal studies that island extraction in English, to the extent that it is possible, seems to be restricted to or more available with relativization dependencies (Sprouse et al. 2016; Nyvad et al. 2022).

#### *2.3. Interim Summary*

Recent investigations on the possibility of island extraction in MSc. and English have primarily focused on formally and informally collected judgments of constructed sentences, with the exception of the studies by Lindahl (2010, 2017), Jensen (1998, 2002), and Kush et al. (2021) that investigated naturally produced language. Findings show that the traditional picture, according to which island extraction is impossible in English and widely acceptable in MSc., is not as clear-cut as previously thought. In formal studies on islands in MSc., relative clause and adjunct clause extraction constructions were shown to receive very low ratings, or seem to be restricted to specific environments. On the other hand, the island status of adjuncts and relative clauses in English is challenged by anecdotal reports of acceptable cases of extraction and by studies showing that at least some form of adjunct and relative clause island extraction may be acceptable under certain conditions in English. All in all, these data raise new questions regarding what is in fact possible in MSc. and English, and how different these languages really are with regard to island constraints. To address these questions, the findings from acceptability studies should be complemented with data obtained using different methods.

A study of extractions in authentic material may be able to circumvent some of the potential issues associated with investigating the phenomenon by means of formal acceptability studies. For example, it has been suggested that the low ratings obtained for island extractions in some of the acceptability studies on MSc. are due to using stimulus sentences that are overall unnatural or unusually complex (see e.g., Poulsen 2008, p. 96; Müller 2019, p. 69; Tutunjian et al. 2017; Wiklund et al. 2017; Kush et al. 2018). Another factor that has been suggested to explain the low ratings in formal studies is the absence of contextual cues for the stimulus sentences, which may be required for some forms of extraction (e.g., Tutunjian et al. 2017; Wiklund et al. 2017; Kush et al. 2018). By investigating extractions in authentic material, we should be able to get a better picture of what natural extraction sentences look like (in terms of their syntactic and semantic/pragmatic properties), and what kind of contexts they are typically embedded in. Evidence from production data may thus help us to better understand the mismatch often encountered between formal and informal judgments of MSc. island extractions, as well as to develop more natural-sounding stimuli for future formal studies. Another potential issue with formal acceptability studies is that traditional Scandinavian grammars often associate both adjunct and relative clause extraction with colloquial style, or even advise language users to avoid such extractions (see Lindahl 2017; Müller 2019 for an overview). It is thus possible that participants in acceptability experiments let such prescriptive rules influence their ratings, even though speakers may still produce such constructions—especially in informal settings. Corpus studies can make these production data available.

The existing investigations of strong island extractions in naturally produced material (Jensen 1998, 2002; Lindahl 2010, 2017; Kush et al. 2021) show that corpus studies can provide interesting insights in this regard; however—with the exception of Jensen (1998, 2002)—these studies have focused on relative clause extraction while disregarding adjunct islands, and none of them investigate naturally produced island extractions in English.

We conduct two exploratory corpus studies on extraction from strong islands in one of the MSc. languages, Danish, and in English, with the goal to further investigate the distribution of possible island extractions in these languages. In the first study, we investigate adjunct clause extraction in Danish and English corpora. In the second study, we focus on relative clause extraction. We expect that an exploration of naturally produced examples of island extractions will provide further insights into what is in fact possible in Danish and English, and what patterns or trends can potentially be observed among the

found extraction instances regarding the syntactic and semantic/pragmatic properties of such constructions.

#### **3. Corpus Study 1—Adjunct Clause Extraction**

#### *3.1. Materials and Methods*

The Danish corpus study of adjunct clause extractions was conducted using the written language corpus *KorpusDK* (56 million words) and the spoken language corpora *BySoc* and *SamtaleBank* (MacWhinney and Wagner 2010. )BySoc consists of transcriptions of ca. 80 spoken conversations (1.3 million words) and SamtaleBank of transcriptions of 24 conversations (altogether 6 h, 20 min). The English data were collected using the *British National Corpus* (British National Corpus 2001, 100 million words) and the *Corpus of Contemporary American English* (COCA 2008, COCA, 1 billion words), which both contain text from spoken and written language of various genres. For both languages, the query was complemented by examples retrieved from a Google search.

To search for examples of adjunct island extraction, we employed a combination of different search strings designed to target extraction from finite *if*-, *when*-, and *because*-clauses, and their Danish counterparts, respectively. Our query was restricted by the fact that the corpora we used are not annotated in a way that makes it possible to search for extraction constructions directly. As a consequence of this, we cannot provide quantitative data on the different types of extraction, or directly compare frequencies across constructions or across languages. Rather, our results provide informal insights into what appears to be possible at all or common among the found extraction instances. We start by describing our strategies for searching adjunct clause extractions in Danish, before moving on to our search strategies for the English material.

The search strings used in KorpusDK were mostly based on the example string given in (13), designed to retrieve instances of topicalization from Danish *hvis* ('if')-clauses, and subsequent modifications of it.

13. [ortho="(Den|Det|De)"] []{0, 2} [pos="N"] [pos="V"] [pos="PERS"] []{0,3} [ortho="hvis"]

This search string will return constructions initiated by a noun phrase consisting of one of the Danish determiners *Den*, *Det* or *De*, followed by up to two optional unspecified words (to allow for e.g., potential adjectival modifiers), and a noun. The noun phrase is followed by a verb (since Danish is a verb-second language), a personal pronoun (in order to target sentences with a pronominal subject), up to three optional unspecified words (in order to allow for e.g., potential sentence adverbs or auxiliary verbs in the matrix), and finally by *hvis* 'if'. The search was restricted to constructions with a pronominal matrix subject, since most acceptable examples of adjunct clause extraction reported in the literature also involve pronouns in the matrix subject position. The query was repeated with other possible determiners in the initial position (viz. *Denne*/*Dette* 'This', *Disse* 'These', *Sådan*/*Sådanne* 'Such') as well as with the lowercased versions of all determiners mentioned to account for extractions occurring after a conjunction. Moreover, search strings were constructed where the noun phrase construction was replaced by a simple pronoun, i.e., *Mig* 'Me', *Dig* 'You', *Hende* 'Her', *Ham* 'Him', *Den* 'It', *Det* 'It', *Os* 'Us', *Jer* 'You', or *Dem* 'Them', as well as the lower-cased versions of all of these.

In order to search for *wh*-question formation out of adjunct clauses, the search string described above was modified such that the determiner position in the initial noun phrase was replaced by the Danish *wh*-elements *Hvilken*, *Hvilket* and *Hvilke* 'which' (as well as their lowercased counterparts), or such that the entire noun phrase was replaced by the question word *Hvad*/*hvad* 'what'.

In order to search for relativization from adjunct clauses, the following search string (initiated by the Danish relative complementizer *som*) was employed:

14. [ortho="som"][pos="PERS"] []{0, 3} [pos="V"] []{0, 3} [ortho="hvis"]

Finally, all of the above-mentioned queries were repeated with the complementizers *når* 'when' and *fordi* 'because' replacing *hvis* 'if', in order to also retrieve potential instances of extraction from adjunct clauses introduced by *når* 'when' or *fordi* 'because'. The returned hits were filtered manually for any instances of adjunct clause extraction.

The other two Danish corpora used in this study, BySoc and SamtaleBank, are not grammatically annotated and the search strategy employed for KorpusDK could thus not be applied to them. Instead, we searched these corpora in their entirety for all instances of *hvis*, *når*, and *fordi* in order to find all instances of adjunct clauses introduced by these elements, and then searched the list of resulting hits manually for cases of extraction.

These corpus queries were complemented by a Google search for adjunct clause extractions. The search strings used in these Google queries were for the most part constructed such that they yielded an adjunct clause combined with a matrix that was composed of a pronominal subject and an adjectival psych-predicate of the sort *være/blive glad* 'be/become happy'. Many of the acceptable instances of adjunct clause extraction reported in the literature involve an adjectival psych-predicate of this type, see e.g., the examples in (15).


(Jensen 1998, p. 19)

We now turn to the search strategies employed in the English corpora. As the query syntax for COCA and BNC is different from the one for KorpusDK, different search strings had to be constructed to search the English corpora. We also decided to restrict the English search to extraction in the form of relativization and *wh*-extraction, since topicalization is rather marked in English (see e.g., Engdahl 1997; Poole 2017, p. 15), and it was deemed unlikely that the corpora would contain any instances of topicalization from an island. In (16), two examples of the search strings that were used to target *wh*-extraction from *if*-clauses are shown.

16. a. Which NOUN \_vd \_pp \_vv if b. Which NOUN \_vm \_pp \_vv if

These strings target question constructions with a *Which-*NP and a form of *do* (\_vd) or a modal verb (\_vm) as the finite matrix verb and a personal pronoun (\_pp) as the matrix subject. These search strings were subsequently augmented with additional unspecified positions in the matrix predicate.

Relativizations from adjunct clauses were targeted with strings like the following, involving a noun followed by one of the relative complementizers *that*, *which* or *who*.

17. NOUN that|which|who \_pp \_vv if

Like with the Danish part of the study, the same search strings were also employed with the alternative adjunct clause subordinators *when* and *because*.

As in the Danish study, Google was used to find further examples of adjunct clause extractions. The search strings used for Google mostly targeted relativization with *which* and a matrix clause involving a pronominal subject and, parallel to the Danish Google search, a psych-predicate such as e.g., *be glad*.

*3.2. Results*

3.2.1. Adjunct Clause Extraction in Danish

Our search on KorpusDK, BySoc, and SamtaleBank combined with the Google search yielded in total 36 instances of sentences displaying extraction from adjunct clauses in Danish. Out of these, only 3 instances were retrieved from KorpusDK, 1 from BySoc, and the remaining 33 examples were retrieved from the Google search. No instances were found on SamtaleBank.

In 7 instances, the extracted phrase is an adjunct rather than an argument in the embedded adjunct clause, see the examples in (18). The sentence-initial adjunct in both (18a) and (18b) must be interpreted as modifying the predicate in the adjunct clause, rather than the matrix predicate.


[http://curis.ku.dk/ws/files/59251974/JensPeterAndersenThesis.pdf, accessed on 3 December 2020]

Regarding the adjunct clause type that is featured in the extraction sentences, 31 sentences involved extraction from *hvis* ('if')-clauses and 6 sentences featured *når* ('when') adjunct clauses. No cases of extraction from *fordi* ('because')-clauses were found. As for the extraction dependency, 16 instances involved topicalization and 21 involved relativization. None of the extractions were in the form of *wh*-extraction. Table 1 shows the distribution of the different clause types and extraction dependencies in the sample.

**Table 1.** Distribution of the Danish adjunct clause extractions.


Sentences (19a–d) demonstrate an example from each category that was attested.


'The board at CC has several times publicly announced that the tronc system is a guideline that they are happy when the players follow, but not a requirement.' [https://www.pokernet.dk/forum/smidt-ud-af-casinoet.html, accessed on 3 December 2020]

#### 3.2.2. Adjunct Clause Extraction in English

In total, 49 sentences involving extraction from an adjunct clauses where found in English, out of which 5 stem from COCA, 1 from BNC and the remaining 43 from Google. A caveat with using Google to retrieve English examples is of course that it cannot be guaranteed that all examples were authored by native English speakers (whereas this is unlikely to be an issue for Danish). We tried to minimize the risk of including examples produced by non-native speakers of English by examining the source, the context, and (where possible) the author of all English examples, and consequently excluding any cases where we had reasons to suspect that they were authored by a non-native speaker. However, we concede that it is not possible to fully rule out the possibility of including examples from non-native speakers in our sample, and the results retrieved from Google for the English part of this study thus allow only for limited conclusions.

Regarding the adjunct clause type that is featured in the extraction sentences, 42 sentences involved extraction from *if*-clauses and 7 sentences featured *when*-adjunct clauses. No cases of extraction from *because*-clauses were found. As for the extraction dependency, all English cases involved relativization, and no examples of *wh*-extraction from adjunct clauses were found (as mentioned above, we did not search for instances of topicalization in English). Table 2 shows the distribution of the different clause types and extraction dependencies in the sample.

**Table 2.** Distribution of the English adjunct clause extractions.


Sentences (20a–b) demonstrate an example of the two types of adjunct clause extraction found in our sample.


to live in a house that they—a house that didn't have room for those. [COCA 2008]

#### *3.3. Discussion*

The fact that we found a variety of adjunct clause extraction sentences in Danish demonstrates that adjunct island extraction in MSc. is not just a peripheral phenomenon restricted to isolated constructed examples, but that naturally occurring cases can be attested in authentic language use. To have a rough point of comparison for the frequency of occurrence of the Danish island extractions, another search was done on KorpusDK that targeted extraction from (non-island) declarative complement clauses introduced by *at* 'that'. To this end, the search strings earlier employed for adjunct clause extractions were reused, but with *at* 'that' replacing the adjunct clause subordinator *hvis* 'if', *når* 'when' or *fordi* 'because'. Note that this excludes any instances of extraction from complement clauses with a non-overt *at* 'that' from the search results. This search resulted in 1250 instances of extraction from declarative *at*-clauses in KorpusDK. Not taking into account the base frequency of complement clauses vs. adjunct clauses in Danish, it can thus be asserted that extraction from (some) adjunct clauses appears to be possible in Danish, but occurs considerably less frequently than *at*-clause extraction.

Perhaps more surprisingly, a non-trivial amount of naturally produced examples of adjunct clause extraction were also found in English. This finding potentially challenges the claim that English differs from the MSc. languages in never allowing extraction from finite adjunct clauses, as argued e.g., by Truswell (2007, 2011) and Ernst (2022). However, in light of the small number of English examples found in relation to the size of the English materials that we used, and the fact that it cannot be guaranteed that all English examples were produced by native speakers, our results do not permit a clear conclusion as to whether extraction from finite adjunct clauses is generally acceptable in English. At the same time, our results are in line with recent experimental findings that at least *if*-adjunct clauses in English do not behave like absolute islands for extraction in the form of relativization (Sprouse et al. 2016; Nyvad et al. 2022).

In both languages, by far most of the extraction examples were found on Google rather than in the investigated corpora. This could in part be due to the corpora not being large enough to feature many examples of a construction as infrequent as adjunct clause extraction. It is also possible that adjunct clause extractions are more common in very informal text types that resemble spoken language (such as blogs and discussion forums), which are more accessible via Google than a corpus consisting mostly of written language.

Some trends and patterns can be discerned among the found extraction instances regarding the type of adjunct clause featured as well as the type of extraction dependency. When it comes to clause type, all examples of adjunct clause extraction found in English as well as in Danish featured *if*- and *when*-clauses, with extraction from *if*-clauses clearly being the most common in both languages (see Tables 1 and 2). The fact that no examples of extraction from *because*-clauses were found in either language is consistent with previous observations that conditional and (certain) temporal adjuncts appear to be more permissive for extraction than causal adjuncts, and that extraction possibilities thus may differ across different types of adjunct clauses (e.g., Müller 2019; Bondevik et al. 2020).

If Danish and English indeed allow extraction from *if*- and *when*-adjunct clauses, at least under some conditions, then this possibility could potentially be accommodated under accounts proposing that adjunct clause extraction is facilitated when a causal relation between the events described in the matrix and in the adjunct clause can be established (e.g., Jensen 1998, 2002; Truswell 2007, 2011). Both conditional *if*-clauses and *when*-clauses (at least *when*-clauses allowing for a generic reading) usually specify general causes or circumstances for a state of affairs expressed in the matrix clause and can thus easily be construed as being causally related to the matrix. A look at the extraction examples that we retrieved from our corpus search confirms that in all cases, a causal interpretation of the events is very natural. Often, the matrix predicate expresses a psychological condition such as *be glad* (see examples 18a–b and 19c–d) or *be angry* (see 19a) that is presented as a consequence of the proposition expressed in the adjunct clause. However, contra Truswell (2007, 2011) and Ernst (2022), extraction from causally interpreted adjuncts in English does not seem to be restricted to non-finite adjuncts.

At first glance, an account of permissible adjunct island extractions in terms of a causation relationship between adjunct and matrix clause leaves unexplained the absence of extraction from *because*-clauses in both English and Danish in our study. Since *because*clauses explicitly mark a causal relation, they should permit extraction as easily as e.g., *if*-clauses according to the above proposal. However, there are some indications that causal adjunct clauses differ from conditional and temporal adjuncts in possessing a more elaborate internal structure that could be responsible for blocking extraction, see Müller (2017, 2019). For instance, Johnston (1994) and Sawada and Larson (2004) argue that causal clauses differ from e.g., temporal clauses semantically in having a "closed event" structure and asserting, rather than presupposing the existence of the event described in them. As Müller (2017) points out, a closed event structure could possibly also entail syntactic opacity, if semantic closure interacts with cyclic Spell-Out such that it induces Transfer of the relevant structure to the interfaces, thereby making it unavailable for subsequent extraction. Furthermore, Sawada and Larson (2004) suggest that this semantic difference may also

have a syntactic parallel, in that causal connectives such as *because* combine not only with a larger semantic domain than temporal connectives, but also with a larger syntactic domain that contains additional layers of structure. This extended syntactic domain could constrain extraction possibilities, for instance, if it contains a feature or projection that causes an intervention effect with the extracted phrase.

As for the type of extraction dependency, most cases of adjunct clause extraction found involved relativization from the island, followed by topicalization in Danish. Notice that in English, only relativization and *wh*-extraction were included in the search. No cases of *wh*-extraction were found in either language. This finding lends further support to previous observations that extraction possibilities may differ across extraction dependency, with topicalization (Kush et al. 2018, 2019; Bondevik et al. 2020) and relativization (Sprouse et al. 2016; Abeillé et al. 2020) from islands reported to be more acceptable than *wh*-extraction. We return to a discussion of the role of extraction dependency in Section 5.

#### **4. Corpus Study 2—Relative Clause Extraction**

#### *4.1. Materials and Methods*

The Danish data for relative clause extractions were collected using KorpusDK. In light of the large quantity of relative clause extraction sentences that we found on KorpusDK, we decided to restrict the search to KorpusDK and did not extend the investigation to include BySoc, SamtaleBank or Google for the purpose of finding Danish relative clause extractions. Note also that Jensen (1998, 2002) has previously searched BySoc for extraction instances and retrieved the cases of relative clause extraction occurring in BySoc (see Section 2.1). The English part of the search was once again carried out using the corpora BNC and COCA, as well as Google. Because we were not able to find any instances of relative clause extraction on BNC or COCA, we extended the search to also include the corpus of Global Web-based English (Davies 2013, GloWbE, 1.9 billion words), which contains material from various websites in 20 different English-speaking countries.

For the search on KorpusDK, the search strings described in Section 3.1 were reused with some adaptions to target relative clause extraction instead of adjunct clause extraction. Thus, the position specifying the adjunct clause subordinator (e.g., *hvis*) was replaced by the relative pronouns *som* and *der* and preceded by 1–4 optional words (rather than 0–3) to allow for a head noun of the relative clause, see the example search string in (21) (targeting topicalization from a relative clause).

21. [ortho = "(Den|Det|De)"] []{0, 2} [pos = "N"] [pos = "V"] [pos = "PERS"] []{1, 4}[ortho = "(som|der)"]

Also for English, the search strings used to target adjunct clause extractions (described in Section 3.1) were reused, to the extent that this was possible, and adapted to target relative clauses instead of adjunct clauses. For instance, relativization from relative clauses was targeted with strings like (22a–b), searching for a noun phrase followed by one of the relative complementizers *that*, *which* or *who*. Instead of an adjunct clause subordinator like *if*, the search string ends on a relative clause complementizer (*who* or *that*), which in turn is preceded by a pronoun position (\_p), since the head noun in the relative clause extractions reported in the literature is often an indefinite pronoun of some kind.

22. a. NOUN that|which|who \_pp \_vv \_p who b. NOUN that|which|who \_pp \_vv \_p that

Like in the adjunct clause search, these strings were augmented with additional unspecified positions in the matrix clause in subsequent queries. Additional searches were carried out targeting specific matrix constructions that appear to be common in relative clause extraction sentences, e.g., by using strings that target relative clauses headed by NPs with *the only* (23a) or *a lot of* (23b), and modifications of these strings.

	- b. that|which|who there [be] a lot of \* who|that

Again, Google was used to carry out further searches, mostly targeting relativization from relative clauses in constructions that appear to be common in the extraction sentences found in the literature or in our Danish material, e.g., *which I don't know anyone who* or *which there are many people who*. Like with the adjunct clause study, the English search was restricted to *wh*-extraction and relativization and did not include topicalization.

#### *4.2. Results*

4.2.1. Relative Clause Extraction in Danish

In total, we found 940 instances of Danish relative clause extraction on KorpusDK, out of which 910 involved topicalization from the relative clause and 30 involved relativization. Again, no instances of extraction by *wh*-question formation were found. Table 3 shows the distribution of Danish relative clause extractions across the different extraction dependencies and the type of matrix verb under which the relative clause was embedded.


**Table 3.** Distribution of the Danish relative clause extractions.

As Table 3 illustrates, the overwhelming majority of extraction instances (933 out of 940) has a form of *være* 'be' as the matrix predicate. In most of these (viz. in 866 cases), the matrix predicate *være* is part of an existential construction with *der* 'there' as the subject, as exemplified in (24).


In most of the remaining cases with matrix verb *be*, the relative clause is attached to a DP in a copular construction with a referential pronominal subject, see e.g., (25).


Only 7 cases of extraction were found that involved a different matrix verb than *være* 'be'. The matrix verbs in these were *kende* 'know', *møde* 'meet', *have* 'have', *blive* 'become', and *finde* 'find', see the examples in (26).


'It turned out to be a very costly affair that we could not find anyone who wanted to take on.' [KorpusDK 2021]

> There was also a clear trend in regard to the extracted element: In 797 cases (ca. 85%), the fronted element was a simple demonstrative or personal pronoun of the type *den*/*det*/*dem* 'that'/'that'/'those' (most commonly, *det*), rather than e.g., a full noun phrase.

> Finally, our query also returned 27 sentences in which a manner adverbial adjunct (*sådan* 'like that') (27) or a PP adjunct (28) rather than an argument has been extracted from the relative clause.


28. [Mod nerverne]*<sup>i</sup>* er der også noget naturmedicin, [som du kan tage \_*i*]. against nerves.the is there also some natural medicine that you can take 'There is also some natural medicine that you can take against the nerves.' [KorpusDK 2021]

#### 4.2.2. Relative Clause Extraction in English

Only 18 cases of relative clause extraction were found in the English material, all of which involved extraction by relativization. Almost all of the English examples were found on Google; the search on BNC, COCA and GloWbE only returned one instances of relative clause extraction (found on GloWbE). Only 7 of the English examples involved a form of *be* as the matrix verb (see Table 4). The remaining sentences had a form of *know* (10 cases) or *meet* (1 case) as the matrix predicate.

**Table 4.** Distribution of the English relative clause extractions.


In all 7 cases with matrix verb *be*, the relative clause modifies a *the only* DP in a copular construction, see the examples in (29).

29. a. And I always make [a plum pudding with hard sauce]*i*, which **I am the only** person [who eats \_ *<sup>i</sup>*]. [https://www.ihavenet.com/recipes/Peppermint-Pie-Christmas-Dessert-Recipe-One-for-the-Table-Recipes.html, accessed on 17 November 2021) b. The compressed-encrypted stream would be as if we are using [a different language (still Zeros and Ones = 0101)]*<sup>i</sup>* which **we are the only** ones [who can decompress-decrypt \_ *<sup>i</sup>*]. [https://www.linkedin.com/pulse/recent-russian-hacking-our-country-numberus-sam-eldin?trk=public\_profile\_article\_view, accessed on 17 November 2021]

The remaining cases (featuring *know* or *meet* as matrix verb) all involve extraction of a VP from a relative clause headed by *anyone* or *anybody*, as shown in (30).

30. a. I've [done five records in the last six years]*i*, which I don't **know** anybody [who has \_*i*].

> [https://www.spin.com/2021/07/david-crosby-5-albums-i-cant-live-without/, accessed on 19 November 2021]


#### *4.3. Discussion*

The quantity of sentences featuring relative clause extraction that we found in Danish (940 cases on KorpusDK alone) allows us to conclude that relative clause extraction, especially with topicalization, is a commonly produced construction in naturally occurring Danish. In fact, the frequency of relative clause extraction is roughly comparable to the rate of extraction from declarative *at* ('that')-clauses that we found using equivalent search strings, with 1250 found cases on Korpus DK (see Section 3.3). Note that this does not take into account the base frequency of complement clauses vs. relative clauses in Danish.

The considerable amount of relative clause extractions found on KorpusDK seems to support the previous observations that relative clauses do not behave like strong islands in Danish. However, it is striking that a vast majority of the found relative clause extractions (more than 92%) feature a relative clause embedded under an existential construction introduced by *der er* 'there is'. Relative clause extraction in Danish thus seems to be particularly productive in this specific environment. This is in line with similar observations for Swedish, where relative clause extraction is also reported to be most common in existential environments (e.g., Engdahl 1997; Lindahl 2017). In light of the few extraction examples that we found featuring other matrix verbs (e.g., *kende* 'know', *møde* 'meet', or *finde* 'find'), our data are compatible with the claims that MSc. relative clause extraction is in principle also possible with other (non-existential) predicates (see also Lindahl 2017). However, the production of such cases in written language appears to be exceedingly rare (given that less than 1% in our sample featured a matrix verb other than *være* 'be').

One possible explanation for the uneven distribution of matrix verbs is that there is a syntactic difference between relative clauses embedded under an existential construction and other relative clauses, such that the latter form islands for extraction and the former do not. McCawley (1981) makes a proposal along these lines, by suggesting that extraction from relative clauses is possible when the relative clause is embedded in an existential or negative existential clause, as the extraction domain in that case is not a regular relative clause, but a *pseudo-relative*. Extraction from a pseudo-relative may not violate an island constraint if it is assumed that pseudo-relatives are not actually complex NPs (e.g., Casalicchio 2016).2 However, this approach would not cover the cases where the matrix verb is a not an existential verb, such as e.g., *finde* 'find', see example (26e). See also Lindahl (2017) for a review of Swedish extraction examples that cannot be covered by McCawley's (1981) proposal. In other words, the preference for an existential matrix verb in MSc. relative clause extractions seems to be more of a strong tendency than an absolute restriction.

This tendency can more likely be explained by a pragmatic account of islands, as suggested by Chaves and Putnam (2020). Chaves and Putnam propose that many island constraints traditionally assumed to be syntactic in nature, including the Complex NP Constraint responsible for relative clause islands, can be reduced to *Relevance Islands*: The referent that is singled out by the extraction must be sufficiently relevant for the main action described by the utterance. Chaves and Putnam's proposal builds on a line of other accounts that derive island effects from information-structural factors (e.g., Erteschik-Shir 1973; Deane 1991; Goldberg 2006, 2013; Van Valin 1994, 1996, 2005). Generally, these accounts share the assumption that extraction is only felicitous if it occurs from a constituent that is in some sense prominent or relevant in the discourse. For example, Goldberg (2006) suggests that extraction is illicit from *backgrounded* domains, since extracted phrases are typically in discourse-prominent positions, and extraction from a backgrounded domain thus causes a pragmatic clash. Chaves and Putnam (2020) recast the account by Goldberg (2006, 2013) and other related pragmatic accounts of islands in terms of the concept of *relevance*: An extracted referent "must be highly relevant (e.g., part of the evoked conventionalized world knowledge) relative to the main action that the sentence describes" (Chaves and Putnam 2020, p. 206). According to Chaves and Putnam (2020, p. 68), this relevance constraint can account for the difficulty to extract from most relative clauses: Because relative clauses tend to express presupposed or backgrounded information, a referent belonging to a relative clause can typically not be construed as sufficiently relevant for the main event. However, extraction may be acceptable if the relative clause is embedded under an existential *there is*/*are* or other matrix predicates that are low in semantic content, such as e.g., *know* or *have*, since in those cases, the embedded clause can be deemed to be more informative than the matrix clause, and drawing attention to a referent from it by extraction thus does not violate the pragmatic relevance principle (Chaves and Putnam 2020, p. 68). Indeed, this explanation seems to accommodate not just the cases of relative clause extraction under *there is*/*are*, but also the examples involving other matrix predicates that we found: All other matrix verbs attested in our extraction examples (*kende* 'know', *møde* 'meet', *have* 'have', *blive* 'become', and *finde* 'find') are semantically rather abstract and are thus compatible with the embedded relative clause expressing the main assertion in the utterance.

As a way to identify relevant or prominent constituents, Chaves and Putnam (2020) and Goldberg (2006, 2013) both suggest that the distinction between discourse-prominent (relevant) and backgrounded content aligns with the distinction between asserted and presupposed content, such that asserted information tends to correspond to the main action (or in Goldberg's terms, the *potential focus domain* of a sentence), whereas presupposed clauses are backgrounded (Chaves and Putnam 2020, pp. 71, 208; Goldberg 2006, p. 130). Assertions in turn can be identified by testing whether a proposition can be negated by sentential negation. This negation test correctly predicts that existential/presentational relative clauses as well as predicative relative clauses (which together represent the bulk of relative clauses involved in our examples) should allow extraction, as they express assertions and can thus successfully be negated by negating the matrix (see Kush et al. 2021, p. 160, Kush et al. 2021, p. 38). However, as Lindahl (2017, pp. 160–61) points out, the negation test runs into difficulties with cleft structures, which appear to allow extraction in the MSc. languages, but are incorrectly identified as islands by the negation test, since they are presupposed (see also Kush et al. 2021, p. 38). While extraction from cleft relative clauses was rare in our material, we did find 8 instances of extraction from a cleft clause in the Danish corpus that would be left unaccounted for under the proposal that islands correspond to presupposed clauses. Consider e.g., example (31a), which involves extraction from a cleft clause. As (31b) shows, it is not possible to negate the proposition expressed in the cleft relative clause by negating the matrix of the non-extracted version of this sentence.


'It is not primarily the center-left parties that have answered them.'

→ Someone has replied to them.

However, Kush et al. (2021) point out that another related information-structural factor seems to make the right cut between the types of relative clauses that allow for extraction and those that do not, viz. whether or not the clause in question conveys new information: "The [relative clauses] that allow movement are those that contribute wholly or partially new information to the discourse (or at least information that need not be known to the hearer)" (Kush et al. 2021, p. 39). This proposal can account for the possibility to extract from cleft structures, since cleft clauses (despite being presupposed) may convey new information. For example, it is possible to utter the sentence in (31a) even if the information provided in the cleft clause (that someone has answered them) is new in the discourse. Kush et al. (2021) further suggest that clauses can be said to provide new information when they contain the sentence's *main point of utterance* (Simons 2007). Since the matrix predicate in (31a) and in other cleft sentences is almost void of semantic content, the embedded relative clause necessarily constitutes the MPU in cleft sentences like (31a). An account of transparent relative clauses in terms of new information or MPU can arguably also account for our extraction examples where the matrix predicate is not *være* 'be'. As pointed out above, the other matrix verbs found in our extraction examples (e.g., *kende* 'know', *have* 'have', and *blive* 'become') are also low in semantic content and are thus compatible with the relative clause constituting the MPU. Finally, this proposal also has the potential to account for some of the Swedish examples of relative clause extraction that Lindahl (2017) has shown to be problematic for an account in terms of backgrounded or presupposed constituents, viz. examples with *beundra* 'admire' or *störa sig på* 'be annoyed by' as matrix predicates (see 6a–b). As Lindahl (2017, p. 161) shows, the clauses embedded under these verbs fail the negation test and are therefore considered to be backgrounded, yet they seem to allow extraction. However, it seems possible that clauses in the complement of these verbs still may constitute a sentence's MPU: According to Simons (2007), an embedded clause can be the MPU if the matrix predicate conveys the speaker's emotional orientation towards the information in the embedded clause, as the matrix verb is considered parenthetical in that case. This seems to fit the function of the matrix verbs in the Swedish examples mentioned above. Even though these examples are quite different from the ones discussed in Simons (2007), the predicates 'admiring' and 'being annoyed' can also be argued to express an emotional orientation towards the content of the relative clause.

Although we acknowledge that these remarks are of a very preliminary nature and require further investigation, we tentatively suggest (along with Kush et al. 2021) that a pragmatic account in terms of new information or MPU could make the relevant distinction between relative clauses that allow for extraction and those that do not (to the extent that a language allows extraction from relative clauses at all). However, more work is required to flesh out how exactly Simons (2007) proposal could be adapted to relative clauses.

Unlike with adjunct clause extraction, the extraction dependency that was by far most frequent in our sample of Danish relative clause extractions was topicalization, with relativization taking a second place. However, the results from our relative clause study share with the adjunct clause study that none of the found extraction instances involved *wh*-extraction. This finding thus further strengthens the conjecture that topicalization and relativization from islands appears to be easier than *wh*-extraction. Our results regarding the distribution of different extraction dependencies in relative clause extraction match Lindahl's (2017, p. 47) finding that topicalization was by far the most common extraction dependency among her sample of Swedish relative clause extractions, whereas relativization occurred only in a few cases and *wh*-extraction from relative clauses remained unattested.

Finally, our finding that not just arguments, but also adverbial or PP adjuncts are extracted from relative clauses in our sample (see examples 27 and 28) demonstrates that relative clause extraction in Danish apparently is not restricted to argument DPs, but that phrases of different categories can be topicalized (see Lindahl 2017 for a similar observation for Swedish relative clauses). If MSc. relative clauses indeed allow for extraction of (some) adjuncts, as these observations suggest, this would distinguish MSc. relative clauses not just from strong islands, but also from traditional weak islands that generally permit extraction of arguments, but not adjuncts (Huang 1982; Szabolcsi 2006).

As for English, the extraction instances that we found on Google and GloWbE seem to support the anecdotal evidence that naturally produced cases of extraction from English relative clauses do occasionally appear in authentic language use. However, the low frequency of the results prevents a clear conclusion as to whether relative clause extraction in English is a phenomenon that extends beyond constructed examples and only sporadically found natural cases. It should, however, also be mentioned that the frequency for relative clause extractions cannot be directly compared between our Danish and English study, since the search methods we could employ for the English corpus study were somewhat restricted in comparison to the Danish study—in part because of limits on the length and generality of the search strings that can be used to query BNC, COCA and GloWbE, and in part due to the polysemy of English *that*. We were thus not able to conduct a search for English relative clause extractions that was as systematic and extensive as the Danish part of this study.

It is, however, remarkable that no cases of extraction from relative clauses in existential environments were found in English, given that a large part of our search methods specifically targeted such constructions, and that extraction from existential relative clauses was so common in the Danish material. Instead, the majority of the examples we found involved VP-extraction from a relative clause embedded under *know* and headed by *anyone* or *anybody*. In light of the low number of cases that we found in English, we refrain from any further analysis and conclusions regarding potential patterns or trends in English relative clause extraction.

#### **5. General Discussion**

Extraction from relative clauses and adjunct clauses is intuitively acceptable to native speakers of the MSc. languages, at least under certain conditions. Such cases of acceptable adjunct and relative clause extraction pose a puzzle for syntactic research, as these structures seem to violate island constraints assumed to apply universally. However, a few recent experimental studies report very low acceptability ratings for island extraction sentences in MSc., which has raised the question how these experimental findings can be reconciled with the observation that adjuncts and relative clauses do not seem to behave like islands in MSc. The finding from our corpus studies that both extraction from finite adjuncts and from relative clauses can be attested at a non-trivial rate in naturally produced Danish lends strong support to the notion that extraction from adjunct and relative clauses, at least in some environments, appears to be possible in Danish, and that these constructions thus are not absolute islands in Danish.

Perhaps more surprisingly, we were able to show that extraction from some types of finite adjunct clauses and relative clauses also occurs in naturally produced English (although the number of both types of extractions we could find in English remained very small). Due to the small number of examples we could find and the fact that most of our English examples were retrieved from Google, our findings do not permit us to draw any further conclusions regarding whether or not English generally allows extraction from traditional strong islands. However, we want to point out that the instances of extraction from finite *if*-adjunct clauses that we found in English are in support of recent experimental findings suggesting that at least *if*-adjunct clauses in English may be transparent for extraction by relativization (Sprouse et al. 2016; Nyvad et al. 2022).

Moreover, many of the trends and patterns that can be observed among our found extraction instances are shared by English and Danish. First, to the extent that extraction from adjunct clauses and from relative clauses is attested in these languages, it seems to occur only with relativization (and in Danish, topicalization) from the island, but remains unattested with *wh*-extraction. Second, adjunct clause extraction in both languages appeared most frequently from *if*-clauses and only in a few instances from a *when*-clause. Extraction from *because*-clauses was unattested in both Danish and English. Third, relative clause extraction occurred only with a very limited number of different verbs as the matrix predicate in both languages. These similarities can be taken as an indication that English and Danish are possibly more similar with regard to strong islands than previously assumed, in the sense that these languages facilitate island extraction in similar environments.

We already discussed the trends that we found pertaining to specifically adjunct or relative clause extraction above (viz. the type of adjunct clause typically involved in adjunct clause extraction, or the type of matrix verb featured in relative clause extraction). One feature shared by both types of islands is the type of extraction dependency found in naturally occurring extractions. Specifically, we found that all retrieved instances of relative and adjunct clause extraction involved topicalization or relativization from the island, whereas *wh*-extraction remained unattested. This observation reinforces the notion that island extraction possibilities appear to vary across different types of extraction dependency, with *wh*-extraction being more restricted than the other dependency forms (Kush et al. 2018, 2019; Bondevik et al. 2020; Sprouse et al. 2016; Abeillé et al. 2020). A difference between for instance relativization and *wh*-extraction could perhaps be derived from a pragmatic account of island constraints, as proposed by Chaves and Putnam (2020) (see also Goldberg 2006). As outlined above, Chaves and Putnam (2020) suggest that relative clauses and adjuncts are islands because they usually convey backgrounded information, and extraction from them thus draws attention to a referent that cannot be construed as sufficiently relevant for the main action, thereby violating a pragmatic relevance principle. However, if extraction is in the form of relativization, this relevance constraint can according to Chaves and Putnam (2020) be circumvented, since relatives "express assertions rather than backgrounded information" (p. 91). Similarly, Abeillé et al. (2020) provide an account according to which relativization from backgrounded constituents may be easier to accept than *wh*-extraction, since—unlike *wh*-extraction—relativization does not put the fronted element into focus and can thus avoid a conflict with the backgrounded status of the domain that is extracted from. This account could be extended to cover the apparent possibility of topicalization from some islands (in the MSc. languages). While topicalization can target focused elements, it is crucially also used frequently to front non-focal elements, and can thus pattern with relativization in not requiring the extracted element to be focused. One potential problem of this approach is that the adjunct clauses and relative clauses primarily featured in our attested examples are of a type that should already help circumvent the backgroundedness or relevance constraint. As we discussed in Section 3.3, all adjunct clauses featured in our sample express a cause or general condition for the (often psychological) condition in the matrix clause, and can thus be considered to express at-issue information for the main action (see Chaves and Putnam 2020, p. 91). Similarly, in Section 4.3 we observed that the relative clause extractions we found have in common that the matrix predicate is void of or very low in semantic content and thus allows for the relative clause to express new and relevant information, or the main point of utterance. Following a proposal along the lines of Chaves and Putnam (2020) or Abeillé et al. (2020), one would expect that it should not be problematic to extract an element from these kinds of adjunct or relative clauses even if the extraction dependency puts the fronted element into focus (as is the case with *wh*-questions), since the domain that it is extracted from is no longer backgrounded and thus should no longer constrain constructions that make the extracted element prominent in the discourse. This would leave unexplained the absence of *wh*-extraction in our sample.

A possible alternative explanation is that the relative and adjunct clauses in our examples, despite being less backgrounded, are still behaving like a type of *weak island* (Szabolcsi 2006), possibly because of a syntactic property. Weak islands are known to permit extraction in some configurations (e.g., if the extracted element is a discourse-linked argument), but not in others. Adjuncts and relative clauses in MSc. have previously been suggested to be weak islands based on their selective extraction behavior (e.g., Lindahl 2017; Müller 2019).

A prominent approach to weak islands is in terms of *featural Relativized Minimality* (Starke 2001; Rizzi 2013). According to featural RM, the acceptability of weak island extraction is dependent on the featural specification of the involved elements: An element can be extracted as long as the movement path does not cross an intervening element that is more richly specified than the extracted element, as this would lead to an intervention effect. Weak island effects caused by adjunct and relative clauses can be accounted for under featural RM by assuming that these clauses involve an operator in their left periphery that can cause said intervention effect (see e.g., Demirdache and Uribe-Etxebarria 2004; Bhatt and Pancheva 2006; Haegeman 2010, 2012) for evidence that conditional and temporal adjunct clauses are derived by operator movement). In order to derive differences between dependency types from a RM account, one would have to assume that topicalization and relativization differ from *wh*-extraction in that they both involve movement of an element that carries a richer featural specification than the operator present in relative or adjunct clauses. There are several proposals that group topicalization and relativization together, based on their close relation (e.g., Kuno 1976; Williams 2011; Abels 2012; Douglas 2016). Williams (2011, 2013) shows several ways in which relatives and topicalization pattern together against *wh*-questions, and Douglas (2016, p. 147) specifically suggests that topicalization and relativization share a feature (presumably for a discourse-linking property) that *wh*-questions do not have.

The additional feature that is assumed to be present in the types of phrases that can escape from weak islands is often taken to be a referential or *discourse-linking* (Pesetsky 1987) feature. This should also enable extraction by question formation from the island, as long as the extracted phrase is a lexically restricted *wh*-phrase of the type *which NP*, as these favor a D-linked interpretation. Nevertheless, extraction of complex *wh*-phrases is just as absent from our sample as extraction of bare *wh*-phrases. However, there have been suggestions that more fine-grained interpretive properties govern the possibility of weak island extraction: Starke (2001) and Baunaz (2011) observe that extraction from some weak islands seems to trigger presupposition of a specific referent that must be familiar to both speaker and hearer, rather than just a D-linked interpretation, where the extracted referent is expected to be part of a set of alternatives that is salient in the discourse. It is possible that topicalization and relativization fulfill these requirements for a specific reference more easily than question formation. In fact, Starke (2001) and Baunaz (2011) point out that questions which trigger the specific reading of the *wh*-phrase require a very rich context and are akin to echo-questions. This could then explain why *wh*-extraction from e.g., relative clauses is judged to be possible with constructed examples—especially with an echo-question reading (see Engdahl 1997; Lindahl 2017)—but occurs rarely in production, as the appropriate kinds of context can be constructed if necessary, but probably do not often occur in (written) natural language. We acknowledge the speculative nature of these remarks, but leave it to future research to investigate further in how far a more fine-grained approach to feature composition can account for potential asymmetries between different extraction dependencies.

As a final note, we would like to mention that we think these results have the potential to complement and inform future formal studies of island phenomena, in the sense that many acceptability studies aim to use stimuli that are modelled on naturally occurring data. However, our corpus study revealed for instance that most naturally attested island extractions in Danish and English involve relativization or (in Danish) topicalization, while *wh*-extraction remained unattested. In this regard, the naturally produced extractions contrast with the test sentences used in most acceptability judgment experiments on island extractions, which—at least in English—have focused on testing extraction in the form of *wh*-movement. At the same time, further formal acceptability studies of island constructions e.g., in Danish, are also necessary to get a better understanding of the source of the apparent mismatches between experiments and naturally occurring production.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/languages7020125/s1.

**Author Contributions:** Conceptualization, C.M. and C.U.E.; methodology, C.M. and C.U.E.; formal analysis, C.M. and C.U.E.; investigation, C.M. and C.U.E.; data curation, C.M. and C.U.E.; writing original draft preparation, C.M.; writing—review and editing, C.M. and C.U.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Independent Research Fund Denmark (grant DFF-9062- 00047B).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available in the Supplementary materials.

**Acknowledgments:** We wish to thank Anne Mette Nyvad, Ken Ramshøj Christensen, and Sten Vikner for assistance with the design of the corpus studies and for providing valuable comments on this research. For further comments and suggestions, we would like to thank two anonymous reviewers as well as the editors for this *Languages* issue.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Notes**


#### **References**

Abeillé, Anne, Barbara Hemforth, Elodie Winckel, and Edward Gibson. 2020. Extraction from subjects: Differences in acceptability depend on the discourse function of the construction. *Cognition* 204: 104293. [CrossRef] [PubMed]

Abels, Klaus. 2012. The Italian left periphery: A view from locality. *Linguistic Inquiry* 43: 229–54. [CrossRef]


Baunaz, Lena. 2011. *The Grammar of French Quantification*. Dordrecht and New York: Springer.

Bhatt, Rajesh, and Roumyana Pancheva. 2006. Conditionals. In *The Blackwell Companion to Syntax*. Edited by Martin Everaert and Henk van Riemsdijk. Oxford and Boston: Blackwell Publishing, pp. 638–87.

Boeckx, Cedric. 2012. *Syntactic Islands*. Cambridge: Cambridge University Press.


Casalicchio, Jan. 2016. Pseudo-relatives and their left-periphery. A unified account. In *Romance Languages and Linguistic Theory 10: Selected Papers from 'Going Romance' 28, Lisbon*. Edited by Ernestina Carrilho, Alexandra Fiéis, Maria Lobo and Sandra Pereira. Philadelphia: Benjamins, pp. 23–42.

Cattell, Ray. 1976. Constraints on movement rules. *Language* 52: 18–50. [CrossRef]

Chaves, Rui P., and Michael T. Putnam. 2020. *Unbounded Dependency Constructions*. Oxford: Oxford University Press.

Chomsky, Noam. 1986. *Barriers*. Cambridge: MIT Press.

Christensen, Ken Ramshøj, and Anne Mette Nyvad. 2014. On the nature of escapable relative islands. *Nordic Journal of Linguistics* 37: 29–45. [CrossRef]

Christensen, Ken Ramshøj, and Anne Mette Nyvad. 2022. The Island Is Still There: Experimental Evidence for the Inescapability of Relative Clauses in English. *Studia Linguistica* 2022: 1–25. [CrossRef]

Chung, Sandra, and James McCloskey. 1983. On the interpretation of certain island facts in GPSG. *Linguistic Inquiry* 14: 704–13. Corpus of Contemporary American English. 2008. Available online: http://corpus.byu.edu/coca/ (accessed on 19 November 2021). Culicover, Peter W. 1999. *Syntactic Nuts: Hard Cases, Syntactic Theory and Language Acquisition*. Oxford: Oxford University Press. Davies, Mark. 2013. Corpus of Global Web-Based English: 1.9 Billion Words from Speakers in 20 Countries (GloWbE). Available online:

https://corpus.byu.edu/glowbe/ (accessed on 19 November 2021). Deane, Paul. 1991. Limits to attention: A cognitive theory of island phenomena. *Cognitive Linguistics* 2: 1–63. [CrossRef]

Demirdache, Hamida, and Myriam Uribe-Etxebarria. 2004. The syntax of time adverbs. In *The Syntax of Time*. Edited by Jaqueline Guéron and Jacqueline Lecarme. Cambridge: MIT Press, pp. 143–80.

Douglas, Jamie A. 2016. The Syntactic Structures of Relativisation. Ph.D. dissertation, University of Cambridge, Cambridge, UK.

Engdahl, Elisabeth. 1997. Relative clause extractions in context. *Working Papers in Scandinavian Syntax* 60: 59–86.

Ernst, Thomas. 2022. The adjunct condition and the nature of adjuncts. *The Linguistic Review* 39: 85–128. [CrossRef]

Erteschik-Shir, Nomi. 1973. On the Nature of Island Constraints. Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA.

Erteschik-Shir, Nomi. 1982. Extrability in Danish and the pragmatic principle of dominance. In *Readings on Unbounded Dependencies in Scandinavian Languages*. Edited by Elisabeth Engdahl and Eva Ejerhed. Stockholm: Almqvist & Wiksell, pp. 175–91.

Goldberg, Adele. 2006. *Constructions at Work. The Nature of Generalization in Language*. New York: Oxford University Press.


Haegeman, Liliane. 2010. The internal syntax of adverbial clauses. *Lingua* 120: 628–48. [CrossRef]


Heinat, Fredrik, and Anna-Lena Wiklund. 2015. Scandinavian relative clause extraction. Apparent restrictions. *Working Papers in Scandinavian Syntax* 94: 36–50.

Hofmeister, Philip, and Ivan A. Sag. 2010. Cognitive constraints and island effects. *Language* 86: 366–415. [CrossRef]

Huang, James C.-T. 1982. Logical Relations in Chinese and the Theory of Grammar. Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA.

Jensen, Anne. 1998. Knudekonstruktioner—En syntaktisk, semantisk og pragmatisk analyse av sætningsknuder i dansk. Master's thesis, Københavns Universitet, Copenhagen, Denmark.

Jensen, Anne. 2002. Sætningsknuder i dansk [Sentence intertwining in Danish]. *NyS—Nydanske Studier & Almen Kommunikationsteori* 29: 105–24.

Johnston, Michael. 1994. The Syntax and Semantics of Adverbial Adjuncts. Ph.D. dissertation, University of California, Santa Cruz, CA, USA.

KorpusDK. 2021. Available online: https://ordnet.dk/korpusdk/ (accessed on 19 November 2021).

Kuno, Sususmu. 1976. Subject, theme, and the speaker's empathy: A re-examination of relativization phenomena. In *Subject and Topic*. Edited by Charles N. Li. New York: Academic Press, pp. 417–44.

Kush, Dave, Akira Omaki, and Norbert Hornstein. 2013. Microvariation in Islands? In *Experimental Syntax and Island Effects*. Edited by Jon Sprouse and Norbert Hornstein. Cambridge: Cambridge University Press, pp. 239–65.

Kush, Dave, Charlotte Sant, and Sunniva Briså Strætkvern. 2021. Learning Island-insensitivity from the input: A corpus analysis of child- and youth-directed text in Norwegian. *Glossa* 6: 105. [CrossRef]

Kush, Dave, Terje Lohndal, and Jon Sprouse. 2018. Investigating variation in island effects: A case study of Norwegian Wh-extraction. *Natural Language & Linguistic Theory* 36: 743–79. [CrossRef]


Vincent, Jake Wayne. 2021. Extraction from Relative Clauses: An Experimental Investigation into Variable Island Effects in English. Ph.D. dissertation, University of California, Santa Cruz, CA, USA.

Wiklund, Anna-Lena, Fredrik Heinat, Eva Klingvall, and Damon Tutunjian. 2017. An acceptability study of long-distance extractions in Swedish. In *Language Processing and Disorders*. Edited by Linda Escobar, Vincenç Torrens and Teresa Parodi. Newcastle upon Tyne: Cambridge Scholars Publishing, pp. 103–20.

Williams, Edwin S. 2011. *Regimes of Derivation in Syntax and Morphology*. London: Routledge.

Williams, Edwin S. 2013. Generative Semantics, Generative Morphosyntax. *Syntax* 61: 77–108. [CrossRef]
