What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks

Sprenger, Simone A.; Beck, Sara D.; Weber, Andrea

doi:10.3390/languages9030105

Open AccessArticle

What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks

by

Simone A. Sprenger

^1,*,

Sara D. Beck

²

and

Andrea Weber

²

¹

Faculty of Arts, University of Groningen, 9712 CP Groningen, The Netherlands

²

Faculty of Humanities, University of Tübingen, 72074 Tübingen, Germany

^*

Author to whom correspondence should be addressed.

Languages 2024, 9(3), 105; https://doi.org/10.3390/languages9030105

Submission received: 31 August 2023 / Revised: 5 March 2024 / Accepted: 8 March 2024 / Published: 18 March 2024

(This article belongs to the Special Issue Idiomatic and Formulaic Language: Learning, Processing and Representation)

Download

Browse Figure

Versions Notes

Abstract

:

This study investigated the processing of lexical elements of idioms in isolation. Using visual word priming, spreading activation for idiomatically related word pairs (e.g., pop–question) was compared to that for semantically related (e.g., answer–question) and unrelated word pairs (e.g., trim–question) in two experiments varying in SOA (500 ms and 350 ms). In line with hybrid theories of idiom representation and processing, facilitatory priming was found in both experiments for idiomatic primes, suggesting a tight link between the words of an idiom that is mediated by a common idiom representation. While idiomatic priming was stable across SOAs, semantic priming was stronger for the short SOA, implying fast and early activation. In conclusion, one lexical element of an idiom can facilitate the processing of another, even if the elements are not presented within a phrasal context (i.e., within an idiom), and without the words being semantically related. We discuss our findings in light of theories about idiom processing, as well as current findings in the field of semantic priming.

Keywords:

idioms; (semantic) priming; lexical decision

1. Introduction

Idioms, such as “to bury the hatchet” (meaning to end a conflict), form an important part of the native speaker’s phrasal vocabulary, which is frequently used in everyday communication (Pawley and Syder 1983). One of the key questions in psycholinguistic research on idioms concerns how this knowledge is represented and accessed by the language user. Hybrid theories of idiom representation and processing (e.g., Cacciari and Tabossi 1988; Titone and Connine 1999; Sprenger et al. 2006) assume that idioms are represented both in terms of their lexical elements and in terms of an overarching idiom representation that serves to connect these elements with the phrasal meaning (e.g., a superlemma). These theories therefore predict that lexical co-occurrence within a fixed expression has a lasting effect on the structure of the mental lexicon: words that share aspects of neither their meaning nor their form will be “wired together”, because they are processed as part of the same fixed linguistic structure. While the main argument for such a structure is parsimony (i.e., idioms make use of existing lexical representations with their own literal meanings), it also can explain how idioms contribute to fluent speech (Dechert 1983; Pawley and Syder 1983; Kuiper 1995) and more effortless comprehension: whenever one element of an idiom is activated, it is predicted to spread activation to the idiom’s remaining lexical elements, making the retrieval of those words from long-term memory faster and less error-prone.

In line with this hypothesis, the facilitatory effects of idioms in their phrasal contexts have consistently been observed in language comprehension research (see Conklin and Schmitt 2012 for a review). For example, in an eye-tracking study, Siyanova-Chanturia et al. (2011) showed that native speakers of English read idiomatic sequences (left a bad taste in my mouth) significantly faster than matched control phrases (the bad taste left in his mouth). Likewise, Carrol and Conklin (2020) tested reading times for idioms (spill the beans) but also for binomials (bread and butter) and collocations (defined by the authors as “combinations of words that are entirely compositional and semantically ‘free’, but which co-occur in conventional and recurrent patterns” (p. 97); e.g., classic example). They found significant processing advantages for all three types of formulaic language, with the phrase frequency being a particularly strong predictor of the response-to-idiom-reading time, and with greater idiom familiarity leading to a higher rate of final-word skipping. The latter finding implies that readers could predict the final word because they had already recognized the idiom.

Taken together, experiments on idiomatic processing advantages in phrasal contexts support hybrid models of idiom representation and processing, suggesting tight links between the words of an idiom that are mediated by a common idiom representation. Here, we want to take this research a step further by studying the processing of lexical elements of idioms in isolation. More specifically, we ask whether the long-term storage of idiomatic expressions affects the organization of single lexical items in the mental lexicon. If idiom representations indeed tie their lexical elements together, enabling spreading activation from one element to another, the facilitatory effects should not depend on idiom recognition in a phrasal context but should independently occur for lexical elements. In other words, bury should prime hatchet, even if presented in isolation. To this aim, we studied the representation and processing of idiom words without a phrasal context.

We tested our hypothesis that idiom words can prime each other when presented in isolation by means of a primed lexical decision task with idiomatic prime–target pairs mixed with pairs that are semantically related, are unrelated, or contain nonwords. We report the results of two experiments in which we varied the Stimulus Onset Asynchrony (SOA) between the prime and target. If idiom knowledge results in the creation of connections between the idiom’s lexical elements in the mental lexicon, we should see facilitatory priming effects between idiom words in a primed lexical decision task. An additional question concerns the nature of these effects. In our view, an idiomatic priming effect should be qualitatively different from the effect of semantic priming, as it originates at a different level of processing, involving different types of representations. This assumption is based on Collins and Loftus’s (1975) highly influential model of semantic memory, which is also reflected in Levelt et al.’s (1999) model of word production and, accordingly, the superlemma model (Sprenger et al. 2006). Collins and Loftus clearly distinguish between a semantic (conceptual) network and a lexical network (i.e., what we refer to as the mental lexicon). In the case of idiomatic priming, the connection between the two words involved is located within the lexicon, mediated by a phrasal representation, and therefore independent of possible semantic links between the underlying concepts. In contrast, in the case of semantic priming, the representations involved are abstract concepts that represent the meaning of a word. If these concepts share semantic features, they are expected to prime each other (Lucas 2000). If the nature of the connection between idiom words differs from that of an associative semantic connection because it is mediated by a common idiom representation, rather than semantic feature overlap, the time course of idiom word priming should differ from the time course of semantic priming. Whether semantic or idiomatic priming should be faster cannot easily be predicted, though: while activation spreading via a common idiom representation might be somewhat slower than the activation of a semantic network due to the additional node, the level of lexical processing precedes that of semantic analysis during comprehension. We therefore predict priming effects for both types of relationships, with a possible time course difference in either direction.

Importantly, research on semantic memory (which typically uses verbal materials) has looked into prime–target pairs that are quite similar to those under consideration in our study (Roelke et al. 2018; Hofmann et al. 2022), based on the idea that the frequency with which words co-occur in a language plays an important role in the organization of their long-term storage. Specifically, compound-cue theory (McKoon and Ratcliff 1989, 1992; Ratcliff and McKoon 1988, 1995) states that priming effects result from the familiarity of the prime and target as a compound (not in the linguistic sense), and that such a compound “is formed by the simultaneous presence of the prime and target in short-term memory as a test item” (McKoon and Ratcliff 1992, p. 1155). We will return to the similarities and differences between an approach based on models of idiom representation and processing and the compound-cue theory of semantic priming in the discussion. It should be noted that both approaches have one thing in common: they consider the extent of priming that can be observed for non-associatively related word pairs to be affected by their familiarity, which McKoon and Ratcliff (1992) operationalized as the frequency of co-occurrence in a linguistic corpus. However, while the compound-cue theory locates the familiarity effect at the conceptual level, theories of idiom representation and processing locate it in the mental lexicon (i.e., at the level of linguistic representations).

2. Experiment 1

The experiment was conducted online using Gorilla Experiment Builder (https://gorilla.sc/, accessed on 23 January 2021). A primed visual lexical decision task was conducted to explore the nature of the connections between words within idioms, followed by a familiarity-rating task of the idioms used.

2.1. Method

2.1.1. Participants

Ninety native speakers of English (age 18–35, mean 30.2, SD = 3.26, 43 female) were recruited via Prolific (https://www.prolific.co/, accessed on 1 February 2021). Participation was restricted to residents of the United States and those who grew up in a monolingual English-speaking household. The recruitment, payment, and procedure followed the standard practices of ethical consent according to the LingTüLab’s approval from the DFG (German Research Foundation).

2.1.2. Pre-Study

In order to develop idiom and prime stimuli, 89 highly familiar idioms were first pre-tested with the three primes (idiomatic, semantic, unrelated) for association strength. Data were collected in an online rating study using Alchemer (https://alchemer.com/, accessed on 7 March 2024, formerly called SurveyGizmoLLC), and L1 American participants were recruited via Prolific. An independent sample of 59 adult participants (33 male, 2 undisclosed, average age = 31, SD = 5.60) rated the word pairs for association strength (each target once with one of the three primes, across lists). Participants rated the association of each prime with one target on a scale from 1 (not associated at all) to 7 (very strongly associated). In the instructions, some semantically and idiomatically related primes were given as examples of associated pairs in order to clarify what was meant by association strength. The participants from the pre-test did not take part in the priming study.

For the final selection, the idioms with the most similar association strengths between idiomatic and semantic primes and targets were selected. Overall, 60 idioms showed high association strengths for both the idiomatic and semantic priming pairs (5.19 and 6.51 on a scale of 1–7, respectively) and a low association strength for unrelated pairs (1.52). Both the idiomatic and semantic priming pairs differed significantly from the unrelated pairs (t(97) = −33.99, p > 0.001, t(108) = 71.97, p > 0.001, respectively). The semantically primed pairs also showed a higher overall associative rating than the idiomatic pairs (t(81) = 13.04, p > 0.001), but this disparity is in line with other research using idiomatic priming pairs (see, e.g., Beck and Weber 2016). Additionally, all target word conditions were controlled for lexical frequency (see Supplementary Materials).

2.1.3. Materials

Sixty familiar idioms were selected for use in the experiment. The final idiomatic word was used as the target, and three different types of prime words were selected to create a total of 180 total word pairs (3 word pairs per idiom): 60 idiomatically related word pairs, 60 semantically related word pairs, and 60 unrelated word pairs (see Table 1). Idiomatic pairs were developed primarily from familiar idioms using existing English idiom databases (Beck and Weber 2016; Libben and Titone 2008; Nordmann and Jambazova 2017; Titone and Connine 1994) and included the final word of the idiom as the target and a previous, additional content word as the prime (occurring 1–3 words prior to the final idiom word). For instance, from the idiom to pop the question, meaning to make a marriage proposal, the idiomatic pair was POP (prime) and QUESTION (target). Semantically related pairs were created by using backward associations from the same target word (QUESTION) using the word association database Small World of Words (www.smallworldofwords.org; e.g., ANSWER). Table 1 shows the example set of all three prime–target pairs. Critically, idiomatically related word pairs were not also semantically associated (e.g., in the idiom two peas in a pod, PEAS and POD are both idiomatically and semantically associated and therefore were not included in the materials). The absence of a semantic association in the idiomatically related word pairs was verified by the second author, who is a native speaker of American English and a linguist by training. Finally, the unrelated primes were chosen at random (e.g., TRIM). To check that no idiomatic or semantic relations had been selected by chance, all choices were based on association strength (see the Pre-Study Section above) and additionally reviewed by a native speaker (i.e., author 2).

These target items were divided into three counterbalanced lists, each containing 20 idiomatic pairs, 20 semantically related pairs, and 20 unrelated pairs, so that each target word was presented only once per list. In addition to the three sets of 60 target word pairs, 100 filler pairs were developed. Twenty fillers were idiomatic word pairs (e.g., FIT and FIDDLE from the idiom fit as a fiddle), such that the ratio of idiomatic to non-idiomatic word pairs was balanced. Finally, 80 word and nonword pairs (e.g., DIRTY and ROGGLE) were added so that half of each list consisted of pseudoword targets that are nonexistent words in English but are phonotactically acceptable. Including target and filler pairs, each of the three lists then consisted of 160 word pairs.

2.1.4. Procedure and Design

The experiment was designed and hosted online via Gorilla Experiment Builder (https://gorilla.sc/, accessed on 12 April 2021) and included the lexical decision task, a familiarity-rating task, and a short language background questionnaire, taking about 20 min in total. Participants were asked to participate in a quiet environment on a laptop or desktop using Google Chrome in full-screen mode (Gorilla did not allow participants to continue without fulfilling the latter two requirements).

For the lexical decision task, participants were instructed that they would see two strings of letters consecutively on the screen, and they should decide as quickly and accurately as possible whether the second string of letters was an existing English word or not. Responses were recorded with the letters “F” and “J” on each participant’s keyboard. For right-handed participants, “J” (the right key) corresponded to “YES” and “F” corresponded to “NO”, and vice versa for left-handed participants, so that the dominant hand was used for a positive response. Reaction times for keyboard presses were recorded by Gorilla and used for analysis (for online accuracy of reaction time experiments using Gorilla Sc, see, e.g., Anwyl-Irvine et al. 2021).

After participants gave consent and selected their dominant hand, the experiment began. Prior to the 160 target word pairs, 6 practice items (3 target words and 3 target pseudowords) were presented, and feedback was given on accuracy to make sure that participants understood the task and were participating to the best of their abilities. Following the practice trials, the 160 experimental trials were presented in a randomized order for each individual participant. Following every 40 trials, participants were able to take a short break and press a key to continue the experiment. Each trial began with a fixation cross, presented for 1500 ms, followed by a visual presentation of the prime word for 350 ms, an inter-stimulus interval (ISI) of 150 ms, followed by a visual presentation of the target word. Thus, the SOA for Experiment 1 was 500 ms. The target was presented until a keyboard press advanced to the next trial or for a maximum of 1500 ms. The task took participants less than 10 min to complete.

Next, participants completed a familiarity-rating task on the idiomatic pairs they had seen in the experiment. Although the idioms included were generally highly familiar, the task provided further confirmation that the idioms were indeed familiar to the participants in this study. Participants were presented with each of the 20 idioms seen in the idiomatic condition individually and instructed to rate their familiarity with the idiom. Familiarity was defined as the frequency with which a person has heard, seen, or used an idiom, and we instructed participants that knowledge of the meaning of the idiom is not the same as familiarity. Participants responded by moving the bar on a sliding scale from never (1) to very frequently (7). The task was short and took about 2–3 min to complete.

Following the familiarity task, some general demographic and language background information, such as age, gender, occupation, and questions confirming nativeness, was collected before participants were redirected to Prolific for payment.

2.2. Results

We collected data from a total of 90 participants. All analyses were performed in R (version 4.2.3, 15 March 2023). The data of seven participants were excluded from further analyses, because they answered correctly in less than 70% of all trials in one or more conditions. This was equivalent to 8% of all data collected. For the remaining participants, we found high percentages of correct responses: In the filler condition (80 nonwords and 20 words), participants responded correctly on average in 94% of all trials. In the experimental conditions (semantic, idiomatic, and unrelated), 97% of all responses were correct (on average, both overall and separately per condition). All erroneous trials were removed from the analysis. The average familiarity score of the idioms was high (M = 5.36, SD = 1.71, range = 1–7). A separate analysis of the response times for the idioms (see online Supplementary Materials) did not show a reliable effect of familiarity.

We analyzed the response times by means of a series of linear mixed-effect models using the lme4 package (lme4 1.1.32, Bates et al. 2015). Starting with a random-effects model (with Participant and Target word as random factors), we added the factor Condition (unrelated, semantic, idiomatic) to the model. All factors made significant contributions to the model. We also tested random slopes for Condition and a fixed effect for target word Frequency, neither of which improved the model. The final model included all theoretically motivated factors:

M1: RT ~ Condition + (1 | Participant) + (1 | Target)

After checking the residuals, we decided to trim the model at 2.5 standard deviations (removing 2.9% of the data, following Baayen 2008). While this improved the tails of the distribution, we still saw evidence of some heteroscedasticity in the data. We therefore analyzed the trimmed dataset with rlmer from the robustlmm package (version 3.2.0, Koller 2016), as this method is supposed to be more robust in this case. The pattern of results confirms that of the lmer model. The parameters of this final model are listed in Table 2.

The results show an average estimated response time in Experiment 1 of 535 ms in the unrelated condition (model intercept, rounded). In both the idiomatic (6 ms) and the semantic (11 ms) conditions, response times are significantly faster than in the unrelated condition, and the effect is numerically stronger for the semantic condition. Paired comparisons of the effects of the three levels of Condition across experiments, however, are somewhat contradictory about the effect of idiom priming: while a comparison based on the lmer model confirms the reliability of the idiom priming effect, a comparison based on the rlmer model does not. As we chose the rlmer model due to its robustness to homoscedasticity, we report the corresponding averages and test results here as well (Table 3 and Table 4). Please refer to the online Supplementary Materials for the corresponding values for the lmer model.

Taken together, we find a clear effect of semantic priming. In comparison, the effect of idiomatic priming is smaller and seemingly less reliable: while the effect is significant in both the lmer and rlmer models, it is only marginally significant when the pairwise comparisons are based on the rlmer model. This may be related to the fact that the residuals are still somewhat non-normally distributed. Changing the dependent variable to log-RT did not change the pattern of effects. We therefore cautiously interpret the pattern of results in Experiment 1 as evidence of a robust semantic priming effect, next to a numerically smaller and less robust idiom priming effect that requires further scrutiny.

3. Experiment 2

As Experiment 1 showed evidence of both idiomatic and semantic priming, but with different effect sizes and reliabilities, Experiment 2 was designed to test the hypothesis that the two priming effects arise at different levels of processing: whereas semantic priming should result from activation in semantic long-term memory, idiomatic priming should arise at the level of lexical or phrasal processing. In other words, semantic and idiomatic priming may differ from each other with respect to the time window in which they can reliably be observed. We therefore designed our second primed lexical decision experiment to be identical to Experiment 1, except that the SOA was reduced to 350 ms by removing the ISI. In this way, we intended to capture fast and early priming processes.

3.1. Method

3.1.1. Participants

Ninety-one native speakers of English (age 18–35, mean 28.8, SD = 5.46, 50 female) were recruited via Prolific (https://www.prolific.co/, accessed on 12 April 2021). As in Experiment 1, participation was restricted to residents of the United States and those who grew up in a monolingual English-speaking household. The recruitment, payment, and procedure followed the standard practices of ethical consent according to the LingTüLab’s approval from the DFG (German Research Foundation).

3.1.2. Materials

The materials were identical to those in Experiment 1.

3.1.3. Procedure and Design

The procedure and design were nearly identical to Experiment 1, with the exception of two details: SOA and familiarity. As in Experiment 1, each trial began with a fixation cross, presented for 1500 ms, followed by the prime, presented for 350 ms, and then immediately followed by the target (ISI = 0 ms, SOA = 350 ms). The target was again presented until a keyboard press advanced to the next trial or for a maximum of 1500 ms.

In contrast to Experiment 1, familiarity was not collected for Experiment 2. Note that all idioms had been chosen based on their familiarity from the start. While Experiment 1 confirmed that the idioms were highly familiar to participants, subjective familiarity did not improve the fit of the model.

3.2. Results

The results of Experiment 2 have been analyzed both separately and in combination with those of Experiment 1. In the interest of brevity, however, the separate analysis—which is confirmed in all important aspects by the common analysis—can be found in the online Supplementary Materials. Analyzing Experiments 1 and 2 in one model allows us to test for the presence of idiomatic and semantic priming in interaction with Experiment (and therefore SOA) and generally increases the statistical power.

We collected data from a total of 181 participants in Experiments 1 and 2 (91 and 90, respectively). All analyses were performed in R (version 4.2.3, 15 March 2023). The data of a total of fifteen participants were excluded from further analyses, because they answered correctly in less than 70% of all trials in one or more conditions. This was equivalent to 7.8% of all data collected. For the remaining participants, we found high percentages of correct responses: In the filler condition (80 nonwords and 20 words), participants responded correctly on average in 94% (Exp1) and 93% (Exp2) of all trials. In the experimental conditions (semantic, idiomatic, and unrelated), between 97% and 98% of all responses per condition were correct. All erroneous trials were removed before the response time analysis.

We analyzed the response times by means of a series of linear mixed-effect models using the lme4 package (lme4 1.1.32, Bates et al. 2015). Starting with a random-effects model (with Participant and Target word as random factors), we incrementally added the factor Condition (unrelated, semantic, idiomatic) and its interaction with Experiment (exp1 vs. exp2) to the model. We also tested random slopes for Condition, as well as a fixed effect for Frequency1, but they did not improve the model. The final model, M2, included all theoretically motivated factors and their interaction:

M2: RT ~ Condition*Experiment + (1 | Participant) + (1 | Target)

After checking the residuals, we decided to trim the model (i.e., its residuals) at 2.5 standard deviations (removing 3.1% of the data, following Baayen 2008). While this improved the tails of the distribution, we still saw evidence of some heteroscedasticity in the data. We therefore analyzed the trimmed dataset with rlmer from the robustlmm package (version 3.2.0, Koller 2016). The pattern of results confirms that of the lmer model. The parameters of this final model are listed in Table 5. In addition, the response time pattern is illustrated in Figure 1.

The results show an average estimated response time in Experiment 1 of 535 ms in the unrelated condition (model intercept, rounded). In both the idiomatic (−6 ms) and the semantic (−12 ms) conditions, response times are significantly faster than in the unrelated condition, but the effect is again numerically stronger for the semantic condition. In Experiment 2, we see that, in addition to an overall slowdown (between participants) of 26 ms, the semantic effect is further modulated (adding a significant −9.6 ms to the facilitatory effect observed in Experiment 1), but the idiomatic effect is not (a non-significant effect of −1.27 ms). Paired comparisons of the effect of the three levels of Condition across experiments confirm the reliability of the observation that—overall—the semantic priming effect is larger than the idiomatic priming effect, but both effects are significant. Averages and test results are shown in Table 6 and Table 7.

Taken together, we see a small but reliable facilitatory effect of idiom word priming, as well as a clear effect of semantic priming. The two effects differ not only in size (with the semantic effect on average being about twice as large) but also in the way in which they depend on our Stimulus Onset Asynchrony (SOA) manipulation: in Experiment 1 (ISI = 150 ms, SOA = 500 ms), the semantic priming effect is smaller than in Experiment 2 (ISI = 0 ms, SOA = 350 ms), indicating that the effect is fast and early. In contrast, the idiom priming effect does not vary with SOA.

4. Discussion

In the current study, we tested the prediction that prime words that are related to a target word via an idiomatic representation show facilitatory priming, even if the prime and target are presented without any phrasal context and the words themselves are not semantically related. That is, we expected to see, for example, the word “bury” prime the word “hatchet”, as they are both part of the familiar idiom “to bury the hatchet”. We derived this prediction from psycholinguistic models of idiom representation and processing that assume connections between idiom words at the lexical level of processing (e.g., the superlemma model, Sprenger et al. 2006). We tested our prediction, with native speakers of English, in two visually primed lexical decision experiments that differed with respect to the timing of the prime and target words: in Experiment 1, the prime preceded the target by 500 ms (with an ISI of 150), and in Experiment 2, by 350 ms (with an ISI of 0). In both experiments, we included a semantic priming condition of the type DOCTOR–NURSE for comparison. A common analysis of the results of both experiments showed reliable facilitatory priming effects for both idiomatically and semantically related prime–target pairs. In addition, we observed two effects that point to possible differences in the underlying mechanisms responsible for the priming effects: first, the idiomatic priming effect that we found is, on average, only half as strong as the semantic priming effect. Second, the semantic priming effect, but not the idiomatic priming effect, was modulated by our SOA manipulation, with stronger effects in Experiment 2 (shorter SOA). We will discuss these findings and their implications in turn.

First, the finding that idiom word pairs that are not semantically related show facilitatory priming confirms our prediction and supports hybrid models of idiom representation and processing (e.g., Sprenger et al. 2006). According to this lexical view, native speakers of American English have linked the words bury and hatchet by means of a phrasal representation in the mental lexicon, because the two words appear together in the same idiom. In addition, the idiom word priming effect is consistent with the literature on processing advantages for fixed phrases, such as faster reading times (e.g., Siyanova-Chanturia et al. 2011) and more fluent production (e.g., Pawley and Syder 1983; Kuiper 1995).

Second, the observed difference in effect size between semantic and idiomatic priming is in line with the idea of a possible difference in the underlying priming mechanisms: direct connections between conceptual representations in the case of semantic priming, and indirect connections between lexical representations that are mediated by a phrasal representation in the case of idiomatic priming. At the same time, however, other factors may contribute to the difference as well. For example, semantic associations of the type DOCTOR–NURSE are presumably more frequent than the idioms of our idiomatic word pairs, and they most probably have an earlier age of acquisition than most idioms (Sprenger et al. 2019; Carrol 2023). We did not control for frequency, and—due to the differences in the nature of the linguistic contexts in which these items can be expected to appear and the need to control for idiomaticity—doing so would not be straightforward, but we agree with an anonymous reviewer that this could be a valuable addition to future research. In contrast, controlling for age-of-acquisition effects is unfortunately impossible, as early idiom acquisition data are virtually nonexistent.

Third, similar to the difference in effect size, the effect of our SOA manipulation suggests a possible difference in the time course of semantic and idiomatic priming and, therefore, a difference in the underlying mechanisms as well. While we find that the semantic priming effect is stronger at the shorter SOA (and has been shown to be even faster in lab-based studies, e.g., Neely 1991), the idiomatic priming effect overall is still comparatively small at the SOAs that we tested. When analyzing the results of the experiments separately, we found the effect to be more reliable at the shorter SOA, but this was not reflected in a significant interaction with Experiment in the common model. Thus, while the effects of the two types of priming are not the same, they are also not clearly dissimilar in terms of their temporal distribution. We therefore interpret these differences with the necessary caution, leaving the question about a time course difference between the two types of priming open. Future research will have to show whether the idiom priming effect can be further optimized at longer SOAs. This will be informative for models of idiom representation and processing, as such studies on idiom words without context can provide us with an upper boundary for the speed with which one word of a phrase can activate its remaining elements. More importantly, such data could help us to understand the nature of the priming processes involved. As we mentioned in the introduction, the assumption of an additional phrasal representation that mediates idiomatic priming effects may lead to the prediction that idiomatic priming is inherently slower than associative semantic priming due to the additional computational cost that is related to the idiom node’s activation. In contrast, the opposite effect may be predicted as well: as the level of lexical processing precedes that of semantic analysis during comprehension, any effect of a lexical association within the lexicon itself (i.e., non-conceptual) may be faster than the effects of semantic relatedness. As a third option, both processes may act in concert, with idioms losing some speed to the extra node on the one side while gaining some speed, due to the “early” nature of the processes, on the other. Follow-up studies could help us to reduce the number of options, as could computational modeling.

While our study was motivated by theoretical models of idiom representation and processing, our findings are also consistent with the compound-cue theory of semantic priming: we cannot exclude that the fact that bury and hatchet frequently co-occur may by itself be enough reason for their representations to become linked together in memory. Whether such associations are conceptual or lexical in nature does not follow clearly from the theory. In the words of Ratcliff and McKoon (1988), “the automatic component of facilitation would be neither pre-nor postlexical [i.e., before or after lexical access], as those terms are usually used, but a product of the joint association of prime and target”. Most importantly, however, no additional phrasal storage seems required within their framework. If we commit to the principle that simpler explanations are to be preferred above more complex ones (“Occam’s razor”), one might therefore conclude that the odds seem to favor the statistical explanation over a phrasal memory explanation at this point.

However, before we throw in the towel, it is worthwhile to take a closer look at those statistics. More recent studies on semantic memory (Roelke et al. 2018; Hofmann et al. 2022) have compared the effects of “direct association” (i.e., association without semantic overlap) to semantic only or associative and semantic relations between prime and target words. In their approach, Roelke et al. follow the strategy employed by McKoon and Ratcliff (1992) to base the association on statistical measures of co-occurrence, but based on a larger database. That is, they extracted “directly associated” word pairs from a German 43-million-sentence corpus that consists of more than 7.5 million word types (Quasthoff et al. 2006) by calculating the likelihood of all possible word pairs and subsequently selecting the top associates for their “pure associative” condition (i.e., word pairs that do not show semantic feature overlap but nevertheless are associates). Lexical decision data with SOA = 200 ms and SOA = 1000 ms show that “pure associative” (high co-occurrence, but no feature overlap) and “semantic” (high semantic feature overlap) priming were equally effective at the short SOA. However, the effect changed with the SOA, with associative priming being significantly stronger than semantic priming at the 1000 ms SOA. The authors conclude that associative and semantic priming can be dissociated from each other. In other words, they seem to demonstrate that statistical co-occurrence is the driving force behind facilitatory priming between words that do not have any common associates (and thus appear to be unrelated), and that this effect follows a different time course than actual semantic priming does.

Yet, an inspection of their stimulus set shows that for the large majority of their associative word pairs (at least 36 out of 50), the words form constituents of well-known German multi-word expressions. For example, Kuh and Eis (cow and ice) are part of the idiom die Kuh vom Eis holen (to get the cow off the ice, to save the situation), Flut and Ebbe (flow and ebb) are part of the binomial Ebbe und Flut (ebb and flow), and Tafel and Kreide (chalkboard and chalk) form the compound Tafelkreide (chalkboard chalk). In other words, the statistical approach to word association reveals that the most common “pure” associates are predicted by the language’s phrasal vocabulary. This includes not only figurative language, such as idioms, but also, for example, common literal expressions (allen Bedenken zum Trotz, in spite of all considerations), literary movements (Sturm und Drang, storm and stress), and movie titles (Der Schuh des Manitu, The Shoe of Manitou)2. Put differently, idiomatic—or rather phrasal—associations are the best explanation for the way in which strong associations between words that are not semantically related come about. In addition, they are also a fairly good explanation for the association strength between words that are both associatively and semantically related. In Roelke et al.’s list of stimuli, many pairs in the Associative+Semantic category (at least 12 out of 50) seem to have phrasal origins as well. For example, Kaffee and Tasse (coffee and cup) form the compound Kaffeetasse (coffee cup), Banane and Schale (banana and peel) form the compound Bananenschale (banana peel), and Hopfen and Malz (hop and malt) are part of the idiom da ist Hopfen und Malz verloren (hop and malt are lost there, that situation cannot be saved).

With respect to the effect that these associations have in a primed lexical decision task, our findings converge with those of Roelke et al. (2018): both semantic and associative pairs show facilitation, with semantic effects being strongest at a short SOA and associative effects profiting from a longer SOA. More importantly, however, our approaches diverge with respect to the processing level at which we locate the observed effects. While Roelke et al. aimed at investigating direct associations in semantic memory (as questioned by Lucas 2000; Hutchison 2003; McNamara 2005), we locate the source of our priming effects in the mental lexicon. In (Psycho-)Linguistics, it has long been acknowledged that our linguistic long-term memory comprises not simply words and rules but also vast collections of fixed phrases (e.g., Pawley and Syder 1983; Jackendoff 1995; Wray 2003). Theories of idiom processing focus on figurative expressions, but the effects of phrasal storage can be observed for all kinds of chunks and at all kinds of ages. For example, Bannard and Matthews (2008) found frequency effects on the repetition of four-word chunks (a drink of tea) already in 2-year-olds. Yet, while idioms take much longer to learn (e.g., Sprenger et al. 2019; Carrol 2023), the ambiguity between their literal and figurative meanings has so far drawn the majority of the research. As idioms cannot be taken literally and, at the same time, depend on a specific configuration of words and grammar, language users must be able to access their phrasal representations fast and effortlessly during both comprehension and production. In comprehension, that includes quickly discarding the literal word meanings in favor of the idiomatic interpretation, once the idiom has been recognized (Rommers et al. 2013). In other words, idiom word processing within an idiomatic context is not driven by lexical semantics. Accordingly, the relationships between an idiom’s constituent words in semantic memory alone cannot explain these processes. Instead, we need to acknowledge the linguistic nature of these relationships. More generally, our findings therefore contribute to current discussions about the way in which lexical and conceptual representations are connected in the human mind (e.g., Eviatar et al. 2023). In future work, it would be interesting to combine the approach by Roelke et al. (2018), who sought to separate the effects of co-occurrence and semantic feature overlap, with an approach that explicitly takes the phrasal vocabulary into account.

Here, we have shown that—in line with current theories of idiom processing—one idiom word facilitates the processing of another, even if they are not presented within a phrasal context, and without the words being semantically related. While statistical co-occurrence is probably an important factor for the acquisition of such sequences, we argue that the mechanism that is responsible for the observed effects can be found in the fact that these words are bound together by a common phrasal representation in the mental lexicon.

Supplementary Materials

The following supporting information can be downloaded at https://osf.io/qd3y8/?view_only=de1ecfb300184879a6ce610f3e97f36d (accessed on 7 March 2024): analysis scripts and output.

Author Contributions

S.A.S. and A.W. were involved in the conceptualization of the experiment and all authors participated in the methodology and draft preparation. S.D.B. was responsible for the data curation, supervised by A.W. and S.A.S. was responsible for the data analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This study received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the declaration of Helsinki, and approved by the Ethics committee of the German Research Foundation (DFG), (#2022-08-220617, issued on 17 June 2022) for studies involving humans.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study, and the procedure followed the standard practices of ethical consent according to the LingTüLab’s approval from the DFG (German Research Foundation).

Data Availability Statement

Our data are available via the following link: https://osf.io/qd3y8/?view_only=de1ecfb300184879a6ce610f3e97f36d (accessed on 7 March 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Notes

1	Frequency made a significant contribution to the separate model for Experiment 2 but contributed neither to that for Experiment 1 nor to the common model.
2	In its diversity, the collection of items is in fact highly reminiscent of Jackendoff’s Wheel of Fortune corpus of fixed phrases (Jackendoff 1995).

References

Anwyl-Irvine, Alexander, Edwin S. Dalmaijer, Nick Hodges, and Jo K. Evershed. 2021. Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behavior Research Methods 53: 1407–25. [Google Scholar] [CrossRef]
Baayen, R. Harald. 2008. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press. [Google Scholar]
Bannard, Colin, and Danielle Matthews. 2008. Stored word sequences in language learning. Psychological Science 19: 241–48. [Google Scholar] [CrossRef]
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67: 1–48. [Google Scholar] [CrossRef]
Beck, Sara D., and Andrea Weber. 2016. Bilingual and Monolingual Idiom Processing Is Cut from the Same Cloth: The Role of the L1 in Literal and Figurative Meaning Activation. Frontiers in Psychology 7: 1350. [Google Scholar] [CrossRef]
Cacciari, Cristina, and Patrizia Tabossi. 1988. The comprehension of idioms. Journal of Memory and Language 27: 668–83. [Google Scholar] [CrossRef]
Carrol, Gareth. 2023. Old Dogs and New Tricks: Assessing Idiom Knowledge Amongst Native Speakers of Different Ages. Journal of Psycholinguistic Research 52: 2287–302. [Google Scholar] [CrossRef]
Carrol, Gareth, and Kathy Conklin. 2020. Is all formulaic language created equal? Unpacking the processing advantage for different types of formulaic sequences. Language and Speech 63: 95–122. [Google Scholar] [CrossRef] [PubMed]
Collins, Allan M., and Elizabeth F. Loftus. 1975. A spreading-activation theory of semantic processing. Psychological Review 82: 407. [Google Scholar] [CrossRef]
Conklin, Kathy, and Norbert Schmitt. 2012. The processing of formulaic language. Annual Review of Applied Linguistics 32: 45–61. [Google Scholar] [CrossRef]
Dechert, Hans W. 1983. How a story is done in a second language. In Strategies in Interlanguage Communication. Edited by Claus Faerch and Gabriele Kasper. London: Longman, pp. 175–95. [Google Scholar]
Eviatar, Zohar, Nahal Binur, and Orna Peleg. 2023. Interactions of lexical and conceptual representations: Evidence from EEG. Brain and Language 243: 10530. [Google Scholar] [CrossRef] [PubMed]
Hofmann, Markus J., Mareike A. Kleemann, André Roelke-Wellmann, Christian Vorstius, and Ralph Radach. 2022. Semantic feature activation takes time: Longer SOA elicits earlier priming effects during reading. Cognitive Processing 23: 309–18. [Google Scholar] [CrossRef]
Hutchison, Keith A. 2003. Is semantic priming due to association strength or feature overlap? A microanalytic review. Psychonomic Bulletin & Review 10: 785–813. [Google Scholar] [CrossRef]
Jackendoff, Ray. 1995. The boundaries of the lexicon. Idioms: Structural and Psychological Perspectives 133: 165. [Google Scholar]
Koller, Manuel. 2016. robustlmm: An R Package for Robust Estimation of Linear Mixed-Effects Models. Journal of Statistical Software 75: 1–24. [Google Scholar] [CrossRef]
Kuiper, Koenraad. 1995. Smooth Talkers: The Linguistic Performance of Auctioneers and Sportscasters. Hillsdale: Erlbaum. [Google Scholar]
Levelt, Willem J. M., Ardi Roelofs, and Antje S. Meyer. 1999. A theory of lexical access in speech production. Behavioral and Brain Sciences 22: 1–38. [Google Scholar] [CrossRef] [PubMed]
Libben, Maya R., and Debra A. Titone. 2008. The multidetermined nature of idiom processing. Memory & Cognition 36: 1103–21. [Google Scholar] [CrossRef]
Lucas, Margery. 2000. Semantic priming without association: A meta-analytic review. Psychonomic Bulletin & Review 7: 618–30. [Google Scholar] [CrossRef]
Lüdecke, Daniel. 2023. sjPlot: Data Visualization for Statistics in Social Science. R Package Version 2.8.14. Available online: https://CRAN.R-project.org/package=sjPlot (accessed on 7 March 2024).
McKoon, Gail, and Roger Ratcliff. 1989. Semantic associations and elaborative inference. Journal of Experimental Psychology: Learning, Memory and Cognition 15: 326–38. [Google Scholar] [CrossRef]
McKoon, Gail, and Roger Ratcliff. 1992. Inference during reading. Psychological Review 99: 440–66. [Google Scholar] [CrossRef]
McNamara, Timothy P. 2005. Semantic Priming. Psychology Press eBooks. London: Psychology Press. [Google Scholar] [CrossRef]
Neely, James H. 1991. Semantic priming effects in visual word recognition: A selective review of current findings and theories. In Basic Processes in Reading: Visual Word Recognition. Edited by Derek Besner and Glyn W. Humphreys. Hillsdale: Erlbaum, pp. 264–336. [Google Scholar]
Nordmann, Emily, and Antonia A. Jambazova. 2017. Normative data for idiomatic expressions. Behavior Research Methods 49: 198–215. [Google Scholar] [CrossRef]
Pawley, Andrew, and Frances Hodgetts Syder. 1983. Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In Language and Communication. Edited by Jack C. Richards and Richard W. Schmidt. New York: Longman, pp. 191–226. [Google Scholar]
Quasthoff, Uwe, Matthias Richter, and Christian Biemann. 2006. Corpus portal for search in monolingual corpora. Paper presented at LREC-06, Genoa, Italy, May 24–25; pp. 1799–802. [Google Scholar]
Ratcliff, Roger, and Gail McKoon. 1988. A retrieval theory of priming in memory. Psychological Review 95: 385. [Google Scholar] [CrossRef] [PubMed]
Ratcliff, Roger, and Gail McKoon. 1995. Sequential effects in lexical decision: Tests of compound-cue retrieval theory. Journal of Experimental Psychology: Learning, Memory and Cognition 21: 1380–88. [Google Scholar] [CrossRef] [PubMed]
Roelke, Andre, Nicole Franke, Chris Biemann, Ralph Radach, Arthur M. Jacobs, and Markus J. Hofmann. 2018. A novel co-occurrence-based approach to predict pure associative and semantic priming. Psychonomic Bulletin & Review 25: 1488–93. [Google Scholar] [CrossRef]
Rommers, Joost, Ton Dijkstra, and Marcel Bastiaansen. 2013. Context-dependent semantic processing in the human brain: Evidence from idiom comprehension. Journal of Cognitive Neuroscience 25: 762–76. [Google Scholar] [CrossRef]
Siyanova-Chanturia, Anna, Kathy Conklin, and Norbert Schmitt. 2011. Adding more fuel to the fire: An eye-tracking study of idiom processing by native and non-native speakers. Second Language Research 27: 251–72. [Google Scholar] [CrossRef]
Sprenger, Simone A., Amélie la Roi, and Jacolien Van Rij. 2019. The development of idiom knowledge across the lifespan. Frontiers in Communication 4: 29. [Google Scholar] [CrossRef]
Sprenger, Simone A., Willem J. M. Levelt, and Gerard Kempen. 2006. Lexical access during the production of idiomatic phrases. Journal of Memory and Language 54: 161–84. [Google Scholar] [CrossRef]
Titone, Debra A., and Cynthia M. Connine. 1994. Descriptive norms for 171 idiomatic expressions: Familiarity, compositionality, predictability, and literality. Metaphor and Symbolic Activity 9: 247–70. [Google Scholar] [CrossRef]
Titone, Debra A., and Cynthia M. Connine. 1999. On the compositional and noncompositional nature of idiomatic expressions. Journal of Pragmatics 31: 1655–74. [Google Scholar] [CrossRef]
Wray, Alison. 2003. Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. [Google Scholar]

Figure 1. Predicted values of the response times (in ms) per Condition and Experiment based on the final model (M2) for Experiments 1 and 2, as shown in Table 5. This figure was generated with the SjPlot package (version 2.8.14, Lüdecke 2023).

Table 1. Example set of target–prime pairs for “pop the question”.

Word Pair Type	Prime	Target
Idiomatic	POP	QUESTION
Semantic	ANSWER	QUESTION
Unrelated	TRIM	QUESTION

Table 2. Model parameters for the final linear mixed-effect model (M1, based on rlmer). This table was generated using the SjPlot package (version 2.8.14, Lüdecke 2023).

	RT
Predictors	Estimates	CI	p
(Intercept)	535.04	518.13–551.96	<0.001
Condition [idiom]	−6.44	−12.26–−0.63	0.030
Condition [sem]	−11.42	−17.23–−5.61	<0.001
Random Effects
σ²	6521.56
τ_{00 Subject}	5228.62
τ_{00 Target}	217.76
ICC	0.46
N_Subject	83
N_Target	60
Observations	4689
Marginal R²/Conditional R²	0.002/0.456

Note. The reference level for the factor Condition is [unrelated].

Table 3. Estimated marginal means per Condition in Experiment 1.

Condition	emmean	SE	df	asymp.LCL	asymp.UCL
unrel	535	8.63	Inf	518	552
idiom	529	8.63	Inf	512	546
sem	524	8.63	Inf	507	541

Note. Table 3 and Table 4 were generated by means of the package emmeans (version 1.8.4.1). emmean = estimated marginal mean. LCL/UCL = lower/upper confidence levels. Degrees-of-freedom method: asymptotic. Confidence level used: 0.95.

Table 4. Pairwise comparisons between the levels of Condition in Experiment 1.

Comparison	Estimate	SE	df	z.Ratio	p Value
unrel–idiom	6.44	2.97	Inf	2.171	0.0762
unrel–sem	11.42	2.97	Inf	3.851	0.0003
idiom–sem	4.98	2.97	Inf	1.679	0.2133

Note. Results are averaged over the levels of Experiment. Degrees-of-freedom method: asymptotic. p Value adjustment: Tukey method for comparing a family of three estimates.

Table 5. Model parameters for the final linear mixed-effect model, M2 (based on rlmer), for Experiments 1 and 2. This table was generated with the SjPlot package (version 2.8.14, Lüdecke 2023).

	RT
Predictors	Estimates	CI	p
(Intercept)	534.87	517.27–552.47	<0.001
Condition [idiom]	−6.43	−12.24–−0.62	0.030
Condition [sem]	−11.53	−17.33–−5.72	<0.001
Exp [Exp2]	26.10	1.89–50.31	0.035
Condition [idiom] × Exp [Exp2]	−1.27	−9.49–6.95	0.762
Condition [sem] × Exp [Exp2]	−9.60	−17.82–−1.38	0.022
Random Effects
σ²	6507.83
τ_{00 Participant}	5670.00
τ_{00 Target}	248.44
ICC	0.48
N_Participant	166
N_Target	60
Observations	9379
Marginal R²/Conditional R²	0.014/0.484

Table 6. Estimated marginal means (emmean) per Condition in Experiments 1 and 2.

Condition	emmean	SE	df	asymp.LCL	asymp.UCL
unrel	548	6.52	Inf	535	561
idiom	541	6.52	Inf	528	554
sem	532	6.52	Inf	519	544

Note. Table 6 and Table 7 were generated by means of the package emmeans (version 1.8.4.1). emmean = estimated marginal mean; LCL/UCL = lower/upper confidence levels. Results are averaged over the levels of Experiment. Degrees-of-freedom method: asymptotic. Confidence level used: 0.95.

Table 7. Pairwise comparisons between the levels of Condition in Experiments 1 and 2.

Comparison	Estimate	SE	df	z.Ratio	p Value
unrel–idiom	7.06	2.10	Inf	3.369	0.0022
unrel–sem	16.33	2.10	Inf	7.785	<0.0001
idiom–sem	9.26	2.09	Inf	4.434	<0.0001

Note. Results are averaged over the levels of Experiment. Degrees-of-freedom method: asymptotic. p Value adjustment: Tukey method for comparing a family of three estimates.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sprenger, S.A.; Beck, S.D.; Weber, A. What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks. Languages 2024, 9, 105. https://doi.org/10.3390/languages9030105

AMA Style

Sprenger SA, Beck SD, Weber A. What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks. Languages. 2024; 9(3):105. https://doi.org/10.3390/languages9030105

Chicago/Turabian Style

Sprenger, Simone A., Sara D. Beck, and Andrea Weber. 2024. "What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks" Languages 9, no. 3: 105. https://doi.org/10.3390/languages9030105

Article Menu

What Fires Together, Wires Together: The Effect of Idiomatic Co-Occurrence on Lexical Networks

Abstract

1. Introduction

2. Experiment 1

2.1. Method

2.1.1. Participants

2.1.2. Pre-Study

2.1.3. Materials

2.1.4. Procedure and Design

2.2. Results

3. Experiment 2

3.1. Method

3.1.1. Participants

3.1.2. Materials

3.1.3. Procedure and Design

3.2. Results

4. Discussion

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI