**1. Introduction**

In sign language emergence, linguistic variation at the lexical level appears to be the default, where synonyms for a word coexist within a population. However, over time, certain pressures seem to push towards lexical uniformity (Meir et al. 2012). We can thus imagine two extreme cases as languages evolve: one in which the variation present in language emergence is fully retained and a second where all the variation is lost in favor of uniformity. What are the pressures that may drive languages away from linguistic variability? It has been proposed that the communicative context in which languages are used shapes the features of a language (Lupyan and Dale 2010; Trudgill 2011; Wray and Grace 2007). Specifically in this paper, we explore how shared social and psychological information makes it possible to use iconic signs and how this may be a driving factor in retaining the lexical variation present in language emergence.

Traditionally, in the study of lexical variation in spoken languages, it has been assumed that true synonyms do not exist (Clark 1987). Rather, it is accepted that synonyms for a concept coexisting in a population would be conditioned by sociolinguistic and pragmatic factors. However, in the first stage of language emergence, where individuals improvise forms to refer to concepts, it appears that synonyms can coexist. It is possible that the iconic affordances of the manual modality facilitate the coexistence of synonyms in a population. Without data on the emergence of spoken languages, it is unclear how iconic affordances play a role in their emergence. For these reasons, in this paper, we focus on the emergence

**Citation:** Mudd, Katie, Connie de Vos, and Bart de Boer. 2022. Shared Context Facilitates Lexical Variation in Sign Language Emergence. *Languages* 7: 31. https://doi.org/ 10.3390/languages7010031

Academic Editors: Wendy Sandler, Mark Aronoff, Carol Padden, Juana M. Liceras and Raquel Fernández Fuertes

Received: 4 October 2021 Accepted: 24 January 2022 Published: 10 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of sign languages and how different factors influence the degree of lexical variability across a population.

de Vos (2011) suggests that a high degree of variation at the lexical level may be characteristic of sign languages used in communities with a small population size and a high degree of shared context. Here, we refer to sign languages in such communities as *shared sign languages*, following Nyst (2012). For instance, Ergin et al. (2021) report that the shared sign language Central Taurus Sign Language is "remarkable in its mixture of more or less conventionalized<sup>1</sup> signs or sign sequences, improvised sign sequences, and competing lexical variants".

Similarly, in Kata Kolok, a sign language which emerged in a relatively small, insular village community in northern Bali due to a high incidence of hereditary deafness (de Vos 2012; Marsaja 2008), a high degree of lexical variation has been observed (Mudd et al. 2020); in response to a picture description task, up to nine lexical variants for a stimulus were produced, while other stimuli in the task elicited a uniform response (Mudd et al. 2020). This high degree of lexical variation seems typical of shared sign languages and has also been reported in Al-Sayyid Bedouin Sign Language (ABSL) (Meir et al. 2012), San Juan Quiahije Chatino Sign Language (SJQCSL) (Hou 2016) and Providence Island Sign Language (PISL) (Washabaugh 1986), to name a few.

In contrast, sign languages used predominantly by a large and dispersed group of deaf individuals, most of whom are born to hearing parents, or *Deaf community sign languages* (Meir et al. 2010; Mitchell and Karchmer 2004), appear to exhibit lower levels of lexical variation than shared sign languages (Meir et al. 2012). However, it should be noted that this claim is mostly based on anecdotal evidence (for one exception, see Washabaugh 1986). What can be said is that variation in this category of sign languages is typically structured along different sociolinguistic lines than in shared sign languages, as variation is often the result of schooling practices (Meir et al. 2010). For example, gender-based school segregation in Dublin has resulted in a gendered Irish Sign Language lexicon (LeMaster 2006), and different varieties of American Sign Language (ASL) have emerged due to race-based school segregation (McCaskill et al. 2011).

There undoubtedly also exists structured variation in shared sign languages, such as within families (Sandler et al. 2011) and also along sociolinguistic lines (Mudd et al. 2020). Despite evidence of structured variation, it seems like the degree of lexical variation in shared sign languages is higher within a small community across the board, with frequent interlocutors using different forms to refer to a concept (de Vos 2011). Crucially, despite the existence of multiple forms associated with a concept, signers are able to understand each other. Tkachman and Hudson Kam (2020) posit that a decrease in lexical variation may only be necessary in cases where communication fails. This may be less the case in shared sign language, where pressures for convergence seem to be somewhat alleviated. Meanwhile, in Deaf community sign languages, frequent interlocutors seem to have more synchronized lexical preferences, with higher degrees of variation evident when comparing larger, more dispersed subgroups of the community. What aspects of shared signing communities could reduce the pressure for linguistic uniformity?

One possibility that we explore in the present study is that shared context alone, allowing for the use of iconic forms, may be sufficient to maintain high degrees of lexical variation in a community (Sandler et al. 2011; Tkachman and Hudson Kam 2020). In tight-knit communities, individuals can make use of shared social and psychological information, facilitating the use of strategies such as pointing to concepts and using iconic signs (de Vos 2011). Iconic signs, in which aspects of a sign's form resemble aspects of that sign's meaning (Dingemanse et al. 2015), would only be successfully communicated (if not already conventionalized) when individuals share the same salient features (specific to the individual) associated with a concept (the entity or concept in the real world). For instance, in the shared sign language ABSL, the sign for kettle was shown to differ across families, but within families, members were uniform in their productions (Sandler et al. 2011). Regarding this variation, Sandler et al. (2011) state: "It is likely that all the different versions

would be intelligible across the community, due to iconicity, context, or the existence of synonymy in the signers' mental lexicons—possibly all the of the above". We refer to these as *productive synonyms*, i.e., variants that may be used interchangeably, in contrast to *perceptual synonyms*, i.e., variants which signers may be aware of in a more abstract sense but not use (Mudd et al. 2020).

Figure 1 shows three signs for *pig* used in the Kata Kolok community. Many villagers in this community make a living as farmers, and this is reflected in the iconic motivations underlying the forms produced for PIG-1 and PIG-2. Given that the members of this community share a high degree of cultural context, it is probable that individuals exploit iconic mappings, understanding each other by retrieving the meaning (comprised of culturally salient features) from the form even if they have not seen or produced the form themselves. On the other hand, when shared context is not available (i.e., individuals are from different backgrounds and have different experiences), there is no advantage to using iconic signs, as the culturally salient features of interlocutors are different. Continuing with this example, imagine someone from a different community who does not have experience with farming. The underlying iconic motivations related to this practice will be meaningless to them, and therefore, the meaning comprised of culturally salient features to the individual in the farming community which are expressed in the form (e.g., PIG-1, whose underlying iconic motivation refers to how pigs are killed) would not be understood unless the mapping is learned. As explained by Occhino et al. (2017), iconicity is subjective as it is dependent on one's language and culture-specific experience.

**Figure 1.** Three variants for *pig* in Kata Kolok produced in response to a picture description task (Lutzenberger et al. 2021; Mudd et al. 2020). The iconic motivation underlying PIG-1 is how a pig is killed, for PIG-2 is how a pig eats and for PIG-3 is the ears of a pig. It is clear that the cultural context of the Kata Kolok community has shaped lexical preferences, illustrated with iconic signs (i.e., mappings between culturally salient features and forms). For instance, the iconic motivations of PIG-1 and PIG-2 stem from farming practices in the community.

Here, we aim to operationalize the relationship between shared context (allowing for iconic mappings) and lexical variation using an agent-based model. In our model, the language representation is adapted from the semiotic triangle (Ogden and Richards 1925). Traditionally, the semiotic triangle consists of a referent (something concrete or abstract referred to in a particular instance of a conversation), a meaning (a representation of that referent by a given individual) and a form (the signal conveyed) (terminology following Steels and Kaplan 1999, definitions following Vogt 2002; Vogt and Divina 2007). The relationship between these components has been used to study the symbol grounding problem (Harnad 1990), i.e., the problem that symbols are internal representations but need to be linked to entities in the real world (Vogt 2002).

In the semiotics literature, there is a heavy emphasis on the conventionalized and/or arbitrary link between the form and referent (see Pierce 1931), which is unsurprising considering the long-held assumption that arbitrariness is a design feature of language (de Saussure 1916; Hockett and Hockett 1960). However, the emphasis on arbitrariness has been reduced due to the overwhelming presence of iconic forms in sign languages as well as in spoken languages (see Perniss et al. 2010 for a review). It should be noted that the role of iconicity in language emergence may differ in signed and spoken languages, given the different affordances of the modalities, which may have ramifications on the degree of lexical variability.

In the present study, we adapt the semiotic triangle to reflect what we posit is representative of the linguistic situation in sign language emergence. The semiotic triangle presented here consists of three components: (1) a concept, i.e., an abstract notion; (2) culturally salient features, i.e., culturally salient features of a concept; and (3) a form, i.e., the signal conveyed. For example, a hypothetical semiotic triangle from an individual in the Kata Kolok community could consist of (1) the concept *pig*, an abstract representation of the animal; (2) culturally salient features of a pig in this farming community, such as how a pig is killed and how a pig eats; and (3) the form PIG-1 (see Figure 1), whose iconic motivation stems from how a pig is killed. Notably, the inclusion of culturally salient features in the language model allows for the use of iconic mappings between the culturally salient features and the form. As such, the original contributions of this model are the introduction of culturally salient features and the *iconic–inferential pathway* (presented in the right triangle in Figure 2). In addition to the *conventional link* between form and concept, the iconic–inferential pathway goes from form to culturally salient features to concept (or vice versa). Here, an individual can make use of the culturally salient features (unique to them depending on their culture and experiences), which can be retrieved from the form given that cultural knowledge is shared.

**Figure 2.** The semiotic triangle used in the current study, consisting of a concept, culturally salient features and a form. The triangle on the left shows the traditional view of the relationship, in which an arbitrary link between the form and concept are made. Depicted in the triangle on the right, we present an alternative route to connecting the form to the concept, through culturally salient features, which we call the iconic–inferential pathway. Figure based on Vogt and Divina (2007), adapted from Ogden and Richards (1925), updated with terminology used in the current study.

Here, we provide an example of how these pathways could be used in interaction with the example of pig again from the Kata Kolok community, using a hypothetical conversation between individual A and individual B, both from this community. In conversation, individual A uses the sign PIG-1 (iconic motivation referring to how a pig is killed). However, individual B is not familiar with this form and, using the conventional link (form to concept), does not know at this stage what individual A is referring to. Subsequently, individual B uses the iconic–inferential pathway to consider if the form produced by individual A overlaps with the culturally salient features of any concept. Because individual B is from the Kata Kolok community, where individuals have knowledge about farming, including the way in which pigs are killed, individual B recognizes that the form PIG-1 produced by individual A refers to how a pig is killed, and thus likely refers to pig. In this way, when individuals share a cultural context, the iconic–inferential pathway can serve as a supporting route in case the conventional pathway fails. In the event that neither of these pathways lead individual B to the concept *pig*, it is probable that these individuals will need to initiate repair in order to understand each other. Though many strategies may be used, one option would be for individual B to learn the form produced by individual A. Although in the operationalization of this model the conventional link has priority over the

iconic–inferential pathway, in the real world, meaning can also undoubtedly be inferred using the iconic–inferential pathway prior to the conventional link or a combination of both.

This theory generates a prediction about the level of iconicity present in different types of communities. Frishberg (1975) showed that in ASL, a Deaf community sign language, signs tend to become less iconic over time. Pleyer et al. (2017) point out that studies from young sign languages and homesign systems show that "signs gradually shed their iconic mapping", potentially in favor of facilitating a larger vocabulary (Gasser 2004). However, what about for shared sign languages? Does the level of iconicity remain high or decrease over time? We predict that in shared sign languages, the level of iconicity will remain relatively high because iconic forms are successfully communicated, as community members share a high degree of cultural context. In contrast, in Deaf community sign languages, we predict that iconicity will decrease, as found by Frishberg (1975) for ASL, because in these larger communities, individuals typically come from diverse backgrounds. Therefore, retrieving culturally salient features from the form will not be useful when communicative partners do not share cultural context. Rather, individuals are more likely to adapt their form moving closer to the form of their communicative partner. This helps them to successfully communicate, as their forms move towards becoming aligned. However, as individuals do not likely share a cultural context (and hence likely have different salient features), adapting one's form would typically result in a move away from its initial highly iconic state. Iconicity is often talked about on a large scale, irrespective of individual experience. While iconic affordances can be grounded in human experience (e.g., men have beards), it must be stressed that iconicity remains subjective (Occhino et al. 2017). Thus, here, iconicity is considered on an individual level, as opposed to across entire communities where individuals may not share much cultural context.

In sum, we propose that in communication individuals may exploit an iconic–inferential pathway, making use of iconic mappings between a form and culturally salient features if a conventional pathway is not available. In communities such as shared signing communities where individuals share psychological and social information, we predict that communicative partners will successfully communicate using the iconic–inferential pathway if the conventional pathway fails. Because communication can succeed using these two routes, lexical variation should remain high, as well as the degree of iconicity in the community. On the other hand, in communities such as those with Deaf community sign languages, because there is less shared information, the iconic–inferential pathway is less useful. Hence, in the case of failure using the conventional pathway, individuals are more likely to proceed to adapt their lexical form in order to be understood. Hence, we predict that communities with little shared context will move towards lexical uniformity and low degrees of iconicity.

In addition to shared context, it has been proposed that population size may affect linguistic features (Lupyan and Dale 2010; Wray and Grace 2007). In sign languages, anecdotal evidence suggests that small populations exhibit a higher degree of lexical variation than large populations (Meir et al. 2012). The relationship between population size and lexical variation has been supported by a recent computational model (Tkachman and Hudson Kam 2020), though previous computational models have found that conventions emerge faster in smaller populations (Baronchelli et al. 2006). Although not the main focus of this study, we also consider the effect of population size on the degree of lexical variability, as typically shared sign languages emerge in smaller populations, and Deaf community sign languages emerge in larger populations. Modeling shared context and population size may help to tease apart the contribution of each on the degree of lexical variation.

In the next section, we describe how this theory is operationalized using an agentbased model. Following this, we begin the results section with two example model runs focusing on the results of the language game component of the model. Then, we study the effect of shared context on lexical variation by altering the number of groups in the model, which determines how many agents share the same cultural context. Concluding the results section, we briefly consider the effect of population size on lexical variation. Finally, in

the discussion section, we first focus on comparing the model results to the evidence from variation in signing communities. Then, we discuss the limitations of this model and how it can be extended to account for these limitations.

## **2. Model Description**

The model description is inspired by the ODD (Overview, Design concepts, Details) protocol for describing agent-based simulations (Grimm et al. 2006; Grimm et al. 2010). The description has been adapted to include links between the model and real world examples, to hopefully make for a more understandable model description. The model was implemented in Mesa, a Python framework for agent-based modeling (Kazil et al. 2020). The model code is available on figshare: https://doi.org/10.6084/m9.figshare.15163872.v1, accessed on 23 January 2022.

**Purpose**. The purpose of this model is to investigate how shared context affects lexical variation in sign language emergence. As shown in Figure 3, the agent-based model takes the following values as input parameters:


• The number of time steps in the model (*n\_steps*).

**Figure 3.** Visualization of the steps and parameters in the agent-based model. During the initialization phase, the number of groups (*n\_groups*) determines how many subsets of the population have the same set of identical culturally salient features associated with concepts. Then, a number of agents are created (*n\_agents*). Each agen<sup>t</sup> is randomly assigned to a group, and their language representation is set given the following parameters: the number of concepts (*n\_concepts*), the number of bits (*n\_bits*) and the initial degree of overlap between the culturally salient features and form (*initial\_degree\_of\_overlap*). At each time step, all agents initiate a language game (i.e., they take a turn as the sender). At the end of each time step, data on the mean degree of iconicity and the mean lexical variability are calculated. The model continues for a number of steps (*n\_steps*).

**Entities, state variables and scales**. The only entity in the model is the agent, which is the entity in the model that represents one individual in the real world. Agents consist of a unique id and a group that they are assigned to during the initialization stage (first stage of the model). Furthermore, each agen<sup>t</sup> has a language representation which is explained in the initialization below. Figure 4 shows an example of an agent.

**Figure 4.** Example of an agent.

The agent's unique id is 1 as it is the first agen<sup>t</sup> created in this run of the model. In this example, there is only one group (*n\_groups* = 1), so the agen<sup>t</sup> is assigned to group 1. As there is only one group, all agents in the model would have the same culturally salient features corresponding to each concept. This is akin to individuals of a population having shared social and psychological information, thus they are likely to have similar notions for a given concept. Some examples of concepts in real life are *pig*, *tree* and *destiny*, as discussed in the introduction. In the example in the figure, there are two concepts (*n\_concepts* = 2); each concept is associated with culturally salient features and a form, both of which are made up of three bits (*n\_bits* = 3). For each bit of the form, the probability that it will have the same value as the corresponding bit in the culturally salient features is determined by *initial\_degree\_of\_overlap*. Hence, the form, corresponding in real life to a sign produced or a word uttered, is determined by the association with the culturally salient features. The idea is that when individuals initially improvise forms, the forms often bear some degree of resemblance to culturally salient features of the concept. For example, in Kata Kolok, signs for *pig* refer to how a pig is killed, how a pig eats or a pig's ears—features that are culturally salient in the Kata Kolok community.

**Process overview and scheduling**. The set-up of the model is outlined in initialization below. After the initialization phase, each time step consists of the processes outlined in Table 1. For details of these processes, see the Submodels sections. A schematic overview of the order of processes and parameter input is provided in Figure 3.

**Initialization**. For each group (*n\_groups*), a bit vector of length *n\_bits* is generated per concept (*n\_concepts*). Following the example provided in Figure 4, the culturally salient features associated with concept A is 001 and concept B is 000. In the real world, this could be analogous to two concepts, say, *pig* and *butterfly*, which have different culturally salient features (dependent on the background of a person), such as wings for a butterfly and pigs rolling in mud or how they are killed in farming. Roughly, the string of 0s and 1s representing the culturally salient features can be thought of as a unique representation of the characteristics of that concept, given the group one is in.

Each agen<sup>t</sup> has a language representation which consists of, for each concept, a set of culturally salient features and a form, as shown in Figure 4. *n\_concepts* determines the number of concepts in the language representation. This is akin to the number of words in a person's vocabulary. Each concept is associated with culturally salient features and a form, each consisting of a number of bits (0 s and 1 s), determined by *n\_bits*. The culturally salient features corresponding to each concept are fixed, based on the group that the agen<sup>t</sup> belongs to. The culturally salient features and concepts are never updated or changed throughout the simulation. Only the forms can be updated. The idea here is a simplification of reality, in which an individual is born in a certain context, determining what features are salient culturally for the entirety of their life (e.g., in communities where farming is practiced, one's concept of an animal is likely related to how that animal is farmed). However, despite this, the form (produced sign or uttered word) can change over the course of one's life (e.g., I may say "rad", "radical" or "cool" to refer to the same concept).


Iconicity is represented in the model by the similarity between the forms (sign produced or word uttered) and the culturally salient features for a concept. For example, if a butterfly's wings are salient in one's culture and the sign for butterfly refers to the wings of the insect, then the similarity between the culturally salient features and the form is strong and thus highly iconic for individuals with the same background. In the model, to understand how iconicity affects lexical variability, the parameter determining the degree of overlap between forms and culturally salient features is varied. This is operationalized in the model in the following way: The relationship between each bit of the culturally salient features and the form is determined by the *initial\_degree\_of\_overlap*, such that the probability that the form's bit is the same as the bit from the culturally salient features is equal to the value of *initial\_degree\_of\_overlap*. For a bit of the form that is not chosen to be the same as the bit of the culturally salient features in the initial event, then that bit is randomly<sup>2</sup> assigned a 0 or 1. As such, a non-iconic form does not have a structured relationship between the form and the culturally salient features; rather, its relationship is arbitrary. If the *initial\_degree\_of\_overlap* is set to 1, then there is a 100% chance that each bit of the form will be the same as that of the culturally salient features. If the *initial\_degree\_of\_overlap* is set to 0, then each bit of the form is randomly assigned a 0 or 1.

To illustrate with an example following Figure 4, take concept A, which is associated with the culturally salient features 001. Before assigning the forms, the language representation looks like this: A, 001, NA NA NA, with 001 referring to the culturally salient features and NA NA NA referring to placeholders for each bit of the form. Starting with the first bit of culturally salient features (0), there is a 33% chance that the corresponding bit of the form will be identical to the bit of the culturally salient features in this initial event. The outcome of this event is that the bit of the culturally salient features and of the form are not identical. From here, a new event occurs, randomly assigning a 0 or 1 to this bit; a 1 is randomly assigned (note that at this stage a 0 could also be chosen randomly). Now, the language representation looks like this: A, 001, 1 NA NA. The same process is repeated to determine the second bit of the form, and here the outcome of this event is that this bit is identical, i.e., the second bit of the form is set to 0 as the second bit of the culturally salient features is 0. Finally, this process is repeated a third time, and here the outcome of this event is that the bit of the culturally salient features and of the form are not identical. From here, a new event occurs, randomly assigning a 0 or 1 to this bit; here, it happens to be a 0 that is chosen (note that at this stage, a 1 could also be chosen randomly). Thus, the final language representation of this agen<sup>t</sup> for concept A is: A, 001, 100.

**Submodel Language game**. The language games consist of two agents interacting—a sender and a receiver, simulating a simplified exchange between two individuals. At each time step in the model, all agents take one turn as a sender in the language game. As shown in Figure 5, the language game consists of four steps. First, the sender randomly chooses a concept and produces the corresponding form. In Figure 5, the sender has randomly chosen concept A. In real life, this would be analogous to an individual wanting to communicate about a given concept and producing the corresponding sign or uttering the corresponding word.

Second, in the language game, the receiver selects the form which is closest to the form of the sender, by calculating the distance between the sender's form to all of the forms of the receiver. Crucially, in this model, the distance is calculated by comparing the bits at the same index. In the event of a tie between two or more forms as having most in common with the form of the sender, a form that tied is randomly chosen. Following the example provided in Figure 5, the sender's form is 100. The distance to the receiver's first form 001 is 2/3 and the distance to the receiver's second form is 100 is 0/3, so the second form is selected. The concept of the selected form of the receiver is then compared to the chosen concept of the sender. If the concept of the sender and receiver are the same, then the language game is over and no update is made. When the language game succeeds here, we refer to this as *form success*. However, if the concepts of the sender and receiver do not match (as is the case in the example presented in Figure 5 where the sender chose concept

A and the receiver's closest match is concept B), then the language game proceeds to the third step. Success at this step of the language game represents the conventional link or memorizing the association between a concept and a form. Typically in language games, it is the conventional link that is modeled.

This next step presents the original contribution of this model, which models the ability of individuals to make use of iconic affordances. In this third step, the form of the sender is compared to all the sets of culturally salient features of the receiver. As performed in step two between forms, the distances between the form of the sender and all of the sets of culturally salient features of the receiver are calculated, and the closest culturally salient features are selected. Again, following the example in Figure 5, the sender's form is 100. The distance to the receiver's first culturally salient features 001 is 2/3, and the distance to the receiver's second culturally salient features is 000 is 1/3, so the second culturally salient features are selected. As in step two, the concept of the receiver's selected culturally salient features is compared to the sender's chosen concept. If these concepts are the same, then the language game is over and no update is made. When the language game succeeds here, we refer to this as *culturally salient features success*. Success at this step of the language game represents the iconic–inferential pathway, where a form and concept are linked via the cultural salient features. Crucially, no memorization is required. However, at this stage, if the concepts of the agents do not match (as is the case in the example presented in Figure 5 where the sender chose concept A and receiver's closest match is concept B), then the language game proceeds to the fourth step.

The last step of the language game represents when communication is unsuccessful via the conventional link and the iconic–inferential pathway. In this case, as is typical in language games, one agen<sup>t</sup> updates their form to hopefully allow for successful communication in the future. In real life, this corresponds to aligning speech with an interlocutor. Concretely, in this fourth step, for the sender's chosen concept (concept A), the receiver updates one bit of the form which is different from the form of the sender. If the language game advances to this stage, we call this *bit update*. In Figure 5, the sender's form corresponding to concept A is 100. The receiver's form corresponding to concept A is 001. The bits that are different between the sender's and the receiver's form are identified (the first and third bits), and one is randomly selected to be changed to correspond to the sender's. In the example, the first bit was chosen and is changed to a 1; now, the receiver's form for concept A is 101.




**Figure 5.** The steps of the language game with an accompanying example.

**Submodel Collect data**. In the data collection phase of each time step, two calculations are made: the mean degree of iconicity and the mean lexical variability. Calculation examples are demonstrated with the agents in Figure 6.


**Figure 6.** Example agents for calculating the mean degree of iconicity and the mean lexical variability.

First, the mean degree of iconicity is calculated for each concept of each agen<sup>t</sup> and averaged across all agents. To calculate the degree of iconicity for a concept, the culturally salient features and the form are compared at each index, with the similarity (or overlap) calculated. For example, for agen<sup>t</sup> 1 in Figure 6, for concept A, the associated culturally salient features are 001 and the form is 100. The similarity between these is 1/3. For concept B, the similarity is 2/3. Thus, the mean degree of iconicity for agen<sup>t</sup> 1 is 1/2.

Next, the mean lexical variability in the population is calculated by comparing all forms for each concept between all pairs of agents in the population. If the agents' forms for a concept are the same, i.e., all bits match at each index, then the distance between the productions is 0. If two agents' forms for a concept are not the same, i.e., the bits differ at one index or more, then the distance between the productions is 1. Thus, the result of the comparison between two agents' forms is binary (distance of 0 or 1)3. For each pair, the mean of the distances is taken. We will illustrate this calculation with the agents depicted in Figure 6: For concept A, agen<sup>t</sup> 1's form is 100 and agen<sup>t</sup> 2's form is 001. As these forms differ at the first and last positions, the distance between them is 1. Subsequently, for concept B, agen<sup>t</sup> 1's form is 001 and agen<sup>t</sup> 2's form is 100, which differ at the first and last positions, so the distance between them is 1. Thus, the mean lexical variability between these agents is 1.
