*7.2. Definition of Similarity*

We estimate multiple similarity functional forms and measures of distance between the attributes used in the definition similarity between problems. We test how similarity as characterized by Equation (14) compares to in-sample fit of the data. The definitions of our similarity functions primarily differ in how similarity decays; above, we assume similarity decays exponentially and, in Equation (14), it decays according to the logarithm (Ones were added to avoid dividing-by-zero and log-of-zero problems).

$$S^2(x, y) = \frac{1}{\ln(d(x, y) + 1) + 1} \tag{14}$$

In addition to the decay of similarity, we can also test a different definition of distance between elements of the problem. In our main specification, we use weighted Euclidean distance, as defined in Equation (4). Another popular definition in psychology for distance is the Manhattan distance given by Equation (15).

$$d^2(\mathbf{x}, \mathbf{y}) \quad = \sum\_{i=1}^{\#Dims} w\_i |\mathbf{x}\_i - \mathbf{y}\_i| \tag{15}$$

Using these definitions, we report that the in-sample fit of the data, measured by the quadratic scoring rule, in Table 1 to be robust to the various definitions of similarity and distance. We indicate by column heading in Table 2 which equations were used corresponding to specific functional forms of similarity and distance. The functional form of distance between elements seems to be of minor importance to fit in the mixed strategy equilibria games explored here. Nevertheless, there is greater variation in the performance of the different similarity functions. The similarity function provided in Equation (14) performs better than the exponential. We also find the weight *W*<sup>2</sup> with Equation (14) is negative and statistically significant, which is unexpected. To avoid overfitting the data with parameters that do not make psychological sense, we use Equation (3) in our main specification. We conclude that CBL is robust to different definitions of similarity, and the inverse exponential function is a good fit with the experimental data at hand. This also corresponds to previous findings in psychology and economics [4,33].

**Table 2.** Similarity Definitions Measured by Mean Quadratic Score.


Note: *S* denotes the similarity function in Equation (3) and *S*<sup>2</sup> denotes the similarity function in Equation (4). *d* denotes the Euclidean distance function and *d*<sup>2</sup> denotes the Manhattan distance function.
