*3.4. Experimental Timeline*

A total of 111 subjects were recruited for this experiment. Sessions were conducted at the Incentive Lab at Rady School of Management, University of California—San Diego (San Diego, CA, USA). The experiment was programmed and conducted using z-tree [23]. The complete session lasted for 90 min. Subjects were given a 5 USD show-up fee for attending the experiment and an additional \$5 if they passed the understanding test and completed the experiment. They earned an additional \$8 on average depending on their decisions for the guessing games. For those who did not pass the understanding test, they spent about 30 min in this experiment and left with the \$5 show-up fee.

Subjects were given instructions on the two-person guessing game first. After explaining the rules, I introduced four unincentivized practice rounds. During the practice rounds, subjects played against the computer and were told that the computer will always choose the mean of the target interval. After the subjects made a guess, feedback was provided for the subjects to reflect on the game rule and the payoff rule. An understanding test was then administered. The test was composed of six questions, similar to the understanding test in CGC06. Standard questions included calculatiosn of best responses and payoffs. Although subjects in the experiment were not restricted to following a level-*k* reasoning process, for the purpose of the experiment, I wanted to make sure the subjects were capable of calculating the best responses. A screenshot of the understanding test is shown in Figure 1. Subjects needed to answer four out of six questions correctly to proceed to the main part of the experiment.

**Figure 1.** Screenshot of the understanding test.

Before playing the incentivized guessing games, subjects were introduced to the memorization task. They were given two unincentivized practice rounds for the low load and high load treatments. During the practice round, they had the standard 15 s to memorize the string of letters and were asked for immediate recall when the time was up. They, however, did no get to practice the guessing game with the cognitive load implemented.

The main experiment consisted of two parts, as discussed in Section 3.3. There were 18 two-person guessing games in total. For the first 16 games, subjects were randomly assigned into pairs and stayed within the same pair for all 16 decisions (one as player 1 and the other as player 2). For each game, subjects were given the same information set that consisted of the types of memorization task (either string of three or seven letters, or a probability distribution) for themselves and their opponents, whether their opponents knew about their exact memorization task, and the targets and limits for both players. An example of the actual decision screen is provided in Figure 2. Subjects were also asked to

elicit their opponents' types of memorization task after they made their guesses and recalled the letters. This practice allowed me to check whether the subjects received and processed the correct information about their strategic environment. There was no feedback given in between the 18 guessing games. This prevented the subjects from learning anything about their opponents' past actions. Such practice also limited the subject's learning of the guessing game, as no payoff information was provided. (There was limited learning of the game. Upon checking a subject's levels with respect to the orders of the games they played, playing a later game was not associated with higher *k*-levels. The coefficient from the OLS regression was 0.005 and it was not statistically significant.)

**Figure 2.** Screenshot of the incentivised two-person guessing game.

Subjects took a 10-question Mensa practice test at the end of the experiment. The test is used to measure the subject's analytical ability. Some questions ask the subject to identify the missing element that completes a sequence of patterns or numbers. Some questions are verbal math questions. A couple of studies in economics literature have used a similar test as a measure of cognitive ability [22]. I used this test in the experiment to measure whether there were any heterogeneous treatment effects on subjects with different exogenous cognitive abilities.
