**7. Conclusions**

This paper presented a large-scale cognitive experiment for estimating the entropy rate for English. Using AMT, we conducted Shannon's experiment online and collected 172,954 character predictions in total across 683 subjects. It was by far the largest cognitive experiment conducted thus far, and the scale enabled us to analyze the factors that influence the estimation.

While Shannon implied that subjects' prediction performances improved with increasing context length, others disagreed with his implication. Our experiment showed that subjects' prediction performances improved consistently with increasing context length, at least up to 100 characters.

Further, we investigated the influence of the number of observations on the estimation via the bootstrap technique. One of the most important insights gained is that the number of prediction observations must be at least 1000 in order to produce an estimate with a reasonable margin of error. In the case of small samples, the value of *h* could be potentially underestimated. Hence, Shannon's original experiment and other previous experiments provided estimates that could have been underestimated. We believe that this present work reports a statistically reliable estimate with a reasonable margin of error.

Due to the online environment, the performances of the subjects varied, and the upper bound should be evaluated based on filtered results. With a sufficient number of well-performing samples, we obtained an upper bound of *h* ≈ 1.22 bpc, which is slightly smaller than Shannon's reported value of *h* = 1.3 bpc.

Future work could include finding a new experimental design, one in which the participants use longer contexts to predict the next character; thus, reducing the cognitive load. Such an experiment would contribute to the tighter evaluation of the upper bound of the entropy rate. It would be also interesting to examine the entropy rates of other languages and at the word level while still utilizing a cognitive experiment.

**Author Contributions:** Conceptualization, K.T.-I.; Data curation, G.R.; Funding acquisition, K.T.-I.; Investigation, G.R., K.T.-I. and S.T.; Methodology, G.R., K.T.-I. and S.T.; Project administration, K.T.-I.; Resources, K.T.-I.; Software, G.R. and S.T. ; Supervision, K.T.-I.; Validation, S.T. and K.T.-I.; Visualization, G.R., S.T. and K.T.-I.; Writing—original draft, S.T. and K.T.-I.; Writing—review & editing, K.T.-I. and S.T.

**Funding:** This research was funded by RISTEX-HITE 176100000214 of Japan Science Technology Agency, Japan.

**Conflicts of Interest:** The authors declare no conflict of interest.
