**4. Order within Disordered Sequences**

In intuitive terms, Ramsey theory states that there exists a certain degree of order in all sets/sequences/strings, regardless of their composition. Heuristically speaking, this is so because it is impossible for a collection of data not to have "spurious" correlations, that is, relational properties among its constituents which are determined only by the size of the data. The simplest example of such (spurious) correlation is given by the *Dirichlet's pigeonhole principle* stating that *n* pigeons sitting in *m* < *n* holes result in at least one hole being filled with at least two pigeons. Or in a party of any six people, some three of them are either mutually acquaintances, or complete strangers to each other [64,65].<sup>7</sup> This seemingly obvious statements can be used to demonstrate unexpected results; for example, the pigeonhole principle implies that there are two people in Paris who have the same number of hairs on their heads. The pigeonhole principle is true for at least two pigeons and one whole; the party result needs at least six people. A common drawback of both results is their non-effectivity: we know that two people in Paris have the same number of hairs on their heads, but we don't know who they are.

An important result in Ramsey theory is Van der Waerden theorem (see [66]) which states that *in every binary sequence at least one of the two symbols must occur in arithmetical progressions of every length.*<sup>8</sup> The theorem describes a set of arbitrary large strong correlations – in the sequence *x*1*x*2 ... *xn* ... there exist arbitrary large *k*, *N* such that equidistant positions *k*, *k* + *t*, *k* + 2*t*, ... *k* + *Nt* contain the same element (0 or 1), that is, *xk* = *xk*+*t* = *xk*+2*t*, ··· = *xk*+*Nt*. 9 Crucial here is the fact that the property holds true for *every* sequence, ordered or disordered.<sup>10</sup> Are these correlations "spurious"? According to Oxford Dictionary, *spurious* means "Not being what it purports to be; false or fake. False, although seeming to be genuine. Based on false ideas or ways of thinking." The (dictionary) definition of the word "spurious" is semantic, that is, it depends on an assumed theory: one correlation can be spurious according to one theory, but meaningful with respect to another one.

Can we give a definition of "spurious correlation" which is independent of *any* theory? Following [46] a *spurious correlation* is defined in a very restrictive way as follows: *a correlation is spurious if it appears in a randomly generated string/sequence*. Indeed, in the above sense a spurious correlation is "meaningless" according to any reasonable interpretation because, by construction, its values have been generated at "random", as all data in the sequence. As a consequence, such a correlation cannot provide reliable information on future developments of any type of behaviour. Of course, there are other reasons making a correlation spurious, even within a "non-random" string/sequence. But, are there correlations as defined above? Van der Waerden theorem proves that in every sequence there are spurious correlations in the above sense – they can be said to "emerge". Therefore, these spurious correlations can also be re-interpreted as "emerging laws." It is important to keep in mind that these "laws" are not properties of a particular sequence,—indeed, they exist in *all* sequences as Van der Waerden theorem proves. How do the spurious correlations manifest themselves in a number world? From the finite version of Van der Waerden theorem, the more bits of the sequence describing the number world we can observe, the longer are the lengths of monochromatic arithmetical progressions. So, once there are (sufficiently many) data, regardless of their intrinsic structure, "laws from nowhere" (*ex nihilo*) emerge. In what follows we will work only with the above definition of spurious correlation.

Are these spurious correlations just simple accidents or more customary phenomena? We can answer this question by analysing the "sizes" of the sets of random sequences/strings in which spurious correlations arise. As our definition of spurious correlation is independent of any theory, in

<sup>7</sup> In fact, there is a second trio who are either mutually acquainted or unacquainted [64].

<sup>8</sup> If we interpret 0 and 1 as colours, then the theorem says that in every binary sequence there exist arbitrarily long monochromatic arithmetical progressions.

<sup>9</sup> Again, the proof is not constructive.

<sup>10</sup> The finite version of Van der Waerden theorem shows that the same phenomenon appears in long enough strings. See more in [46].

answering the above questions we will use a model of randomness for sequences and strings provided by algorithmic information theory [67,68] which has the same property.

First, how "large" is the set of random sequences? If we work with Martin-Löf random sequences11, then the answer is "almost all sequences": the probability of a sequence to be Martin-Löf random is one.<sup>12</sup> This means that *the probability that an arbitrary sequence does not have spurious correlations is zero.*<sup>13</sup>

Second, as human access to sequences is limited to their finite prefixes, it is necessary to answer the same question for strings: what is the "size" of "random" strings? Using the incompressibility criterion again [46], a string *x* of length *n* is *α*-random if no Turing machine can produce *x* from an input with less than *n* − *α* · *n* bits.<sup>14</sup> The number of *α*-random strings *x* of length *n* is larger than 2*n* (1 − <sup>2</sup>−*α*·*<sup>n</sup>*) + 1, and hence, with finitely many exceptions, it outnumbers the number of binary strings of length *n* which are not *α*-random.<sup>15</sup> More interestingly, the probability that a string *x* of length *n* is *α*-random is larger than 1 − 2−*α*·*<sup>n</sup>* + 2−*n*, an expression which tends exponentially to 1 as *n* tends to infinity. This means that *the probability that an arbitrary string does not have spurious correlations is as close to zero as we wish provided that its length is large enough, that is, excluding finitely many strings.*

Furthermore, the increase of some types of spurious correlations, i.e., emergen<sup>t</sup> "laws", can be quantified: Goodman's inequality [69,70] yields lower bounds on how many spurious correlations are observed as a function of the size of data. Conversely, Pawliuk recently suggested [71] that Goodman's inequality can be utilised for testing the (null) hypothesis that a dataset is random: if the bounds are over-satisfied, the correlations might be not spurious, and thus the dataset might not be stochastic. Can we distinguish between meaningful laws and emerging "laws"? The answer seems to be negative at least from a computational point of view.

#### **5. The Emergence of Turing Complete (Universal) Computation**

In view of the "quantification" of information content [67,72], how could complexity and structures such as universal computation, evolve even in principle? The answer to this question is in the algorithmic information content (complexity) of the number world.

The proof of Turing completeness<sup>16</sup> of the Game of Life provided by Conway in ([73], Chapter 25, What Is Life?) is a useful method for exploring how complex behaviour like Turing completeness can emerge from very simple rules, in this case, the rules of cellular automata (see more in [74]). With a universal Turing machine and all *α*-random strings one can generate *all* strings [67].

Is this phenomenon also possible for sequences, that is, for number worlds? The answer is affirmative. According to a theorem by Kuˇcera-Gács-Hertlinger ([67], p. 179), there effectively exists a process *F*—which is continuous computable operator—which generates all sequences from the set of Martin-Löf random sequences: in other words, every sequence is the image from *F* of a Martin-Löf random sequence.

#### **6. Is the World Number Computable?**

Of course, there exist infinitely (countable) computable world numbers.

<sup>11</sup> A Turing machine with a prefix-free domain is called self-delimiting. A (self-delimiting) Turing machine which can simulate any other (self-delimiting) Turing machine is called universal. A sequence is Martin-Löf random if there exists a fixed constant such that every finite prefix (string) of the sequence cannot be compressed by a self-delimiting universal Turing machine by more than a constant [67].

<sup>12</sup> This holds true even constructively.

<sup>13</sup> Probability zero is not the same as impossibility: there exist infinitely many sequences—like the computable ones—which contain no spurious correlations.

<sup>14</sup> The minimum length of an input a Turing machine needs to compute a string of length *n* lies in the interval (0, *n* + *<sup>c</sup>*), where *c* is a fixed constant. From this it follows that *α* ∈ (0, <sup>1</sup>).

<sup>15</sup> More precisely, when *n* ≥ 2/*<sup>α</sup>*.

<sup>16</sup> A model of computation is Turing complete—sometimes called universal—if it can simulate a universal Turing machine.

Can we decide whether the sequence describing a given world number is computable? Answering this question is probably impossible both theoretically and empirically. However, we can answer a simpler variant of the question: What is the probability that a world number is computable? If we take as probability the Lebesgue measure [67], then the answer is zero.<sup>17</sup>

The above result shows that the probability that a world number can be generated by an algorithm is zero. If we weaken the above requirement and ask about the probability that there exists an algorithm which generates infinitely many bits of a world number, then the answer remains the same: this probability is nil. This result follows from a theorem in algorithmic information theory saying that the complement of the above set—the set of bi-immune sequences<sup>18</sup> —has probability one [67]. A consequence of this fact, corroborated by an extension of the Kochen-Specker theorem proving value indefiniteness of quantum observables relative to rather weak physical assumptions [75], is that with probability one a number world is produced by repeatedly measuring of such a value indefinite observable.
