*2.3. Maximum-Entropy Models*

To build the MaxEnt model at a given level *λ* of coarse-graining, we substitute every word in our corpus by its binary representation. Our text then becomes a binary string. For example, with the coarse-graining in which nouns and verbs are kept, and all other words are abstracted into *cat*3, we have

> green colorless ideas sleep furiously −→ 001 001 100 010 001. (11)

We indicate the *i*-th word in a text by *<sup>w</sup>*(*i*). Its grammatical class in the description level *λ* is noted:

$$
\pi^\lambda(i) \equiv \pi^\lambda(w(i)),
\tag{12}
$$

and its binary representation:

$$
\sigma^{\lambda}(i) \equiv \pi^{\lambda}(w(i)). \tag{13}
$$

Both mappings *π<sup>λ</sup>* and *π*˜ *λ* contain the same information, and both of them play the role of *π* in Figure 1. Note that *c<sup>λ</sup>*(*i*) = *cλj* for some *j*, and that although *i* ∈ {1, ... , *Nw*} indexes words as they happen in a text (of length *Nw*), *j* ∈ {1, ... , *N<sup>λ</sup>*} indexes unique grammatical classes in *<sup>χ</sup>λ*. Each binary representation consists of *N<sup>λ</sup>* bits. When necessary, we will use a subindex *k* to label *σλj*,*k* as the *k*-th bit of the *j*-th class's binary representation at a given coarse-graining level *λ*.

We next produce binary samples that include each word and the one next to it in a text: *σ<sup>λ</sup>*(*i*)|*σ<sup>λ</sup>*(*<sup>i</sup>* + <sup>1</sup>), where ·|· indicates concatenation. Thus, the coarse-grained sentence from Equation (11) yields the samples:

$$\{001001, 001100, 100010, 010001\}.\tag{14}$$

Each sample has size 2*N<sup>λ</sup>* (when needed, the index *k* over bits will also label positions from 1 to 2*N<sup>λ</sup>*). Large corpora will produce huge collections of such samples. We can summarize these collections by giving the empirical frequency *F σλj* |*σλj* with which each of the *N<sup>λ</sup>* <sup>2</sup> possible bit strings with length 2*N<sup>λ</sup>* shows up. These collections behave as samples of what is known as spin glasses in statistical mechanics. We have powerful mathematical tools to infer MaxEnt models for spin glasses – therefore all these efforts.
