*3.1. Envariance*

A pure entangled state of a system S and of another system (which we call "an environment E", anticipating connections with decoherence) can be always written as:

$$
\langle \psi\_{S\mathcal{E}} \rangle = \sum\_{k=1}^{N} a\_k |s\_k\rangle |\varepsilon\_k\rangle \,. \tag{3.1}
$$

Here *ak* are complex amplitudes while {|*sk*-} and {|*<sup>ε</sup>k*-} are orthonormal states in the Hilbert spaces HS and HE . This *Schmidt decomposition* of a pure entangled |*ψ*SE - is a consequence of a theorem of linear algebra that predates quantum theory.

Schmidt decomposition demonstrates that any pure entangled bipartite state is a superposition of *perfectly correlated outcomes* of judiciously chosen measurements on each subsystem: Detecting |*sk*- on S implies, with certainty, outcome |*<sup>ε</sup>k*- for E, and *vice versa*.

Even readers unfamiliar with Equation (3.1) have likely relied on its consequences: Schmidt basis {|*sk*-} appears on the diagonal of the reduced density matrix *ρ*S = *Tr*E |*ψ*SE - *ψ*SE |. (We have used this fact "in reverse" in the preceding section to "purify" mixed states.) But tracing is tantamount to averaging over states of the traced-out systems with weights given by the squares of their amplitudes. Therefore, physical interpretation of the resulting reduced density matrix (which is central in the usual treatments of decoherence) presumes Born's rule we aim to derive (see [5] for discussion of how Born's rule is used to justify physical significance of reduced density matrices). Consequently, we shall avoid employing tools of decoherence or relying on the statistical interpretation of *ρ*S in this section as this could introduce circularity. Instead, we derive Born's rule from the symmetries of |*ψ*SE -.

Symmetries reflect invariance. Rotation of a circle by an arbitrary angle, or of a square by multiples of *π*/2 are familiar examples. Entangled quantum states exhibit a new kind of symmetry—*entanglement—assisted invariance* or *envariance*: When a state |*ψ*SE - of a pair S, E can be transformed by *U*S = *u*S ⊗ **1**E acting solely on S,

$$
\langle \mathcal{U}\_{\mathcal{S}} | \psi\_{\mathcal{S}\mathcal{E}} \rangle = (\mathfrak{u}\_{\mathcal{S}} \otimes \mathfrak{1}\_{\mathcal{E}}) | \psi\_{\mathcal{S}\mathcal{E}} \rangle = | \mathfrak{u}\_{\mathcal{S}\mathcal{E}} \rangle \,, \tag{3.2}
$$

but the effect of *U*S can be undone by acting solely on E with an appropriately chosen *U*E = **1**S ⊗ *u*E :

$$
\langle \mathcal{U}\_{\mathcal{E}} | \eta\_{S\mathcal{E}} \rangle = (\mathbf{1}\_{\mathcal{S}} \otimes \boldsymbol{\mu}\_{\mathcal{E}}) | \eta\_{S\mathcal{E}} \rangle = | \boldsymbol{\Psi}\_{\mathcal{S}\mathcal{E}} \rangle \,, \tag{3.3}
$$

then |*ψ*SE - is called *envariant* under *U*S [7,35].

Envariance can be seen on any entangled |*ψ*SE -. Any unitary operation diagonal in Schmidt basis {|*sk*-}:

$$\mu\_{\mathcal{S}} = \sum\_{k=1}^{N} \exp\left(i\phi\_k\right) |s\_k\rangle\langle s\_k| \,. \tag{3.4a}$$

is envariant: It can be undone by a *countertransformation*

$$\mu\_{\mathcal{E}} = \sum\_{k=1}^{N} \exp\left(-i\phi\_{k}\right)|\varepsilon\_{k}\rangle\langle\varepsilon\_{k}|,\tag{3.4b}$$

acting solely on the environment.

In contrast to familiar symmetries (when a transformation has no effect on a state of an object) envariance is an *assisted symmetry*: The global state of SE *is* transformed by *U*S, but it can be restored by acting on E, physically distinct (e.g., spatially separated) from S. When the global state of SE is envariant under some *U*S, the local state of S alone must be obviously invariant under it. There are analogies between envariance and gauge symmetries, with E assuming the role of a gauge field.

Entangled states might seem an unusual starting point for the study of probabilities. After all, the textbook statement of Born's rule deals with pure states of individual systems. Nevertheless, already in Schrödinger's famous "cat" paper [20] the discussion of entangled quantum states leads to the realization that "Best possible knowledge of the whole does not necessarily lead to the same for its parts...", as well as "The whole is in a definite state, the parts taken individually are not". Our aim is to recast such essentially negative qualitative statements (which can be read as a realization of the limitations imposed by the tensor structure of quantum states on their predictive powers) into a derivation of Born's rule, the quantitative tool used for predictions.

Entanglement is also the essence of decoherence responsible for the emergence of "the classical" in a quantum Universe. It is therefore natural to investigate symmetries of entangled quantum states, and explore their implication for how much can be known about parts when the whole is entangled. In addition, as we shall see below, envariance allows one to reassess the role of the environment and sheds new light on the origin of decoherence.

#### *3.2. Decoherence as a Result of Envariance*

Envariance of entangled states leads to our first conclusion: Phases of Schmidt coefficients are envariant under local (Schmidt) unitaries, Equations (3.4). Therefore, when a composite system is in an entangled state |*ψ*SE -, the state of S (or E) alone is *invariant under the change of phases* of *ak*. In other words, the *state* of S (understood as a set of all measurable properties of S alone) cannot depend on phases of Schmidt coefficients: It can depend only on their absolute values and on the outcome states—on the set of pairs {|*ak*|, |*sk*-}. In particular (as we demonstrate below) probabilities cannot depend on these phases.

We have just seen that the loss of phase coherence between Schmidt states—decoherence— is a consequence of envariance: Decoherence is, after all, a selective loss of relevance of phases for the state of S. We stumbled here on its essence while exploring an unfamiliar territory, without the usual backdrop of dynamics and without employing trace and reduced density matrices.

However, decoherence viewed from the vantage point of envariance may look unfamiliar. What other landmarks of decoherence can we find without using trace and reduced density matrices (which rely on Born's rule, something we do not ye<sup>t</sup> have)? The answer is— all the essential ones [42,48,49]. We have already seen in Section 2 that pointer states (states that retain correlations, are predictable, and, hence, good candidates for classical domain) are singled out directly by repeatability—by the nature of information transfers. So, we already have a set of preferred pointer states and we have seen that when they are aligned with Schmidt basis, phases between them lose relevance for S alone. Indeed, models of decoherence ([7,24,25,27–29,32,33]) predict that after a brief (decoherence time) interlude Schmidt basis will typically settle down to coincide with pointer states determined through other criteria (such as predictability in spite of the coupling to the environment). There are exceptions that we have already mentioned in the previous section. They tend to arise when decoherence is incomplete, and/or when the reduced density matrix (that is diagonalized by Schmidt states) is so close to maximally mixed (i.e., some of its eigenvalues are nearly degenerate) that any complete set of states (including pointer states selected for their predictability) can express it while leaving it in a nearly diagonal form (see, e.g., [85]).

This encounter with decoherence on the way to Born's rule is good omen: Quantum phases must be rendered irrelevant for the additivity of probabilities to replace additivity of complex amplitudes. Of course, one could postulate additivity of probabilities by fiat. This was done by Gleason (1957) [84], but such an assumption is at odds with the overarching additivity principle of quantum mechanics—with the principle of superposition (as is illustrated by the double slit experiment). So, if we set out to understand emergence of the classical domain from the quantum substrate defined by axioms (o)–(iii), additivity of probabilities should be derived (as it is done in Laplace's approach, see Gnedenko, 1968 [52]) rather than imposed as an axiom (as it happens in Kolmogorov's measure—theoretic approach, and in Gleason's theorem).

Assuming decoherence to ge<sup>t</sup> *pk* = |*ψk*| 2 (Zurek, 1998 [36]; Deutsch, 1999 [86]; Wallace, 2003 [87]) would mean, at best, starting half way, and raises concerns of circularity [7,32, 35,48,88–90] as the physical significance of the reduced density matrix—standard tool in the usual treatment of decoherence—is justified using Born's rule. By contrast, envariant derivation, if successful, can be fundamental, independent of the usual tools of decoherence: It can justify, starting from the basic quantum postulates, the use of the trace and physical significance of reduced density matrices in the study of the quantum-classical transition.

As perceptive analysis by Drezet (2021) [91] shows, envariance has been recently adopted (Wallace, 2010; 2012) [19,92] even in the (modified) decision theory approach that initially dealt with states of a single system, and, therefore, encountered difficulties with circularity of the argumen<sup>t</sup> by invoking decoherence before establishing Born's rule ([35,48, 89,90]). We noted this problem already and will discuss it briefly below, in Section 3.7.

#### *3.3. Swaps, Counterswaps, and Equiprobability*

Envariance of pure states is purely quantum: Classical state of a composite classical system is given by a Cartesian (rather than tensor) product of its constituents. Therefore, to completely know a state of a composite classical system one must know a state of each subsystem. It follows that when one part of a classical composite system is affected by a transformation—a classical analogue of a swap *U*S—state of the whole cannot be restored by acting on some other part. Hence, *pure classical states are never envariant*.

However, a mixed state (of, say, two coins) can mimic envariance: When we only know that a dime and a nickel are 'same side up', we can 'undo' the effect of the flip of a dime by flipping a nickel. This classical analogue depends on partial ignorance: To emulate envariance, we cannot know individual states of the two coins—just the fact that they are the same side up—just their correlation.

In quantum physics, tensor structure of states for composite systems means that 'pure correlation' is possible. We shall now prove that a maximally entangled "even" state with equal absolute values of Schmidt coefficients:

$$|\psi\_{\mathcal{SE}}\rangle \propto \sum\_{k=1}^{N} e^{-i\phi\_k} |s\_k\rangle |\varepsilon\_k\rangle \tag{3.5}$$

*implies* equal probabilities for any orthonormal set of states |*sk*- of S and any corresponding set of |*<sup>ε</sup>k*- of E.

Such an *even* state is envariant under a *swap* operation

$$
\mu\_{\mathbb{S}}(k \rightleftharpoons l) = |s\_{k}\rangle\langle s\_{l}| + |s\_{l}\rangle\langle s\_{k}| \,. \tag{3.6a}
$$

A swap is a quantum version of the operation that exchanges two cards (Figure 2). It is a unitary that permutes states |*sk*- and |*sl*- of the system. A swap |Heads-Tails| + |Tails-Heads| would flip a coin.

A swap on S is envariant when |*ak*| = |*al*| because *u*S (*k* - *l*) can be undone by a *counterswap* on E;

$$u\_{\mathcal{E}}(k \implies l) = e^{i(\phi\_k - \phi\_l)} |\varepsilon\_l\rangle\langle\varepsilon\_k| + e^{-i(\phi\_k - \phi\_l)} |\varepsilon\_k\rangle\langle\varepsilon\_l|,\tag{3.6b}$$

as is seen in Figure 2c.

We want to *prove* that probabilities of envariantly swappable outcome states must be equal. But let us proceed with caution: Invariance under a swap is not enough—probability could depend on some other 'intrinsic' property of the state. For instance, in a superposition |*g*- + |*e*-, the ground and excited state can be invariantly swapped, as |*g*- + |*e*- = |*e*- + |*g*-, but *energies* of |*g*- and |*e*- are different. Why should probability—like energy—no<sup>t</sup> depend on some intrinsic property<sup>5</sup> of the state?

Envariance can be used to prove that this cannot happen—that probabilities of envariantly swappable states are indeed equal. To this end we first define what is meant by "the state" and "the system" more carefully. Quantum role of these concepts is elucidated by three "Facts"—three definitions that recognize what is known about systems and their (mixed) states, but phrase it in a way that does not appeal to the Born-rule dependent tools and concepts (e.g., reduced density matrices):


Note that the states defined this way need not be pure. In addition, note that Facts—while 'naturally quantum'—are not in conflict with the role of states in classical physics.

Facts are a consequence of quantum theory. They are not, in any sense, additional assumptions. Rather, they clarify operational meanings of concepts (such as "a state") that will play key role in the derivation of Born's rule: Facts are the attributes that any sensible notion of a "state" (and, in particular, a "mixed state") should possess. They also help

<sup>5</sup> In a sense, hidden variable theories (such as the Bohm—de Broglie approach) use a strategy of "tagging" an element of the total superposition in this manner. Therefore, hidden variable theories can violate assumptions of our derivation, as they allow physical properties of a state that determine measurement outcomes to depend on more than just the state vector.

distinguish purely quantum view based on the core quantum postulates from, e.g., hidden variable theory (where, for example, Fact 2 would not hold).

We can now *prove*:

**Theorem 2.** *When Schmidt coefficients satisfy* |*ak*| = |*al*| *in an even state*|*ψ*¯SE - ∝ ∑ *N k*=1 *e*<sup>−</sup>*iφ<sup>k</sup>* |*sk*-|*<sup>ε</sup>k*-*, Equation (3.5), the local state of* S *is invariant under a swap u*S (*k* - *l*) = |*sksl*| + |*slsk*|*.*

**Proof.** Swap changes partners in the Schmidt decomposition (and, therefore, alters the global state). However, when the coefficients of the swapped outcomes differ only by a phase, a swap can be undone (without acting on S) by a corresponding counterswap, Equation (3.6b), in E. As the global state of SE is restored, it follows (from Fact 3) that the local state of S must have been also restored. However, (by Fact 1) the state of S could not have been affected by a counterswap acting only on E. So, (by Fact 2) the state of S must be left intact by a swap, in S, of Schmidt states that have the same absolute values of Schmidt coefficients.

We conclude that envariance of a pure global state of SE under swaps implies invariance of the corresponding local state of S. We could now follow Laplace, appeal to subjective indifference, apply equal likelihood, and "declare victory"—claim that subjective probabilities must be equal. However, as we have just seen with the example of the superposition |*g*- + |*e*- of the eigenstates of energy, invariance of a local state under a swap implies only that the property associated with the swapped states (i.e., energies or probabilities) gets swapped when the states are swapped. Without an appeal to subjective ignorance this does not ye<sup>t</sup> establish that the properties of interest (probabilities or energies) must be equal.

Entanglement (via envariance) allows us to ge<sup>t</sup> rid of such subjectivity altogether. The simplest way to prove the desired equality of probabilities is based on perfect correlation between the Schmidt states of S and E. These relative Schmidt states in, e.g., Equations (3.1) or (3.5) are orthonormal, and |*sk*- are correlated with |*<sup>ε</sup>k*- one-to-one. This implies the same probability for each member of every such Schmidt pair. Moreover (and for the very same reason—perfect correlation) after a swap on S probabilities of swapped states must be the same as probabilities of their two new partners in E. That is, after a swap *u*S (*k* - *l*) = |*sksl*| + |*slsk*| probability of |*sk*- must be the same as probability |*<sup>ε</sup>l*-, and probability of |*sl*- must be the same as that of |*<sup>ε</sup>k*-.

We now focus on an even state, Equation (3.5). It is envariant under all local unitaries (and, hence, under all swaps). Thus (by Fact 1) the state of E (and, by Fact 2, probabilities it implies) are not affected by swaps in S. So, *swapping Schmidt states of* S *exchanges their probabilities, and when the state is even it also keeps them the same!* This can be true only when the probabilities of envariantly swappable states are equal.

We can now state our conclusion:

**Corollary 1.** *When all N coefficients in the Schmidt decomposition have the same absolute values (as in the even states of Equation (3.5)), probability of each Schmidt state must be the same, and, therefore, by normalization, it is pk* = 1/*N.*

Readers may regard this as obvious, but (as recognized by Barnum (2003) [93], Schlosshauer and Fine (2005) [94], Drezet (2021) [91] and others) this is the key to Born's rule. Equation (3.5) is envariant under swaps. This symmetry allowed us to extract physical consequences from quantum mathematics with a very minimal set of ingredients at hand. In the language of the Kolmogorov measure-theoretic axioms we have now established that—when the entangled state is even, Equation (3.5)—positive numbers (the 'measures' of probability for events corresponding to individual Schmidt states) must be equal.

Still, this may seem like a lot of work to arrive at something seemingly obvious: The case of unequal coefficients is our next goal. However—as we will see—it can be reduced to the equal coefficient case we have just settled. The symmetry of entanglement

inherent in the equal coefficients case provides the crucial link between the quantum state vectors and the experimental consequences. Simple algebra along with the special case of probability—certainty—will lead us directly to Born's rule6.

We emphasize that in contrast to many other approaches to both classical and quantum probability, our envariant derivation is based not on a subjective assessment of an observer, but on an objective, experimentally verifiable symmetry of entangled states. Observer is forced to conclude that probabilities of local outcomes are equal not because of subjective ignorance, but because of certainty about something else: Certainty about the symmetries of the global state of the composite system implies—via symmetries of entanglement encapsulated in envariance—that local Schmidt states are equiprobable7.

Envariant probability is also a probability of an individual event—there is no need for an ensemble of many events, so that relative frequency of "favorable" events can be used to define probability. Rather, we are decomposing the future possibilities into equiprobable alternatives, and deducing probability from the ratio of the number of favorable alternatives to the total. Above, just one favorable out of *N* equiprobable alternatives leads to *pk* = 1/*N* for even states. We now extend this approach to general states.

#### *3.4. Born's Rule from Envariance*

To illustrate general "finegraining" strategy of reducing cases with unequal coefficients to the previously described equal coefficient case (see (Zurek, 1998) [36] for its density matrix version) we start with an example involving a two-dimensional Hilbert space of the system spanned by states {|0-, |2-} and (at least) a three-dimensional Hilbert space of the environment:

$$|\psi\_{\mathcal{SE}}\rangle = \sqrt{\frac{2}{3}}|0\rangle\_{\mathcal{S}}|+\rangle\_{\mathcal{E}} \quad + \sqrt{\frac{1}{3}}|2\rangle\_{\mathcal{S}}|2\rangle\_{\mathcal{E}}.\tag{3.7a}$$

System is represented by the leftmost kets, and |+-E = (|0-E + |1-E )/ √2 exists in (at least a two-dimensional) subspace of E that is orthogonal to the state |2-E , so that 0|1-E = 0|2-E = 1|2-E = +|2-E = 0. We already know we can ignore phases in view of their irrelevance for states of subsystems, so we omitted them above. From now on we shall also drop the overall normalization, as the probabilities will only depend on the relative values of the coefficients associated with the alternatives.

To reduce this case to an even state we extend "uneven" |*ψ*SE - above to a state |Ψ¯ SEC - with equal coefficients by letting E act on an ancilla C. (By Fact 1, since S is not acted upon, probabilities we shall infer for it cannot change.) Transformation into an even state can be accomplished by a generalization of controlled-not acting between E (control) and C (target), so that (in an obvious notation):

$$|k\rangle|0'\rangle \Rightarrow |k\rangle|k'\rangle\_{\prime\prime}$$

<sup>6</sup> There is an amusing corollary to the above corollary: One can now *prove* that states appearing in the Schmidt decomposition with vanishing coefficients (that is, states with *ak* = 0 = *a*) have vanishing probability, *pk* = *p* = 0: Consider a part of the sum representing Schmidt decomposition given by *a*(|0-|0- + |1-|1- + |2-|2-), where the first ket corresponds to S and the second to E. Let us now carry out a c-not on the first two terms: |0-|0- + |1-|1- ⇒ |0-|0- + |1-|0- = (|0- + |1-)|0- = √2|+-|0-. Note that |+- is normalized, and the initial sequence is transformed as *a*(|0-|0- + |1-|1- + |2-|2-) ⇒ √<sup>2</sup>*a*|+-|0- + *a*|2-|2- (that is, it has one less term). However, as *a* = 0, √<sup>2</sup>*a* = *a* = 0. Therefore, *a*(|0-|0- + |1-|1- + |2-|2-) = *a*(|+-|0- + |2-|2-). The operations we have employed could not have changed the probability assigned to this part of the Schmidt decomposition. Yet, there are now only two rather than three terms with equal coefficients (hence, equal probability) in that sum. More generally, there will be *n* − 1 rather than *n* terms when analogous operations involve a sum with *n* vanishing coefficients. So the probability *p* of any state with zero amplitude would have to satisfy *np* = (*n* − <sup>1</sup>)*p* (i.e., 3*p* = 2*p* in our example). This equality holds only when *p* = 0. (Note that this argumen<sup>t</sup> would obviously fail—as it should—when *a* = 0 in *a*|0-|0- + *a*|1-|1- + *a*|2-|2-.)

<sup>7</sup> We leave it to the reader to find out why this strategy proves the equality of probabilities, but fails to establish equality of energies in an "even" entangled state such as |*e*-|*e*- + |*g*-|*g*-.

leading to:

$$\left(\sqrt{2}|0\rangle|+\rangle+|2\rangle|2\rangle\right)|0'\rangle \Longrightarrow \sqrt{2}|0\rangle\frac{|0\rangle|0'\rangle+|1\rangle|1'\rangle}{\sqrt{2}}+|2\rangle|2\rangle|2'\rangle=|\Psi\_{SC\mathcal{E}}\rangle.\tag{3.8a}$$

Above, and from now on we skip subscripts: State of S will be always listed first, and state of C will be primed. The cancellation of √2 makes it obvious that this is an equal coefficient ("even") state:

$$|\Psi\_{\mathcal{SCE}}\rangle \propto |0,0'\rangle|0\rangle + |0,1'\rangle|1\rangle + |2,2'\rangle|2\rangle. \tag{3.9a}$$

Note that we have now combined state of S and C and (in the next step) we shall swap states of SC together.

Clearly, for joint states |0, 0-, |0, 1-, and |2, 2- of SC this is a Schmidt decomposition of (SC)E. The three orthonormal product states have coefficients with the same absolute value. THerefore, they can be envariantly swapped. It follows that the probabilities of these Schmidt states—|0-|0-, |0-|1-, and |2-|2-—are all equal, so by normalization they are 13 Moreover, and for the same envariant reason, the probability of state |2- of the system is 13 As |0- and |2- are the only two outcome states for S; it also follows that probability of |0- must be 23. Consequently:

$$p\_0 = \frac{2}{3};\ p\_2 = \frac{1}{3}.\tag{3.10a}$$

.

.

This is Born's rule! Probabilities are proportional to the squares of the amplitudes from |*ψ*SE -, Equation (3.7a).

Note that above we have avoided assuming additivity of probabilities: *p*0 = 23 not because it is a sum of two fine-grained alternatives each with probability 13 , but rather because there are only two (mutually exclusive and exhaustive) alternatives for S; |0- and |2-, and it was separately established that *p*2 = 13 . So, by normalization, *p*0 = 1 − 13 .

Bypassing appeal to additivity of probabilities is essential in interpreting theory with another principle of additivity—quantum superposition principle—which trumps additivity of probabilities or at least classical intuitive ideas about what should be additive (e.g., in the double slit experiment). Here, this conflict is averted: Probabilities of Schmidt states can be added because of the loss of phase coherence that follows directly from envariance, as we have established earlier, and as was discussed in (Zurek, 2005) [48]. We return to this point in the additivity Lemma below.

Consider now a more general case of arbitrary coefficients. For simplicity we focus on entangled state with just two non-zero Schmidt coefficients:

$$|\psi\_{\mathcal{SE}}\rangle = a|0\rangle|\varepsilon\_0\rangle + \beta|1\rangle|\varepsilon\_1\rangle \,, \tag{3.7b}$$

and assume *α* = *μμ*+*ν* ; *β* = *ν μ*+*ν* , with integer *μ*, *ν*.

As before, the strategy is to convert a general entangled state into an even state, and then to apply envariance under swaps. To implement it, we assume E has sufficient dimensionality<sup>8</sup> to allow decomposition of |*<sup>ε</sup>*0- and |*<sup>ε</sup>*1- in a different orthonormal basis {|*ek*-}:

$$|\varepsilon\_0\rangle = \sum\_{k=1}^{\mu} |\varepsilon\_k\rangle \Big/ \sqrt{\mu} ; \qquad |\varepsilon\_1\rangle = \sum\_{k=\mu+1}^{\mu+\nu} |\varepsilon\_k\rangle \Big/ \sqrt{\nu} .$$

<sup>8</sup> This assumption is not essential. One could instead use two ancillas C and C with sufficient dimensionality and a c-not like gate to obtain a fine-grained state ∝ ∑*<sup>μ</sup>k*=<sup>1</sup> |0, *ck* -|*ek*, *ck* - + ∑*<sup>μ</sup>*+*<sup>ν</sup> k*=*μ*+<sup>1</sup> |1, *ck* -|*ek*, *ck* - that allows for envariant swaps and leads to identical conclusions, but with a more cumbersome notation.

Envariance we need is associated with counterswaps of E that undo swaps of the joint state of the composite system SC. To exhibit it, we let ancilla C interact with E as before, e.g., by employing E as a control to carry out a 'controlled-not - like' operation;

$$|e\_k\rangle|c\_0\rangle \to |e\_k\rangle|c\_k\rangle\_\prime$$

where |*<sup>c</sup>*0- is the initial state of C in some suitable orthonormal basis {|*ck*-}. Thus;

$$|\Psi\_{SC\mathcal{E}}\rangle \propto \sqrt{\mu}|0\rangle \sum\_{k=1}^{\mu} \frac{|c\_k\rangle|e\_k\rangle}{\sqrt{\mu}} + \sqrt{\nu}|1\rangle \sum\_{k=\mu+1}^{\mu+\nu} \frac{|c\_k\rangle|e\_k\rangle}{\sqrt{\nu}}.\tag{3.8b}$$

Such CE interaction can happen far from S, so by Fact 1 it cannot influence probabilities in S. |Ψ¯ SCE - is envariant under swaps of states |*s*, *ck*- (where *s* stands for 0 or 1, as needed) in the composite system SC. This is even more apparent after the obvious cancellations;

$$\left| \Psi\_{\mathcal{SCE}} \right\rangle \propto \sum\_{k=1}^{\mu} \left| 0, c\_k \right\rangle \left| c\_k \right\rangle + \sum\_{k=\mu+1}^{\mu+\nu} \left| 1, c\_k \right\rangle \left| c\_k \right\rangle. \tag{3.9b}$$

Hence, *p*0,*k* = *p*1,*k* = 1 *μ*+*ν* . Therefore, it follows that the probabilities of |0- and |1- are:

$$p\_0 = \frac{\mu}{\mu + \nu} = |a|^2; \quad p\_1 = \frac{\nu}{\mu + \nu} = |\beta|^2. \tag{3.10b}$$

Born's rule thus emerges here from the most quantum aspects of the theory—entanglement and envariance.

In contrast with other approaches, probabilities in our envariant derivation are a consequence of complementarity, of the incompatibility of the purity of the entangled state of the whole with the purity of the states of parts. Born's rule appears in a completely quantum setting, without any *a priori* imposition of symptoms of classicality that would violate the spirit of quantum theory.

Envariant derivation (in contrast to Gleason's successful but unphysical proof and in contrast to unsuccessful "frequentist" attempts in the Everettian "Many-Worlds" setting) does not require additivity as an assumption: The strategy that bypasses appeal to additivity used in the simple case of Equation (3.10a) can be generalized (for details see below and Zurek, 2005 [48]). The assumption of additivity is not needed. In a quantum setting this is an important advance. Additivity of probabilities is a consequence of the envariance of phases of Schmidt coefficients that leads to decoherence. The case of more than two outcomes is straightforward, as is extension by continuity to incommensurate probabilities.

#### 3.4.1. Additivity of Probabilities from Envariance

Kolmogorov's axiomatic formulation of the probability theory (see Gnedenko, 1968 [52]) as well as the proof of Born's rule due to Gleason (1957) [84] *assume* additivity of probabilities. This assumption is motivated by the assertion that probability is a *measure*. On the other hand, in the standard approach of Laplace (1820) [69] additivity can be established starting from the definition of probability of a composite event as a fraction of the favorable equiprobable events to the total (see discussion in Gnedenko, 1968 [52]). The key ingredient that makes this derivation of additivity possible is equiprobability

We have already established—using *objective* symmetries (in contrast to Laplace, who had to rely on the subjective 'state of mind')—that envariantly swappable events are equiprobable. We can now follow Laplace's strategy and use equiprobability along with decoherence already justified directly by envariance (see Section 3.2) to prove additivity. This is important, as additivity of probabilities should not be automatically and uncritically adopted in the quantum setting. After all, quantum theory is based on the principle of superposition, our core postulate (i): The principle of superposition asserts supremacy of the

additivity of state vectors which is *prima facie* incompatible with additivity of probabilities, as is illustrated by the double slit experiment.

Phases between the record (pointer) states (or, more generally, between any set of Schmidt states) do not influence the outcome of any local measurement that can be carried out on the apparatus (or on decohered records in the memory of the observer). This independence of the local state from the global phases in the Schmidt decomposition invalidates the principle of superposition when the system of interest (or the pointer of the apparatus, or the memory of the observer) are 'open', entangled with their environments. Therefore, we can now *establish* (rather than postulate) the probability of a composite event:

**Lemma 1.** *Probability of a composite (coarse-grained) event consisting of a subset*

$$\kappa \equiv \{k\_1 \lor k\_2 \lor \cdots \lor k\_{n\_k}\} \tag{3.11}$$

*of nκ of the total N envariantly swappable mutually exclusive exhaustive fine-grained events associated with records corresponding to pointer states of the global state;*

$$\left| \Psi\_{\mathcal{SAE}} \right\rangle \rightleftharpoons \sum\_{k=1}^{N} \epsilon^{i\phi\_{k}} \left| s\_{k} \right\rangle \left| A\_{k} \right\rangle \left| \varepsilon\_{k} \right\rangle \quad = \sum\_{k=1}^{N} \epsilon^{i\phi\_{k}} \left| s\_{k} \right\rangle \left| s\_{k} \right\rangle \left| \varepsilon\_{k} \right\rangle,$$

*is given by:*

$$p(\kappa) = \frac{n\_{\kappa}}{N}.\tag{3.12}$$

To prove additivity of probabilities using envariance we consider the state:

$$|\langle \mathbf{Y}\_{\mathcal{A}\mathcal{A}\mathcal{S}\mathcal{E}}\rangle| \propto \sum\_{\kappa} |A\_{\kappa}^{\leftarrow}\rangle \sum\_{k \in \kappa} |A\_{k}\rangle |s\_{k}\rangle |\varepsilon\_{k}\rangle \tag{3.13}$$

representing both the fine-grained and coarse-grained records. The coarse-graining is implemented by the apparatus A ¯ with pointer states |*A*<sup>∈</sup>*κ* -.

We first note that the form of |<sup>Υ</sup>AASE ¯ - justifies assigning zero probability to |*sj*-'s that do not appear—i.e., appear with zero amplitude—in the initial state of the system. Quite simply, there is no state of the observer with a record of such zero-amplitude Schmidt states of the system in |<sup>Υ</sup>AASE ¯ -, Equation (3.13).

To establish this Lemma we exploit basic implications of envariance: When there are total *N* envariantly swappable outcome states, and they exhaust all of the possible outcomes, each should be assigned probability of 1/*N*. We also note that when coarsegrained events are defined via |*A*<sup>∈</sup>*κ* - as unions of fine-grained events, the conditional probability of the coarse grained event is:

$$
\underline{v}(\mathbb{x}|k) = 1 \quad k \in \mathbb{x}\,,\tag{3.14a}
$$

$$
\psi(\mathbf{x}|k) = 0 \quad k \notin \mathbb{x} \,. \tag{3.14b}
$$

To demonstrate the above Lemma we need one more property—the fact that when an event U that is certain (*p*(U) = 1) can be decomposed into two mutually exclusive events,

$$\mathcal{U} = \kappa \vee \kappa^{\perp},\tag{3.15}$$

their probabilities must add to unity:

$$p(\mathcal{U}) = p(\boldsymbol{\kappa} \vee \boldsymbol{\kappa}^{\perp}) = p(\boldsymbol{\kappa}) + p(\boldsymbol{\kappa}^{\perp}) = 1 \,. \tag{3.16}$$

This assumption introduces (in a very limited setting) additivity. It is equivalent to the statement that "something will certainly happen". Note that this very limited version of additivity holds in quantum setting (i.e., it is not challenged by the double slit experiment,

and, more generally, by the superposition principle, providing that the events are indeed mutually exclusive).

**Proof.** Proof of the Lemma starts with the observation that probability of any composite event *κ* of the form of Equation (3.11) can be obtained recursively—by subtracting, one by one, probabilities of all the fine-grained events that belong to *<sup>κ</sup>*⊥, and exploiting the consequences of the implication, Equation (3.14)–(3.16). Thus, as a first step, we have:

$$p(\{k\_1 \lor k\_2 \lor \dots \lor k\_{n\_k} \lor \dots \lor k\_{N-1}\} + p(k\_N) = 1 \dots)$$

Moreover, for all fine-grained events *p*(*k*) = 1*N* . Hence;

$$p(\{k\_1 \lor k\_2 \lor \dots \lor k\_{\mathfrak{n}\_\mathbf{x}} \lor \dots \lor k\_{N-1}\} = 1 - \frac{1}{N} \dots$$

Furthermore (and this is the next recursive step) the conditional probability of the event {*k*1 ∨ *k*2 ∨···∨ *knκ* ∨···∨ *kN*−<sup>2</sup>} given the event {*k*1 ∨ *k*2 ∨···∨ *knκ* ∨···∨ *kN*−<sup>1</sup>} is:

$$p(\{k\_1 \lor k\_2 \lor \cdots \lor k\_{N-2}\} | \{k\_1 \lor k\_2 \lor \cdots \lor k\_{N-1}\}) = 1 - \frac{1}{N-1} \cdot p$$

and so the unconditional probability must be:

$$p\left(\left\{k\_1 \lor k\_2 \lor \cdots \lor k\_{n\_k} \lor \cdots \lor k\_{N-2}\right\}|\mathcal{U}\right) = (1 - \frac{1}{N})(1 - \frac{1}{N-1})\dots$$

Repeating this procedure untill only the desired composite event *κ* remains we have:

$$p(\{k\_1 \lor k\_2 \lor \dots \lor k\_{n\_k}\}) = (1 - \frac{1}{N}) \dots (1 - \frac{1}{N - (N - n\_k - 1)}) . \tag{3.17}$$

After some elementary algebra we finally recover:

$$p(\{k\_1 \lor k\_2 \lor \dots \lor k\_{n\_k}\}) = \frac{n\_k}{N} \dots$$

Hence, Equation (3.12) holds.

**Corollary 2.** *Probability of mutually compatible exclusive events κ*, *λ*, *μ*, ... *that can be decomposed into unions of envariantly swappable elementary events are additive:*

$$p(\mathbf{x} \lor \boldsymbol{\lambda} \lor \boldsymbol{\mu} \lor \dots) = p(\mathbf{x}) + p(\boldsymbol{\lambda}) + p(\boldsymbol{\mu}) + \dots \tag{3.18}$$

Note that in establishing additivity Lemma we have only considered situations that can be reduced to certainty or impossibility (that is, cases corresponding to the absolute value of the scalar product equal to 1 and 0). This is in keeping with our strategy of deriving probability and, in particular, of arriving at Born's rule from certainty and symmetries.

#### 3.4.2. Algebra of Records as the Boolean Algebra of Events

Algebra of events (see, e.g., Gnedenko, 1968 [52]) can be then defined by simply identifying events with records inscribed in the coarse-grained pointer states such as |*A*<sup>∈</sup>*κ* - in Equation (3.13) of the apparatus A ¯ . Logical product of any two coarse-grained events *κ*, *λ* corresponds to the product of the projection operators that act on the memory Hilbert space—on the corresponding records:

$$
\kappa \wedge \lambda \stackrel{\text{def}}{=} P\_{\text{x}} P\_{\lambda} = P\_{\text{x} \wedge \lambda} \,. \tag{3.19a}
$$

Logical sum is represented by a projection onto the union of the Hilbert subspaces:

$$
\kappa \lor \lambda \stackrel{\text{def}}{=} P\_{\kappa} + P\_{\lambda} - P\_{\kappa} P\_{\lambda} = P\_{\kappa \lor \lambda} \,. \tag{3.19b}
$$

Last not least, a complement of the event *κ* corresponds to:

$$
\kappa^{\perp} \stackrel{\text{def}}{=} P\_{\mathcal{U}} - P\_{\kappa} = P\_{\kappa^{\perp}} \,. \tag{3.19c}
$$

With this set of definitions it is now fairly straightforward to show:

**Theorem 3.** *Events corresponding to the records stored in the memory pointer states define a Boolean algebra.*

**Proof.** To show that the algebra of records is Boolean we need to show that coarse—grained events satisfy any of the (several equivalent, see, e.g., Sikorski, 1964 [95]) sets of axioms that define Boolean algebras:

(a) Commutativity:

$$P\_{\mathbf{x}\lor\lambda} = P\_{\lambda\lor\mathbf{x}} \text{; } \quad P\_{\mathbf{x}\land\lambda} = P\_{\lambda\land\mathbf{x}} \text{.}\tag{3.20a, a'}$$

(b) Associativity:

$$P\_{\left(\mathbf{x}\vee\lambda\right)\vee\mu} = P\_{\lambda\vee\left(\mathbf{x}\vee\mu\right)};\quad P\_{\left(\mathbf{x}\wedge\lambda\right)\wedge\mu} = P\_{\lambda\wedge\left(\mathbf{x}\wedge\mu\right)}.\tag{3.20b,b'}$$

(c) Absorptivity:

$$P\_{\mathbf{x}\vee(\lambda)\wedge\lambda} = P\_{\mathbf{x}\prime} \quad P\_{\mathbf{x}\vee(\lambda\wedge\mathbf{x})} = P\_{\mathbf{x}} \,. \tag{3.20c,c'}$$

(d) Distributivity:

$$P\_{\mathbf{x}\vee(\lambda\wedge\mu)} = P\_{(\mathbf{x}\vee\lambda)\wedge(\mathbf{x}\vee\mu)};\quad P\_{\mathbf{x}\wedge(\lambda\vee\mu)} = P\_{(\mathbf{x}\wedge\lambda)\vee(\mathbf{x}\wedge\mu)} \text{ .}\tag{3.20d,d'}$$

(e) Orthocompletness:

$$P\_{\mathbf{x}\vee(\lambda\wedge\lambda^{\perp})}=P\_{\mathbf{x}\prime} \quad P\_{\mathbf{x}\wedge(\lambda\vee\lambda^{\perp})}=P\_{\mathbf{x}}\,.\tag{3.20e,e'}$$

Proofs of (a)–(e) are straightforward manipulations of projection operators. We leave them as an exercise to the interested reader. As an example we give the proof of distributivity: *<sup>P</sup>κ*<sup>∧</sup>(*<sup>λ</sup>*∨*μ*) = *<sup>P</sup>κ*(*<sup>P</sup><sup>λ</sup>* + *Pμ* − *<sup>P</sup>λPμ*) = *PκP<sup>λ</sup>* + *<sup>P</sup>κPμ* − (*<sup>P</sup>κ*)<sup>2</sup>*PλPμ* = *Pκ*∧*<sup>λ</sup>* + *<sup>P</sup>κ*∧*μ* − *<sup>P</sup>κ*∧*λPκ*∧*<sup>μ</sup>* = *<sup>P</sup>*(*κ*∧*<sup>λ</sup>*)∨(*κ*∧*μ*) . The other distributivity axiom is demonstrated equally easily.

These record projectors commute because records are associated with the orthonormal pointer basis of the memory of the observer or of the apparatus: It is impossible to consult memory cell in any other basis, so the problems with distributivity pointed out by Birkhoff and von Neumann simply do not arise—when records are kept in orthonormal pointer states, there is no need for 'quantum logic'.

Theorem 3 entitles one to think of the outcomes of measurements—of the records kept in various pointer states—in classical terms. Projectors corresponding to pointer subspaces define overlapping but compatible volumes inside the memory Hilbert space. Algebra of such composite events (defined as coarse grained records) is indeed Boolean. The danger of the loss of additivity (which in quantum systems is intimately tied to the principle of superposition) has been averted: Distributive law of classical logic holds.

#### *3.5. Inverting Born's Rule: Why Is the Amplitude a Square Root of the Frequency of Occurrence?*

The strategy we have pursued above to derive Born's rule for unequal coefficients was to consider two different splits, SE|C and SC|E of the same composite tripartite system SEC. In the beginning we had a bipartite state of SE with unequal coefficients, Equations (3.7). We have finegrained it into an equal coefficient envariantly swappable state by introducing an additional system C, which entangled with SE in such a way that the resulting state of SEC was even, Equations (3.9). In that even state the finegrained probabilities were provably equal. One could then count the number of contributions that included the two alternatives of interest—states |0- and |1- of S. Unequal probabilities

were proportional to the number of equiprobable fine-grained contributions in the resulting even states, Equations (3.9).

This reasoning can be reversed: We shall now use the strategy of moving the line dividing the two subsystems of interest in a tripartite composite system to show that the amplitudes must be proportional to the square roots of the frequencies of occurrence of the corresponding events.

We consider the probability of getting a count of *m* 1's in a measurement by an apparatus A on an ensemble of identically prepared two-state systems:

$$\left| \check{\psi}\_{S} \right\rangle = \bigotimes\_{k=1}^{M} \left( \alpha \left| 0 \right\rangle + \beta \left| 1 \right\rangle \right)\_{k}.$$

In course of the (pre-)measurement memory cells of A entangle with S, so, in obvious notation, the resulting state is a product of *M* identical copies:

$$|\Psi\_{\mathcal{SA}}\rangle = \bigotimes\_{k=1}^{M} (a|0\rangle|a0\rangle + \beta|1\rangle|a1\rangle)\_{k} = \bigotimes\_{k=1}^{M} |\Psi\_{\mathcal{SA}}\rangle\_{k}.\tag{3.21a}$$

We shall work with the case of *α* = *β* to avoid cumbersome notation. With these simplifications the state vector |Ψ˘ SA*k* = (|0-|*<sup>a</sup>*0- + 1-|*<sup>a</sup>*1-)*k* is envariant under a swap (|0-1| + |1-0|)*k* acting on *k*'s member of the ensemble, as such a swap can be undone by a counterswap (|*<sup>a</sup>*0*<sup>a</sup>*1| + |*<sup>a</sup>*1*<sup>a</sup>*0|)*k* acting on A. This envariance must also hold when the state vector is expanded into a sum:

$$
\langle \Psi\_{\mathcal{SM}} \rangle \approx \sum\_{m=0}^{M} |\tilde{s}\_m\rangle. \tag{3.21b}
$$

In this expression, each unnormalized vector |*s*˜*m*- represents all sequences of outcomes and records that have resulted in *m* detections of "1", that is;

$$\begin{aligned} \left| \bar{s}\_0 \right\rangle &= \left| 00...0 \right\rangle \left| A\_{00...0} \right\rangle \\ \left| \bar{s}\_1 \right\rangle &= \left| 10...0 \right\rangle \left| A\_{10...0} \right\rangle + \left| 01...0 \right\rangle \left| A\_{01...0} \right\rangle + ... + \left| 00...1 \right\rangle \left| A\_{00...1} \right\rangle \\ \ldots \\ \left| \bar{s}\_m \right\rangle &= \left| 11...10...0 \right\rangle \left| A\_{11...10...0} \right\rangle + ... + \left| 00...01...1 \right\rangle \left| A\_{00...01...1} \right\rangle \\ \ldots \\ \left| \bar{s}\_M \right\rangle &= \left| 11...1 \right\rangle \left| A\_{11...1} \right\rangle. \end{aligned} \tag{3.22}$$

The memory state of A contains record of all the outcomes—it is the product of memory states of individual cells, e.g., |*<sup>A</sup>*10...0- = |*<sup>a</sup>*1-1|*<sup>a</sup>*0-2...|*<sup>a</sup>*0-*M*. All the sequences of the outcomes and the corresponding sequences of the records are orthonormal. Therefore, the sum of the sequences of outcome states and corresponding record states—Equation (3.21b) expressed in terms of Equation (3.22)—constitutes a Schmidt decomposition of |Ψ˘ SA-.

The number of distinct outcome sequence states in |*s*˜*m*- is *M*! *<sup>m</sup>*!(*<sup>M</sup>*−*<sup>m</sup>*)! = (*M m*). Every outcome sequence state is equiprobable since it can be envariantly swapped with any other outcome sequence state. Therefore, the probability of detecting *m* 1's must be proportional to (*M m*), the number of equiprobable records with *m* detections of 1.<sup>9</sup> Moreover, as we have set *α* = *β*, the relative normalization of every such sequence is the same. Consequently, every permutation of the outcomes—every specific sequence of 0's and 1's, regardless of the number of 1's—has a probability of 2−*M*. This includes "maverick" states with the unlikely total counts such as |*s*˜0- and |*s*˜*M*-.

<sup>9</sup> For instance, |00...0- in |*s*˜0- can be swapped with |10...0- in |*s*˜1-. The pre-swap |Ψ˘ SA- can be restored by counterswap of the corresponding |*<sup>A</sup>*00...0- with |*<sup>A</sup>*10...0-.

We now relate probability of a specific count of *m* 1's to the amplitude of the corresponding state. We prepare to address this question by adding a quantum system that counts 1's and enters their number in the *register* R. The result is an entangled state of S, A, and R:

$$
\langle \check{Y}\_{\mathcal{S}, \mathcal{A} \mathcal{R}} \rangle \propto \sum\_{m=0}^{M} |\check{s}\_{m}\rangle |m\rangle\_{\mathcal{R}}.\tag{3.23a}
$$

Above, |*m*-R are the orthogonal states of R with distinct total counts.

We shall now use envariance of |Υ˘ SAR- to relate probability of a specific count *m* to the amplitude of the corresponding state |*m*-R. To this end we first normalize states |*s*˜*m*-: Without normalization, amplitudes we are trying to deduce would have no meaning.

Normalizing states of SA is not difficult: Every individual sequence of 0's and 1's that corresponds to a possible records has the same norm, and the number of sequences that yield total count of *m* 1's determines the norm of |*s*˜*m*-;

$$
\langle \vec{s}\_m | \vec{s}\_m \rangle \propto \binom{M}{m}.
$$

This is a first step in a purely *mathematical* operation that converts |*s*˜*m*- into the corresponding normalized state |*sm*- that can be later legally used to implement the Schmidt decomposition.

It is now easy to see that states;

$$|s\_m\rangle = \begin{pmatrix} M \\ m \end{pmatrix}^{-\frac{1}{2}} |\bar{s}\_m\rangle\_\prime \tag{3.24}$$

have the same normalization. The state of the whole ensemble is then:

$$\langle |\check{Y}\_{\mathcal{S},\mathcal{A}\mathcal{R}}\rangle \propto \sum\_{m=0}^{M} \binom{M}{m}^{\frac{1}{2}} |s\_m\rangle |m\rangle\_{\mathcal{R}} = \sum\_{m=0}^{M} \gamma\_m |s\_m\rangle |m\rangle\_{\mathcal{R}}.\tag{3.23b}$$

This is also a Schmidt decomposition, as |*sm*- and |*m*-R are orthonormal. Given our previous discussion we already know that the probability *pm* of any specific count *m* is given by the fraction of such sequences. That is:

$$p\_m = 2^{-M} \binom{M}{m} \cdot$$

This follows from counting of the number of envariant (and, hence, equiprobable) permutations of 0's and 1's contributing to |*sm*- and, hence, corresponding to |*m*-R.

Indeed, Equation (3.23b) is a coarse-grained version of Equations (3.22) and (3.23a). So, the above expression for |Υ˘ SAR- shows that the amplitude *γm* of |*m*-R—of the state of the register that holds the count of *m* 1's—is proportional to the square root of the number of equiprobable sequences that lead to that count;

$$|\gamma\_m| \propto \sqrt{\binom{M}{m}} = \sqrt{\frac{M!}{m!(M-m)!}}\,\,\,\,\,\tag{3.25a}$$

or;

$$|\gamma\_m| = \sqrt{p}\_m \,. \tag{3.25b}$$

Equation (3.25) is the main result of our discussion. We have now deduced that absolute values |*<sup>γ</sup>m*| of the Schmidt coefficients are proportional to the *square roots* of relative frequencies—to the square roots of the cardinalities of subsets of 2 *M* equiprobable sequences that yield such 'total count = *m*' of composite events. In a sense, our calculation "inverts" the derivation of Born's rule we have presented before.

As in the earlier derivation of Born's rule, the key was to express the same tripartite global state |Υ˘ SAR- as two different Schmidt decompositions. Thus,

$$\begin{aligned} \left| \dot{Y}\_{\mathcal{S}|\mathcal{A}\mathcal{R}} \right\rangle &\ll \left| 00...0 \right\rangle (\left| A\_{00...0} \right\rangle \left| 0 \right\rangle\_{\mathcal{R}}) \\ &+ \left| 10...0 \right\rangle (\left| A\_{10...0} \right\rangle \left| 1 \right\rangle\_{\mathcal{R}}) + \left| 01...0 \right\rangle (\left| A\_{01...0} \right\rangle \left| 1 \right\rangle\_{\mathcal{R}}) + \\ &\dots &+ \left| 00...1 \right\rangle (\left| A\_{00...1} \right\rangle \left| 1 \right\rangle\_{\mathcal{R}}) \\ &\dots \\ &+ \left| 11...1100...00 \right\rangle (\left| A\_{11...1100\dots00} \right\rangle \left| m \right\rangle\_{\mathcal{R}}) + \\ &\dots &+ \left| 00...0011...11 \right\rangle (\left| A\_{00\dots0011\dots11} \right\rangle \left| m \right\rangle\_{\mathcal{R}}) \\ &\dots \\ &+ \left| 11...1 \right\rangle (\left| A\_{11...1} \right\rangle \left| M \right\rangle\_{\mathcal{R}}). \end{aligned}$$

for the split S|AR of the whole into two subsystems, and;

$$\left| \left| \check{Y}\_{\mathcal{S}\mathcal{A}/\mathcal{R}} \right\rangle \propto \sum\_{m=0}^{M} \sqrt{\binom{M}{m}} \left| s\_m \right\rangle \left| m \right\rangle\_{\mathcal{R}} = \sum\_{m=0}^{M} \gamma\_m \left| s\_m \right\rangle \left| m \right\rangle\_{\mathcal{R}'} \tag{3.26b}$$

for the alternative SA|R.

The location of the border between the two parts of the whole SAR is the key difference. It redefines "events of interest". The top |Υ˘ S|AR- treats binary sequences of outcomes as "events of interest", and, by envariance, assigns equal probabilities 2−*<sup>M</sup>* to each outcome sequence state. By contrast, in |Υ˘ SA|R- the total count *m* is an "event of interest", but now its probability can be deduced from |Υ˘ S|AR-, as both Schmidt decompositions represent the same state—the same physical situation. This implies (a converse of) Born's rule: Amplitude of a state |*m*-R of the register R is proportional to the square root of the number of sequences of counts that yield *m*.

Generalization to when *α* = *β* is conceptually simple (although notationally cumbersome). Global state after the requisite adjustment of the relative normalizations is:

$$\langle |\check{Y}\_{\mathcal{S}\mathcal{A}\mathcal{R}}\rangle \propto \sum\_{m=0}^{M} \binom{M}{m}^{\frac{1}{2}} a^{M-m} \beta^m |s\_m\rangle |m\rangle\_{\mathcal{R}} = \sum\_{m=0}^{M} \Gamma\_m |s\_m\rangle |m\rangle\_{\mathcal{R}}.$$

Coefficients Γ*m* that multiply |*sm*-|*m*-R combine on equal footing preexisting amplitudes *α* and *β* of |0- and |1- from the initial state, |*ψ*˘S - = *Mk*=<sup>1</sup>(*α*|0- + *β*|1-)*k* with the square roots of Newton's symbols that arise from counting—with the numbers of the corresponding outcome sequences. Once the state representing the whole ensemble is written as ∑*Mm*=<sup>0</sup> <sup>Γ</sup>*m*|*sm*-|*m*-R, the origin of the coefficients Γ*m* (or *γm* before) is irrelevant: Observer presented with a state ∑*Mm*=<sup>0</sup> <sup>Γ</sup>*m*|*sm*-|*m*-R and asked to assess probabilities of outcomes |*sm*-|*m*-R has no reason to delve into combinatorial origins of Γ*<sup>m</sup>*. For a measurement with outcome states |*sm*-|*m*-R the origin of the amplitudes Γ*m* that multiply them do not matter. Their absolute values, however, do matter: Observer could implement envariant derivation "from scratch", starting with whatever coefficients are there in the initial state, and finegraining (as before, Equations (3.8)), to deduce probabilities of various outcomes.

#### *3.6. Relative Frequencies from Relative States*

We shall now use envariance to deduce relative frequencies from amplitudes. In view of the discussion immediately above the relation between amplitudes and frequencies is already apparent, so this may seem superfluous, but we shall sketch it anyway "for the sake of completeness", and also because it provides a different—experimentally motivated, one could say—point of view of the alternatives. A much more complete derivation of relative frequencies from envariance is also available in Ref. [48].

We emphasize that we do not need relative frequencies to define probabilities: Probabilities are already in place. They are "single shot", defined not by counting the number of

"favorable events" (as in the relative frequency approach), but, rather, by first establishing equiprobability of a certain class of events, and then by counting the number of equiprobable favorable possibilities. Therefore, the calculation immediately below is, in a sense, only a consistency check.

Consider *M* distinguishable SCE triplets, each already in the fine-grained state;

$$\left|\Psi\_{\mathcal{SC}\mathcal{E}}^{(\ell)}\right\rangle = \left(\sum\_{k=1}^{\mu} \left|0\left.c\_{k}\right\rangle\left|c\_{k}\right\rangle + \sum\_{k=\mu+1}^{\mu+\nu} \left|1\left.c\_{k}\right\rangle\left|c\_{k}\right\rangle\right\rangle\right)^{(\ell)}\right)$$

of Equation (3.9). The state of the whole ensemble is then given by their product;

$$\left|\Omega\_{SC\mathcal{E}}^{M}\right\rangle = \bigotimes\_{\ell=1}^{M} \left|\Psi\_{SC\mathcal{E}}^{(\ell)}\right\rangle = \bigotimes\_{\ell=1}^{M} \big| \sum\_{k=1}^{\mu} \left|0, c\_{k}\right\rangle \left|c\_{k}\right\rangle + \sum\_{k=\mu+1}^{\mu+\nu} \left|1, c\_{k}\right\rangle \left|c\_{k}\right\rangle \right)^{(\ell)}.\tag{3.27}$$

As in the derivation of the inverse of Born's rule, we can now carry out the product and obtain a sum that will contain (*μ* + *ν*)*<sup>M</sup>* (instead of just 2*<sup>M</sup>*) terms. As in Equation (3.26a), these terms can be now sorted according to the number of 0's and 1's they contain. We could even attach a register R, and repeat all the steps we have taken above. We shall bypass these intermediate calculations that are conceptually straightforward but notationally cumbersome. What matters in the end is how many equiprobable terms contain, say, *m* 1's. The answer is clearly, (*Mm*)*μM*−*mνm*. Therefore, the probability of detecting *m* 1's in *M* trials is given by:

$$p\_M(m) = \binom{M}{m} \frac{\mu^{M-m} \nu^m}{(\mu + \nu)^M} = \binom{M}{m} |a|^{2(M-m)} |\beta|^{2m}. \tag{3.28}$$

We now assume *M* is large, not because envariant derivation requires this—we have already obtained Born's rule for individual events, *M* = 1—but because the relative frequency approach needs it (von Mises, 1939 [71]; Gnedenko, 1968 [52]). In that limit binomial can be approximated by a Gaussian:

$$p\_M(m) \simeq \frac{\exp - \frac{1}{2} \left(\frac{m - |\beta|^2 M}{\sqrt{M} |a\beta|}\right)^2}{\sqrt{2\pi M} |a\beta|}. \tag{3.29}$$

The average number of 1's is, according to Equation (3.29), *m*- = |*β*|<sup>2</sup>*M*, as expected, establishing a link between relative frequency of events in a large number of trials and Born's rule. This connection between quantum states and relative frequencies does not rest either on circular and *ad hoc* assumptions that relate size of the coefficients in the global state to probabilities (e.g., by asserting that probability corresponding to a small enough amplitude is 0 (Geroch 1984) [73]), or modifications of quantum theory (Weissman, 1999 [96]; Buniy, Hsu, and Zee, 2006 [83]), or on the unphysical infinite limit (Hartle, 1968 [79]; Farhi, Goldstone, and Guttmann, 1989 [80]). Such steps have left past frequentist approaches to Born's rule (including also these of Everett, DeWitt, and Graham) open to criticism (Stein, 1984 [74]; Kent, 1990 [38]; Squires, 1990 [39]; Joos, 2000 [75]; Auletta, 2000 [97]).

Note that we avoid the problem of two independent measures of probability (number of branches *and* size of the coefficients) that derailed previous relative state attempts. We simply count the number of *envariantly swappable* (and hence *provably equiprobable*) sequences of potential events. This settles the issue of "maverick universes"—atypical branches with numbers of e.g., 0's very different from the average *n*-. They are there (as they should be) but they are very improbable. This is established through a physically motivated envariance under swaps. So, maverick branches did not have to be removed either "by assumption" (DeWitt, 1970 [14]; 1971 [15]; Graham, 1973 [72]; Geroch, 1984 [73]) or by an equally unphysical *M* = ∞. Maverick branches are there, but pose no threat to our envariant derivation.

#### *3.7. Envariance—An Overview*

Envariance settles a major outstanding quantum foundational problem: The origin of probabilities. Born's rule can be now established on a basis of a solid and simple physical reasoning, and without assuming additivity of probabilities (in contrast to Gleason, 1957 [84]). We have derived *pk* = |*ψk*| 2 without relying on the tools of decoherence.

There were other attempts to apply Laplacean strategy of invariance under permutations to prove "indifference". This author (Zurek, 1998) [36] noted that all of the possibilities listed on a diagonal of a unit density matrix (e.g., ∼ |0-0| + |1-1|) must be equiprobable, as it is invariant under swaps. This equiprobability—based approach was then extended to the case of unequal coefficients by finegraining, and leads to Born's rule.

However, a reduced density matrix is not the right starting point for the derivation: A pure state, prepared by the observer in the preceding measurement, is. In addition, to ge<sup>t</sup> from such a pure state to a mixed (reduced) density matrix one must "trace"—average over, e.g., the environment. Born's rule is involved in averaging, which leads to a concern that such a derivation may be circular [7,35,48,88].

One could attempt to deal with a pure state of a single system instead. Deutsch (1999) [86] and his followers (Wallace, 2003 [87]; Saunders, 2004 [98]) pursued this approach in terms of decision theory. The key was again invariance under permutations. It is indeed there for certain pure states (e.g., |0- + |1-) but it disappears when the relative phase is involved. That is, |0- + |1- equals the post-swap |1- + |0-, but |0- + *ei<sup>φ</sup>*|1- = |1- + *ei<sup>φ</sup>*|0-, and the difference is *not* the overall phase. Indeed, |0- + *i*|1- is *orthogonal* to *i*|0- + |1-, so there is no invariance under swaps, and the argumen<sup>t</sup> that the swap does not matter because the pre-swap state is in the end recovered is simply wrong. In isolated systems this problem cannot be avoided. (Envariance of course deals with it very naturally, as the phase of Schmidt coefficients is envariant—see Equations (3.1)–(3.4).)

The other problem with decision theory approaches put forward to date is selection of events one of which will happen upon measurement—the choice of the preferred states. These two problems must be settled, either through appeal to decoherence (as in Zurek, 1998 [36], and in Wallace, 2003 [87]), or by ignoring phases essentially *ad hoc* (Deutsch, 1999) [86], which then makes readers suspect that some "Copenhagen" assumptions are involved. (Indeed, Barnum et al. (2000) [99] criticized Deutsch (1999) [86] by interpreting his approach in the "Copenhagen spirit".) In addition, decoherence—invoked by Wallace, 2003 [87]—employs reduced density matrices, hence, Born's rule. So, as noted by many, it should be "off limits" in its derivation [32,35,48,89,90].

In more recent papers, advocates of the decision theory approach adopt a strategy (Wallace 2007 [100]; 2010 [92]; 2012 [19]) that in effect relies on envariance. This affinity of the updated decision-theoretic approach with envariant derivation of Born's rule has been noticed and analyzed [91].

Envariant derivation of Born's rule we have presented is an extension of the swap strategy in (Zurek, 1998) [36]. However, instead of tracing out the environment, we have incorporated it in the discussion (albeit in the role similar to a gauge field).

Envariance leads to Born's rule, but also to new appreciation of decoherence. Pointer states can be inferred directly from the dynamics of information transfers as was shown in Section 2 (see also Ref. [48]) and, indeed, in the original definition of pointer states [24]. Not everyone is comfortable with envariance (see, e.g., Herbut, 2007 [101], for a selection of views on envariance). This is understandable—interpretation of quantum theory was always rife with controversies.

3.7.1. Implications and the Scope of Envariance: Why Entanglement? Why Schmidt States?

Envariance is firmly rooted in quantum physics. It is based on the symmetries of entanglement. One may be nevertheless concerned about the scope of envariant approach: *pk* = |*ψk*| 2 for Schmidt states, but how about measurements with other outcomes? The obvious starting point for the derivation of probabilities is not an entangled state of S and

E, but a pure state of S. In addition, such a state can be expressed in *any* basis that spans HS. So, why entanglement? And why Schmidt states?

Envariance of Schmidt coefficient phases is closely tied to the einselection of pointer states: After decoherence has set in, pointer states diagonalize reduced density matrices nearly as well as the Schmidt states (which diagonalize them exactly). Residual misalignment is not going to be a major problem. At most, it might cause minor violations of the laws obeyed by the classical probability for events defined by the pointer states.

Such violations are intriguing, and perhaps even detectable, but unlikely to matter in the macroscopic setting we usually deal with. To see why, we revisit pointer states— Schmidt states (or einselection—envariance) link in the setting of measurements: Observer O uses an (ultimately quantum) apparatus A, initially in a known state |*<sup>A</sup>*0-, to entangle with S, which then decoheres as A is immersed in E (Zurek, 1991 [27], 2003 [7]; Joos et al., 2003 [29]; Schlosshauer, 2005 [31]; 2007 [32]). This sequence of interactions leads to:

$$\left| \left| \psi\_S \right> \left| A\_0 \right> \left| \varepsilon\_0 \right> \right| \Rightarrow \left( \sum\_k a\_k \left| s\_k \right> \left| A\_k \right> \right) \left| \varepsilon\_0 \right> \Rightarrow \sum\_k a\_k \left| s\_k \right> \left| A\_k \right> \left| \varepsilon\_k \right> \dots$$

In a properly constructed apparatus pointer states |*Ak*- are unperturbed by E while |*<sup>ε</sup>k*- become orthonormal on a decoherence timescale. So in the end we have a Schmidt decomposition of SA (treated as a single entity) and E.

Apparatus is built to measure a specific observable *σ*S = ∑*k <sup>ς</sup>k*|*sksk*|. Suppose O knows that S starts in |*ψ*S - = ∑*k ak*|*sk*-. The choice of A (of Hamiltonians, etc.) commits observer to a definite set of potential outcomes: Probabilities will refer to {|*Ak*-}, or, equivalently, to {|*Aksk*-} in the Schmidt decomposition.

To answer questions we started with (Why entanglement, Why Schmidt states?), entanglement is a result of interactions that cause measurement and decoherence, and only pointer states *of the apparatus* (e.g., states that are near the diagonal, and can play the role of Schmidt states to a very good approximation after decoherence) can be outcomes.

This emphasis on the role of the apparatus in deciding what happens parallels Bohr's view captured by "No phenomenon is a phenomenon until it is a recorded phenomenon" (Wheeler, 1983) [47]. In our case A is quantum and symptoms of classicality—e.g., einselection as well as the loss of phase coherence between pointer states—arise as a result of entanglement with E.

Envariant approach applies even when |*sk*- are not orthogonal: Orthogonality of the record states of the apparatus is assured by their distinguishability. This is because (as noted in Section 2) events agents have direct access to are records in A (rather than states of S). Indeed, as we shall discuss in the next section, we access records in the measuring devices indirectly, by intercepting small fragments of, e.g., photon environment that has helped decohere them and has einselected distinct states of the apparatus pointer. States of A that can leave multiple imprints on the environment (so that we can find out measurement outcomes from the tiny fraction of E) must be distinguishable (hence, orthogonal).

Other simplifying assumptions we invoked can be also relaxed [48]. For example, when E is initially mixed (as will be generally the case), one can 'purify' it by adding extra E in the usual manner (see Section 2). Given that we already have a derivation of Born's rule for pure states, the use of the purification strategy (when it is justified by the physical context) does not require apologies, and does not introduce circularity. Indeed, it is tempting to claim that all probabilities in physics *can* be interpreted envariantly.

Probabilities described by Born's rule quantify ignorance of the observer O *before* he or she finds out the measurement outcome. Therefore, envariant probabilities admit ignorance interpretation—O is ignorant of the *future* outcome (rather than of an unknown *pre-existing* real state, as was the case classically). Of course, once O's memory becomes correlated with A, its state registers what O has perceived (say, |*<sup>o</sup>*7- that registers |*<sup>A</sup>*7-). Re-checking of the apparatus will confirm it. Moreover, when many systems are prepared in the same initial state, frequencies of the outcomes will accord with Born's rule.

Envariant approach uses incompatibility between observables of the whole and its parts. It has been now adopted and discussed by others (Barnum, 2003 [93]; Schlosshauer and Fine, 2005 [94]; Schlosshauer, 2005 [31]; 2006 [37]; 2007 [32]; 2019 [33]; Paris, 2005 [102]; Jordan, 2006 [103]; Steane, 2007 [104]; Bub and Pitovsky, 2007 [105]; Horodecki et al., 2009 [106]; Blaylock, 2010 [107]; Seidewitz, 2011 [108]; Hsu, 2012 [109]; Sebens and Carroll, 2018 [110]; Drezet, 2021 [91]).

In retrospect, it seems surprising that envariace was not noticed earlier and used to derive probability or to provide insights into decoherence and einselection: Entangling interactions are key to measurements and decoherence, so entanglement symmetries would seem relevant. However, entanglement was viewed as a paradox, as something that needs to be explained, and not used in an explanation. This attitude is, fortunately, changing.

#### 3.7.2. Towards the Experimental Verification of Envariance

Purifications, use of ancillas, fine-graining, and other steps in the derivation need not be carried out in the laboratory each time probabilities are computed using the state vector: Once established, Born's rule is a *law*. It follows from the tensor product and the geometry of the Hilbert spaces for composite quantum systems. We used assumptions about C, E, etc., to demonstrate *pk* = |*ψk*| 2, but this rule must be obeyed even when no one is there to immediately verify compliance. So, even when there is no ancilla C at hand, or when E is initially mixed or too small for fine-graining, one could (at some later time, using purification, extra environments and ancillas) verify that bookkeeping implicit in assigning probabilities to |*ψ*S - or pre-entangled |*ψ*SE - abides by the symmetries of entanglement.

The obvious next question is how to verify envariance directly. Testing whether the global state is recovered following a swap and a counterswap using tools that favor dealing locally with individual systems is the essence of the experimental difficulties. The few tests of envariance carried out to date approach these challenges differently. The first (and, to date, most precise) test uses pairs of entangled photons to perform the requisite transformations (Vermeyden et al., 2015) [111]. The final global state is acquired by measurements on the individual photons, and characterized using quantum tomography.

Envariance was indeed confirmed with impressive accuracy. Aware of the dangers of circularity Vermeyden et al. used Bhattacharyya coefficient to analyze the experimental data. They have also tested (and constrained) the theory of Son (2014) [112] which allows for powers other than the standard square, *pk* = |*ψk*| 2, in the relation between probability and amplitude. Entangled quantum states were found to be (99.66 ± 0.04)% envariant as measured using the quantum fidelity, and (99.963 ± 0.005)% as measured using a modified Bhattacharyya coefficient. According to the authors, the systematic deviations are due to the less-than-maximal entanglement in their photon pairs.

The experiment of Harris et al. (2016) [113] verified envariance by showing that pure quantum states consisting of two maximally entangled degrees of freedom are left unaltered by the action of successive 'swapping' operations, each of which is carried out on a different (entangled) degree of freedom. Moreover, it tested and confirmed the perfect correlation used in the derivation of Born's rule. That is, it showed that Schmidt partners—states that belong to different Hilbert spaces but are linked to one another via tensor product in a Schmidt decomposition—are detected together upon measurement. Bhattacharyya coefficients were again used in testing and the accuracies of well over 99% were attained.

The advent of quantum computers has allowed theorists to act as experimenters. A pioneering example is the test of envariance using 5 qubits of IBM's Quantum Experience carried out by Deffner (2016) [114]. In addition to testing envariance on entangled pairs, he has also investigated larger GHZ-like states with up to five entangled qubits. The accuracy to which envariance holds decreased with the increasing number of qubits from ∼95% for pairs to ∼75% for quintuples. Final measurement was again done on individual spins. Clearly, quantum computers are at this point no match for serious laboratory experiments, but as they improve, even small quantum devices may be useful as tests of fundamental physics.

While present day quantum computers are imperfect, the progress is, of recent, relatively rapid. It seems therefore likely that much more accurate implementations of strategies that test this fundamental symmetry as well as more advanced tests (needed to verify "finegraining" in the case of unequal absolute values of the coefficients) should be within reach soon. A possible circuit that can be used to verify envariance is illustrated in Figure 3A.

#### **A. 'Quantum Agent' Circuit Test of Envariance.**

**B. Hong-Ou-Mandel Experimental Test of Envariance.**

**Figure 3. Testing envariance:** A. "Quantum agent" circuit test of envariance: Hadamard gates put agen<sup>t</sup> A and the system S in a superposition, and the leftmost c-not creates an entangled SE state. The next c-not performs a superposition of a swap on S (when A is in the state |1-) or does nothing (when A is in |0-). After the second c-not the SE pair is in a superposition of swapped an untouched, and state of A is mixed. However, after the last gate performs a conditional counterswap on E, the entangled state of SE should be restored, and the state of A should disentangle from SE and become pure again. B. Testing envariance with an entangled photon pair and a Hong-Ou-Mandel [115] interferometer. The initial state is |0-|1- − |1-|0-. As photons in the two arms of the interferometer are distinguishable (have "opposite" polarizations) they should not emerge together at the end of the interferometer. As a result, detectors should click in coincidence. However, after a swap *UA* = |0-1| + |1-0| on one of the photons changes the state into |0-|0- − |1-|1-, both photons should emerge at the same output. This will suppress coincidence clicks of the two detectors. This swap can be of course undone by a counterswap *UB* = |0-1| + |1-0| on the second photon, restoring the initial state, and restoring coincidences between the two detectors. Partial rotations of polarization and phase shifts can probe envariance for more general transformations.

In the meantime, it may be useful to consider other experimental settings and other designs that require laboratory tests but that allow one to verify that (in the wake of a swap and a counterswap) nothing happens that the global state is restored. A possible design of such a test that employs Hong-Ou-Mandel (HOM) interferometry (Hong, Ou, and Mandel, 1987 [115]; Milonni and Eberley, 2010 [116]) is shown in Figure 3B.

#### **4. Quantum Darwinism**

Objective existence in classical physics was an attribute of the state of a system. An unknown classical state could be measured and found out by many, and ye<sup>t</sup> retain its identity—remain unchanged—even when observers were initially ignorant of what it is.

In contrast to classical states, quantum states are fragile—they cannot be, in general, found out without getting perturbed by the measurement. Objective existence in a quantum world emerges—we shall see—as a consequence of correlations between a system and its environment. It is not (as in classical physics) "sole responsibility" of the system.

Quantum states can, in effect, exist objectively—retain their identity and result in compatible records of independent measurements by many observers—providing observers measure only observables that commute with the preexisting state of the system. In the aftermath of decoherence, this means restriction to its pointer observable. In that case, observers will agree about the outcomes—their measurements will not invalidate one another and will not be erased by decoherence. Therefore, a consensus about the state based on independent measurements—the essence of objectivity—can be established. Such a consensus is the only operational requirement for the "objective existence of classical reality" in our quantum Universe. However, why should observers measure only pointer observables?

Quantum Darwinism provides a simple and natural explanation of this restriction, and, hence, of objective existence—bulwark of classicality—for the einselected states. Quantum Darwinism recognizes that the information we acquire about the Universe comes to us indirectly, through the evidence systems of interest deposited in their environments, and that the only states capable of depositing multiple copies—many quantum memes or qmemes—are the einselected pointer states. Observers access directly only the record made in a fraction of the environment—an imprint of the original state of S on a fragment F of E. There are usually multiple copies of that original (e.g., of this text) that are disseminated throughout E (e.g., by the photon environment—by the light scattered by a printed page or emitted by a computer screen). Observers can find out states of various systems indirectly and agree about their findings because correlations of S with E (which we quantify below using mutual information) allow E to be a *witness* to the pointer state of the system.

In this section we define mutual information, and use it to characterize the information that can be gained about S from E. Objectivity arises because of redundancy—the same information can be obtained independently by many observers from many fragments of E. So, in a sense, objectivity of an observable is quantified by the redundancy of its records—the number of its copies—in E.

The multiplicity of records of the pointer observable of S in E accounts for all the symptoms of the "wavepacket collapse" that are accessible via localized measurements of the environment: Observers who have detected a fragment of E that bears an imprint of the pointer observable of the system will—in the future—encounter only states of the rest of the environment fragments that convey message consistent with the pointer state they have initially seen, and will be able to communicate only with others who have recorded the same pointer state (and are therefore on the same "branch"). Moreover, an observer who decides to verify the state of S (either by measuring it directly, or by intercepting additional fragments of E) will obtain data which confirm that S is in a pointer state that was revealed by the initial measurement on the fragment E that was measured first.

The system itself is untouched—it is not measured directly. Observers acquire their information indirectly, from the qmemes in the environment that has "measured" the system while decohering it. What we find out about our quantum Universe as a consequence of

decoherence (that restricts stable states of macroscopic systems to the einselected pointer states) and of quantum Darwinism (selective proliferation of the information about these pointer states). What we see looks classical—it is our familiar classical world: Environment communicates information about pointer states that were selected by decoherence.

We perceive our reality as classical because we are immersed in the information bearing halos of macroscopic systems—because our world consists of extantons, composite entities that combine the source of information (the pointer observable of the macroscopic object that resides in the extanton core) with the means of its delivery (information-bearing halo of, e.g., photons). We only pay attention to the message (the state of the extanton core) and take for granted—ignore—its means of delivery (extanton halo laden with data about the pointer state of the core).

Fragility of individual quantum states is no longer a problem. Observers will generally destroy the evidence (e.g., absorb photons in their retina) while acquiring information. However, there are now many copies of the same information—all imprinted with the data about the underlying state of the system. Therefore, even though evidence of the state of the system may be in part erased, consensus about it will emerge in the end, even as observers measure different fragments of the environment in ways that obliterate carriers of that information.

Last not least, even when observers do not know what are the pointer states of the system, the environment does, and will let them know: Consensus between the evidence carried by different fragments of E emerges as these measurements contain redundant information only about the pointer states10.

Quantum Darwinism can be developed starting from the same assumptions as decohrence theory. Nevertheless, results of the two previous sections are essential when one aims to arrive at a consistent and comprehensive *quantum theory of the classical*. Derivation of the pointer states via repeatability in Section 2 allows us to anticipate preferred states capable of leaving multiple records in E with minimal assumptions that tie directly to the narrative of quantum Darwinism. Only states that are monitored by E without getting perturbed can survive long enough to deposit copious qmemes, their information—theoretic progeny, in the environment: The no-cloning theorem is not an obstacle when "cloning" involves not an unknown quantum state, but an einselected observable. The copies are then messages with the information about the pointer states. The environment becomes an *amplification channel*—a quantum communication channel that carries multiple qmemes of the classical information about the 'events' corresponding to the pointer states of S.

The inevitable price of the amplification of the preferred observable is the destruction of the information about the complementary observables and about the initial superposition of the pointer states of S. A single copy of that state is diluted in all of E, so quantum information can be obtained only through global measurements that are inaccessible to local observers. Thus, a quantum environment can transmit only classical information about the pointer observable of S. Or, more precisely, the ability to spawn and disseminate qmemes endows pointer states with all the prerogatives of objective classical existence. Such preferred observable is "fittest" in the Darwinian sense—the original pointer state

<sup>10</sup> It has been remarked (Healey, 2012) [117] that, in the quantum Darwinist approach to objective existence "One is reminded of Wittgenstein's remark in his *Philosophical Investigations*: *As if someone were to buy several copies of the morning paper to assure himself that what it said was true*". The difference with Wittgenstein's reader of classical newspapers is, of course, that an observer in a quantum Universe does not ge<sup>t</sup> a newspaper with a robust, objectively existing content. What an observer will ge<sup>t</sup> out of a single copy of the hypothetical quantum newspaper (or a fragment of E) is a combination of what was imprinted on it (by the system) and quantum randomness—by Born's rule and by how the state of that F collapses when "read" (measured). Therefore, in general, different copies of the same "quantum paper" will "collapse" into distinct messages (because the state of the copy will "collapse" when measured, and also because different observables may be used to read them), even when the underlying state of each copy was the same. Objective reality emerges—the news about the state of the system can be ascertained—only from the consensus between many copies that reveal the "true" state of the S. It can be confirmed by additional measurements (on the environment or even directly on the system). That information will also agree with what other observers have found out about S. In practice, we do not need to search for such consensus between different choices of observables: Our eyes have evolved to measure photons that are imprinted with and can only reveal localized states of systems.

has survived evolutionary pressures of its environment and has spawned copious qmemes, information - theoretic offspring, advertising its objective existence.

To make these ideas rigorous we shall calculate the number of copies of S in E. To do that we will compute entropies of S, E, and various fragments F of E (see Figure 4) using reduced density matrices and relevant probabilities. It is therefore fortunate that, in Section 3, we have already derived Born's rule from the symmetries of entanglement. This gives us the right to employ the usual tools of decoherence—trace and reduced density matrices interpreted as statistical entities—to compute entropy.

**Figure 4. Quantum Darwinism and the structure of the environment**. The decoherence paradigm distinguishes between a system (S) and its environment (E) as in (**a**), but makes no further recognition of the structure of E; it could as well be monolithic. In the environment-as-a-witness paradigm, we recognize subdivision of E into subenvironments—its natural subsystems, as in (**b**). The only essential requirement for a subsystem is that it should be individually accessible to measurements; observables corresponding to different subenvironments commute. To obtain information about the system S from its environment E one can then carry out measurements of *fragments* F of the environment—non-overlapping collections of its subsystems. Sufficiently large fragments of the environment that has monitored (and, therefore, decohered) S can often provide enough information to infer the state of S, by combining subenvironments as in (**c**). There are then many copies of the information about S in E, which implies that information about the "fittest" observables that survived monitoring by E has proliferated throughout E, leaving multiple qmemes, their (quantum) informational offspring. This proliferation of the information about the fittest (pointer) states defines quantum Darwinism. Multiple copies allow many observers to find out the state of S: Environment becomes a reliable witness with redundant copies of information about preferred observables, which accounts for the objective existence of preferred pointer states.

#### *4.1. Mutual Information, Redundancy, and Discord*

Quantities that play key role in quantum Darwinism are often expressed in terms of the von Neumann entropy:

$$H(\rho) = -Tr\rho \lg \rho \,, \tag{4.1}$$

The density matrix *ρ* can describe the state of just one, or of a collection (ensemble) of many quantum systems.

As was done (since at least Laplace) classically, probabilities underlying quantum von Neumann entropy can be regarded as a measure of ignorance11. However, the density matrix *ρ* provides more than just its eigenvalues. It is an operator—it has eigenstates. One may be tempted to add that *ρ* also determines what one is ignorant of: This is generally not the case. Observer can be interested in an observable whose eigenstates do not diagonalize *ρ*. Indeed, as Section 2 demonstrated, even the einselected pointer states do not always diagonalize *ρ*S. Thus, states that are predictable (because of their stability) and therefore useful may not coincide with instantaneous eigenstates of the reduced density matrix.

What is the ignorance of someone interested in an observable with states {|*<sup>π</sup>k*-} that differ from the eigenstates of *ρ*? The corresponding entropy is then the Shannon entropy given by:

$$H(p\_k) = -\sum\_k p\_k \lg p\_{k\text{-}\text{-}k} \tag{4.2}$$

where probabilities of {|*<sup>π</sup>k*-} are:

$$p\_k = \text{Tr}\langle \pi\_k | \rho | \pi\_k \rangle \,. \tag{4.3a}$$

As noted above, {|*<sup>π</sup>k*-} may be pointer states (indeed, we shall adopt this notation for the pointer states in this section). They will be (almost) as good in diagonalizing the reduced density matrix of the system as its eigenstates after decoherence. As was noted in Section 3, it is only then that one can associate the usual interpretation of probabilities with pointer states. Otherwise additivity of probabilities may be in danger, as consistent histories approach (Griffiths, 1984 [118]; 2002 [119]; Gell-Mann and Hartle, 1990 [120], 1993 [121]; Omnès 1992 [122]; Griffiths and Omnès, 1999 [123]) makes especially clear.

Note that above—in Equations (4.1) and (4.2)—we have used the same *H* to denote both von Neumann and Shannon entropy. Sometimes different letters (e.g., *S* and *H*) are used for this purpose. We will not do that because, to begin with, S is used to denote the system. Moreover, immediately below we shall consider entropies that are in a sense partly von Neumann and partly Shannon. Last not least, throughout most of this section our focus will be on von Neumann entropy and on the corresponding mutual information.

#### 4.1.1. Mutual Information

Mutual information will help us find out how much a fragment of the environment knows about the system, and what does it know. It is the difference between entropy of two systems treated separately and jointly:

$$I(\mathcal{S}:\mathcal{A}) = H(\mathcal{S}) + H(\mathcal{A}) - H(\mathcal{S}, \mathcal{A})\,. \tag{4.4}$$

Mutual information *I*(S : A) quantifies of how much S and A know about one another.

<sup>11</sup> We are not "backtracking" here—probabilities in the quantum theory are based on objective quantum symmetry—envariance: Even in the case when global state of SE is pure, von Neumann entropy of S quantifies ignorance associated with the probabilities of the eigenstates of its reduced density matrix. The difference between the quantum case and the classical "Laplacean" approach is clear: Subjective assessment of likelihood can be, in quantum setting, replaced by the objective symmetries of the global state.

For classical systems the above definition of *I*(S : A) is equivalent to the definition of mutual information that employs conditional entropy (e.g., *H*(S|A)). Conditional entropy quantifies the ignorance about S left after the state of A is found out:

$$H(\mathcal{S}, \mathcal{A}) = H(\mathcal{A}) + H(\mathcal{S}|\mathcal{A})\,. \tag{4.5a}$$

A similar formula reverses the roles of the two parties:

$$H(\mathcal{S}, \mathcal{A}) = H(\mathcal{S}) + H(\mathcal{A}|\mathcal{S})\,. \tag{4.5b}$$

In classical settings, when states can be characterized by probability distributions, one can simply substitute either of the Equations (4.5) for *<sup>H</sup>*(S, A) in Equation (4.4) and obtain an equivalent expression for mutual information. In quantum physics "knowing" is not as innocent—it requires performing a measurement that in general alters the joint density matrix into an outcome—dependent *conditional density matrix* describing the state of the system given the measurement outcome—e.g., given the state |*Ak*- of the apparatus A:

$$
\langle \rho\_{\mathcal{S}|A\_k\rangle} = \langle A\_k|\rho\_{\mathcal{S},\mathcal{A}}|A\_k\rangle / p\_k \,. \tag{4.6}
$$

Above, in accord with Equation (4.3a),

$$p\_k = \text{Tr}\langle A\_k | \rho\_{S\mathcal{A}} | A\_k \rangle \,. \tag{4.3b}$$

Given the outcome |*Ak*- the conditional entropy is:

$$H(\mathcal{S}|A\_k\rangle) = -\text{Tr}\rho\_{\mathcal{S}|A\_k\rangle} \lg \rho\_{\mathcal{S}|A\_k\rangle} \,\prime \,\,\,\,\tag{4.7}$$

which leads to the average conditional entropy:

$$H(\mathcal{S}|\{\langle A\_k \rangle\}) = \sum\_k p\_k H(\mathcal{S}|A\_k\rangle)\,. \tag{4.8}$$

This much information about S one expects will be still missing after a measurement of an observable with the eigenstates {|*Ak*-} on A. Note that, as in the discussion of probabilities and envariance, this estimate of the expected missing information is relevant for a "bystander"—someone who knows what was measured, but does not ye<sup>t</sup> know the result. (Observer who knows the outcome would use Equation (4.7) instead). An average over all the outcomes, Equation (4.8), gives the expected remaining ignorance about S as long as one does not know the outcome.

Once the observer perceives the outcome, the relevant state of S (and the corresponding relevant entropy) will be given by Equations (4.6) and (4.7). This is the infamous "collapse"—the range of possibilities is reduced to a single actuality. We note that even for a bystander—observer's friend who knows that the measurement has already happened, but who has not ye<sup>t</sup> found out the outcome<sup>12</sup> the joint state of SA is usually affected, as the reconstituted density matrix ∑*k pk*|*Ak*-*Ak*|*ρ*S|*Ak* - differs in general from the premeasurement *ρ*SA. In particular, the entropy of the reconstituted mixed state is usually larger than the entropy of the pre-measurement *ρ*SA.

This increase of entropy is characteristically quantum. It was pointed out already by von Neumann (1932) [4]. Decoherence explains it as an inevitable consequence of the correlations with the environment E that "monitors" A. From the point of view of the bystander, correlations of A with the environment or with the observer have a similar effect—they can invalidate some of the information the bystander had about SA, and hence increase entropy.

<sup>12</sup> That is, for whom the possibilities have not ye<sup>t</sup> collapsed to a single actuality, but who knows the apparatus A has already passed on the information to someone else (via a measurement) or to the environment (as a consequence of decoherence).

The entropy in Equation (4.8) can be viewed as a half von Neumann—half Shannon: It involves (quantum) conditional density matrices as well as (effectively classical) probabilities of outcomes. Given *ρ*SA, one can also address a more specific question, e.g., how much information about a specific observable of S (characterized by its eigenstates {-*sj*}) will be still missing after a specific observable with the eigenstates {|*Ak*-} of A is measured. This can be answered by using *ρ*SA to compute the joint probability distribution:

$$p(s\_{j\prime}, A\_k) = \left< s\_{j\prime}, A\_k \middle| \rho\_{\mathcal{SM}} \middle| s\_{j\prime}, A\_k \right> . \tag{4.9}$$

These joint probabilities are in effect classical. They can be used to calculate Shannon joint entropy for any two observables (one in S, the other in A), as well as entropy of each of these observables separately, and to obtain the corresponding (Shannon) mutual information:

$$I(\{\left|\mathbf{s}\_{\{\}}\right\rangle\} : \{\left|A\_{k}\right\rangle\}) = H(\{\left|\mathbf{s}\_{\{\}}\right\rangle\}) + H(\{\left|A\_{k}\right\rangle\}) - H(\{\left|\mathbf{s}\_{\{\}}\right\rangle\}, \{\left|A\_{k}\right\rangle\}) \,. \tag{4.10}$$

We shall find uses for all of these variants of mutual information. The von Neumann entropy based *I*(S : A), Equation (4.4), answers the question "how much the systems know about each other", while the Shannon version immediately above quantifies mutual information between two specific observables. Shannon version is (by definition) basis dependent. It is straightforward to see that, for the same underlying joint density matrix:

$$I(\mathcal{S}:\mathcal{A}) \ge I(\{|s\_j\rangle\}: \{|A\_k\rangle\})\,. \tag{4.11}$$

Equality is attained only for a special choice of the two measured observables, and only when the eigenstates of *ρ*SA are direct products |*sk*-|*Ak*- of the orthogonal sets of states {|*sk*-} and {|*Ak*-} of S and A. In that case correlations between S and A can be regarded as completely classical.

With the help of Equation (4.8) one can define "half way" (Shannon—von Neumann) mutual informations that presume a specific measurement on one of the two systems (e.g., A), but make no such commitment for the other one. For instance,

$$J(\mathcal{S}: \{ |A\_k\rangle \}) = H(\mathcal{S}) - H(\mathcal{S}|\{ |A\_k\rangle \}) \tag{4.12a}$$

would be one way to express mutual information defined "asymmetrically" in this way. A corresponding formula;

$$J(\mathcal{A}:\{|s\_k\rangle\}) = H(\mathcal{A}) - H(\mathcal{A}|\{|s\_k\rangle\}) \tag{4.12b}$$

is relevant when S is measured in the basis {|*sk*-}.

#### 4.1.2. Quantum Discord

Quantum discord is the difference between the mutual information defined using symmetric von Neumann formula, Equation (4.4), and one of the half-way Shannon—von Neumann versions [124–126]. For example:

$$\mathcal{D}(\mathcal{S}; \{ |A\_k\rangle \}) = I(\mathcal{S}:\mathcal{A}) - I(\mathcal{S}:\{ |A\_k\rangle \}) \,. \tag{4.13a}$$

Discord is a measure of how much information about the two systems is inaccessible locally—how much of the globally accessible mutual information is lost when one attempts to find out the state of SA starting with a local measurement on A with outcomes {|*Ak*-}.

Discord is asymmetric and basis-dependent, as information gain about S depends on what gets measured on A. The least discord (corresponding to optimal {|*Ak*-}):

$$\mathcal{D}(\mathcal{S}; \mathcal{A}) = \min\_{\{|A\_k\rangle\}} \{ \mathcal{D}(\mathcal{S}; \{|A\_k\rangle\}) \} = 0 \tag{4.13b}$$

disappears iff *ρ*SA commutes with *A* = ∑*k <sup>α</sup>k*|*Ak*-*Ak*|:

$$
\lceil \rho\_{\mathcal{S}\mathcal{A}'} A \rceil = 0. \tag{4.14a}
$$

When that happens, quantum correlation is classically accessible from A. This can be assured iff;

$$[\rho\_{\mathcal{S},\mathcal{A}}, \rho\_{\mathcal{A}}] = 0.\tag{4.14b}$$

Decoherence of A that einselects preferred pointer basis {|*Ak*-} will evolve *ρ*SA to where Equations (4.14) are satisfied to a good approximation [126].

When the composite system is classical, so that its state can be found out without disturbing it and can be—prior to measurements—characterized by a probability distribution that is independent of the measuring process, the symmetric *I* (defined using joint entropy, Equation (4.4)) and the asymmetric *J* (defined using conditional entropy, Equation (4.12)) coincide. The proof (see, e.g., Cover and Thomas, 1991) [127] relies on Bayes' rule relating conditional and joint probabilities. Thus, non-vanishing discord signifies breakdown of Bayes' rule in quantum physics.

We emphasize that the asymmetric "half way" (Shannon—von Neumann) mutual information *J* is indeed asymmetric—it depends on whether measurements are carried out on A or S. In the classical case asymmetric-looking definition of *J* results, courtesy of Bayes' rule, in a symmetric mutual information, as *J*(S : A) = *I*(S : A) = *J*(A : S). In the classical case *J*(S : A) = *J*(A : S), so this does not matter, but in the quantum case, in general, *J*(S : A) = *J*(A : S).)

As a consequence of asymmetry between the system that is measured and its partner whose state is inferred indirectly, based on the outcome of that measurement, it is possible to have correlations that are classically accessible only "from one end" [128]. For instance:

$$\rho\_{S\mathcal{A}} = \frac{1}{2} (|\uparrow\rangle\langle\uparrow| |A\_{\uparrow}\rangle\langle A\_{\uparrow}| + |\nearrow\rangle\langle\nearrow| |A\_{\nearrow}\rangle\langle A\_{\nearrow}|)$$

will be classically accessible through a measurement on A with orthogonal record states {--*<sup>A</sup>*↑, --*<sup>A</sup>*} (i.e., when *A*↑|*A*- = 0), but classically inaccessible to any measurement on S when ↑ | - = 0. Indeed, the original motivation for introducing discord was the observation that decoherence of the apparatus makes the correlations accessible from A [124].

Minimization used in Equation (4.13b) raises the obvious question: Could one do better if one used positive operator valued measures (POVM's) rather than Hermitian observables with orthogonal eigenstates? The answer is, unsurprisingly, "Yes". In the case of POVM's {*<sup>π</sup>k*} the asymmetric mutual information coincides with the familiar Holevo quantity *χ* (Holevo, 1973 [129]) :

$$\chi(\rho\_{\mathcal{A}}) = \max\_{\{\pi\_k\}} \left( H(\rho\_{\mathcal{A}}) - \sum\_k p\_k H(\mathcal{A}|\pi\_k) \right), \tag{4.12c}$$

so that the minimum discord can be now expressed as:

$$\mathcal{D}(\mathcal{S}; \mathcal{A}) = I(\mathcal{S} : \mathcal{A}) - \chi(\rho\_{\mathcal{A}}).\tag{4.13c}$$

Holevo quantity bounds the capacity of quantum channels to carry classical information. This role of Holevo *χ* fits naturally into the discussion of quantum Darwinism where fragments F of the environment play a role of quantum channels transmitting information about S that is being decohered by E. In particular, Zwolak and Zurek (2013) [130] point out that *χ* and D—classical and quantum information transmitted by this channel—are complementary, while Touil et al., (2022) [131] show that the Holevo bound is a reasonable estimate of the information about S that can be accessed by optimal measurement of F.

Quantum discord has become an active area of research following indications that the "quantumness" it defines may play a fundamental role in operation of quantum thermodynamic demons ([128], also Brodutch and Terno, 2010 [132]), in defining completely positive maps (Rodrígues-Rosario et al., 2008) [133], and, especially, in quantum information processing (Datta, Shaji, and Caves, 2008) [134] as well as quantum communication (Piani, Horodecki and Horodecki, 2008 [135]; Luo and Sun, 2010 [136]; Gu et al., 2012 [137]; Dakic et al., 2012 [138]). Our brief introduction to discord is clearly incomplete (see, however, Modi et al., 2012 [139] and Bera et al., 2018 [140] for reviews), but it will suffice for our purpose.

Questions we shall analyze using mutual information and discord will concern (i) redundancy of information (e.g., how many copies of the record does the environment have about S), and; (ii) what is this information about (that is, what observables of the system are recorded in the environment with large redundancy).

Objectivity appears as a consequence of large redundancy. In the limit of large redundancy the precise value of redundancy has as little physical significance as the precise number of atoms on thermodynamic properties of a large system. So, it will be often enough to show that redundancy is large (rather than to calculate exactly what it is). Discord will turn out to be a measure of the unattainable quantum information that cannot be extracted by local measurements from the fragments of the environment.

One might be concerned that having different measures—different mutual informations— could be a problem, as this could lead to contradictory answers, but in practice this never becomes a serious issue for two related reasons: There is usually a well-defined pointer observable that obviously minimizes discord, so various possible definitions of mutual information tend to agree where it matters. Moreover, the effect we are investigating—quantum Darwinism—is not subtle: We shall see thet there are usually many copies of pointer states of S in E, so (as is discussed by Touil et al. (2022) [131] and Zwolak (2022) [141]) the discrepancy between redundancies computed using different methodologies—differences between the numbers of copies defined through different measures—is irrelevant.

#### 4.1.3. Evidence and Its Redundancy

We study a quantum system S interacting with a composite quantum environment E = E1 ⊗ E2 ⊗···⊗EN . The question we consider concerns the information one can obtain about S from a fragment F of the environment E consisting of several of its subsystems (see Figure 4). To be more specific, we partition E into non-overlapping fragments. Redundancy of the record is then defined as the number of disjoint fragments each of which can supply sufficiently complete information (i.e., all but the *information deficit δ* < 1) about S;

$$I(\mathcal{S}:\mathcal{F}\_{\delta}) \ge (1-\delta)H\_{\mathcal{S}}\ . \tag{4.15}$$

Small information deficit, *δ* 1, implies that nearly all the classically available information can be obtained from F*δ*. This will not always be the case, and *δ* 1 is not a condition for the effectively classical behavior or even for an agreemen<sup>t</sup> between different observers13.

We now define redundancy as the number of fragments that can independently supply almost all—all but *δ*—of the missing information *H*S about the system:

$$R\_{\delta} = \frac{1}{f\_{\cdot \delta}} \,. \tag{4.16}$$

Definition of redundancy can be illustrated using partial information plots that show the dependence of the mutual information on the size of the intercepted fraction of the

<sup>13</sup> In some situations—e.g., where the information about a text is concerned—this demand certainly justified, and *δ* 1 assures distinguishability of the letters. However, there are situations (e.g., astronomical observations) where accessing larger fraction of the relevant environment (e.g., using telescope to collect more light emitted by Andromeda Nebula) provides more information, and this alone is a conclusive evidence that *δ* is not always small. Observers generally recognize that, in such situations, disagreements between what they infer about the object of their indirect observations may be a result of the incompleteness of their data (see also Girolami et al., 2022 [142] for discussion of this issue).

environment: Redundancy is the length of the plateau of *I*(S : F) measured in the units set by the support of the initial, rising, portion of the graph—the part starting at *I*(S : F) = 0 and ending when *I*(S : F*δ*)=(<sup>1</sup> − *<sup>δ</sup>*)*<sup>H</sup>*S (see Figure 5). Thus, to estimate redundancy we will need to determine how much information about S can one ge<sup>t</sup> from a typical fragment F that contains a fraction;

$$f\_{\delta} = \frac{{}^{\sharp}\mathcal{F}\_{\delta}}{{}^{\sharp}\mathcal{E}} = \frac{number \ of \ subspaces \ in \ \mathcal{F}\_{\delta}}{number \ of \ subspaces \ in \ \mathcal{E}}$$

of E. For this we need the dependence of the mutual information *<sup>I</sup>*(S, F) on *f* = F/ E.

**Figure 5. Partial Information Plot (PIP) and redundancy** *Rδ* **of the information about a system** S **stored in its environment** E. When global state of SE is pure, mutual information that can be attributed to a typical fraction *f* of the environment is antisymmetric around *f* = 1 2 and monotonic in *f* . For pure states picked out at random from the combined Hilbert space HSE , there is very little mutual information between S and a typical fragment F smaller than about half of E. However, once threshold fraction 1 2 is attained, nearly all information is in principle at hand. Thus, such random states (green line above) exhibit only small redundancy. (Strictly speaking redundancy is 2: The environment can be split into two halves, each supplying *H*S of information.) By contrast, states of SE created by decoherence (where the environment E monitors preferred observables of S) allow one to gain almost all (all but *δ*) of the information about S accessible through local measurements from a small fraction *fδ* = 1/*Rδ* of E. The corresponding PIP (red line above) quickly asymptotes to *H*S—the entropy of the system S (either preexisting or caused by decoherence)—which is all of the information about S available from measurements on either E of S. (More information about SE can be ascertained only through global measurements on S and a fragment F corresponding to more than half—indeed, nearly all—of E). *H*S is therefore the *classically accessible information*. As (1 − *<sup>δ</sup>*)*<sup>H</sup>*S of information can be obtained from a fraction *fδ* = 1/*Rδ* of E, there are *Rδ* such fragments in E, and *Rδ* is the *redundancy* of the information about S. Large redundancy implies objectivity: The state of the system can be found out independently and indirectly (from fragments of E) by many observers, who will agree about it. In contrast to direct measurements, S will not be perturbed. Thus, *quantum Darwinism accounts for the emergence of objective existence*.

Examples of partial information plots (or "PIPs") of the von Neumann mutual information for a pure composite system consisting of S and E are shown in Figure 5. The first

observation is that these plots are asymmetric around *f* = 12 . This can be demonstrated<sup>14</sup> using elementary properties of the von Neumann mutual information and assuming; (i) F is typical, and; (ii) the whole of SE pure [143].

There is a striking difference between the character of PIPs for random pure states in the whole joint Hilbert space HSE = HS ⊗ HE and states resulting from decoherence—like evolution: For a random state very little information obtains from fragments with *f* < 12 By contrast, for PIPs that result from decoherence already a small fragment F of E (*f* 1) will often supply nearly all the information that can be obtained from less than almost all SE.

The character of such decoherence—generated PIPs sugges<sup>t</sup> dividing information into (i) easily accessible *classical information H*S that can be inferred (up to the information deficit *δ*) from local measurements—from any sufficiently large fragment F*δ* that is still small compared to half of E, and (ii) *quantum information* that is locally inaccessible, but is present, at least in principle, in the global observables of the whole SE.

This shape of PIPs is a result of einselection: When there is a preferred observable in S that is monitored but not perturbed by the environment, the information about it is recorded over and over again by different subsystems of E. A combined state of S and E resulting from decoherence will have a *branching* structure;

$$\left| \Psi\_{S\mathcal{E}} \right\rangle = \sum\_{k} \varepsilon^{i\phi\_{k}} \sqrt{p\_{k}} |\pi\_{k}\rangle \left| \varepsilon\_{k}^{(1)} \right\rangle \left| \varepsilon\_{k}^{(2)} \right\rangle \dots \left| \varepsilon\_{k}^{(l)} \right\rangle \dots \tag{4.17}$$

.

with the pointer states of the system |*<sup>π</sup>k*- at the base of each branch. Subsystems of E are correlated with these pointer states, but, individually, will often contain only poor (far from distinguishable) records of |*<sup>π</sup>k*-. Nevertheless, even when records contained in individual subsystems are insufficient—*ε*(*l*) *j* |*ε*(*l*) *k* - is nowhere near *<sup>δ</sup>jk*—sufficiently long fragments F of branches labelled by distinct pointer states |*<sup>π</sup>k*- will be approximately orthogonal. As a result, nearly all of the easily accessible classical information can be often recovered from small fragments—a fraction of the environment.

#### 4.1.4. Mutual Information, Pure Decoherence, and Branching States

In quantum Darwinism, fragment F plays a role of an apparatus or of a communication channel designed to access the same pointer observable that can survive intact in spite of the immersion of S in E. Decoherence singles out preferred observables of S. They are determined by the dynamics of decoherence, so—in presence of fixed interaction Hamiltonians—they remain unchanged even as more and more copies of the information about the system are deposited in E. This is fortunate, as calculating mutual information is in general difficult. However, when decoherence is the only significant process—when we are dealing with *pure decoherence* that results in perfect branching states—calculations simplify [131,141,144–146]).

Pure decoherence is defined by the system-environment Hamiltonian that commutes with {|*<sup>π</sup>k*-}, pointer states of S, and does not directly correlate subsystems E*l* of the environment:

$$H\_{\mathcal{SE}} = \sum\_{k} \sum\_{l} \varsigma\_{kl} |\pi\_{k}\rangle \langle \pi\_{k}| \hat{\gamma}\_{\mathcal{E}\_{l}}.\tag{4.18a}$$

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{S}} + H\_{\mathcal{F}} - H\_{\mathcal{S},\mathcal{F}} = H\_{\mathcal{S}} + [H\_{\mathcal{F}} - H\_{\mathcal{E},\mathcal{F}}].$$

<sup>14</sup> For a pure SE, the joint entropy of SF must be the same as the entropy of E/F—the rest of the closed pure SE (where E/F is the remainder of the environment—E less F). Consequently, mutual information is given by;

When we assume that the fragments of the environment are typical, the entropies in the term in the square brackets are a function of the fraction of the environment contained in the fragment F, so that *H*F − *<sup>H</sup>*E/F = *H*(*f*) − *H*(1 − *f*), which established the antisymmetry illustrated in Figure 5.

Above, Hermitian operators *<sup>γ</sup>***<sup>ˆ</sup>**E*l* act on individual subsystems of E. The resulting evolution operator *U*SE (*t*) factors:

$$\mathcal{U}\_{\mathcal{SE}}(t) = e^{-t\mathcal{H}\_{\mathcal{SE}}t/\hbar} = \mathcal{U}\_{\mathcal{SF}}(t)\mathcal{U}\_{\mathcal{SE}\_{/\mathcal{F}}}(t). \tag{4.18b}$$

This independence of the evolution of a fragment of the environment F from its remainder E/F will greatly simplify our calculations. Addition of a self-Hamiltonian *H*S that commutes with *H*SE would not affect our discussion. Similarly, one could add self-Hamiltonians of the environment subsystems. As long as *U*SE (*t*) factors as above, our conclusions will be valid.

We now consider evolution of *ρ*SF starting from an initially uncorrelated state;

$$\rho\_{S\mathcal{E}}(0) = \rho\_{\mathcal{S}}(0)\rho\_{\mathcal{E}1}(0)\rho\_{\mathcal{E}2}(0)... $$

Given our assumptions that include pure decoherence, the evolved state is given by;

$$
\rho\_{\mathcal{SF}}(t) = \mathcal{U}\_{\mathcal{SF}}(t)\rho\_{\mathcal{S}d\mathcal{E}\_{/\mathcal{F}}}(t)\rho\_{\mathcal{F}}(0)\mathcal{U}\_{\mathcal{SF}}(t)^{\dagger} \tag{4.19}
$$

where;

$$\rho\_{\mathcal{S}d\mathcal{E}\_{/\mathcal{F}}}(t) = \text{Tr}\_{\mathcal{E}\_{/\mathcal{F}}} \mathcal{U}\_{\mathcal{S}\mathcal{E}\_{/\mathcal{F}}}(t) \rho\_{\mathcal{S}}(0) \rho\_{\mathcal{E}\_{/\mathcal{F}}}(0) \mathcal{U}\_{\mathcal{S}\mathcal{E}\_{/\mathcal{F}}}(t)^\dagger$$

represents the state of S decohered by E/F , the remainder of the environment. The structure of Equation (4.19) implies that the joint entropy of the system S and the environment fragment F is given by:

$$H\_{\mathcal{S}\mathcal{F}}(t) = H\_{\mathcal{S}d\mathcal{E}\_{/\mathcal{F}}}(t) + H\_{\mathcal{F}}(0).$$

This identity is valid for branching states, Equation (4.17), resulting from pure decoherence. It allows us to write (Zurek, 2007 [49]; Zwolak, Quan, and Zurek, 2009 [144]; 2010 [145]):

$$I(\mathcal{S}:\mathcal{F}) = \begin{pmatrix} \operatorname{local}/\operatorname{classical} \\ \operatorname{F} - H\_{\mathcal{F}}(0) \end{pmatrix} + \begin{pmatrix} \operatorname{global}/\operatorname{quantum} \\ \operatorname{H}\_{\mathcal{S}} - H\_{\operatorname{Sd}\mathcal{E}/\mathcal{F}} \end{pmatrix} . \tag{4.20}$$

The mutual information is given by the sum of two contributions, which (as indicated above) can often be regarded as, respectively, classical and quantum.

The first contribution, *H*F − *H*F (0), is the increase of the entropy of the fragment F. In our case of pure decoherence it is all due to the correlation with S—due to the information F acquires about S. This information about S can be accessed indirectly, by measuring F, and is available locally (hence, it is within reach of local observers). In the PIP representing decoherence in Figure 5 increase of *H*F is responsible for the rapid rise of *I*(S : F) that starts at *f* = 0 and for its leveling off at the classical plateau at *H*S that can happen already at *f* 1. This information about S is easily accessible because it has been widely disseminated—many independent fragments of E share it.

The second term, *H*S − *<sup>H</sup>*S*d*E/F , turns out (Zwolak, Quan, and Zurek, 2010) [145] to be the discord in the pointer basis of S. It is usually negligible when *f* < 1, but can become significant when *f* → 1 (F→E). Consequently, it represents information that can be obtained only via global measurements—measurements that involve nearly all SE. When decoherence by the environment E is solely responsible for *H*S, one can rewrite *H*S − *<sup>H</sup>*S*d*E/F as *H*S*d*E − *<sup>H</sup>*S*d*E/F —as the difference between the decoherence caused by all of E and its remainder, E/F . As long as the remainder E/F keeps S decohered, *H*S − *<sup>H</sup>*S*d*E/F = 0. Only when, with the increase of *f* , E/F becomes too small to effectively decohere S, *<sup>H</sup>*S*d*E/F begins to be substantially less that *H*S, and eventually disappears. In that limit of a vanishing remainder *H*S − *<sup>H</sup>*S*d*E/F → *H*S, and *I*(S : F) approaches 2*H*S for *f* → 1.

To sum up, the initial climb of *I*(S : F) to the classical plateau at *H*S is due to *H*F − *H*F (0). The final climb from that classical plateau to the quantum peak *I*(S : F) = 2*H*S happens when the remainder of the environment E/F is not enough to keep S decohered, so that *<sup>H</sup>*S*d*E/F falls below *H*S. This quantum value of *I*(S : F) = 2*H*S can be reached

at *f* = 1, and only when SE, as a whole, is pure (so that *H*SE = 0, and *H*S = *H*E ). This additional quantum information can be revealed only by measurements that have global entangled states of SE as outcomes.

In addition to the need for global measurements there is another reason that suggests that *H*S − *<sup>H</sup>*S*d*E/F represents quantum information. It is best illustrated by contrasting the behavior of *H*S − *<sup>H</sup>*S*d*E/F for *f* → 1 when the system starts in a mixture of pointer states (so that its entropy is *H*S already before it couples to E, and cannot increase any more due to decoherence by E) with the alternative—when S starts in a corresponding pure "Schrödinger cat" state (so its entropy is due to decoherence by E, *H*S = *H*S*d*E ).

In the case of initially pure E and mixed S the entropy *<sup>H</sup>*S*d*E/F remains unchanged— the system was pre-decohered—and *I*(S : F) levels off at the classical plateau even as *f* → 1—it can never exceed *H*S. However, when S is initially pure, *<sup>H</sup>*S*d*E/F disappears as F→E and E/F becomes too small to be an effective decoherer. In that case the additional information that can be in principle recovered from SE concerns phases between the pointer states (or, to put it differently, observables complementary to the pointer observable).

In the "opposite" case—when E is initially mixed but S is initially pure—the first term in Equation (4.20) vanishes, and *H*F (0) is already as large as *H*F can get, so that classical plateau disappears—the mutual information remains negligible until *f* is nearly 1. However, even now *H*S − *<sup>H</sup>*S*d*E/F eventually attains *H*S = *H*S*d*E , as *<sup>H</sup>*S*d*E/F vanishes when E/F shrinks with F→E. So, at the end of the PIP there is still a quantum peak, but now it rises not above the classical plateau at *I*(S : F) = *H*S, but, rather from the "sea level", *I*(S : F) = 0, so that at *f* = 1 mutual information reaches the peak value of only *H*S.

#### 4.1.5. Surplus Decoherence and Redundant Decoherence

In realistic situations, observers can intercept only a fraction of the environment. Thus, *f* < 1 (indeed, usually *f* 1). It is therefore often natural to assume that the remainder of the environment suffices to keep S decohered. This implies that *ρ*SF has eigenstates that are products of pointer states of S with some states of F (and not, e.g., entangled states of SF; indeed, in this case *ρ*SF has vanishing discord—it is classically accessible from S). This will be true providing that there is *surplus decoherence*, so that one does not need all of E to keep S decohered—the remainder E/F of the environment is enough.

The essence of surplus decoherence is easily traced (and closely tied) to the branching structure of the states of SE we have already recognized as a consequence of branching states, Equation (4.17), and of pure decoherence, Equation (4.18). Surplus decoherence implies that states of E/F , the remainder of E, are nearly orthogonal (so they can constitute nearly perfect records of pointer states of S). This guarantees that the same states that are selected by decoherence and diagonalize decohered *ρ*S also help diagonalize *ρ*SF (i.e., [*ρ*S, *ρ*SF ] = 0, Equation (4.14)—the joint state *ρ*SF is classically accessible from S, and its discord disappears in the pointer basis). This assumption breaks down when the states of the remainder E/F correlated with the pointer states are no longer orthogonal, so that the eigenstates of *ρ*SF are entangled or discordant states of SF, and [*ρ*S, *ρ*SF ] = 0. For an initially pure SE this will always eventually happen as *f* → 1, providing that S started in a superposition of pointer states.

Quantum Darwinism recognizes that situations when there are many copies of S inscribed in E are commonplace in our Universe. A single accurate copy in the environment suffices to decohere S. Thus, when there are many copies, one can expect not just "surplus decoherence" but a situation when S is decohered "many times over". This situation (which often turns out to be generic in our Universe) defines *redundant decoherence*.

Redundant decoherence may sound like an oxymoron—once coherence is lost from S, one might say, there is no way to decohere S even more. Indeed, redundant decoherence will have no additional effect on S; *ρ*S will remain diagonal in the pointer basis, and *H*S will no longer increase. However, it will turn out to be useful (e.g., in discussions of irreversibility) to appeal to the redundancy of decoherence *<sup>R</sup>δD* . Redundancy of decoherence

is defined—by analogy with the redundancy of information about S in E, Equations (4.15) and (4.16)—by enquiring what typical fraction *f<sup>δ</sup>D* of the environment suffices to decohere the system to the extent given by the *decoherence deficit δD*:

$$H\_{\mathbb{S}d\mathcal{F}\_{\delta\_D}} = (1 - \delta\_D) H\_{\mathbb{S}} = (1 - \delta\_D) H\_{\mathbb{S}d\mathcal{E}}.\tag{4.21}$$

A very small fragment of E, *f<sup>δ</sup>D* 1, will often suffice. The redundancy of decoherence is then defined by:

$$R\_{\delta\_D} = \frac{1}{\delta\_D},\tag{4.22}$$

and is at least as large (and can be much larger) than the previously defined redundancy of information about S:

$$
\mathcal{R}\_{\delta\_{\mathcal{D}}} \ge \mathcal{R}\_{\delta}.\tag{4.23}
$$

The equality, *<sup>R</sup>δD* = *Rδ*, can be attained only when the environment is initially pure and its subsystems do not interact (so that all of the entropy in its fragments is due to the information it acquires about S). Mixed environment with interacting subsystems can still decohere S very effectively, but it is generally more difficult to retrieve information about S, so in this case one can have *<sup>R</sup>δD*> *Rδ* (or even *<sup>R</sup>δD Rδ*).

#### 4.1.6. Information Gained by Pure and Mixed Environments

Further simplifications of Equation (4.20) are often possible. Thus, when the environment is initially pure, we get:

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{F}} + \left(H\_{\mathcal{S}} - H\_{\mathcal{S}d\mathcal{E}\_{/\mathcal{F}}}\right) \cdot \mathbb{I}$$

Moreover, when *H*S = *<sup>H</sup>*S*d*E/F (due to surplus decoherence):

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{F}} \tag{4.24a}$$

follows. This simple expression for mutual information is valid for pure decoherence in an initially pure environment for *f* starting at 0 until F gets to be so large that the decoherence by the (shrinking) remainder E/F of the environment is no longer effective and *H*S = *<sup>H</sup>*S*d*E/F.

The opposite case of a completely mixed E yields:

$$I(\mathcal{S}:\mathcal{F}) = \left(H\_{\mathcal{S}} - H\_{\mathcal{S}d\mathcal{E}/\mathcal{F}}\right). \tag{4.25}$$

When S is initially pre-decohered in the pointer basis this implies *I*(S : F) = 0. However— and this may seem surprising—for an initially pure S mutual information will still rise as *f* → 1, but now (with completely mixed E)—it will reach only *H*S (and not 2*H*S as was the case for pure SE). This means that quantum phase information is still "out there", and, at least in principle could be recovered (see Ref. [147]), in spite of the completely mixed E.

One might be surprised that completely mixed environments can be effective decoherers. After all, decoherence is caused by the environment "finding out" about the system, and in a completely mixed environment there does not seem to be any place left to accommodate the data. The right way of thinking about this relies on the "Church of Larger Hilbert Space" view of the mixtures, and is very much in tune with the envariant derivation of probabilities and Born's rule in the preceding section. Mixed environment can be regarded as one half of an entangled pair (so that probabilities are due to the symmetries of entangled state involving E and its purifying "doppelganger" E we have explored using Schmidt decomposition in the envariant derivation of Born's rule). That entangled pair will acquire information about S. Thus, even when the environment E is completely mixed, it will decohere the system as if it was entangled with and "purified" by E.

The last case (that will be relevant for some of the examples we are about to consider) assumes initially pure S and E, as well as redundant (or at least surplus) decoherence. In that case:

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{F}} = H\_{\mathcal{S}d\mathcal{F}}.\tag{4.24b}$$

This last equality follows from the Schmidt decomposition of SE with F and SE/F as subsystems. Under pure decoherence the reduced density matrix *ρ*F does not depend on whether the system is coupled to the remainder of the environment, E/F . Thus, *ρ*F would be the same if it evolved only in contact with isolated S. However, in that case the system would be decohered only by F, and *H*F = *H*S (by Schmidt decomposition), which establishes *H*F = *H*S*d*F .

This last equality allows one to use standard tools of decoherence to compute *H*F . Equality *I*(S : F) = *H*S*d*F follows from Equation (4.20) under the assumption of surplus decoherence, *H*S − *<sup>H</sup>*S*d*E/F≈ 0.

### 4.1.7. Environment as a Communication Channel

We conclude this section by noting that (with the few simplifications that were already justified) the way in which the information about S is transmitted by the fragment F of E is analogous to the transfer of the (classical) information about the pointer states through a quantum channel. The joint state of SF has the form:

$$
\rho\_{\mathcal{SF}} = \text{Tr}\_{\mathcal{E}/\mathcal{F}} \rho\_{\mathcal{SE}} \simeq \sum\_{k} p\_k |\pi\_k\rangle\langle\pi\_k| \rho\_{\mathcal{F}\_k} \,. \tag{4.26}
$$

Effectively classical pointer states |*<sup>π</sup>k<sup>π</sup>k*| are encoded in the quantum states *ρ*F*k* of the channel F. The theorem (due to Holevo, 1973 [129] and Schumacher, 1995 [148]) shows that the capacity of a quantum channel to carry classical information is bounded from above by:

$$\chi(\mathcal{F}:\{\pi\_k\}) = H\mathcal{F} - \sum\_k p\_k H(\mathcal{F}|\pi\_k) \,. \tag{4.27a}$$

When *πk* = |*<sup>π</sup>k<sup>π</sup>k*| are orthogonal projectors, *χ* coincides with the asymmetric mutual information *J*(F|{|*<sup>π</sup>k*-}), Equation (4.12). Generalization to where *πk* are POVM's is possible. Quantum discord can be then bounded from below by:

$$\mathcal{D}(\mathcal{F}:\{\pi\_k\}) = I(\mathcal{F}:\mathcal{S}) - \chi(\mathcal{F}:\{\pi\_k\})\,. \tag{4.27b}$$

Discord is the mutual information that cannot be communicated classically.

One can rewrite the definition of quantum discord as:

$$I(\mathcal{F}:\mathcal{S}) = I(\mathcal{F}:\{\pi\_k\}) + \mathcal{D}(\mathcal{F}:\{\pi\_k\})\,. \tag{4.27c}$$

This conservation law (Zwolak and Zurek, 2013) [130] provides a new view of complementarity: The left hand side is fixed and basis—independent, while both terms on the right hand side depend on {*<sup>π</sup>k*}. The first term represents information about the observable *ς* that can be gained by the measurements on F. This information is maximized for the pointer observable **Π**. Information inscribed in F about any other observable *ς* will be less. One can show [130] that the information about *ς* that can be obtained from F is:

$$\chi(\mathfrak{g}:\mathcal{F}) = H(\mathfrak{g}) - H(\mathfrak{g}|\Pi) \,. \tag{4.28}$$

Above, *<sup>H</sup>*(*ς*|**Π**) is the conditional entropy, the information about *ς* still missing when **Π** is known. For instance, observables complementary to **Π** cannot be found out from F.

We can now understand why only the pointer observable (and, possibly, observables closely aligned with it) can be found out by intercepting a fraction of the environment. Redundant imprinting of an observable *ς* is possible only when there is a fragment F that is large enough so that the information about *ς* can be extracted from it:

$$
\chi(\mathfrak{g}:\mathcal{F}) \ge (1-\delta)H(\mathfrak{g}).\tag{4.29a}
$$

Using the earlier expression for *χ*(*ς* : F) one obtains an inequality *<sup>H</sup>*(*ς*|**Π**) ≤ *<sup>δ</sup>H*(*ς*). In other words, redundant information about observables that are so poorly correlated with the pointer observable that the conditional entropy satisfies the inequality:

$$H(\mathfrak{g}|\mathbf{II}) > \delta H(\mathfrak{g}),\tag{4.29b}$$

cannot be obtained from the environment fragments [48,130,149]. We shall return to this subject below.

#### 4.1.8. Quantum Darwinism and Amplification Channels

The usual focus of the communication theory is to optimize the channel capacity. Thus, in quantum communication theory one would consider messages (such as our *πk* above) sent with the probabilities *pk*, one at a time. Capacity is defined in the limit of many uses of the channel. That is, in this setting one would be dealing with an ensemble described by a density matrix of the form:

$$
\rho\_{S\mathcal{F}}^{\mathcal{N}} = \left(\sum\_{k} p\_k |\pi\_k\rangle\langle\pi\_k|\rho\_{\mathcal{F}\_k}\right)^{\odot \mathcal{N}}.\tag{4.30a}
$$

The theorems are established in the limit of N → ∞.

By contrast, in quantum Darwinism we are dealing with a state that can be approximately expressed as:

$$\rho\_{S,\mathcal{F}} \simeq \sum\_{k} p\_k |\pi\_k\rangle\langle\pi\_k| \bigotimes\_{l}^{\mathbb{R}\_\delta} \rho\_{\mathcal{F}\_k}^{(l)}.\tag{4.30b}$$

That is, the same message |*<sup>π</sup>k<sup>π</sup>k*| is inscribed over and over, ∼ *Rδ* times, in the environment as a whole. This is how its multiple copies can reach many observers.

This is also why an observer who consults one fragment of E and infers the state of the system from it will ge<sup>t</sup> data consistent with the first finding from the consecutive fragments. In view of this structure of the states of the environment the nomenclature "amplification channel" is well justified: Shared quantum information becomes effectively classical, since quantum discord cannot be shared (Streltsov and Zurek, 2013) [150]. Quantum Darwinism implements amplification that provides means of such sharing, shedding new light on the ubiquity of amplification in the transition from quantum to classical.

We shall now illustrate these insights in models of quantum Darwinism. We shall also investigate situations where some or even all of the simplifying assumptions employed above break down. We also note that the above discussion focused on the mutual information defined via von Neumann entropy, and thus, that it prepared us to answer the question about the amount of information that was deposited in, and can be extracted from F. This largely bypasses the question: What is this information about? One can anticipate that the answer is "pointer states". We have already produced evidence of this. We shall confirm and quantify it (by enquiring how much information one can extract from F about other observables) while discussing quantum Darwinism in specific models.

#### *4.2. Quantum Darwinism in Action*

Dissemination of information throughout the environment has not been analyzed until recently. Given the complexity of this process, it is no surprise that the number of results to date is still rather modest, but they have already led to new insights into the nature of the quantum-to-classical transition. The models discussed here show that; (i) *dynamics responsible for decoherence is capable of imprinting multiple copies of the information* *about the system in the environment*. Whether that environment can serve as a useful witness depends on the memory space it has available to store this information, and whether the information is stored unscrambled and unperturbed and is accessible to observers.

Quantum Darwinism will always lead to decoherence, but the reverse is not true: There are situations where the environment cannot store any information about S (e.g., because E is completely mixed to begin with) or where the information it stores becomes scrambled by the dynamics (so that it is effectively inaccessible to local observers).

So, redundancy of records that is so central to quantum Darwinism is not necessarily implied by decoherence. Moreover; (ii) *redundancy can keep on increasing long after decoherence has completely decohered the system*: Copies of the einselected pointer observable can continue to be added—imprinted on E. As we have already noted, redundancy of decoherence is at least as large as *Rδ*, and can be much larger: It is possible to have hugely redundant decoherence while *Rδ* remains negligible. However, typically, both redundancies will continue to increase as the system and the environment continue to interact.

Last not least; (iii) *only the einselected pointer states can be redundantly recorded in* E*.* While multiple copies of the information about the preferred pointer observable are disseminated throughout E, only one copy of the complementary information is (at best) shared by all the subsystems of the environment, making it effectively inaccessible.

Using imperfect analogies with classical devices, one can say that the information flow from S to E acts as an amplifier for the pointer observable, and, simultaneously, as a shredder for the complementary observables, dispersing slivers of a single copy of the phase information in the correlations with many subsystems of the environment. All of these fragments would have to be brought together and coherently reassembled to recover preexisting state of S. By contrast, many copies of the information about the pointer observable are readily available from the fragments of E.

In addition to these general characteristics of quantum Darwinism we shall see that realistic models—e.g., photon scattering—can lead to huge redundancies, and that environment that is partially mixed can still serve as an effective communication channel, allowing many observers independent access to the information about the preferred observable of the system.

#### 4.2.1. C-Nots and Qubits

The simplest model of quantum Darwinism is a rather contrived arrangemen<sup>t</sup> of many (*N*) target qubits that constitute subsystems of the environment interacting via a *controlled not* ("c-not") with a single control qubit S. As time goes on, consecutive target qubits become imprinted with the state of the control S:

$$\left(a|0\rangle + b|1\rangle\right) \otimes |0\_{\mathfrak{e}\_1}\rangle \otimes |0\_{\mathfrak{e}\_2}\rangle \dots \otimes |0\_{\mathfrak{e}\_N}\rangle \implies$$

$$\implies \left(a|0\rangle \otimes |0\_{\mathfrak{e}\_1}\rangle + b|1\rangle \otimes |1\_{\mathfrak{e}\_1}\rangle\right) \otimes |0\_{\mathfrak{e}\_2}\rangle \cdot \dots \otimes |0\_{\mathfrak{e}\_N}\rangle \implies$$

$$\implies a|0\rangle \otimes |0\_{\mathfrak{e}\_1}\rangle \otimes \dots \otimes |0\_{\mathfrak{e}\_N}\rangle + b|1\rangle \otimes |1\_{\mathfrak{e}\_1}\rangle \cdot \dots \otimes |1\_{\mathfrak{e}\_N}\rangle.$$

It is evident that this pure decoherence dynamics is creating branching states with multiple records of the logical (as well as pointer) states {|0-, |1-} of the system in the environment. Mutual entropy between S and a subsystem E*k* can be easily computed. As the *k*'th c-not is carried out, *I*(S : E*k*) increases from 0 to:

$$I(\mathcal{S}:\mathcal{E}\_k) = H\_{\mathcal{S}} + H\_{\mathcal{E}\_k} - H\_{\mathcal{S},\mathcal{E}\_k} = -|a|^2 \lg |a|^2 - |b|^2 \lg |b|^2 \dots$$

Thus, each E*k* is a sufficiently large fragment of E to supply complete information about the pointer observable of S.

The very first c-not causes complete decoherence of S in its pointer basis {|0-, |1-}. This illustrates points (i)–(iii) above—the relation between the (surplus and redundant) decoherence and quantum Darwinism, the continued increase of redundancy well after coherence between pointer states is lost, and the special role of the pointer observable.

As each environment qubit is a perfect copy of the pointer states of S, redundancy *R* in this simple example is eventually given by the number of fragments that have complete information about S—that is, in this case, by the number of environment qubits, *R* = *N*. There is no need to define redundancy in a more sophisticated manner, using *δ*, when each environment subsystem contains perfect copy of the pointer state: It will arise only in the more realistic cases when the analogues of c-not's are imperfect. We also note that decoherence is redundant, as each environment qubit suffices to decohere the system, so the redundancy of decoherence is also given by *N*.

Partial information plots in our example would be trivial: *I*(S : F) jumps from 0 to the "classical" value given by *H*S = −|*a*|<sup>2</sup> lg |*a*|<sup>2</sup> − |*b*|<sup>2</sup> lg |*b*|<sup>2</sup> at *f* = 1/*N*, continues along the plateau at that level until *f* = 1 − 1/*N*, and eventually jumps up again to the quantum peak at twice the level of the classical plateau as the last qubit is included: The whole SE is still in a pure state, so when F = E, *<sup>H</sup>*S,<sup>F</sup> = *<sup>H</sup>*S,<sup>E</sup> = 0. However, this much information exists only in global entangled states, and is therefore accessible only through global measurements.

Preferred pointer basis of the control S is of course its logical basis {|0-, |1-}. These pointer states are selected by the "construction" of c-not's. They remain untouched by copying into consecutive environment subsystems E*k*. After decoherence takes place;

$$I(\mathcal{S}:\mathcal{F}) = I(\{ |0\rangle, |1\rangle \}:\mathcal{F})$$

for any fragment of the environment when there is at least one subsystems of E correlated with S left outside of F, which suffices to decohere S. When this is the case, minimum quantum discord disappears:

$$\mathcal{D}(\mathcal{F}:\mathcal{S}) = I(\mathcal{S}:\mathcal{F}) - I(\{ |0\rangle, |1\rangle \}:\mathcal{F}) = 0, 1$$

and one can ascribe probabilities to correlated states of S and F in the pointer basis of S that are singled out by the c-not "dynamics". Discord would reappear only if all of E go<sup>t</sup> included, as then *I*(S : F) = 2*H*S, twice the information of the classical plateau of the PIP. Thus, all of E is needed to detect coherence in S: When a single environment qubit is missing, it is impossible to tell if the initial state was a superposition or a mixture of |0- and |1- of S.

As soon as decoherence sets in, *H*S = *<sup>H</sup>*S,<sup>F</sup> for any fragment F that leaves enough of the rest of the environment E/F to einselect pointer states in S. Consequently;

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{F},\nu}$$

and *H*F = *H*S*d*F , illustrating Equations (4.20) and (4.24).

4.2.2. Central Spin Decohered by Noninteracting Spins

A generalization of a model with c-not gates and qubits is a model with a central spin system interacting with the environment of many other spins. In effect, perfect c-not's discussed above become imperfect when a collection of spins interacts with the central spin system via Ising Hamiltonian:

$$\mathbf{H} = \sigma^z \sum\_{i} d\_i \sigma\_i^z \,. \tag{4.31a}$$

Above *σ<sup>z</sup>* and *σzi*act on the spins of the system and on the subsystems of the environment.

Several different versions of such models were studied as examples of quantum Darwinism ([131,143–146,149,151–157]). In this section, we focus on the steady state situation when the evolution results, at long times, in a PIP that is largely time-independent.

Hamiltonian of Equation (4.31a) provides an example of pure decoherence. It can imprint many copies of the preferred observable *σ<sup>z</sup>* onto the environment. Given the example of c-not's and qubits, this is no surprise.

Copies of the pointer states of S are, of course, no longer perfect: A single subsystem of the environment is typically no longer perfectly correlated with the system. It is therefore usually impossible for a single subsystem of E to supply all the information about S. Nevertheless, when the environment is sufficiently large, asymptotic form of *I*(S : F) has—as a function of the size of the fragment F—the same character we have already encountered with c-not's: A steep rise (where in accord with Equations (4.20) and (4.24), every bit of information stored in F reveals new information about S) followed by a plateau (where the information only confirms what is already known). This is clearly seen in Figure 6A: Only when the environment is too small to convincingly decohere the system, PIPs do not have a plateau.

In a central spin model with a large, initially pure and receptive environment (so that there is a pronounced classical plateau) and for a small information deficit *δ*, mutual information of a fragment F with F finite-dimensional subsystems is approximately [152]:

$$I(\mathcal{S}:\mathcal{F}) = H\_{\mathcal{S}} - \frac{1}{2} (e^{H\_{\mathcal{S}}} - 1)(d\_{\mathcal{E}}^{-\sharp \mathcal{F}} - d\_{\mathcal{E}}^{-(\sharp \mathcal{E} - \sharp \mathcal{F})})\,. \tag{4.32}$$

Above, *d*E is the size of the Hilbert space of the effective memory of a single environment subsystem (e.g., *d*E = 2 for a spin 12 that can use all its memory to store information about S). This is a good approximation only as *I*(S : F) is close to the plateau: For *f* near 0 or near 1 mutual information is approximately linear in *f* , see Figure 6B, although actual dependence on *f* turns out to be more complicated in exactly solvable models (see, e.g., Touil et al., 2022) [131].

When *δ* 1 and *f* < 12 , we can use Equation (4.32) to estimate redundancy. To this end we retain dominant terms and set *I*(S : F)=(1 − *<sup>δ</sup>*)*<sup>H</sup>*S to get:

$$(1 - \delta)H\_{\mathcal{S}} \approx H\_{\mathcal{S}} - \frac{1}{2}(\mathfrak{e}^{H\_{\mathcal{S}}} - 1)d\_{\mathcal{E}}^{-\mathfrak{F}\_{\mathcal{S}}}.$$

A simple formula for F*δ*, the number of subsystems that reduce information deficit to *δ* follows when *H*S 1;

$$\varepsilon^{\sharp} \mathcal{F}\_{\delta} \approx \log\_{d\_{\mathcal{E}}} \frac{e^{H\_{\mathcal{S}}} - 1}{2\delta H\_{\mathcal{S}}} \approx \frac{H\_{\mathcal{S}} - \ln 2\delta H\_{\mathcal{S}}}{\ln d\_{\mathcal{E}}} \approx \frac{H\_{\mathcal{S}}}{\ln d\_{\mathcal{E}}} \,. \tag{4.33a}$$

This last approximate answer shows that in the central spin model redundancy is close to what one might guess: It is given by the number of environment fragments that have enough subsystems—*<sup>H</sup>*S/ln *d*E—to store information about S. In other words, there are approximately;

$$R\_{\delta} = \frac{{}^{\sharp}\mathcal{E}}{{}^{\sharp}\mathcal{F}\_{\delta}} \approx \frac{{}^{\sharp}\mathcal{E}\ln d\_{\mathcal{E}}}{H\_{\mathcal{S}}}\tag{4.33b}$$

fragments of E that "know" the state of the system. Note that the information deficit *δ* does not appear in this approximate answer: We have dropped the subdominant ln *δ* in Equation (4.33a).

In the discussion above ln *d*E enters as a measure of the memory capacity of a subsystem of E. This and the universality of re-scaling in Figure 6B sugges<sup>t</sup> a conjecture: The environment will fill in the space available to store information with qmems of S.

We conclude that, when only a part of the Hilbert space of the environment subsystem is available to record the state of the system S, one should be able to use just this "accessible memory" of the subsystem instead of its maximal information storage capacity *hm* = ln *d*E in the estimates of redundancy. We shall now corroborate this conjecture.

#### **A. Partial Information Plots: Qutrits and Qubits**

**B. Universal Rise of a Partial Information Plot.**

**Figure 6.** (**A**). **Partial information plots for pure decoherence:** (**A**) A qutrit coupled to *N* = 4 ... 128 qutrit environment plotted against the fraction *f* = F/ E (see Blume-Kohout and Zurek, 2006 [152], for details). As the number of the environment subsystems increases, redundancy grows, which is reflected by the increasingly steep slope of the initial part of the plot. The inset [152] depicts rescaled mutual information of a qubit plotted against the number *m* of the environment qubits (rather than their fraction). Elongation of the plateau leads to the increase of redundancy: *Rδ* is the length of the PIP measured in units defined by the size (e.g., in the number of environment subsystems needed *mδ*, see Figure 5) of the part of the PIP that corresponds to *I*(S : F) rising from 0 to (1 − *<sup>δ</sup>*)*<sup>H</sup>*S . (**B**). Universal rise of partial information plots of a qutrit coupled to nine different environments with different *subsystem size d*E and cardinality E, but with nearly identical total information capacity plotted as a function of their fractional information capacity.

#### 4.2.3. Quantum Darwinism in a Hazy Environment.

Inaccessibility of the memory of E can have different causes [144,145,153,158]. The most obvious one is the possibility that the environment starts in a partly mixed state, so that some of its Hilbert space is already "taken up" by the information that is of no interest to observers15. It is then tempting to use the available memory of the environmental subsystem, given by the maximal information storage capacity *hm* less *h*, the preexisting entropy, instead of its maximal memory *hm* = ln *d*E in Equation (4.33b). When this substitution is made, the estimated redundancy is:

$$R\_{\delta}^{h} \approx \frac{\sharp \mathcal{E}}{\sharp \mathcal{F}\_{\delta}} \approx \not{\mathfrak{E}} \frac{(h\_{m} - h)}{H\_{\mathcal{S}}} \,, \tag{4.33c}$$

or;

$$R\_{\delta}^{h} \approx \left(1 - \frac{h}{h\_{m}}\right) R\_{\delta} \,. \tag{4.33d}$$

That is, redundancy in a partly mixed environment, *Rhδ* , decreases compared to the redundancy in the pure environment *Rδ* in proportion to the memory that is still available to accept the information about the system.

In the manageable case of a single qubit system, *H*S = ln 2, the substitution of *hm* − *h* for ln *d*E works [145], although the expression we have used above for, e.g., *Rδ* has to be modified, as the derivation of Equation (4.33a) assumes *H*S 1. A rather large environment (hundreds of qubits, with symmetries of the initial state and of the interaction Hamiltonian exploited to keep the size of the memory down) was used to explore a range of values of *h*. This was necessary because the principal effect of a mixed environment is to lower the slope of the initial part of PIP's by (1 − *h*/*hm*), so now it takes more subsystems of E to ge<sup>t</sup> to the "plateau".

The results for a central spin 12 in the environment of spin 12 subsystems confirmed that the redundancy decreased by approximately 1 − *h*/*hm* (Zwolak, Quan, and Zurek, 2010) [145]. This change in *Rhδ* was due to the change in the slope of the early part of PIP. Equation (33c,d) became a more accurate approximation when the environment was more mixed, i.e., when *h* was, to begin with, closer to *hm*. Of course, when *h* = *hm*, no information about S can be recovered from the environment, as *H*F (*t*) cannot increase when it starts at a maximum. Nevertheless, as already noted, mixed environments are still very effective in decohering the system. Thus, even as the classical contribution *H*F (*t*) − *H*F (0) = 0, disappears, the quantum contribution, *H*S − *<sup>H</sup>*S*d*E/<sup>F</sup> , to *I*(S : F) remains similar to the case when E was initially pure.

As we have seen before with c-not's and qubits, the system decoheres as soon as a single copy of its state is imprinted with a reasonable accuracy in E, and—when the environment is initially pure—a few imperfect imprints establish the initial rising part of the PIP. However, as new subsystems of E become correlated with S, the size of the plateau increases and its elongation (when plotted as a function of F) occurs without any real change to the early part of the PIP (see inset in Figure 6A). Thus, the number of copies of the information E has about S can grow long after the system was decohered. This increase of the number of copies leads to the corresponding increase of the redundancy of decoherence. Moreover, redundancy of decoherence *<sup>R</sup>δD* exceeds the redundancy of the information available to the observers *Rδ*, as even mixed environments retain their undiminished ability to decohere.

<sup>15</sup> A related question arises in the situation when—in addition to the "quantum system of interest"—there are many other systems "of no interest" that can imprint information on the common environment. There is therefore a danger that the information of interest would be diluted with irrelevant bits, suppressing the redundancy responsible for objectivity. Zwolak and Zurek (2017) [155] show that mixing of the relevant and irrelevant bits of information makes little difference to the redundancy of the information of interest.

#### 4.2.4. Quantum Darwinism and Pointer States

What information is redundantly acquired by E and can be recovered by observers from its fragments? In systems with discrete observables such as spins one can prove that it concerns pointer states of the system. The proof was first given in the idealized case of perfect environmental records, and then extended to the case of imperfect records (Ollivier, Poulin, and Zurek, 2004 [149]; 2005 [151]).

A natural way to characterize such correlations is to use the mutual information between an observable *σ* of S and a measurement e on E: Shannon mutual information *<sup>I</sup>*(*σ* : e) measures the ability to predict the outcome of measurement of *σ* on S after a measurement e. For a given density matrix of S⊗E, the measurements results are characterized by a joint probability distribution

$$p(\sigma\_i, \mathfrak{e}\_j) = \text{Tr}\{ (\sigma\_i \otimes \mathfrak{e}\_j) \rho^{S\mathcal{E}} \}\_{\prime \prime}$$

where *σi* and e*j* are the spectral projectors of observables *σ* and e. By definition, the mutual information is the difference between the initially missing information about *σ* and the remaining uncertainty about it when e is known. Shannon mutual information is defined using Shannon entropies of subsystems (e.g., *<sup>H</sup>*(*σ*) = − ∑*i p*(*<sup>σ</sup>i*)log *p*(*<sup>σ</sup>i*)) and the joint entropy *<sup>H</sup>*(*<sup>σ</sup>*, e) = − ∑*<sup>i</sup>*,*<sup>j</sup> p*(*<sup>σ</sup>i*, <sup>e</sup>*j*)log *p*(*<sup>σ</sup>i*, <sup>e</sup>*j*):

$$I(\sigma : \mathfrak{e}) = H(\sigma) + H(\mathfrak{e}) - H(\sigma, \mathfrak{e}) .$$

The information about observable *σ* of S that can be optimally extracted from *ν* environmental subsystems is

$$\mathcal{I}\_{\mathbb{V}}(\sigma) = \max\_{\{\mathfrak{e} \in \mathfrak{W}\_{\mathbb{V}}\}} I(\sigma : \mathfrak{e})$$

where M*ν* is the set of all measurements on those *ν* subsystems. In general, <sup>I</sup>*ν*(*σ*) will depend on which particular *ν* subsystems are considered. For simplicity, we will assume that any *typical ν* environmental subsystems yield roughly the same information. This may appear to be a strong assumption, but relaxing it does not affect our conclusions. By setting *ν* = E = *N* to the total number of subsystems of E, we ge<sup>t</sup> the information content of the entire environment. The condition;

$$\mathcal{T}\_N(\sigma) \approx H(\sigma)$$

expresses the *completeness* prerequisite for objectivity: All (or nearly all) missing information about *σ* of S must be in principle obtainable from all of E.

As a consequence of the basis ambiguity, information about many observables *σ* can be deduced by a suitable (generally, global) measurement on the entire environment [24,25]. Therefore, completeness, while a prerequisite for objectivity, is not a very selective criterion (see Figure 7a for evidence). To claim objectivity, it is not sufficient to have a complete imprint of the candidate property of S in the environment. There must be many copies of this imprint that can be accessed independently by many observers: *information must be redundant*.

To quantify redundancy, we count the number of copies of the information about *σ* present in E:

$$
\mathcal{R}\_\delta(\sigma) = {}^\sharp \mathcal{E} / \upsilon\_\delta(\sigma) = N / \upsilon\_\delta(\sigma).
$$

Above *νδ*(*σ*) is the smallest number of typical environmental subsystems that contain almost all the information about *σ* (i.e., <sup>I</sup>*ν*(*σ*) ≥ (1 − *<sup>δ</sup>*)<sup>I</sup>*N*(*σ*)).

The key question now is: What is the structure of the set O of observables that are **completely**, *IN*(*σ*) ≈ *<sup>H</sup>*(*σ*), and **redundantly**, *<sup>R</sup>δ*(*σ*) 1 with *δ* 1, imprinted on the environment? The answer is provided by the theorem:

**Theorem 4.** *The set* O *is characterized by a unique observable* **Π***, called by definition the* **maximally refined observable***, as the information* <sup>I</sup>*ν*(*σ*) *about any observable σ in* O *obtainable from a fraction of* E *is equivalent to the information about σ that can be extracted from its correlations with the maximally refined observable* **Π***:*

$$\mathcal{I}\_{\boldsymbol{\nu}}(\boldsymbol{\sigma}) = \boldsymbol{l}(\boldsymbol{\sigma} : \boldsymbol{\Pi}) \tag{4.34}$$

*for νδ*(**Π**) ≤ *m N.*

**Proof.** Let *σ*(1) and *σ*(2) be two observables in O for *δ* = 0. Since *σ*(1) and *σ*(2) can be inferred from two disjoint fragments of E, they must commute. Similarly, let e(1) (resp. e(2)) be a measurement acting on a fragment of E that reveals all the information about *σ*(1) (resp. *σ*(2)) while causing minimum disturbance to *ρ*SE . Then, e(1) and e(2) commute, and can thus be measured *simultaneously*. This combined measurement gives complete information about *σ*(1) *and <sup>σ</sup>*(2). Hence, for any pair of observables in O, it is possible to find a more refined observable which is also in O. The maximally refined observable **Π** is then obtained by pairing successively all the observables in O. By construction **Π** satisfies equality <sup>I</sup>*ν*(*σ*) = *<sup>I</sup>*(*σ* : **Π**) for any *σ* in O.

Theorem 4 can be extended to nearly perfect records for assumptions satisfied by usual models of decoherence (Ollivier, Poulin, and Zurek, 2005 [151]). The proof is based on the recognition that only the already familiar pointer observable can have a redundant and robust imprint on E. This Theorem can be understood as *a consequence of the ability of the pointer states to persist while immersed in the environment*. This resilience allows the information about the pointer observables to proliferate, very much in the spirit of the "survival of the fittest".

Note that the above Theorem does not guarantee the existence of a *non trivial* observable **Π**: when the system does not properly correlate with E, the set O will only contain the identity operator.

Two important consequences of this theorem follow: (*i* ) An observer who probes only a fraction of the environment is able to find out the state of the system as if he measured **Π** on S; (*ii*) Information about any other observable *ς* of S will be inevitably limited by the available correlations existing between *ς* and **Π**. In essence, our theorem proves the uniqueness of redundant information, and therefore the selectivity of its proliferation.

We can illustrate this preeminence of the pointer observable in our simple model: a single central spin 12 interacting with a collection of *N* such spins, Equation (4.31a). As seen in Figure 7a environment as a whole contains information about any observable of S. Preferred role of the pointer observable becomes apparent only when one seeks observables that are recorded *redundantly* in E. Figure 7b shows that only the pointer observable **Π** = *σz* (and observables that are nearly co-diagonal with it) are recorded redundantly, illustrating the theorem quoted above. The "ridge of redundancy" is strikingly sharp.

Further confirmation and extension of the theorem quoted above is the relation (derived under the assumption of surplus decoherence; Zwolak and Zurek, 2013 [130]) between the available information (characterized by the Holevo quantity *χ*, measure of the capacity of the quantum information channel for classical information) about an arbitrary observable *ς* and the pointer observable **Π** of the system, see Equation (28), and its consequences, Equations (29a) and (b).

Comparison of Figure 7a,b also shows that redundancy of *σz* increases long after the environment as a whole is strongly entangled with S. This is seen in a steady rise of the redundancy *Rδ* with the action. Thus, as anticipated, redundancy can continue to increase long after the system has decohered.

The origin of the consensus between different observers is the central lesson that follows from our considerations. In everyday situations, observers have no choice in the observables of systems of interest they will measure. This is because they rely on the "second hand" information they obtain from the same environment that is responsible for

decoherence. In addition, the environment that selects a certain pointer observable will record redundantly only the information about that observable.

Information about complementary observables is in principle still "out there", but one would have to intercept essentially all of the environment (to be more precise, all but its 1 − 1*<sup>R</sup>δD* ≥ 1 − 1*Rδ* fraction) and measure it in the right way (that is, using an observable with entangled eigenstates) to have any hope of acquiring that information. By contrast, to find out about the pointer observables, a small fraction of the environment ∼ 1*Rδ*is enough.

**Figure 7. Selection of the preferred observable in Quantum Darwinism in a simple model of decoherence** (Ollivier, Poulin, and Zurek, 2004 [149]).The system S, a spin- 12 particle, interacts with *N* = 50 qubits of E through the Hamiltonian **H** = *σ<sup>z</sup>* ∑*k gkσzk* for a time *t*. The initial state of S⊗E is √12 (|0- + |1-) ⊗ |0-E1 ⊗ ... ⊗ |0-E*N* . Couplings are selected randomly with uniform distribution in the interval (0,1]. All the plotted quantities are a function of the system's observable *σ*(*μ*) = cos(*μ*)*σ<sup>z</sup>* + sin(*μ*)*σx*, where *μ* is the angle between its eigenstates and the pointer states of S—here the eigenstates of *<sup>σ</sup>z*. (**a**) Information acquired by the optimal measurement on the whole environment, ˆ *IN*(*σ*), as a function of the inferred observable *σ*(*μ*) and the average interaction action *gk<sup>t</sup>*- = *a*. Nearly all information about every observable of S is accessible in the *whole* environment for any observables *σ*(*μ*) except when the action *a* is very small (so that E does not know much about S). Thus, complete imprinting of an observable of S in E is not sufficient to claim objectivity. (**b**) Redundancy of the information about the system as a function of the inferred observable *σ*(*μ*) and the average action *gk<sup>t</sup>*- = *a*. It is measured by *<sup>R</sup>δ*=0.1(*σ*), which counts the number of times 90% of the total information can be "read off" independently by measuring distinct fragments of the environment. For all values of the action *gk<sup>t</sup>*- = *a*, redundant imprinting is sharply peaked around the pointer observable. Redundancy is a very selective criterion. The number of copies of relevant information is high only for the observables *σ*(*μ*) falling inside the theoretical bound (see Equation (4.29)) indicated by the dashed line. (**c**) Information about *σ*(*μ*) extracted by an observer restricted to local random measurements on *m* environmental subsystems. The interaction action *ak* = *gkt* is randomly chosen in [0, *π*/4] for each *k*. Because of redundancy, pointer states—and only pointer states—can be found out through this far-from-optimal measurement strategy. Information about any other observable *σ*(*μ*) is restricted by the theorem discussed in this subsection (see also Refs. [130,149,151]) to be equal to the information about it revealed by the pointer observable **Π** = *<sup>σ</sup>z*.

As we shall see, redundancies for, e.g., the photon environment are astronomical. It is therefore no surprise that we rely on the information that can be obtained with little effort from a small fraction of E. In addition, it is also no surprise that the complementary information is inaccessible. Therefore (and as is established through a sequence of results in Girolami et al., 2022 [142] the evidence of quantumness is unavailable from the fragments of E. Moreover, observers that find out about their Universe in this way will agree about the outcomes. This is how objective classical reality emerges from a quantum substrate in our quantum Universe.

#### 4.2.5. Redundancy vs. Relaxation in the Central Spin Model

We have seen defining characteristics of quantum Darwinism in the central spin model. Thus; (i) decoherence begets redundancy which; (ii) can continue to increase long after decoherence saturated entropy of the system at *H*S. Moreover, (iii) both decoherence and quantum Darwinism single out the same pointer observable. There are multiple copies of the pointer observable in the environment, and only the information about the pointer states is amplified by the decoherence process. This happens at the expense of the information about the complementary observables (i.e., information about the phases between the pointer states).

Our model is illustrated in Figure 8. It differs from the central spin models we have investigated as now the spins of the environment interact and can exchange information. The effect of the interactions between the environmental subsystems on partial information plots and on the redundancy is seen in Figure 9. The coupling of the central spin to the environmental spins is stronger than their couplings to each other. As a result, partial information plots quickly assume the form characteristic of quantum Darwinism caused by pure decoherence, and redundancy rises to values of the order of the size of the environment.

**Figure 8. Central spin model with interacting environment subsystems.** Decoherence is no longer pure: An environment of 16 spins E*i* coupled to a single system qubit S with Hamiltonian given by Equation (4.31b) is the basis of the results presented in this subsection and in Figure 9. As before, fragment F is a subset of the whole environment E. The couplings *di* and *mij* were selected from normal distributions with zero mean and standard deviations *<sup>σ</sup>d* = 0.1 and *σm* = 0.001. Crucially, the interactions between S and the E*i* are much stronger than those within E. That is, *<sup>σ</sup>d σm*, so that information acquisition by the individual subsystems of the environment happens faster that the exchange of information between them. As a consequence, pure decoherence is initially a reasonable approximation. Eventually, however, interactions between the spins of the environment scramble the information so that fragments of E composed of individual spins reveal almost nothing about S.

Initially, pure decoherence is a reasonable approximation. However, as time goes on, correlations between spins of the environment gradually build up. Interactions between the spins of E mean that the state of a fragment of the environment begins to entangle with—begins to acquire information about—the rest of the environment. As a consequence, the information individual spins had about the system becomes delocalized as it is shared with the rest of E. The structure of the states of SE is no longer branching—it is no longer possible to represent them by Equation (4.17). More general entangled states of the Hilbert space of SE are explored. Eventually, interactions may take the systems closer to equilibrium (although we do not expect complete equilibration in our model—the central spin decoheres, but its pointer states remain unaffected by the environment). Consequently, over time, PIPs are expected to change character from the steep rise followed by a long plateau characteristic of pure decoherence (red graph in Figure 5) to the approximate step function shape (green graph in Figure 5) that is characteristic of a collection of spins in equilibrium (see Page, 1993 [159]).

All of the models investigated so far were "pure decoherence"—subsystems of the environment interacted only with the system S via an interaction that left the pointer basis untouched. This assured validity of Equation (4.20), so that the information gained by the environment about the pointer states of S remained localized, available from the fragments of the environment.

The idealization of pure decoherence is often well-motivated. Photon environment, for example, consists of subsystems (photons) that interact with various systems of interest but do not interact with one another. There are, however, other environments—such as air—that contribute to or even dominate decoherence, but consist of interacting subsystems (air molecules). Thus, while the pointer basis is still untouched by decoherence, information about it will no longer be preserved in the individual subsystems of the environment—it will become delocalized, and, hence, impossible to extract from the local fragments of the environment consisting of its natural subsystems.

To investigate what happens when subsystems of E interact and exchange information we relax the assumption of pure decoherence and consider a model that adds to the central spin model of Equation (4.31a) interactions between the environmental spins:

$$\mathbf{H} = \sigma^z \sum\_i d\_i \sigma\_i^z + \sum\_{j,k} m\_{jk} \sigma\_j^z \sigma\_k^z. \tag{4.31b}$$

As a consequence, the information about the system of interest is still present in the correlations with the environment, but it will gradually become encrypted in non-local states of E that are inaccessible to local observers—i.e., that do not provide information about S via measurements that access small (*f* < 12 ) fragments of E consisting of its subsystems.

The timescale over which pure decoherence is a good approximation depends on the strength of the couplings. In our case, the coupling between the system and the spins of the environment is significantly stronger than the couplings between the spins of E. As a result, states of SE acquire initially an approximately branching structure and have PIP's that allow significant redundancy to develop. However, over time, interactions within E take the system closer to equilibrium, and PIPs change. More detailed discussion of this can be found in the caption of Figure 9, and, especially, in Riedel, Zurek, and Zwolak, 2012 [153].

Our simple model with weakly interacting environment subsystems illustrates why environments where the subsystems (photons) are in effect non-interacting are used by observers to gather information rather than environments (such as air) that may be more effective in causing decoherence but scramble information acquired in the process because their subsystems (air molecules) interact with each other.

**Figure 9. Rise and fall of redundancy in the spin universe** of Figure 8 (see Riedel, Zurek, and Zwolak, 2012 [153]). (a) The redundancy *Rδ* is the number of fragments of E that provide (here, up to a fractional deficit *δ* = 0.1) information about the system. The exact redundancy is supplemented with an estimate based on the interpolated value of *I*(S : <sup>F</sup>*f*). The vertical dashed lines mark five instants. (b) The mutual information *I*(S : <sup>F</sup>*f*) versus fragment size F, and the entropy *H*S of the system at five instants corresponding to different qualitative behavior. (c) The mutual information *I*(S : <sup>F</sup>*f*) versus fragment size F and time *t*. Thick black lines mark five instants. *Low correlation* (*t* = 0.5) for small times means E "knows" very little about S. Each spin added to F reveals a bit more about S, resulting in the linear dependence *I*(S : <sup>F</sup>*f*) on *f* . *Decoherence* (*t* = 2) sets in near *<sup>τ</sup>d* ≡ (√*<sup>N</sup>σd*)−<sup>1</sup> = 2.5. By then the density matrix of S approaches a mixture of the two pointer states |↑- and |↓- singled out by the interaction and the state is approximately branching. Mutual information is still nearly linear in F and *Rδ* ∼ *<sup>O</sup>*(1). Mixing within E can be neglected as *t σ*<sup>−</sup><sup>1</sup> *m* = 1000. *Quantum Darwinism* (*t* = 10) is characterized by *I*(S : <sup>F</sup>*f*) that rises to the plateau; the first few spins reveal nearly all classical information: Additional spins just confirm it. The quantum information (above the plateau) is still present in the global state but accessible only via an unrealistic global measurement of almost all of SE. After *t* ∼ *σ*<sup>−</sup><sup>1</sup> *d* = 10, few spins suffice to determine the state of S no matter how large *N* is, so *Rδ* ∼ *N*. In the absence of the couplings *mij* this (approximately pure decoherence) would persist. (For some environments, such as photons, this is indeed the case.) *Relaxation* (*t* = 150) occurs near *t* ∼ *τm* ≡ (√*<sup>N</sup>σm*)−<sup>1</sup> = 250. Mixing within the environment entangles any given spin's information about S with the rest of E, reducing usefulness of the fragments. The mutual information plateau is destroyed, so redundancy plummets. *Equilibrium* (*t* = 500) is reached for *t* ∼ *σ*<sup>−</sup><sup>1</sup> *m* = 1000, when the actions associated with interaction between spin pairs in the environment reach order unity. The state ceases to be branching. The mutual information plot takes the form of random states in the combined Hilbert space of SE. An observer can learn virtually nothing about the system unless almost half the environment is accessed.

#### 4.2.6. Quantum Darwinism in Quantum Brownian Motion

Evolution of a single harmonic oscillator (the system) coupled through its coordinate with a collection of many harmonic oscillators (the environment) is a well known exactly solvable model (Feynman and Vernon, 1963 [160]; Dekker, 1977 [161]; Caldeira and Leggett, 1983 [162]; Unruh and Zurek, 1989 [163]; Hu, Paz, and Zhang, 1992 [164], Paz, Habib, and Zurek, 1993 [165]; Tegmark and Wheeler, 2001 [166]; Bacciagaluppi, 2004 [167]).

This is not a pure decoherence model. While the environment oscillators do not interact, the self-Hamiltonian of S does not commute with the interaction Hamiltonian. Therefore, when the oscillations are underdamped preferred states selected by their predictability are Gaussian minimum uncertainty wavepackets (Zurek, Habib, and Paz, 1993 [63]; Tegmark and Shapiro, 1994 [168]; Gallis, 1996 [169]). This is in contrast to spin models (including the model we have just discussed) where exact and orthogonal pointer states can be often identified. So, while decoherence in this model is well understood, quantum Darwinism—where the focus is not on S, but on its relation to a fragment F of E—presents novel challenges.

Here, we summarize results (Blume-Kohout and Zurek, 2008 [170]; Paz and Roncaglia, 2009 [171]) obtained under the assumption that fragments of the environment are "typical" subsets of its oscillators—that is, subsets of oscillators with the same spectral density as the whole E.

The QBM Hamiltonian:

$$\mathbf{H} = H\_S + \frac{1}{2} \sum\_{\omega} \left( \frac{q\_{\omega}^2}{m\_{\omega}} + m\_{\omega} \omega^2 y\_{\omega}^2 \right) + \mathbf{x}\_S \sum\_{\omega} \mathbf{C}\_{\omega} y\_{\omega} \mathbf{y}\_{\omega}$$

describes a collection of the environment oscillators coupled to the harmonic oscillator system with:

$$H\_{\mathcal{S}} = \left(\frac{p\_{\mathcal{S}}^2}{m\_{\mathcal{S}}} + m\_{\mathcal{S}}\Omega\_0^2 \mathbf{x}\_{\mathcal{S}}^2\right) / 2\pi$$

and the environmental coordinates *yω* and *qω* describe a single band (oscillator) E*<sup>ω</sup>*. As usual, the bath is defined by its spectral density, *<sup>I</sup>*(*ω*) = ∑*n <sup>δ</sup>*(*ω* − *<sup>ω</sup>n*) *C*<sup>2</sup> *n* 2*mnωn* , that quantifies the coupling between S and each band of E. An *ohmic* bath with a sharp cutoff Λ: *<sup>I</sup>*(*ω*) = 2*m*S*γ*0 *π ω* for *ω* ∈ [0 ... Λ] was adopted: A sharp cutoff (rather than the usual smooth rolloff) simplifies numerics. Each coupling is a differential element, d *C*<sup>2</sup> *ω* = 4*m*S*<sup>m</sup>ωγ*0 *π ω*2d*ω* for *ω* ∈ [0 ... <sup>Λ</sup>]. For numerics, whole range of frequencies [0 ... Λ] was discretized by dividing it into into discrete bands of width Δ *ω*, which approximates the exact model well up to a time *τrec* ∼ 2*π* Δ*ω* .

The system was initialized in a squeezed coherent state, and E in its ground state. QBM's linear dynamics preserve the Gaussian nature of the state, which can be described by its mean and variance:

$$
\overrightarrow{z} = \left( \begin{array}{c} \langle \mathbf{x} \rangle \\ \langle p \rangle \end{array} \right); \ \Delta = \left( \begin{array}{c} \Delta \mathbf{x}^2 \\ \Delta x p & \Delta p^2 \end{array} \right).
$$

Its entropy is a function of *a*2 = *h*¯ 2 −2 det(Δ), its squared symplectic area. Thus;

$$H(a) = \frac{1}{2}((a+1)\ln(a+1) - (a-1)\ln(a-1)) - \ln 2 \approx \ln\left(\frac{\varepsilon}{2}a\right),\tag{4.35}$$

where *e* is Euler's constant, and the approximation is excellent for *a* > 2. For multi-mode states, numerics yield *<sup>H</sup>*(*ρ*) exactly as a sum over Δ's symplectic eigenvalues (Serafini et al., 2004) [172]. The theory proposed in Ref. [170] approximates a collection of oscillators as a single mode with a single *a*2.

Mutual information illustrated in partial information plots (Figure 10) shows that *I*(S : F)—the information about S stored in F—rises rapidly as the fraction of the environment *f* included in F increases from zero, then flattens for larger fragments. Most—all but ∼ 1 nat—

of *H*S is recorded redundantly in E. When S is macroscopic, this *non-redundant information* is dwarfed by the total amount of information available from even small fractions of E.

**Figure 10. Partial information plots for quantum Darwinism in quantum Brownian motion** (see Blume-Kohout and Zurek, 2008 [170]). The system S was initialized in an *x*-squeezed state, which decoheres as it evolves into a superposition of localized states. Plot (**a**) shows PIPs for three fullydecohered (*t* = 4) states with different squeezing. Small fragments of E provide most of the information about S. Squeezing changes the amount of *redundant* information without changing the PIP's shape. The numerics agree with the simple theory, Equation (4.36), discussed in Ref. [170]. Plot (**b**) tracks one state as decoherence progresses. PIPs' shape is invariant; time only changes the redundancy of information.

Calculations simplify in the macroscopic limit where the mass of the system is large compared to masses of the environment oscillators. This regime (of obvious interest to the quantum—classical transition) allows for analytic treatment based on the Born-Oppenheimer approximation: Massive system follows its classical trajectory, largely unaffected by E. The environment will, however, decohere a system that starts in a superposition of such trajectories. In the process, E that starts in the vacuum will become imprinted with the information about the location of S.

The basic observation is that the area of the 1 − *σ* contours in phase space determines entropy. As a result of decoherence, the squared symplectic area corresponding to the state of the system will increase by *<sup>δ</sup>a*2S. This is caused by the entanglement with the environment, so the entropies and symplectic areas of environment fragments increase as well. When F contains a randomly selected fraction *f* of E, *ρ*F 's squared area is *a*2F = 1 + *f <sup>δ</sup>a*2S, and that of *ρ*SF is *<sup>a</sup>*2SF = 1 + (1 − *f*)*<sup>δ</sup>a*2S. Applying Equation (4.35) (where *<sup>δ</sup>a*2S 1) yields:

$$I(\mathcal{S}:\mathcal{F}) \approx H\_{\mathcal{S}} + \frac{1}{2} \ln \left( \frac{f}{1 - f} \right). \tag{4.36}$$

This "universal" *I*(S : F) (Blume-Kohout and Zurek, 2008 [170]; Roncaglia and Paz, 2009 [171]) is valid for significantly delocalized initial states of S (which implies large *H*S). It is a good approximation everywhere except very near *f* = 0 and *f* = 1 (where it would predict singular behavior). It has a classical plateau at *H*S which rises as decoherence increases entropy of the system.

In contrast to PIP's we have seen before (e.g, in the central spin model), adding more oscillators to the environment does not simply extend the plateau: The shape of *I*(S : F) is only a function of *f* and so it is invariant under enlargement of E. This is because the couplings of individual environment oscillators are adjusted so that the damping constant of the system oscillator remains the same. Therefore, increasing the number of oscillators in the environment does not really increase the number of the copies of the state of the system in E—in a sense it only improves the accuracy with which the environment with a continuum distribution of frequencies (e.g., a field) is modeled using discrete means.

When the above equation for *I*(S : F) is solved for *fδ* one arrives at the estimate for the redundancy:

$$R\_{\delta} \approx \varepsilon^{2\delta H\_{\delta}} \approx \approx \text{s}^{2\delta}.\tag{4.37}$$

The last equality above follows because an *s*-squeezed state decoheres to a mixed state with *H*S ≈ ln *s*. This simple last formula for *Rδ* holds where it matters—after decoherence but before relaxation begins to force the system to spiral down towards its ground state.

As trajectories decay, plateau flattens compared to what Equation (4.36) would predict. This will initially increase redundancy *Rδ* above the values attained after decoherence (see Figure 11). Eventually, as the whole SE equilibrates, the system will spiral down to occupy a mixture of low-lying number eigenstates, and *Rδ* will decrease.

**Figure 11. Delocalized states of a decohering oscillator redundantly recorded by the environment** (Blume-Kohout and Zurek, 2008 [170]). Plot (**a**) shows redundancy *Rδ* vs. time for three information deficits *δ*, when the initial state of the system is a Gaussian squeezed in *x* by *sx* = 6.3 × 103. Plots (**b**–**d**) show *<sup>R</sup>*10%—redundancy of 90% of the available information—vs. initial squeezing (*sx* or *sp*). Dots denote numerics; lines—theory. S has mass *m*S = 1000, *ω*S = 4. E comprises oscillators with *ω* ∈ [0 ... 16] and mass *m* = 1. The frictional coefficient is *γ* = 140 . Redundancy develops with decoherence: *p*-squeezed states [plot (**c**)] decohere almost instantly, while *x*-squeezed states [plot (**b**)] decohere as a *π* 2 rotation transforms them into *p*-squeezed states. Redundancy persists thereafter [plot (**d**)]; dissipation intrudes by *t* ∼ *<sup>O</sup>*(*γ*<sup>−</sup><sup>1</sup>), causing *R*10% to rise above simple theory. Redundancy increases *exponentially*—as *Rδ* ≈ *s*2*δ*—with the information deficit [plot (**a**)]. So, while *Rδ* ∼ 10 may seem modest, *δ* = 0.1 implies *very* precise knowledge (resolution of around 3 groundstate widths) of S. This is half an order of magnitude better than a recent results for measuring a micromechanical oscillator (see e.g. Ref. [173]). At *δ* ∼ 0.5—resolving ∼ √*s* different locations within the wavepacket—redundancy reaches *R*50% 10<sup>3</sup> (maximum numerical resolution of Ref. [170]).

Quantum Brownian motion model confirms that decoherence leads to quantum Darwinism. However, details of quantum Darwinism in QBM setting are different from what we have becomes accustomed to in the models involving discrete Hilbert spaces of S and of subsystems of E. Partial information plots with the shape that is independent of the size of E and a scaling of redundancy with the information deficit *δ* that is not logarithmic as before are clear manifestations of such differences.

Buildup of redundancy still takes longer than the initial destruction of quantum coherence. Nevertheless, various time-dependent processes (such as the increase of redundancy caused by dissipation) remain to be investigated in detail. Moreover, localized states favored by einselection are redundantly recorded by E. So, quantum Darwinism in QBM confirms many of the features of decoherence we have anticipated earlier. On the other hand, we have found an interesting tradeoff between redundancy and information deficit *δ*, Equation (4.37). It suggests that, in situations where QBM is applicable, objectivity (as measured by redundancy) may come at the price of accuracy.

We also note that surplus decoherence we have described before can be found in the case of QBM. This follows, in effect, from the fact that *<sup>a</sup>*2SF = 1 + (1 − *f*)*<sup>δ</sup>a*2S approaches *a*2S = 1 + *<sup>δ</sup>a*2S for small *f* . Consequently, it is evident that *H*S − *<sup>H</sup>*S,E/<sup>F</sup> ≈ 0, and the mutual information *I*(S : F) is given by *H*F . This is not obvious from the simple scale-invariant Equation (4.37) above.

To sum up, we note that while broadly defined tenets of quantum Darwinism— multiple records of S in E, buildup of redundancy to large values, etc.—are satisfied in QBM, there are also interesting differences. Thus, QBM—in contrast to the spin models— does not have an obvious version in which decoherence is pure (as there is no perfect pointer observable that commutes with the whole Hamiltonian of SE). However, when the mass of S is large and the initial state is delocalized in position, at least early on pure decoherence is a good (Born-Oppenheimer—like) approximation.

Eventually the collection of oscillators begins to relax, and the information about the system flows from their individual states to correlations between them. This is because, even though the oscillators of the environment do not directly interact with each other, they do interact indirectly via the system oscillator: There is no perfect pointer observable for our harmonic oscillator S. The obvious consequence of this is the damping suffered by S. Less obvious is its role in coupling of the environment oscillators: Even though they do not interact directly (as did spins of the environment in Equation (4.31b)), they exchange information about one another indirectly, via S, which in time creates entanglements that make it more difficult to extract information about S from the fragments of E defined by collections of the (original) oscillators.

#### 4.2.7. Huge Redundancy in Scattered Photons

The two decoherence models discussed above—central spin model and quantum Brownian motion—are the two standard workhorses of decoherence. They were the early focus of quantum Darwinism primarily because one could analyze them using many of the tools developed to study decoherence. We have thus seen quantum Darwinism in action, and we have already confirmed in these idealized models that the expectations about the shape of partial information plots resulting from decoherence and about the buildup of redundancy are satisfied—with variations—in both cases. This is reassuring. However, while redundancies appeared in both cases, they were modest (*Rδ* ∼ 10), in part as a result of the limited size of the environment. Moreover, neither model is an accurate representation of how we find out about our world.

In our Universe vast majority of data acquired by human observers comes via the photon environment. A fraction *f* , usually corresponding to a very small fragment <sup>F</sup>*f* of the photon environment scattered or emitted by the "system of interest" is intercepted by our eyes. This is how we find out about what we have grown accustomed to regard as "objective classical reality".

It is fair to expect that the photon environment should have significant redundancies— we use up only a tiny fraction of photon evidence, and others who look at the same systems generally agree about their states. To investigate quantum Darwinism in (photon) scattering processes we turn to the model of decoherence discussed by Joos and Zeh (1985) [174] that was since updated (Gallis and Fleming, 1990 [175]; Hornberger and Sipe, 2003 [176]; Dodd and Halliwell, 2003 [177]; Halliwell, 2007 [178]) and applied by others, for example to calculate decoherence in the fullerene experiments (Hornberger et al., 2003 [179]; Hackermüller et al., 2004 [180]). The book by Schlosshauer (2007) [32] provides a good overview.

In contrast to Joos and Zeh (1985) [174] who focused on decoherence caused by the isotropic black-body radiation we are interested in the information content of the scattered photons. We shall therefore primarily consider a distant point source that illuminates an object—a dielectric sphere of radius *r*—that is initially in a non-local superposition (see Figure 12), although we shall also discuss the isotropic case as a counterpoint. We shall assume thermal distribution of energies (and, hence, wavelengths) of the incoming radiation and (at least in the results we shall focus on below) we will assume that photons come as a plane wave from a single direction (which approximates illumination by a distant localized light source such as the Sun or a light bulb). Generalizations to other models of illumination have been considered by Riedel and Zurek, 2011 [158].

**Figure 12. Scattering photons from a dielectric sphere 'Schrödinger's cat'** (Riedel and Zurek, 2010; 2011 [158,181]). This is a realistic case of quantum Darwinism: Scattered photons carry multiple copies of the information about the location of S: A dielectric sphere of radius *r* and permittivity is initially in a superposition with separation Δ*x* = |*<sup>x</sup>*1 − *<sup>x</sup>*2|. This object—our systems S—scatters plane-wave radiation with thermally distributed photons of wavelength *λ* propagating in a direction *n* ˆ that makes an angle *θ* with the vector Δ *x*.

The scattering process is responsible for the decoherence of S and for the imprinting of information about the location of the scatterer S in the photon environment. As the initial state of S we take a nonlocal "Schrödinger cat" superposition of two locations:

$$\left|\psi\_{\mathcal{S}}(\vec{\mathbf{x}})\right\rangle = \left(\left|\vec{x}\_{1}\right\rangle + \left|\vec{x}\_{2}\right\rangle\right) / \sqrt{2}\dots$$

We ignore the self-Hamiltonian of the sphere so that its pointer states are localized in space. One can justify this approximation by pointing to the large mass of the sphere. Large mass also enables our other approximation—we shall ignore the momentum imparted to the sphere by the photons. Note that—under these assumptions—scattering of photons from S results in pure decoherence.

Scattering takes the initial pure state density matrix *ρ*0S = |*ψ*S (*x*)*ψ*S (*x*)| into a mixture with the off-diagonal terms:

$$\left| \left< \vec{\mathfrak{x}}\_1 \middle| \rho\_S \middle| \vec{\mathfrak{x}}\_2 \right> \right|^2 = \gamma^N \left| \left< \vec{\mathfrak{x}}\_1 \middle| \rho\_S^0 \middle| \vec{\mathfrak{x}}\_2 \right> \right|^2 = \Gamma \left| \left< \vec{\mathfrak{x}}\_1 \middle| \rho\_S^0 \middle| \vec{\mathfrak{x}}\_2 \right> \right|^2.$$

Above, *γ* is the decoherence factor corresponding to scattering by a single photon. It is given by:

$$\gamma = |\text{Tr}(\mathcal{S}\_{\vec{x}\_1} \rho\_{\mathcal{S}}^0 \mathcal{S}\_{\vec{x}\_2}^\dagger)|^2 \,, \tag{4.38}$$

where *Sxi* is the scattering matrix acting on the photon when the dielectric sphere S is located at *xi*. Scattering by *N* photons results in the decoherence factor Γ = *γ N*. The entropy of the decohered system turns out to be:

$$H\_{\mathcal{S}} = \ln 2 - \sum\_{n=1}^{\infty} \frac{\Gamma^n}{2n(2n-1)} = \ln 2 - \sqrt{\Gamma} \operatorname{arctanh} \sqrt{\Gamma} - \ln \sqrt{1 - \Gamma} \,. \tag{4.39}$$

To compute the decoherence factor Γ we use the classical cross section of a dielectric sphere in the dipole approximation (where the wavelength of photons is much larger than the size of the sphere and the photons are not sufficiently energetic to individually resolve the superposition). Under these assumptions the decoherence factor Γ due to blackbody radiation can be obtained explicitly, and has a form Γ = exp(−*t*/*<sup>τ</sup>D*) where *τD* is the decoherence time. Its inverse, the decoherence *rate*, is given by:

$$\frac{1}{\tau\_D} = \mathbb{C}\_\Gamma(3 + 11\cos^2\theta) \frac{I\ddot{a}^6 \Delta x^2 k\_B^5 T^5}{c^6 \hbar^6},\tag{4.40}$$

where *C*Γ = 161, 280 *ζ*(9)/*π*<sup>3</sup> ≈ 5210 is a numerical constant, *I* is the *irradiance* (that is, radiative power per unit area) while *a*˜ ≡ *r*[( − <sup>1</sup>)/( − 2)]1/3 is the effective radius of the sphere that takes into account its permittivity , and *θ* is the angle between the direction of incoming plane wave *n*ˆ and Δ*x* (see Figure 12).

The decoherence rate does not increase for arbitrarily large Δ*x* with the square of the separation, as Equation (4.40) would indicate. Rather, this expression is only valid in the Δ*x λ* limit we are considering here. For Δ*x λ*, the decoherence rate saturates [175];

$$\frac{1}{\vec{\tau}\_D} = \vec{\mathbf{C}}\_\Gamma \frac{I \vec{\mathbf{a}}^6 k\_B^3 T^3}{c^4 \hbar^4},\tag{4.41}$$

with *C*˜ Γ = 57, 600 *ζ*(7)/*π*<sup>3</sup> ≈ 1873. In the intermediate region, where Δ*x* ∼ *λ*, decoherence time *τD* has a complicated dependence on both Δ*x* and *θ*. The results discussed here are valid for *all* Δ*x* providing that the correct *τD* is used.

To obtain the mutual information we use the (pure decoherence) identity *I*(S : F) = (*<sup>H</sup>*F − *H*F (0)) + ( *H*S − *<sup>H</sup>*S*d*E/F ), Equation (4.20). The result is:

$$I(\mathcal{S}:\mathcal{F}\_f) = \ln 2 + \sum\_{n=1}^{\infty} \frac{\Gamma^{(1-f)n} - \Gamma^{fn} - \Gamma^n}{2n(2n-1)}.\tag{4.42}$$

Figure 13 shows the plot of this mutual information as a function of the fraction of the environment *f* for several times. For large *t* (small Γ) the sum is dominated by the lowest power of Γ. Thus, for *f* < 1 2we have:

$$I(\mathcal{S}:\mathcal{F}\_f) = \ln 2 - \frac{1}{2} \Gamma^f. \tag{4.43}$$

This allows us to estimate redundancy for *δ* < 0.5 as;

$$R\_{\delta} \approx \frac{1}{\ln(2\delta \ln 2)} \frac{t}{\tau \, D} \,. \tag{4.44}$$

As in the case with the central spin (but not with the quantum Brownian motion) redundancy depends only weakly—logarithmically—on the information deficit *δ*.

(**b**) Isotropic illumination

**Figure 13. Quantum Darwinism in a photon environment—the origin of the photohalo** (Riedel, and Zurek, 2010; 2011 [158,181]). The quantum mutual information *I*(S : <sup>F</sup>*f*) vs. fragment size *f* at different elapsed times for an object illuminated by a point-source black-body radiation, and by an isotropic black-body radiation. (**a**) **For point-source illumination** individual curves are labeled by the time *t* in units of the characteristic time *τD*, Equation (4.40). For *t* ≤ *τD* (red dashed lines), the information about the system available in the environment is low. The linearity in *f* means each piece of the environment contains new, independent information. For *t* > *τD* (blue solid lines), the shape of the partial information plot indicates redundancy; the first few fragments of the environment give a lot of information, while the additional fragments only confirm what is already known. Such a photohalo contains many copies—multiple qmemes—of the record of the location of S. The remaining quantum information (i.e., mutual information above the plateau) is highly encrypted in the global state, in the sense that it can only be accessed by capturing almost all of E and measuring SE in the right way. (**b**) **For isotropic illumination** the same time-slicing is used as in (**a**) but there is greatly decreased mutual information because the directional photon states are "full" and cannot store more information about the state of the object.

What is even more important, redundancy continues to increase linearly with time at a rate given by the inverse of the decoherence time *τD*. This is different than in either central spin or quantum Brownian motion models. This difference is due to the nature of these models: There the subsystems of the environment continued to interact with the decohered system, so that the information they acquired about S sloshed back and forth between S and correlations with E. As a result, a steady state was reached, and redundancy saturated at modest values. In the case of photons the transfer of information is unidirectional—they scatter and go off to infinity (or fall into the eye of the observer). Even a tiny speck of dust can decohere quickly in a photon environment at very modest cosmic microwave background temperatures as was noted by Joos and Zeh, 1985 [174].

Redundancy of illuminated objects can quickly become enormous [158,181]. For instance, a 1μm speck of dust in a superposition of Δ*x* ∼ 1μm on the surface of Earth illuminated by sunlight (*T* = 5250◦K) would produce *Rδ* ∼ 10<sup>8</sup> records of its state—its location—in just 1 microsecond. As a consequence, huge redundancies that grow linearly with time are inevitable.

By contrast with the case of the point source (or, more generally, with the case where illumination comes from more than one direction (Riedel, and Zurek, 2011) [158]), isotropic illumination by black body radiation (Figure 13b) does not result in the buildup of redundancy. This is understandable, as a maximum entropy of blackbody radiation that fills in all the space near the system has no more room to store the information about the location of S—its initial entropy *H*F (0) is already at a maximum. As a result, *H*F − *H*F (0) = 0, and the classical, locally accessible contribution to mutual information disappears. This is in spite of the fact that quantum decoherence caused by blackbody radiation is very effective.

Rapid increase and large values of redundancy signify objectivity—many (∼ 10<sup>8</sup> observers after only 1 microsecond!) could in principle obtain the same information and will agree about the state of the systems. Thus, quantum Darwinism resulting from the photon environment is very effective, and accounts for the emergence of the "objective classical reality" in our quantum Universe.

The flip side of the huge redundancies is irreversibility—the difficulty of undoing redundant decoherence. Restoring the state of our (modest) ersatz Schrödinger cat to the preexisting superposition would require control and manipulation all of the fragments of the environment. In our example this means ∼ 10<sup>14</sup> fragments that, after just one second, independently "know" the location of S. Recovery of phase coherence—reversal of decoherence—requires intercepting all of E, including the very last fraction that has a record of S and that corresponds to the"quantum rise" (at *f* → 1) in the plots of the mutual information. Moreover, such manipulations of ES would involve global observables.

Measurement is *de facto* irreversible not just because observer in possession of the record of its outcome cannot reverse evolution that led to the wavepacket collapse (as discussed in Section 2), and not because "observers choose to ignore the environment", but because they cannot ge<sup>t</sup> hold of and control all of ∼10<sup>14</sup> (or more) fragments of E with the record of the outcome—and that would be a precondition for the attempted reversal. This is especially obvious with photons: As soon as a minute fraction (∼10−<sup>14</sup> of the photons that scattered) escape within a second the irreversible "reduction of the state vector" is a *fait accompli*. Moreover, escaped photons are gone for good—reversibility of the dynamics of this process is trumped by relativistic causality.

#### *4.3. Experimental Tests of Quantum Darwinism*

Experimental study of quantum Darwinism faces the problem familiar already from the tests of decoherence: Both decoherence and quantum Darwinism are so efficient that in everyday life their consequences are taken for granted and even in the laboratory it is difficult to find situations where the effect of the environment can be adjusted at will and quantified.

In the study of decoherence this problem was bypassed by using carefully tuned microsystems with controllable coupling to the environment (Brune et al., 1996 [182]; Hackermüller et al., 2004 [180]; Haroche and Raimond, 2006 [183]). Similar strategy was adopted in the experimental studies of quantum Darwinism, although the branching states

that contain information about S imprinted on E have to contend with decoherence of the composite SE coupled to the more distant degrees of freedom (that cause additional decoherence of SE as a whole): As E with the record of the pointer states of S grows, the differences between the growing branches seeded by the pointer states increase, and become more susceptible to decoherence.

The strategy of finding mesoscopic systems where the process that is key to quantum Darwinism—proliferation of multiple copies of information about S—can be controlled and studied has nevertheless led to several experiments. Thus, the groups of Mauro Paternostro (Ciampini et al., 2018) [184] and Jian-Wei Pan (2019) [185] carried out what amounts to logical gates to imprint information about a system initiated in a superposition of pointer states on a collection of photons. The goal was confirmation of the key idea—that a small fraction of E suffices to gain almost all information about S, and that enlarging that fraction confirms what was already found out. Results are consistent with this expectation. Similar strategy (and similarly positive results) was obtained in an emulation of quantum Darwinism on an IBM 5-qubit quantum computer (Chisholm et al., 2021) [186].

The group of Fedor Jelezko (Unden et al., 2019) [187] used nitrogen vacancy (NV) center as a system, relying on its natural interaction with four surrounding C<sup>13</sup> nuclei in the diamond, as illustrated in Figure 14. The density matrix of SE was reconstructed using quantum tomography. The resulting partial information plots demonstrate that the gain of information upon the measurement of the first fragment of E is largest. As anticipated, information gains from the additional fragments are show diminishing returns.

**Figure 14. The emergence of redundancy for an NV center decohered by its environment** (after Unden et al., 2019 [187]). (**a**) The measurement protocol starts with the initialization sequence and follows with a free decoherence-inducing evolution, *U*, for a duration of *t*. After initial polarization, two *π* 2 -pulses transform the state into a product of |+- states which then evolve under the direct HF interaction between the NV center (the system S) and nuclear spins (the environment E). The tomography sequence follows. (**b**) Holevo information versus fraction size for a few different times. For small times, there are no correlations between S and E. However, as decoherence proceeds, information is transferred into E resulting in formation of a *classical plateau*. The plateau signifies the appearance of redundant information. When the data is not normalized to the initial degree of polarization (black), only the initial rise of information is seen. (**c**) Holevo information, *χ*(S : F), versus the environment fragment size F and free evolution time *t*. For small F one can see the initial rise in information with time followed by oscillations. This is due to information flowing into the fragment of the environment and then back into the system (i.e., environment spins will first gain information and then transfer it back to S). For larger fractions, however, one sees just a rise and a plateau with time. This is due to different interaction strengths with the environment spins, which favors one way information flow. The solid curves in (**b**,**<sup>c</sup>**) show the result of simulations with and without imperfect initial polarization. The dynamics in simulation are governed by the actual Hamiltonian. The semi-transparent red lines in (**b**) and the plane in (**c**) indicate an information deficit of 1/*<sup>e</sup>*, i.e., *I* = (1 − 1/*e*)*<sup>H</sup>*S . Errors are smaller than the data points. (See Unden et al., 2019 [187], for further details.)

The results of all of these experiments are consistent with what was expected, with one exception: If SE as a whole was isolated from the other environments, there should have been a corresponding "uptick" in mutual information as F → E. There was no convincing evidence of that signature of the overall purity. This is hardly surprising: As noted earlier, in addition to deliberate decoherence resulting in quantum Darwinism, the systemenvironment composite interacts with other degrees of freedom. Phase coherence between the branches corresponding to the amplified pointer states of SE is then suppressed by that spurious decoherence.

#### *4.4. Summary: Environment as an Amplification Channel*

Decoherence has made it clear that quantum states are far more fragile than their classical counterparts. A state that is stable (and, therefore, can aspire to classicality) is selected—einselected—with the environment having decisive say in the matter. However, even this dramatic change of view (that limits usefulness of the idealization of "isolated systems"—mainstay of classical physics—in explaining how our quantum Universe works) turns out to underestimate the role of the environment.

Quantum Darwinism demonstrates that preferred states are not only selected for their stability (ability to survive the "hostile environment") but are communicated by the very same environment that also serves as a communication channel. Therefore, the environment acts both as a censor (for some states) and as an advertising agen<sup>t</sup> that disseminates many copies of the information about the pointer states while suppressing complementary information about their superpositions.

Regarding the environment as a communication channel is more than a figure of speech: A quantum communication channel can be regarded (see, e.g., Wilde, 2013 [188]; Preskill, 2020 [189]) as a correlated state of an input and an output. Quantum channel used to transmit classical information can be represented by a state:

$$\rho\_{\mathcal{S}\mathcal{F}} = \operatorname{Tr}\_{\mathcal{E}/\mathcal{F}} \rho\_{\mathcal{S}\mathcal{E}} \simeq \sum\_{k} p\_k |\pi\_k\rangle\langle\pi\_k| \rho\_{\mathcal{F}\_k} \rho\_{\mathcal{F}\_k}$$

where |*<sup>π</sup>k*- are effectively classical (i.e., orthogonal) input states—messages that are to be communicated. They are encoded in the output density matrices *ρ*F*k* that are the records of |*<sup>π</sup>k*-, and that are to be eventually measured. This was our Equation (4.26).

In the idealized situation when both S and E start pure, the initial state of the whole is also pure:

$$\rho\_{\mathcal{S}\mathcal{E}} = |\Psi\_{\mathcal{S}\mathcal{E}}\rangle\langle\Psi\_{\mathcal{S}\mathcal{E}}|$$

where (see Equation (4.17));

$$|\Psi\_{S\mathcal{E}}\rangle = \sum\_{k} e^{i\phi\_{k}} \sqrt{p\_{k}} |\pi\_{k}\rangle \left| \varepsilon\_{k}^{(1)} \right\rangle \left| \varepsilon\_{k}^{(2)} \right\rangle \dots \left| \varepsilon\_{k}^{(l)} \right\rangle \dots \dots$$

Tracing over a part of the environment yields a state of the form of *ρ*SF above, where *ρ*F*k* represent states of the fragments correlated with the effectively classical pointer states |*<sup>π</sup>k*-. Moreover, "roots" of the branches |*<sup>π</sup>k<sup>π</sup>k*| that appear in *ρ*SF are largely independent of what part of the environment is traced over for most of the range of the possible sizes of the the E/F : This is an excellent approximation for *fδ* < *f* < 1 − *f<sup>δ</sup>*, that is, all along the plateau of the partial information plots. Thus, the same preferred branches of SF are singled out regardless of what part of the environment is detected, and what part is out of reach at least as long as E/F suffices to decohere the rest.

This transformaton of the communicated message from quantum to classical is implied by one more feature of *ρ*SE that can be (see Equation (4.30)) approximated as:

$$\rho\_{S\mathcal{E}} \simeq \sum\_{k} p\_k |\pi\_k\rangle\langle\pi\_k| \bigotimes\_{l}^{R\_{\mathcal{S}}} \rho\_{\mathcal{F}\_k}^{(I)}\ .$$

when the off-diagonal terms of *ρ*SE (obviously still present in |<sup>Ψ</sup>SE -<sup>Ψ</sup>SE |) are out of reach, as would be the case for an observer who cannot acquire all of SE and measure it in the correct basis. Thus, Equation (4.30) (reproduced above) provides a justification of the existence of branches with many (*Rδ*) copies of the same message, suggesting a natural extrapolation "by induction" of the objective existence of the "root" |*<sup>π</sup>k*- of the branch16.

Independence of the communicated message from E/F , the remainder of the environment that is traced over, is key: All the fragments of the environment know about the same observable of the system. Thus, omitting some part of the environment is perfectly justified—up to the information of the order of the information deficit *δ* it does not alter what is known about the system of interest.

Indeed, it is usually an excellent approximation to assert that, if we were to look at a macroscopic S—if we intercepted even a very small fraction of E, *f f<sup>δ</sup>*, just a few of the photons that bounced off of S—the information we would ge<sup>t</sup> would be consistent with S occupying a single as ye<sup>t</sup> unknown pointer state. Our "looking" is usually insufficient to identify such a single state, but if we continued with more precise measurements, we would eventually conclude based on all the data gathered from E (and confirmed by direct measurement of S, if desired) that whatever pure state we inferred from the incomplete partial information was in the end gradually revealed by the data gathered along the way.

In other words, in many situations our acquisition of information will correspond to the initial rising part of the partial information plot. Our confidence about the existence of a definite state at the end of the information acquisition process is based on cases when the plateau is reached, or even simply on the fact that the additional, higher resolution information is consistent with the information acquired earlier.

The step from the epistemic ("I have evidence of |*<sup>π</sup>*17-".) to ontic ("The system is in the state |*<sup>π</sup>*17-".) is then an extrapolation justified by the nature of *ρ*SE : Observers who detected evidence consistent with |*<sup>π</sup>*17- will continue to detect data consistent with |*<sup>π</sup>*17- when they intercept additional fragments of E. So, while the other branches may be in principle present, observers will perceive only data consistent with the branch to which they go<sup>t</sup> attached by the very first measurement. Other observers that have independently "looked at" S will agree.<sup>17</sup>

Objective existence of classical reality turns out to be an enormously simplifying, exceedingly accurate, and, therefore, very useful approximation. Thus, when agents' success depends on acting in response to perceived "objective reality", they can do that with confidence based on the indirect data obtained from E.

There is a sense in which our strategy explores and vindicates the artificial division of the Universe between quantum and classical introduced by Heisenberg—what John Bell (1990) [193] called a "shifty split"—in measurements. "Shifty" in the description of the "split" was not meant to be a compliment. The split happened somewhere along the von Neumann chain that connected the quantum system with the observer. It divided a quantum part of the chain (where superpositions were allowed) from the classical part (where a single actuality existed). What quantum Darwinism shows is that "shifty" can be regarded as a statement of invariance, and, thus, upgraded from a statement of contempt to a recognition of a symmetry.

Decoherence introduced the environment into the picture of quantum measurement. The original von Neumann's chain has fanned out: Parts of the chain separate from the links that connect S via the apparatus A with the observer, and go sideways, into E. They disseminate the same information—*Rδ* copies of it—as the chain connecting S with the observer splits into sub-chains. Indeed, there may be—and often are—other observers benefiting from the information communicated by these sub-chains.

<sup>16</sup> More formal arguments that justify regarding fragments of the environment as channels that deliver preselected information about the pointer states have been put forward by Brandão, Piani, and Horodecki (2015) [190] and by Qi and Ranard (2021) [191]; see also Knott et al. (2018) [192].

<sup>17</sup> That is, observer who records "17" will only encounter others whose records are consistent with "branch 17".

Quantum Darwinism recognizes and quantifies this fanning out of the von Neumann chain. Shifty split could be placed anywhere along the plateau of the partial information plot. The information about S that can be extracted from the chain is invariant under the shift of that split. In addition, every branch of the chain is firmly attached to the effectively classical pointer state singled out by decoherence.

#### **5. Quantum Darwinism and Objective Existence: Photohalos and Extantons**

We reviewed research aimed at understanding how the classical world we perceive emerges from the counterintuitive laws of quantum mechanics. It is time to take stock. Do we now understand why we perceive our undeniably quantum Universe as classical?

Quantum Darwinism holds the key to this last question. Deducing discreteness that sets the stage for the wavepacket collapse from repeatability and unitarity reveals quantum origin of quantum jumps, defines "events", and justifies the emphasis on the Hermitian nature of observables, but this is a derivation of one of the *quantum* textbook axioms. The origin of Born's rule in the symmetries of entanglement matters in interpreting *quantum* probabilities, but—as important as axiom (v) is for experimental predictions—its derivation from the core quantum postulates (o)–(iii) does not directly account for the familiar everyday reality.

Quantum Darwinism, by contrast, explains why our world appears classical, and why XIX century physics (physics still relevant for our everyday routine) seemed at the time like the whole story. It shows how the information is channeled from the "objects of interest" to us, observers, leading to consensus about what exists—to the idealization of objective classical reality.

Our everyday world comprises "systems of interest" endowed with the rest mass—the focus of XIX century physics—and environments that often play the role of communication channels. The quantumness of the massive "systems of interest" is suppressed by decoherence that is continually uploading qmemes of their states into the environments that broadcast that information. Thus, every object of interest is ensconced in an expanding halo of qmemes—information-carrying fragments of E. We perceive our world by intercepting fragments of these information-laden halos. We extract data about the systems of interest but tend to ignore the role of the halos—the means of its transmission—in the emergence of the familiar classical reality.

#### *5.1. Anatomy of an Extanton*

The combination of the macroscopic, massive core with the information—bearing halo defines an *extanton* (as in "extant"). Extantons are responsible for how we perceive our Universe. The tandem—the macroscopic decohered system along with its informationladen halo—fulfills Bell's desideratum for "beables" (Bell, 1975) [40]18. As long as an observer relies on the halo of photons (or, possibly, other decohering environments) for the information about S, only the pointer states of the extanton core can be found out.

Extantons exist as beables should: The state of the environment is of course perturbed by agent's measurements (e.g., photons are absorbed by our eyes). However, redundancy means that this does not alter the information about the core still available from the rest of the halo in many redundant ( *Rδ* 1) copies.

The essence of the extanton—the only attribute classical physics cares about—is the state of its core. Information about it is available in multiple copies, and, hence, virtually unaffected by the perturbations of the halo. Indeed, as the environment continues to monitor S, qmemes of its pointer state in E multiply, and the halo continues to expand while the pointer states stay put (or evolve as a classical system would).

Extantons are not "elementary", but neither are atoms of even elementary particles such as protons or mesons (which are made out of quarks), not to mention planets or stars.

<sup>18</sup> We should not presume that Bell (who held strong opinions on interpretational questions [41]) would have approved of extantons as model for 'entities that be'—"beables". We shall therefore call the composite entity consisting of the macroscopic object enveloped in and heralded by its information-carrying halo an extanton.

Yet protons, mesons, atoms, molecules, planets, stars, etc., are all useful in representing and accounting for various phenomena in our Universe.

Extantons account for the classical world of our everyday experience. Quantum Darwinism explains how they come about. We can reduce extantons to their constituents, but—as other higher-level entities such as composite elementary particles, atoms, molecules, viruses, stars—extantons enable an approximate, but simple and useful description of the classical realm, of the familiar world emerging from within the quantum substrate.

Part of that simplification that led to the Newtonian view of the world is our habit of ignoring halos, carriers of the information, and focusing on the extanton cores. These cores are the "objects of interest". Many are subject to Newtonian dynamics. It was convenient to ignore the means by which the information about the cores is delivered, and just focus on the physics of the cores.

Inseparable bond of the core with the halo is responsible for what we perceive. However, conditioned by Newtonian physics we tend to ignore the presence and the role of the halos, and just use the data they deliver to find out about the cores. In the end, only these cores count as elements of our everyday reality. Yet, it is the halos that bear responsibility for the suppression of quantum superpositions and for our perception of the familiar—robust and objective—classical world.

The difficulty with the quantum-classical correspondence arises when the role of the halo is overlooked and the extanton core is treated as if it were isolated, and, therefore, subject to unitary evolution. Core is indeed quantum (as is everything else in our Universe) but it is by no means isolated. Therefore, one cannot expect the core of an extanton to follow a unitary quantum evolution. Nevertheless, whenever Newtonian dynamics is a good approximation)its classical evolution can be approximately reversible.

#### 5.1.1. Extantons and "The Classical"

No one has ever worried about interpretation of classical physics. This is because classical states were thought to be real—to exist independently of what was known about them. There was no need to be concerned about the effect of information acquisition. Even though measurement could perturb a system, that perturbation could always be made as small as required.

The main reason for the interpretational discomfort with quantum theory is our faith in the underlying objective reality. It is based on our everyday experience that leads us to believe the world we inhabit exists independently of the information we (or other agents) have.

We owe this confidence in objective classical reality to the fact that we are immersed in the extanton halos and inundated with the information about their cores. As a result of this overload with "free" data we have grown up believing that we can examine systems and determine their states without perturbing them—classical measurement would just update the record (change the state of the apparatus or of the observer's memory) without 'backreaction'. In the classical setting information about a system is obtained, but its state or its evolution is untouched.

One might call this a *myth of immaculate perception*. It asserts a unidirectional information flow (hence, flow of influence, from the measured system to the observer) that reveals the state of the system but leaves it unaffected. This seems to defy the spirit of Newton's principle that action elicits reaction.

Quantum Darwinism accounts for the origin of this myth. All macroscopic objects telegraph their pointer states via their decohering environments. They are enveloped by information halos—by the environment with multiple records of their decoherence-resistant states. Our everyday world does not consist of isolated systems. Rather it is defined by the extantons consisting of a macroscopic object enveloped in and heralded by its halo, imprinted on the environment.

There is a grea<sup>t</sup> variety of extantons. They all consist of the core and the halo that "knows" the state of the core, and may be persuaded to share that information with observers. Planets are extanton cores and so are the grains of photographic emulsion. Any object one can see is likely an extanton core, its state communicated to us by photons—by its photohalo. Some extantons (like planets) follow approximately reversible dynamics. Others (like grains of photographic emulsion or Brownian particles) are heavily damped or embedded in an immobilizing medium competing with the photohalo in monitoring of the core (and, hence, in decohering it). Every apparatus pointer is an extanton core.

Observers intercept fractions of the halo to gain information about the core. Direct measurement of the core is in principle also possible, but only measurements of the pointer observable would lead to predictable results: Other outcomes—superpositions of pointer states—are quickly invalidated by decoherence. Thus, except for laboratory settings (where decoherence can be kept at bay by a near-perfect isolation) the only observables with predictive value are the pointer observables readily available from the halo.

Photohalo is the main channel through which observers find out the state of the core. The information available from the photohalo is usually limited to the data that are in effect macroscopic (therefore, easily imprinted on the halo, such as the location of the dielectric sphere discussed earlier).

#### 5.1.2. Photohalos, Photoextantons, and Information Detached from Existence

Photons play a preeminent role as information carriers. Every object we know is bathed in a radiation environment. Each is surrounded by an information halo, its photons disseminating qmemes with the speed of light. We eavesdrop on these photohalos, intercepting small fractions that nevertheless reveal the state of the core.

Photoextantons are the family of extantons that advertise the state of their core via their photohalos. One can consider extantons where photohalo is the only relevant environment. There are at least two good reasons for this. To begin with, our eyesight is responsible for most of the information we obtain, so it is of interest to consider photoextantons. Perhaps more importantly, there is a sense in which photoextantons come close to the ideal—one might say Platonic ideal—of the separation of existence and information detached from existence.

Information (in the form of the photohalo) has—apart from the dramatic consequences of decoherence—only a negligible effect on the evolution of the cores. Photohalo detaches from the core and runs off seemingly without any dynamical consequences while the core can evolve in approximate accord with Newton's laws. Questions such as "Does the moon exist when no one's looking?" are motivated by the illusion that one can apply quantum theory to the isolated cores of extantons and recover classical reality.

Quantum states are *epiontic*. Quantum physics has eliminated separation of existence and information. Nevertheless, extanton structure restores this separation (with suitable caveats): The core exists (it persists in a pointer state). And the information about it is continually detached and propagates as the photohalo.

Everyday practice of quantum theory has inherited from classical physics the habit of dealing with isolated systems. Yet, we can never encounter isolated macroscopic extanton cores. Trying to understand cores of extantons as if they were isolated is the cause of the "measurement problem". Decoherence and einselection were the first steps towards its resolution, towards the understanding why our Universe looks classical to us. Quantum Darwinism provides the complete answer, and extantons are its embodiment.

Properties of systems that reside in the extanton cores and give rise to the qmemes in E are untouched by the measurements on the halo, ye<sup>t</sup> they reveal the state of the core. This is guaranteed by the theorems we have already discussed that also imply (depending on specific assumptions) uniqueness or near uniqueness of the states that can be repeatedly imprinted on the halo, and, therefore, deduced from the environment fragments: The optimal strategy for the agen<sup>t</sup> is to find out what the halo redundantly advertises—that is, pointer states of the extanton core—and then use that information to deduce other observables of interest.

#### 5.1.3. Photohalos and the Quantum Origins of Irreversibility

Information in classical, Newtonian physics was immaterial—it was about physics, but it was not a part of physics. Its acquisition did not influence states or the *reversible* dynamics of classical systems. Trajectories in phase space were unaffected by the information transfer. In particular, classical dynamics involved in the measurement process could be reversed even when the outcome of the measurement was recorded. By contrast (and as we have seen in Section 2) copying of a quantum measurement outcome precludes its reversal.

How could an effectively classical realm of our fundamentally quantum Universe follow so respectfully laws that are at odds with the epiontic nature of quantum states? What detaches information about the states from these states so that they can follow classical reversible dynamics, oblivious to what is known about them?

Platonic ideal of the separation of existence and the detached information about what exists would be fulfilled if the information in the photohalo would have no effect on the energy or momentum of the core, but would still decohere it efficiently enough to assure effective classicality of its state. Photohalos exert no (easily) noticeable influence on trajectories of the pointer states of the core. However, cores can interact indirectly with other cores (e.g., via gravitational forces) or scatter elastically (billiard balls) or inelastically. Consequences of such interactions on the motion or the state of the core or on its properties (e.g., merger of the cores in inelastic scattering) will be reflected in the photohalo.

When core is sufficiently massive, emission or scattering of photons has negligible effect on its momentum (although photohalo will still decohere its quantum state very efficiently). How heavy the core should be to make such an approximation accurate depends on the energy of the emitted or scattered photons. Planets are clearly sufficiently massive, and so are billiard balls or even dust grains [21,23,174]. Depending on detailed criteria, one might even classify fullerenes decohered by radiation they emit [180] as photoextantons.

Irreversibility associated with the wavepacket collapse is a natural consequence of the photohalo. As soon as a minute fraction of the photohalo escapes, the 'reduction of the state vector' becomes irreversible, since the escaped photons are gone for good, And, as we have seen in Section 2 (and unlike in the classical realm) presence of even a single record of the measurement outcome—of the pointer state—anywhere in the Universe precludes reversal of the evolution that would bring back superpositions. Thus, the "in principle" reversibility of the equations responsible for inscribing, e.g., 10<sup>14</sup> copies of the location of the dielectric sphere on the sunlight per second we have discussed earlier is trumped by relativistic causality: Reversal becomes impossible in principle as soon as even one such copy runs off to infinity with the speed of light, never to be turned back19.

The information about the classical states (of the cores) becomes detached: Separation of the states of the cores from the information about them is how extantons account for the emergence of the objective classical reality. This mechanism is also the uniquely quantum origin of irreversibility in our Universe.

Photohalo may not be the only environment. However, in our Universe interactions depend on distances. Therefore, localized pointer states imprinted on the photohalo are left intact also by the other environments that may be contributing to decoherence but—like air—are not as useful as witnesses.

#### *5.2. Quantum Darwinism and the Existential Interpretation*

States in classical physics were "real": Their *objective existence* was established operationally—they could be found out by an initially ignorant observer without getting perturbed in the measurement process. Hence, they existed independently of what was known about them.

<sup>19</sup> One can deduce Boltzmann's H-theorem and the Kolmogorov-Sinai entropy increase rate by appealing to such considerations (Zurek and Paz, 1994 [66]; 1995 [21]; Zurek, 1998 [23]), but this subject as well as envariant origin of thermodynamics that emerges along with Born's rule due to the coupling of systems to their environments (Deffner and Zurek, 2016 [194]; Zurek, 2018 [195]) are beyond the scope of this review.

Information was, by contrast, "not real". This was suggested by the immunity of classical states to measurements, and by the fact that, in Newtonian dynamics, information about an evolving system was of no consequence for its evolution. Information was what the observer knew subjectively, a mere shadow of the real state, irrelevant for physics.

This dismissive view of information ran into problems when the classical Universe of Newton confronted thermodynamics. Clash of these two paradigms led to Maxwell's demon, and is echoed in the discussions of the origins of the arrow of time.

The specter of information was and still is haunting physics. The seemingly unphysical record state was beginning to play a role reserved for the real state!

Quantum states are epiontic. They combine information and existence and (like photons or electrons that can be wave-like or particle-like) they can reveal one or the other aspect of their nature depending on circumstances.

We have just seen how, in the quantum setting, in extantons, existence and information about existence intertwine. The state known to observers is defined and made objective by what is known about it—by the information observers can access. "It from bit" comes to mind (Wheeler, 1990) [196].

The main new ingredient is the dramatic upgrade of the role of the environment. It has information—multiple records of S—and is willing to reveal it. It acquires information about the system while causing decoherence and einselection, but—and this is the upgrade— it acts as a communication channel.

In classical Newtonian settings information might have been dismissed as unphysical as it had no significance for dynamics. But *information is physical* (Landauer, 1991) [197]. Moreover, *there is no information without representation*—information must reside somewhere (e.g., in the photohalos). And the presence and availability of such evidence (objective because there are plenty of records of the pointer state) has its legal consequences.

The role of E in quantum Darwinism is not that of an innocent bystander, who simply reports what has happened. Rather, the environment is an accomplice in the "crime" of selecting and transforming fragile epiontic quantum states into robust, objectively existing classical states. Objective existence has its price: Environment—induced decoherence invalidates quantum principle of superposition, leading to einselection—to censoring the Hilbert space. Information transfer associated with decoherence selects preferred pointer states, and banishes their superpositions.

Moreover, testimony offered by the environment is biased—it depends on how (through what observable) E monitors S. Fragments of the environment—qmemes carried by the extanton halo—can reveal information only about the very same pointer states E has helped einselect.

Operational criterion for objective existence is the ability to find out a state without disturbing it. According to this operational definition, pointer states exist in more or less the same way their classical counterparts did: They can be found out without getting perturbed by anyone who examines one of the multiple copies of the record of S "on display" in the environment.

#### *5.3. From Quantum Core Postulates to Objective Classical Reality*

In search for the relation between quantum formalism and the real world we have weaved together several ideas that are very quantum to arrive at the *existential interpretation*. Its essence is the operational definition of objective existence of physical states: *To exist, a state must, at the very least, persist or evolve predictably in spite of the immersion of the system in its environment.* Predictability is the key to einselection, but persistence is only a necessary condition.

*Objective existence requires more: It should be possible to find out a state without perturbing it—without threatening its existence.* When that last desideratum is met, many observers will be able to reach consensus, a well—motivated practical criterion of objective existence.

Let us briefly recapitulate how objective existence arises in the quantum setting: We started with axioms (i) and (ii) that sum up mathematics of quantum theory: They impose the quantum principle of superposition, and demand unitarity, but make no connection with the "real world". Addition of predictability (via the repeatability postulate (iii), the only uncontroversial measurement axiom), and recognition that our Universe consists of systems (axiom (o)) leads to *preferred pointer states*. This is a new insight into the quantum origin of quantum discreteness. Predictability is the "root cause" of the wavepacket collapse and quantum jumps. It also justifies Hermitian nature of quantum observables and explains breaking of the unitary symmetry, the crux of the collapse axiom (iv).

Our next task was to understand *the origin of probabilities and Born's rule*, axiom (v). We have done this without appealing to decoherence (as this would have courted circularity in the derivation). Nevertheless, decoherence—inspired view is reflected in the envariant approach: To assign probabilities to pointer states we first had to show how to ge<sup>t</sup> decoherence without using tools of decoherence such as reduced density matrices and demonstrate that relative phases between outcomes do not matter.

Envariance provides the answer, but—strictly speaking—only for Schmidt states. We take this to mean that usual rules of the probability calculus will strictly hold for pointer states only after their superposition has been thoroughly decohered. Pointer states— Schmidt states coincidence is expected to be very good indeed: Probabilities one has in mind ultimately refer to pointer states of measuring or recording devices. These are usually macroscopic, so their interaction with the environment will quickly align Schmidt basis with pointer states. Born's rule (with all the consequences for the frequencies of events) follows.

Probabilities derived from envariance are objective: They reflect an objective—and experimentally testable—symmetry of the global state (usually involving the measured system S, the apparatus A, and its environment E). Before interacting with A or its E observer does not know the outcome, but will know the set of pointer states—the menu of possibilities. This ignorance reflects objective symmetries of the global state of SAE that lead to Born's rule.

Last question was the origin of *objective existence* in the quantum world: How can we find states of systems we encounter in our everyday experience without redefining them by our measurements?

We started by noting that in contrast to fragile arbitrary superpositions in the Hilbert space of the system, pointer states are robust.

Crucially, in the real world observers find out pointer states by letting natural selection take its course: Pointer states are the "robust species", adapted to their environments. They survive intact through such "environmental monitoring". More importantly, multiple records about S are deposited in E. They record pointer states, which are the "fittest"—they can survive the process of copying and so the information about them can multiply.

There is an extent to which "it had to be so": In order to make one more copy one needs to preserve the original. However, there is a more subtle part of this relation between decoherence, einselection, and quantum Darwinism. Hamiltonians of interaction that allow for copying of certain observables necessarily leave them unperturbed. This conspiracy was noticed early: It is the basis of the commutation criterion for pointer observables (Zurek, 1981; 1982) [24,25]: When **H**SE is a function of some observable **Λ** of the system, it will also necessarily commute with it, [**<sup>H</sup>**SE , **Λ**] = 0.

#### *5.4. Extantons and the Existential Interpretation*

Existential interpretation of quantum theory assigns "relatively objective existence" [6,7,36,198]—key to effective classicality—to widely broadcast quantum states. Objective existence is relative to the redundant records about what persists and (in that sense) exists— evidence of the pointer states imprinted on the environment.

Existential interpretation is obviously consistent with the relative states interpretation: Redundancy of the records disseminated throughout the environment suggests a natural definition of branches that are classical in the sense that an observer can find out a branch with indirect measurements and stay on it, rather than "cut off the branch he is sitting

on" with a direct measurement. This is more than einselection, and much more than decoherence, although the key ingredient—environment—is still key, and the key criterion is "survival of the fittest"— immunity of the pointer states to monitoring by E reflected also in the predictability sieve.

The role of E is, however, upgraded from a passive "quantum information dump" to a communication channel. Information deposited in E in the process of decoherence is not lost. Rather, it is stored there, in multiple copies, often—but not always—for all to see. The emphasis on information theoretic significance of quantum states cuts both ways: Environment—as—a—witness paradigm supplies operational definition of objective existence, and shows why and how pointer states can be found out without getting disrupted. However, it also shows that objective existence is not an intrinsic attribute of quantum states, but that it arises along with—and as a result of—information transfer from S to E.

Extantons combine the source of information (extanton core) with the means of its transmission (halo, often consisting of photons). Extanton is an extant composite entity with the object of interest in its core that is decohering and imprinting qmemes on the information-laden halo, part of the environment that pointer states of the core. Information about them is disseminated by the halos throughout the Universe.

Photohalos are especially efficient in such "advertising". Fragments of the halo intercepted by the observers inform them about the state of its core. Extanton cores persist as classical states would—independent of what is known about them. While they are not fundamental, they are fundamentally important for our perception of the quantum Universe we inhabit as a classical world. Extantons fulfill John Bell's desiderata for "beables".

#### *5.5. Decoherence and Information Processing*

Decoherence affects record keeping and information processing hardware (and, hence, information retention and processing abilities) of observers. Therefore, it is relevant for our consciousness, as agents' consciousness presumably reflects states and processes of their neural networks. As was already noted some time ago by Tegmark (2000) [65], individual neurons decohere on a timescale very short compared to, e.g., the "clock time" on which the human brain operates or other relevant timescales. Moreover, even if somehow one could initiate our information processing hardware in a superposition of its pointer states (which would open up a possibility of being conscious of superpositions), it would decohere almost instantly. The same argumen<sup>t</sup> applies to the present day classical computers. Thus, even if information that is explicitly quantum (that is, involves superpositions or entanglement) was inscribed in computer memory, it would decohere and (at best) become classical. (More likely, it would become random nonsense.)

It is a separate and intriguing question whether a robot equipped with a quantum computer could "do better" and perceive quantumness we are bound to miss. We note, however, that the relevance of this question for the subject at hand—i.e., why does our quantum Universe appear to us as a classical world devoid of quantum weirdness—is at best marginal. After all, as already noted, rapid decoherence in the neural networks of our brains precludes quantum information processing.

Moreover, if such a robot relied (as we do) on the fragments of the environment for the information about the system of interest, it could access only the same information we can access. This information is redundant—hence, classical (see Girolami et al., 2022 [142])—and quantum information processing capabilities would not help; only pointer states can be accessed through this communication channel.

Existential interpretation—as defined originally (e.g., Zurek, 1993 [6])—relied primarily on decoherence. Decoherence of systems immersed in their environments leads to einselection: Only preferred pointer states of a system are stable, so only they can persist. Moreover, when both systems of interest and means of perception and information storage are subject to decoherence, only the correlations of the pointer states of the measured systems with the corresponding pointer states of agent's memory that store the outcomes are stable—only such einselected correlations can persist.

Decoherence, one might say, "strikes twice": It selects preferred states of the systems, thus defining what can persist, hence, exist. It also limits correlations that can persist, so that the observer's memory or the apparatus pointer will only preserve correlations with the einselected states of the measured system when they are recorded in the einselected memory states.

This is really the already familiar discussion of the quantum measurement problem with one additional twist: Not just the apparatus, but also the systems of everyday interest to observers are subject of decoherence. Thus, the post-measurement correlations (when investigated using discord) must now be "classical-classical" using terminology of Piani, Horodecki, Horodecki, 2008 [135], while in quantum measurements only the apparatus side was guaranteed to be einselected (hence, certifiably classical).

#### *5.6. Quantum Darwinism and "Life as We Know It"*

Quantum Darwinism has a Darwinian name, but how Darwinian is it really? This is a vague question about nomenclature, and we shall pass it by. A related and more pointed question is: Does quantum Darwinism have any significance for the evolution of living organisms and for their survival? There are then two even more focused questions one can pose: Are the processes of imprinting information and passing it on (that are central to quantum Darwinism) relevant for natural evolution? Moreover, do living organisms take advantage of the proliferation of information about pointer states?

We note at the outset that natural selection and Darwinian evolution require multiple redundant records (e.g., as DNA). Redundant records imply (as we have seen already in Section 2) preferred set of states—preferred einselected classical basis. Therefore, natural selection and Darwinian evolution take place within a classical post-decoherence realm.

Darwinian evolution depends on the (nearly perfect) preservation of information and its propagation. DNA is disseminated by various means, but DNA molecules are thoroughly decohered carriers of information about an even more decohered parent organism.

The goals and the nature of natural evolution are somewhat different but the transfer of actionable information (see Section 2, Ref. [43]) is the essence of both the "original" and quantum Darwinism. Quantumness in a DNA "blueprint" is long gone, suppressed by decoherence, but the correlation between the pointer state and individual fragment F of E is also classical (as quantum discord is suppressed by surplus decoherence).

The key difference with quantum Darwinism appears to be the presence of the feedback in Darwinian evolution: It depends on the ability of the DNA to reproduce a copy of their progenitor so it can in turn produce more DNA. Evolutionary success therefore is judged not by redundancy in amplifying information in a single episode but through multiple generations of the organisms and their DNA memes. Iteration through multiple generations allows for variation and for natural selection. Fitness is still evaluated based on the ability to amplify, but in an iterative process.

The copies of DNA are imperfect—there are mutations. They are not deliberate—error correction guards against them–but they still happen. That imperfection in combination with the feedback is what allows for gradual change of the DNA and for the adaptation of the organisms driven by the natural selection and survival of the fittest.

By contrast, amplification and dissemination of quantum information appear to be enough to attain the goal of quantum Darwinism: They suffice to account for the emergence of objective classical reality. The ultimate fate of qmeme carriers such as photons is—at first sight—unimportant, and so, one might think, is the fate of the information: Once the state of the system or an event are made "objective", they are an element of reality, and that is it.

Or is it? After all, following their detection photons (and other carriers of information accessible through other senses that also deliver data about the Universe) are no longer needed. Nevertheless, there is the next generation of records—in the apparatus, in the retina, or in the brain of the observer. Such information repositories inherit the memes—the essence of the information detected by the senses.

That leads to further consequences: Agents (living organisms) react to data. Their actions (hence, the state of the part of the Universe affected by them) depend on what they have perceived. In the evolutionary context such actions are taken to optimize their chances of survival. Hence, evolutionary success in the original Darwinian sense depends on the ability to acquire and process information about extantons that is obtained through quantum Darwinian processes that starts with decoherence and proliferation of qmemes.

Redundancy facilitates accessibility and explains the objective nature of events, and, hence, emergence of the objective classical reality. Consequences of such events influence multiple generations of their records and have implications for their recipients: After all, actions taken by the living organisms are based on perceptions—on what was recorded. Knowledge of events that take place is essential for survival. This is feedback. It does not necessarily require conscious decisions (e.g., sunflowers following the sun are taking advantage of such feedback), but it allows, and indeed, calls for, adaptation. Feedback is also what allows for learning from experience. Thus, while natural selection and Darwinism can be analyzed without any reference to quantum goings on, it does involve steps that depend on decoherence and proliferation of information.

#### Seeing Is Believing

Our senses did not evolve to test and verify quantum theory. Rather, they evolved through natural selection where survival of the fittest played a decisive role. When there is nothing to be gained from prediction, there is no evolutionary reason for perception. Only classical states that pass through the predictability sieve and deposit redundant easily accessible records in their environment are robust and easy to access.

Quantum Darwinian requirement of redundancy appears to be built into our senses, and, in particular, into our eyesight: The wiring of the nerves that pass on the signals from the rods in the eye—cells that detect light when illumination is marginal, and that appear sensitive to individual photons (Nam et al., 2014) [199]—tends to dismiss cases when fewer than ∼7 neighboring rods fire simultaneously (Rieke and Baylor, 1998) [200]. Thus, while there is evidence that even individual photons can be (occasionally and unreliably) detected by humans (Tinsley et al., 2016) [201], redundancy (more than one photon) is needed to pass the signal onto the brain.

This makes evolutionary sense—rods can misfire, so such built-in veto threshold suppresses false alarms. Frogs and toads have apparently lower veto thresholds, possibly because they are cold-blooded, so they may not need to contend with as much noise, as thermal excitation of rods appears to be the main source of "false positives".

Quantum Darwinism relies on repeatability. As observers perceive outcomes of measurements indirectly—e.g., by looking at the pointer of the apparatus or at the photographic plate that was used in a double-slit experiment—they will depend, for their perceptions, on redundant copies of photons that are scattered from (or absorbed by) the apparatus pointer or the blackened grains of photographic emulsion. Thus, repeatability is not just a convenient assumption of a theorist: This hallmark of quantum Darwinism is built into our senses. And—as we have seen in Section 2—the discreteness of the possible measurement outcomes—possible perceptions —follows from the distinguishability of the preferred states that can be redundantly recorded in the environment [42,43].

What we are conscious of is then based on redundant evidence. Quantum Darwinist update to the existential interpretation is to demand that states exist providing that one can acquire redundant evidence about them indirectly, from the environment. This of course presumes stability in spite of decoherence (so there is no conflict with the existential interpretation that was originally formulated primarily on the basis of decoherence [6,36]) but the threshold for the state to objectively exist is nevertheless raised.

#### *5.7. Bohr, Everett, and Wheeler*

This paper has largely avoided issues of interpretation, focusing instead on consequences of quantum theory that are "interpretation independent", but may constrain

interpretational options. We have been led by quantum formalism to our conclusions, but they are largely beyond interpretational disputes. Our "existential interpretation" is in that sense not an interpretation—it simply points out the consequences of quantum formalism and some additional rudimentary assumptions. It recognizes that quantum states are epiontic: Like photons or electrons that can act as waves or particles, quantum states can exhibit epistemic or ontic side of their nature, depending on circumstances. In contrast to Bohr (who regarded them as purely epistemic) of Everett (who thought the universal state vector was ontic) they can perform either as robust elements of reality or as information carriers. These two roles are complementary: both are essential for extantons.

It is nevertheless useful to see how the two best known interpretations of quantum theory—Bohr's "Copenhagen Interpretation" (CI) and Everett's "Relative State Interpretation" (RSI) fit within the constraints that we have derived above by acknowledging the paramount role of the environment. To anticipate the conclusion, we quote John Archibald Wheeler (1957) [12], who—comparing CI with RSI—wrote: "*(1) The conceptual scheme of "relative state" quantum mechanics is completely different from the conceptual scheme of the conventional "external observation" form of quantum mechanics and (2) The conclusions from the new treatment correspond completely in familiar cases to the conclusions from the usual analysis*".

Bohr insisted on preexistence of the classical domain of the Universe to render outcomes of quantum measurements firm and objective. Quantum Darwinism accomplishes that goal: Decoherence takes away quantumness of the system, but a system that is not quantum need not be immediately classical: Objective nature of events and, above all, of extantons arises as a result of redundancy. Consensus about what happened and what exists is reached only in presence of large redundancy. Large redundancy yields a very good approximation of "the classical", like finite many-body systems that have a critical point marking a phase transition which is, strictly speaking, precisely defined only in the infinite size limit. Indeed, Quantum Darwinism might be regarded as a purely quantum implementation of the "irreversible act of amplification" that was such an important element of CI.

Physical significance of a quantum state in CI was purely epistemic (Bohr, 1928 [8]; Peres, 1993 [202]; Fuchs and Peres, 2000 [203]; Fuchs and Schack, 2013 [204]): Quantum states were only carriers of information—they correlated outcomes of measurements. Only the classical part of the Universe existed in the sense we are used to.

In the account we have given here there are really several different sorts of states. There are pure states—vectors in the Hilbert space. However, there are also objectively unknown states defined by the "Facts 1–3" of Section 3. They describe a subsystem and derive from the pure state of the whole using envariance, the symmetries of entanglement. And there are states defined through the spectral decomposition of a quantum operator. Last not least, there are decoherence-resistant pointer states that retain correlations and allow for prediction based on indirect measurements, as they are best known to and widely advertised by the environment.

In contrast to CI that split the Universe into only two domains—quantum and classical—we have seen that classicality is a matter of degree, and a matter of a criterion. For example, objectivity (which is in a sense the strongest criterion) is attained only in the limit of large redundancy. It is clear why this is a good approximation in the case of macroscopic systems. However, it is also clear that there are intermediate stages on the way from quantum to classical, and that a system can be no longer quantum but be still far away from classical objective existence.

There are two key ideas in Everett's writings. The first one is to let quantum theory dictate its own interpretation. We took this "let quantum be quantum" point very seriously. The second message (that often dominates in popular accounts) is the Many Worlds mythology. In contrast "let quantum be quantum" it is less clear what it means, so—in the opinion of this author—there is less reason to take it at face value.

It is encouraging for the relative states point of view that the long - standing problem of the origin of probabilities has an elegant solution that is very much "relative state" in the spirit. We have relied on symmetries of entangled states. This allowed us to derive objective probabilities for individual events. We note that this is the first such *objective* derivation of probabilities not just in the quantum setting, but also in the history of the concept of probability.

Envariant derivation of Born's rule is based on entanglement (which is at the heart of the relative states approach). We have not followed either proposals that appeal directly to invariant measures on the Hilbert space [10,11], or attempts to derive Born's rule from frequencies by counting many worlds branches (Everett, 1957 [11]; DeWitt, 1970 [14], 1971 [15]; Graham 1973 [72], Geroch, 1984 [73]): As noted by DeWitt (1971) [15] and Kent (1990) [38], Everett's appeal to invariance of Hilbert space measures makes no contact with physics, and makes less physical sense than the mathematically rigorous proof of Gleason (1957) [84]. In addition, frequentist derivations are circular—in addition to counting branches they implicitly use Born's rule to dismiss "maverick universes".

#### *5.8. Closing Remarks*

The emergence of the classical world from within our quantum Universe is a difficult problem. The traditional (and still occasionally encountered) expectation—that it will be somehow resolved by a single new idea—did not pan out. Rather, several interdependent new insights were needed to account for quantum jumps, for the appearance of the collapse, for the preferred pointer states, for probabilities, and for the perception of the objective reality—for all the familiar ingredients of 'the classical'.

Our strategy was to avoid purely interpretational issues and to focus instead on technical questions. They can often be answered in a definitive manner. In this way, we have gained new insights into selection of preferred pointer states that go beyond decoherence, found out how probabilities arise from entanglement, and discovered how objectivity follows from redundancy.

All of that fits well with the relative states point of view and with a similar although less Everettian approach of e.g. Rovelli [205]. There are also questions that are related to the technical developments we have discussed but are, at present, less definite—less technical—in nature. We signal some of them here.

The first point concerns the nature of quantum states, and its implications for the interpretation. One might regard states as purely epistemic (as did Bohr) or attribute to them "existence". Technical results described above sugges<sup>t</sup> that truth lies somewhere between these two extremes, and these two aspects are complementary in the sense of Bohr. It is therefore doubtful whether one is forced to attribute "reality" to all of the branches of the universal state vector. Indeed, such a view combines a very quantum idea of a state in the Hilbert space with a very classical literal ontic interpretation of that concept.

These two views of the universal state vector are incompatible. As we have emphasized, an unknown quantum state cannot be found out. It can acquire objective existence only by "advertising itself" in the environment20. This is obviously impossible for the universal state vector as the Universe has no environment.

The insistence on the absolute existence of the universal state vector as an indispensable prerequisite in interpreting quantum theory brings to mind the insistence on the absolute time and space. They seemed indispensable since Newton, ye<sup>t</sup> both became relative and observer-dependent in special relativity. The absolute universal state vector may be—like the Newtonian absolute space and time, or for that matter, like isolated systems of classical physics—an idealization that is untenable in the quantum realm.

As noted in Section 2.6, observations reset the state of the Universe—they re-adjust initial conditions relevant for the future of the observer who has the record of their outcome.

<sup>20</sup> The theorem of Pusey, Barrett, and Rudolph (2012) [206] offers an interesting perspective on the existence of quantum states. An intriguing observation by Frauchiger and Renner (2018) [207] raises the "Wigner Friend" question of whether it is possible to employ quantum theory to model complex systems that include agents who are themselves described using quantum theory. The concerns raised in that paper are allayed in presence of a decohering environment [208,209].

The rest of the state vector becomes unreachable. Thus, the relative state reading of Everett rests on a safer foundation than the "Many Worlds" alternative, which is—as Wheeler (1957) [12] pointed out—compatible with the consequences of Bohr's views. There is nothing in the relative state interpretation that would elevate all the branches—especially the ones that "did not happen" to the observer—to the same ontological status as the one that is consistent with the observer's perceptions.

Objective existence can be acquired (via quantum Darwinism) only by a relatively small fraction of all degrees of freedom within the quantum Universe: The rest is needed to "keep records". Extantons (with a classical core and a vast information carrying halo) are a good illustration. Clearly, there is only a limited (if large) memory space available for this at any time. This limitation on the total memory available means that not all quantum states that exist or quantum events that happen now "really happen" in the sense of the existential interpretation: Only a small fraction of what occurs will be still available from the records in the future. So the finite memory capacity of the Universe implies indefiniteness of the present and impermanence of the past.

To sum it up, one can extend John Wheeler's dictum "the past exists only insofar as it is recorded in the present". This is one of the topics that could have been discussed in this review, but was not. Fortunately, Ref. [210] provides an introduction to the quantum Darwinian view of the emergence of objective past in our Universe.

Consensus can be reached about objective histories, a more selective set than histories that are just consistent. This may help settle the so-called class selection problem—that is, selecting candidates for physically relevant histories from among the set of all consistent histories—which is one of the central unresolved issues of the consistent histories interpretation of quantum physics (see Griffiths, 1984 [118]; 2002 [119]; Gell-Mann and Hartle, 1990 [120]; 1993 [121]; 2012 [211]; Omnès, 1992 [122]; Griffiths and Omnès, 1999 [123]; Halliwell, 1999 [212] for a selection of points of view on this subject).

As long as we are discussing subjects that are related, but beyond the scope of this review, we should mention the so-called "strong quantum Darwinism" and "spectrum broadcast structures". They share with quantum Darwinism the main idea—that the objectively existing states are reproduced in many copies and can be independently accessed without a disturbance. They differ in the details of its implementation (e.g., by invoking different measures of what and how much the environment fragments know about the system). Fortunately, there are papers by Horodecki, Korbicz, and Horodecki (2015) [213], Le and Olaya-Castro (2018; 2020) [214,215], and there is a recent review by Korbicz (2021) [216]. These references discuss 'strong quantum Darwinism" and "spectrum broadcast structures" and provide useful entries into the relevant literature.

Some of the quantum information theoretic tools used to implement criteria for objective existence have also been 'beyond the scope'. In particular, quantum Chernoff bound has been successfully used in models [146], and we expect it to be useful in the future. Fortunately, a recent paper by Zwolak (2022) [141] discusses how quantum Chernoff bound can be used to analyze quantum Darwinism (e.g., in the c-maybe model [131]).

These results confirm that there is no need for a unique best quantum informationtheoretic tool to study quantum Darwinism. We have already seen that Shannon and von Neumann mutual entropy, and various half-way quantities including Holevo *χ* (as well as Chernoff bound) lead to useful estimates of redundancy.

In presence of sufficiently large redundancy the precise number of records in the environment does not matter: As long as redundancy is large, emergence of the consensus— hence, objective classical reality—will be confirmed by perceptions of observers.

All of our conclusions followed from the core quantum postulates (o)–(iii) and the insight that quantum states are epiontic. These two aspects of their nature are complementary (like the wave and particle traits of photons and electrons). They both play a role in the emergence of objective classical reality with information and existence seemingly separated – the state of affairs we are accustomed to in our everyday experience.

In the field known for divergent views and lively discussions it is too early to expect consensus on which of the mysteries of the quantum-classical correspondence have been explained, but deducing inevitability of the discreteness and of quantum jumps, a simple and physically transparent derivation of Born's rule, and—above all—accounting for the emergence of objective existence from the fragile quantum states mark a significant progress. Moreover, there are no obvious obstacles in pursuing the program outlined here to provide an even fuller account of the interrelation of the epistemic and ontic aspects of quantum states, or to incorporate evolving quantum states—thus exploring emergence of objective histories—within the framework of quantum Darwinism.

**Funding:** This research was supported by DoE under the LDRD program at Los Alamos, and, in part, by the John Templeton Foundation as well as by FQXi, including most recently the Foundational Questions Institute's 'Consciousness in the Physical World' program, administered in partnership with the Fetzer Franklin Fund.

**Acknowledgments:** I would like to thank Scott Aaronson, Dorit Aharonov, Andreas Albrecht, Robert Alicki, Andrew Arrasmith, Alain Aspect, Alexia Auffeves, Howard Barnum, Charles Bennett, Robin Blume-Kohout, Adan Cabello, Adolfo del Campo, Carlton Caves, Lukasz Cincio, Patrick Coles, Fernando Cucchietti, Diego Dalvit, Bogdan Damski, Luiz Davidovich, Sebastian Deffner, Jacek Dziarmaga, Christopher Fuchs, Bartlomiej Gardas, Murray Gell-Mann, Davide Girolami, Terrence Goldman, Philippe Grangier, Jonathan Halliwell, Serge Haroche, James Hartle, Theodor Hänsch, Ryszard Horodecki, Pawel Horodecki, Michal Horodecki, Karol Horodecki, Bei-Lok Hu, Christopher Jarzynski, Fedor Jelezko, Emmanuel Knill, Chris Monroe, Raymond Laflamme, Seth Lloyd, Shunlong Luo, Eric Lutz, Harold Ollivier, Don Page, Juan Pablo Paz, Roger Penrose, David Poulin, Hai-Tao Quan, Jean-Michel Raimond, Marek Rams, Jess Riedel, Carlo Rovelli, Avadh Saxena, Wolfgang Schleich, Maximilian Schlosshauer, Mark Srednicki, Leonard Susskind, Maximilian Tegmark, Akram Touil, William Unruh, Lev Vaidman, Yoshihisa Yamamoto, Bin Yan, David Wallace, Christoph Wetterich, David Wineland, William Wootters, Jakub Zakrzewski, Anton Zeilinger, Michael Zwolak, Marek Zukowski and Karol ˙ Zyczkowski for stimulating discussion. Special thanks go to Sebastian ˙ Deffner, Jess Riedel, and Akram Touil who offered helpful comments and invaluable assistance in preparing this review.

**Conflicts of Interest:** The author declares no conflict of interest.
