On Magnetic Models in Wavefunction Ensembles

De Carlo, Leonardo; Wick, William D.

doi:10.3390/e25040564

Open AccessArticle

On Magnetic Models in Wavefunction Ensembles

by

Leonardo De Carlo

^1,2,*,†

and

William D. Wick

^3,†

¹

Scuola Normale Superiore, Piazza dei Cavalieri, 7, 56126 Pisa, Italy

²

Department of Economics and Finance, Luiss Guido Carli, Viale Romania, 32, 00197 Rome, Italy

³

Independent Researcher, Seattle, WA 98119, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Entropy 2023, 25(4), 564; https://doi.org/10.3390/e25040564

Submission received: 22 February 2023 / Revised: 14 March 2023 / Accepted: 15 March 2023 / Published: 25 March 2023

(This article belongs to the Collection Advances in Applied Statistical Mechanics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In a wavefunction-only philosophy, thermodynamics must be recast in terms of an ensemble of wavefunctions. In this perspective we study how to construct Gibbs ensembles for magnetic quantum spin models. We show that with free boundary conditions and distinguishable “spins” there are no finite-temperature phase transitions because of high dimensionality of the phase space. Then we focus on the simplest case, namely the mean-field (Curie–Weiss) model, in order to discover whether phase transitions are even possible in this model class. This strategy at least diminishes the dimensionality of the problem. We found that, even assuming exchange symmetry in the wavefunctions, no finite-temperature phase transitions appear when the Hamiltonian is given by the usual energy expression of quantum mechanics (in this case the analytical argument is not totally satisfactory and we relied partly on a computer analysis). However, a variant model with additional “wavefunction energy” does have a phase transition to a magnetized state. (With respect to dynamics, which we do not consider here, wavefunction energy induces a non-linearity which nevertheless preserves norm and energy. This non-linearity becomes significant only at the macroscopic level.) The three results together suggest that magnetization in large wavefunction spin chains appears if and only if we consider indistinguishable particles and block macroscopic dispersion (i.e., macroscopic superpositions) by energy conservation. Our principle technique involves transforming the problem to one in probability theory, then applying results from large deviations, particularly the Gärtner–Ellis Theorem. Finally, we discuss Gibbs vs. Boltzmann/Einstein entropy in the choice of the quantum thermodynamic ensemble, as well as open problems.

Keywords:

quantum magnetism; wavefunction ensembles; large deviations

1. Motivations, Results and Organization of the Work

In recent decades not only has entanglement been observed both at the microscopic level [1] and in “semi-macroscopic” objects [2,3,4] (even at room temperature [5]), but remarkably also superposition states became fundamental to describe not only atomic-level observations [6,7,8] but also large atom complexes [9,10,11,12], for which they observed “cat” states, i.e., states where the center of mass (COM) displayed spatial dispersion or interference patterns. These developments are leading to an experimental program to locate the classical-quantum boundary and the underlying mechanism [13].

In our view, the above-mentioned phenomena can be understood only by considering the wavefunction as a configuration of matter, encoding a physical state in a high-dimensional Hilbert space whose effect is felt in the ordinary three-dimensional space, as Schrödinger originally intended [14]. But this view also implies paradoxes, as pointed out by Schrödinger himself [15]: the unrestricted linearity of quantum mechanics produces situations where macroscopic objects exhibit COM dispersion. (Maybe this was the biggest handicap of his theory and view that particles are not individuals.) On the other side the unrestricted linearity of quantum mechanics is essential for many of the successful predictions at the atomic–nuclear level, but today there is a shared opinion [13] that quantum mechanics could have a size limit to its application. This idea and the mentioned observations about superposition states raise the possibility of deviations from Schrödinger’s equation. This research line was encouraged by S. Weinberg in a recent popular account [16] where he exposed the struggle of quantum mechanics with the macroscopic world known as the “measurement problem”. He explained that modifications have to be undetectable at atomic–nuclear level and at the same time “eliminate” macroscopic superpositions without granting the apparatus any special status. We mention that Weinberg in 1989 developed a mathematical framework [17] to test non-linear generalizations of Schrödinger’s equations at the atomic–nuclear level. Subsequent experiments appeared to rule out the models he examined [18]. (However, models with wavefunction energy escape this net [19].)

Here we consider wave-mechanical configurations for simple spin models with the goal of constructing Gibbs statistical ensembles of magnets containing all the states of the Hilbert space of quantum mechanics. We found it relevant to consider a modification (small for few degrees of freedom and very large for many degrees of freedom in a sense to clarify later) of quantum mechanics represented by an energy giving a “cost” to superpositions, that we call “wavefunction energy” (WFE), thus eliminating the macroscopic dispersion that afflicted the original wave mechanics of Schrödinger. The corresponding dynamical framework is the Hamiltonian one of Weinberg [17] but with a very important difference, observed in [19], that implies conservation of the norm during the evolution. This general set-up and the principles of this proposal will only be briefly described in Section 3.1 since dynamical problems are not in the purview of the present paper.

2. Materials and Mathematical Results

In standard quantum mechanics, one first diagonalizes the Hamiltonian to discover the eigenvalues; call them

{E_{n}}

; then, if the energy is ever measured, the result would be to “find” one of the

E_{n}

, and the system jumps into the corresponding eigenstate. (The realist assumption that a system actually exists in one of the eigenstates leads to contradictions whenever certain pairs of observables are involved, such as position and momentum, or energy and time. Hence the reliance on textual formulations involving “finding” rather than “being”. (John Bell made these points eloquently, especially in his last paper “Against ‘measurement”’ [20].)) Thus a thermal ensemble is constructed using these energy-measurement “outcomes” of form:

{[F]}_{β} = Z^{- 1} \sum_{n} exp {- β E_{n} / k T} < ψ_{n} | F | ψ_{n} >,

(1)

where

β

is the inverse temperature and

ψ_{n}

is the eigenstate corresponding to eigenvalue

E_{n}

in the Hilbert space. In the special case of the Heisenberg model, with Hamiltonian operator expressed in terms of Pauli matrices, where only the z-axis interacts

H = - 2 \sum_{i, j; n . n .}^{N} σ_{z; i} σ_{z; j}

, the model reverts to the Ising model: interpreting the classical spins

S_{i}

as pointing up or down the z-axis and as labels of Hilbert space vectors, e.g., in Dirac notation,

| S_{1}, \dots, S_{N} >

, they already define an eigenbasis of H.

In our first attempt to treat wave-mechanical spin models, we restrict to N spins

S_{1}, \dots, S_{N}

taking values

\pm \frac{1}{2}

and assume interaction only along the z-axis. The state is not now a spin configuration but a wavefunction on such configurations, i.e., of form

ψ (S_{1}, \dots, S_{N})

where each value is a complex number. Thus

ψ

lies in a space of

2 . 2^{N}

real dimensions (with the notation

2 . 2^{N}

we mean

2^{N}

dimensions because of the possible classical configurations and with

2 . 2^{N}

we double the dimensions because of the real part and the complex one of a state

ψ

). The energy of a wavefunction

ψ

is what is called in standard quantum mechanics “the expected energy”. In particular, the energy

E (ψ)

contains contributions from all spin configurations (“superpositions”). The Gibbs canonical ensemble is not based on a product measure.

The probability generating function we wish to evaluate now takes the form:

Z_{N} (β; λ_{1}, \dots, λ_{N}) = \int_{| | ψ | | = 1} d ψ exp \{- β E (ψ) + Λ_{N} (ψ)\},

(2)

where

{| | ψ | |}^{2} = \sum_{S} {| ψ (S_{1}, \dots, S_{N}) |}^{2}

with S a classical configuration of spins. The integral is over the unit sphere in the above-stated number of dimensions. The energy is

E_{N} (ψ) = - < ψ | E (S) | ψ > = - \sum_{S} {| ψ (S_{1}, \dots, S_{N}) |}^{2} E (S),

(3)

with

E (S)

a classical spin magnetic energy of order N in the configuration S, as Curie–Weiss

\frac{1}{N} {(\sum_{i = 1}^{N} S_{i})}^{2}

or Ising

\sum_{n . n .} S_{i} S_{j}

(n.n. means nearest neighbor sites i and j), and

Λ_{N} (ψ; λ_{1}, \dots, Λ_{N}) = < ψ | \sum_{i = 1}^{N} λ_{i} S_{i} | ψ > = \sum_{S} {| ψ (S_{1}, \dots, S_{N}) |}^{2} \sum_{i = 1}^{N} λ_{i} S_{i} .

(4)

The sum

\sum_{S}

in (3) and (4) is the sum over all spin configurations.

The motivation for the base integral over normalized wavefunctions differs from the Copenhagenist, in which

ψ

represents a probability distribution on spin configurations. Rather, Schrödingerist reasoning is that we do not want to compare states on the basis of normalizations, e.g., one with norm 0.1 and another with norm 100, but solely by their respective energies. Moreover, both linear quantum mechanics and suitable nonlinear generalization [19], preserve the norm, and the latter raises the possibility that the flow is ergodic or chaotic, with a possible interest in a classical justification for the Second Law of Thermodynamics. Thus we would not wish to drop the normalization.

The objective of this work is to understand whether it is possible to construct physically meaningful wave-mechanical magnetic models and to explore the implications. In Section 3 we consider distinguishable particles, namely we take the full

2 . 2^{N}

-sphere as state space without imposing any exchange symmetry on the wavefunctions. In this case the models will not magnetize (except at 0 temperature):

{lim}_{N \to \infty} {[{| m (ψ) | > ϵ}]}_{β} = 0

for any

ϵ

, where

m (ψ) = < ψ | \sum S_{i} {/ N | ψ > = \sum | ψ (S) |}^{2} (\sum S_{i}) / N

and

{[\cdot]}_{β}

is the ensemble thermal average.

In Section 3.1 we introduce a generalized Hamiltonian framework, as developed in [17,19], assuming a non-linearity that penalizes large superpositions in large objects. This non-linearity makes energetically impossible configurations corresponding to macroscopic cat states. (For a modern mathematical definition of cat states see [19].) Even in this case, with distinguishable particles no phase transition appears (except at zero temperature).

In Section 4 we specialize to the mean field case with exchange symmetry. With this choice, the dimensionality of the Hilbert state space is greatly reduced and the model manifests magnetization at finite temperatures, when wavefunction energy is included. (In this case but without WFE, showing that there is no phase transition assumes some properties of a certain two-argument function for which we provide computer-generated evidence). To our knowledge, this is the simplest wavefunction model in which a phase transition appears.

In Section 5, we discuss the choice of ensembles with uniform measure, as opposed to other possibilities. In particular we introduce a base measure that includes a quantum Boltzmann–Einstein entropy and discuss how the usual ensembles should be recovered in a suitable limit.

The paper is organized as follows: in each section the primary results are described but the mathematical derivations are relegated to Appendix A. Our principle technique involves transforming the problem to one in probability theory, then applying results from large deviations, particularly the Gärtner–Ellis Theorem.

Definition of phase transition. In classical spin magnetic models one is interested in whether in the thermodynamic limit (

N \to \infty

) there is a spontaneous magnetization at sufficiently low temperature. This means the following. Define the magnetic field generated by the N spins

S_{i}

as:

M = \sum_{i = 1}^{N} S_{i} .

(5)

From the up–down symmetry of

E (S)

one deduced that

{[M]}_{β} = 0

, so the average is not the relevant question. There are two ways to continue. One can introduce + boundary conditions, say by fixing

S_{1} = S_{N} = + \frac{1}{2}

, breaking the symmetry, and inquire as to whether, below a critical temperature,

lim_{N \to \infty} {[M]}_{β} > 0 .

(6)

Another approach is to maintain the “free” boundary conditions and inquire into the behavior of the field per spin,

M / N

; does it exhibit Central Limit Theorem (CLT) behavior; that is, ≈ O

(1 / \sqrt{N})

? This implies

{[| M |]}_{β} \approx \sqrt{N}

. If one interprets “O

(N

)” as macroscopic, one concludes that spontaneous magnetization did not appear. The CLT would hold if the correlations decayed exponentially:

{[S_{i} S_{j}]}_{β} \approx exp {- ξ d (i, j)};

(7)

(for some constant

ξ > 0

, where

d (i, j)

denotes distance between the lattice points labeled i and j) but not if correlations decayed more slowly or even were bounded away from zero. In that case, assuming below a critical temperature the dispersion

{[{(M / N)}^{2}]}_{β} \approx constant > 0,

(8)

one would conclude that the limiting ensemble was a mixture of magnetized states and a phase transition had occurred. Observing the behavior (8) is the concept of phase transition we use throughout the paper, where we replace

M / N

and

{(M / N)}^{2}

, respectively, with

m (ψ)

and

m^{2} (ψ)

.

3. Models with Distinguishable Particles

The starting point, see Appendix A.3, is to replace the random point on the sphere in (2) with

ψ (S) ⟶ \frac{χ (S)}{\sqrt{{\sum | χ (S) |}^{2}}},

(9)

where

{χ (S) : S = (S_{1}, \dots, S_{N})}

are

2^{N}

complex, or 2.

2^{N}

real, numbers distributed as i.i.d. standard (mean zero, norm one) Gaussians. This procedure defines a measure on the sphere and the rotational symmetry of the standard Gaussian measure (covariance matrix: the identity) then assures that we are considering the same distribution of wavefunctions. By this transformation and various lemmas, we will reduce the problems of next sections to finding probabilities of rare events (called “large deviations” theory). Let

\begin{matrix} m (ψ) = < ψ | \sum S_{i} {/ N | ψ > = \sum | ψ (S) |}^{2} (\sum S_{i} / N); \\ {[F (ψ)]}_{β} = \int_{| | ψ | | = 1} d ψ exp {- β E_{N} (ψ)} F (ψ) / Z_{N} . \end{matrix}

(10)

Theorem 1.

Given a magnetic spin energy

E (S)

of order N, e.g., Curie–Weiss or Ising, in any dimension for any finite temperature (except T = 0) there is no magnetization in the thermodynamic limit:

lim_{N \to \infty} {[{| m (ψ) | > ϵ}]}_{β} = 0,

(11)

for any

ϵ > 0

.

For the proof, see Math Appendix A.3. This case does exhibit zero-temperature magnetization. Consider Curie–Weiss or Ising and

β \to \infty

, the probability distribution becomes concentrated on states of minimal energy. These have the form:

ψ (θ, α) = cos (θ) ψ_{+} + sin (θ) e^{\sqrt{- 1} α} ψ_{-},

(12)

where

ψ_{\pm}

denotes a wavefunction concentrated on all up, respectively all down, spins. Hence at

T = 0

the magnetization takes the form:

\begin{matrix} \int_{0}^{π / 2} d θ f (θ) < ψ (θ, α) | \frac{M}{N} | ψ (θ, α) >^{2} = \frac{1}{4} \int_{0}^{π / 2} d θ f (θ) {cos}^{2} (2 θ) > 0 . \end{matrix}

3.1. Models with Wavefunction Energy

We consider a modification of the

E_{N} (ψ)

adding nonquadratic terms into the wavefunction. We will refer to these terms as representing “wavefunction energy” (WFE). Our principles to modify the usual quantum Hamiltonian are:

(a): The modification has to be negligible at the microscopic level but becomes large enough at the macroscopic level to block dispersion (macroscopic dispersion of the spins is better explained later);
(b): The norm of $ψ$ and the energy of a closed system have to be conserved (the same for momentum);
(c): No extra terms are added to the evolution of the center of mass $X (ψ) = < ψ | \frac{1}{N} \sum_{N}^{j = 1} x_{j} | ψ >$ of a closed system.

These characteristic are satisfied when the evolution of the quantum state is

i ℏ \frac{\partial ψ}{\partial t} = \frac{\partial}{\partial ψ^{*}} E (ψ) .

(13)

with

E (ψ) = E_{Q M} (ψ) + E_{W F E} (ψ),

(14)

E_{W F E} (ψ) = w N^{2} D (ψ), D (ψ) : = < ψ | {(\overset{N}{\sum_{i = 1}} O_{i} - < ψ | \overset{N}{\sum_{i = 1}} O_{i} | ψ >)}^{2} | ψ >

(15)

where w is a very small constant and the

O_{i}

’s are self-adjoint operators diagonal in the same base as

H_{Q M}

, see the proofs in [19]. When in (14) we take

w = 0

the energy

E (ψ)

becomes

〈 ψ | H_{Q M} | ψ 〉

where

H_{Q M}

is the usual quantum Hamiltonian given by the sum of the kinetic and potential terms; taking the derivative with respect to

ψ^{*}

in (13) one finds the Schrödinger’s equation.

In [19,21] one author explored nonlinear Schrödingerist quantum mechanics as a potential solution to the Measurement Problem: the addition of nonquadratic terms to the Hamiltonian, i.e., the WFE, were proposed to block spatial dispersion in macroscopic objects (and otherwise be too small to matter). The WFE in the present work will be a functional of the wavefunction addressing dispersion of the total spin, instead of center of mass or total momentum (as in [22]). This means

O_{i} = S_{i}

. This choice just reflects the lack of spatial coordinates in our models of magnetism. The straightforward generalization of (16) to models with spatial coordinates is

O_{i} = L_{i} + S_{i}

in (15) and [w] = kg

^{- 1}

m

^{- 2}

. The original proposal [23] is on the momentum

P_{i}

. The general idea behind (15) derived in part from an interview with Hans Dehmelt, see chapter 15 in [24]. He explained to one of the authors that classical physics applies when for some reason wavefunctions cannot spread out (forming cats). For instance, classical-like orbits in electromagnetic traps are due to the trapping fields. The mechanism (15) proposes a universal self-trapping forbidding macroscopic dispersion, so to speak. (The ideal experimental test to falsify WFE is described in [22]).

For our statistical ensembles the first term

E_{Q M} (ψ)

now incorporates the usual spin–spin and spin–external field interactions, while the second

E_{W F E} (ψ)

takes the form:

\begin{matrix} E_{W F E} (ψ) & = & w N^{2} D (ψ); \\ D (ψ) & = & \frac{1}{N^{2}} < ψ | {(\sum S_{i} - < ψ | \sum S_{i} | ψ >)}^{2} | ψ > \\ = & \frac{1}{N^{2}} [< ψ | {(\sum S_{i})}^{2} | ψ > - < ψ | \sum S_{i} | ψ >^{2}] . \end{matrix}

(16)

Here w is a positive constant small enough to make the mechanism undetectable at atomic level and we have written the sum of spins rather than M to emphasize the interpretation as the “dispersion of the center-of-spin”. We observe that

D (ψ)

is of order 1 on cat states, i.e., close to (the concept can be made precise introducing a suitable spherical distance)

ψ = \frac{1}{\sqrt{2}} ψ_{+} + \frac{e^{\sqrt{- 1} α}}{\sqrt{2}} ψ_{-},

(17)

and therefore

E_{W F E} (ψ)

of order

w N^{2}

, of order

1 / N

on product states, i.e., close to

P . S . = \{ψ : ψ (S_{1}, \dots, S_{N}) = \prod_{i = 1}^{N} ψ_{i} (S_{i})\}, | ψ_{i} {(1 / 2) |}^{2} + {| ψ_{i} (- 1 / 2) |}^{2} = 1, \forall i .

(18)

and therefore

E_{W F E} (ψ)

of order

w N

, of order

1 / N^{2}

close to eigenstates (classical configurations with no superpositions) and therefore

E_{W F E} (ψ)

of order w. We observe that for the macroscopic magnetization

M (ψ) = N m (ψ)

the situation will be opposite, that is it will be very small close to product and cat states while of order N close to eigenstates.

Let us first examine

T = 0

. Again, the case reduces to support on wavefunctions of form:

ψ (θ, α) = cos (θ) ψ_{+} + sin (θ) e^{\sqrt{- 1} α} ψ_{-},

(19)

plugging into the definition of D yields:

E_{W F E} (ψ) = \frac{N^{2}}{4} {sin}^{2} (2 θ) .

(20)

Since by adding a trivial constant (which does not alter the measure) the usual energy, respectively, for Ising and Curie–Weiss, can be rewritten:

E_{Q M} (S_{1}, \dots, S_{N}) = \sum_{i, j, n . n .} {(S_{i} - S_{j})}^{2} and N (1 - m^{2} (S)),

(21)

both forms of energy have the same sign (positive). So in the presence of the WFE terms the support is reduced to the cases with

θ = 0

or

θ = π / 2

; hence, the ensemble becomes a mixture of two magnetized states, as in (13). “Classical” behavior is restored.

Thus, the interesting question arises as to whether this conclusion will persist at nonzero temperatures. We observe that the “correlations” suppressed by including

E_{W F E} (ψ)

in the Gibbs factor concern single wavefunctions and whether they imply a cat, while the correlations important for magnetization are thermal. For a discussion about why these models can not be regarded as “continuous spin” models, see [25].

In Math Appendix A.3 we show that adding the WFE to distinguishable particles does not change the conclusion about magnetization (namely, none at any finite temperature).

Proposition 1.

For distinguishable particles with Hamiltonian (14), given a magnetic spin energy

E (S)

of order N, e.g., Curie–Weiss or Ising, in any dimension for any finite temperature (except T = 0) there is no magnetization in the thermodynamic limit:

lim_{N \to \infty} {[{| m (ψ) | > ϵ}]}_{β} = 0,

(22)

for any

ϵ > 0

.

Namely, even under non-linear modifications, it is not possible to produce phase transitions, which we attribute to entropy swamping energy because of high dimensionality of the models.

4. Models Assuming Exchange Symmetry: The Schrödingerist Curie–Weiss Model

As starting point to remove the notion of distinguishable particles we consider the Curie–Weiss energy:

E_{C W} (S) = - \frac{1}{N} {(\sum_{i = 1}^{N} S_{i})}^{2},

(23)

where we are going to replace spin configurations by wavefunctions with exchange symmetry; for such a system one should consider “integer spin”, say

{- 1, 0, + 1}

; see [26,27] for examples of integer spin models. Here we adopt two to learn how to construct some mathematical tools for these wave-mechanical models, reasoning that the lowest possible dimensionality and mean field interaction define the first case to study whether phase transitions appear or not. Models with higher dimensions, meaning higher entropy, or local interactions presumably are even less likely to exhibit such transitions. (Both the concept and the mathematical definition of nearest neighbors for indistinguishable particles appears somewhat problematic in wavefunction models.) In [28], qu-bits are considered with the exchange symmetry. IBM [29] (see references there) says that titanium atoms can represent qu-bits and they project to make a complex of many of them to observe how the collective behavior changes; this might be a situation described by the SCW model.

We define a wave-mechanical model with Hamiltonian (23) and exchange spin symmetry. Note that with two levels the CW energy depends only on the number, call it “n”, of “down” spins. Thus we are led to introducing wavefunctions that depend only on “n”. On N-dimensional Euclidean space, define, given some

a_{i} > 0

:

\begin{matrix} \int_{{\sum x_{i}^{2} a_{i} = 1}} \prod_{i = 1}^{N} d x_{i} F (x_{1}, \dots, x_{N}) = \\ lim_{σ \to 0} A_{ellipsoid}^{- 1} \int \prod d x_{i} exp {- {(\sum x_{i}^{2} a_{i} - 1)}^{2} / σ^{2}} F (x_{1}, \dots, x_{N}) / W (σ), \end{matrix}

(24)

where “

A_{ellipsoid}

” stands for the surface area of the ellipse

{\sum x_{i}^{2} a_{i} = 1}

and

W (σ) = \int \prod d x_{i} exp {- {(\sum x_{i}^{2} a_{i} - 1)}^{2} / σ^{2}} .

(25)

Thus we utilize an approximate Dirac delta-function becoming concentrated in the limit on the ellipsoid. Now let us make a change of variables by defining:

y_{i} = x_{i} / \sqrt{a_{i}}

in (24). A Jacobian derivative of

\prod \sqrt{a_{i}}

appears in numerator and denominator and so cancels, yielding:

Lemma 1.

Let

A_{sphere}

be the “surface area” of the N-sphere. Then:

\begin{matrix} \int_{{\sum x_{i}^{2} a_{i} = 1}} \prod_{i = 1}^{N} d x_{i} F (x_{1}, \dots, x_{N}) = (\frac{A_{sphere}}{A_{ellipsoid}}) \int_{{\sum y_{i}^{2} = 1}} \prod_{i = 1}^{N} d y_{i} F (\sqrt{a_{1}} y_{i}, \dots, \sqrt{a_{N}} y_{N}) . \end{matrix}

We want to define an ensemble and corresponding integrals for symmetric wavefunctions. Each such wavefunction is uniquely given by its common value, call it

\hat{ψ} (n)

, on the set of configurations with n “down” spins. We have then:

\begin{matrix} {| | ψ | |}^{2} & = & \sum_{n = 0}^{N} C (N, n) {| \hat{ψ} (n) |}^{2} = 1, \\ E (ψ) & = & \sum_{n} \sum_{{# S = n}} {| ψ (S) |}^{2} E_{S} = \sum_{n} C (N, n) {| \hat{ψ} (n) |}^{2} E_{n}, where E_{n} : = \sum_{{# S = n}} E_{S} / C (N, n) . \end{matrix}

Here “S” stands for a spin configuration, “

# S

” for how many “down” spins it contains and

C (N, n)

is the number of combinations with n spins down. Next we let

ϕ (n) : = \sqrt{C (N, n)} \hat{ψ} (n),

(26)

and apply the theorem from above section, yielding:

\int_{{\sum | \hat{ψ} (n) |^{2} = 1}} exp \{- \sum C (N, n) | \hat{ψ} {(n) |}^{2} E_{n}\} = \frac{A_{sphere}}{A_{ellipsoid}} \int_{{\sum | ϕ (n) |^{2} = 1}} exp \{- {\sum | ϕ (n) |}^{2} E_{n}\} .

(27)

Note that the prefactor in (27) is not in the integrand, not a function of

ϕ

, just a constant depending on N. Thus it plays no role in computing, e.g., the magnetization. With these preparatory remarks we can define our symmetric-wavefunction SCW (Schrödingerist Curie–Weiss) model. The magnetic energy associated with

ϕ

becomes:

E_{C W} (ϕ) = - \frac{1}{N} \sum_{n = 0}^{N} {| ϕ_{n} |}^{2} {(N - 2 n)}^{2} .

(28)

Our SCW model is then defined by:

\begin{matrix} {[F (ϕ)]}_{β} = \int_{{| | ϕ | |^{2} = 1}} d ϕ exp {- β E_{C W} (ϕ)} \frac{F (ϕ)}{Z}, Z = \int_{{| | ϕ | |^{2} = 1}} d ϕ exp {- β E_{C W} (ϕ)} . \end{matrix}

The choice of the uniform base ensemble might be questioned; we leave this issue to Section 5.

4.1. The SCW Model with Wavefunction Energy

We observe that

E_{C W} (ϕ) = - N \{m^{2} (ϕ) + D (ϕ)\},

(29)

where

\begin{matrix} m (ϕ) = \frac{1}{N} \sum {| ϕ_{n} |}^{2} (N - 2 n); D (ϕ) = \frac{1}{N^{2}} \{{\sum | ϕ |}^{2} {(N - 2 n)}^{2} - {[\sum | ϕ_{n} |^{2} (N - 2 n)]}^{2}\} . \end{matrix}

In passing, we point out the curious fact, revealed by (29), that

D (ϕ)

plays a role in a wavefunction model with the standard energy. We define:

f = N β \{1 - m^{2} (ϕ) - D (ϕ) + N w D (ϕ)\} .

(30)

Here we have added a term (

N β

) to make

f \geq 0

and incorporated wavefunction energy. The intuition for the latter choice comes from the observation that, lacking that term, f can be small if either

m^{2}

is large or D is large; incorporating the dispersion term with large enough

w N

, the last possibility should be suppressed.

As always, the model is defined by:

\begin{matrix} {[m^{2}]}_{β} = \int_{{| | ϕ | |}^{2} = 1} d ϕ exp {- f (ϕ)} m^{2} (ϕ) / Z; Z = \int_{{| | ϕ | |}^{2} = 1} d ϕ exp {- f (ϕ)} . \end{matrix}

Intuition suggests we should investigate cases were w is at least

1 / N

; hence, we define

ω = N w;

(31)

and we assume below that

ω

is a constant. This does not indicate a belief that w actually scales with N; if a wavefunction energy exists, then w is a constant of nature and does not scale. The role of the assumption is to avoid the suppression of all superpositions, as it would follow with fixed “w” in the mathematical limit of large N because of the factor

N^{2}

. This limit does not exist in reality; therefore our assumption is just a mathematical stratagem to prove theorems.

In Math Appendix A.4 we prove:

Theorem 2.

Let positive numbers ω and ϵ satisfy:

\begin{matrix} 1 < ω < 4 / 3; \end{matrix}

(32)

\begin{matrix} ϵ < \frac{1}{4} {(1 + \sqrt{1 - 4 r})}^{2}; \end{matrix}

(33)

\begin{matrix} r = \frac{ω - 1}{ω} . \end{matrix}

(34)

Then there is a positive number

β_{c} = \frac{p^{*} (ϵ)}{(ω - 1) ϵ}

such that, for

β > β_{c}

,

lim_{N \to \infty} {[m^{2}]}_{β} > ϵ .

(35)

The factor

p^{*} (ϵ)

is a large deviation rate functional related to the application of the Gärtner–Ellis theorem; we do not write it here because it involves a technical discussion on large deviations theory, so we prefer to postpone it to Math Appendix A.4. The bound

4 / 3

is required for the application of the Gärtner–Ellis theorem but we expect that it has no meaning. With the specified range for

ω

, (33) always holds if

ε < 1 / 4

.

The fact that at high temperature the magnetization is zero is in Appendix A.2, where we used a system of coordinates called polyspherical coordinates for a computation, see [30].

4.2. The SCW Model without Wavefunction Energy

Without WFE, we state this theorem:

Theorem 3.

For the SCW model at any finite temperature (except T = 0) and for any

ϵ > 0

:

lim_{N \to \infty} {[{| m (ϕ) | > ϵ}]}_{β} = 0 .

(36)

Thus, the SCW model with the standard energy expression has no finite-temperature phase transition. Using large deviation techniques from probability theory, we reduced the proof of Theorem Three to a study of one real function of two variables, for which we supply exact formulas (see Math Appendix A.5). A property for the rate functionals, needed for the proof, is sustained by mathematical arguments and can be displayed, with some computational limits, via computer using statistical sampling from the domain of this function.

5. Ensemble Choice and Discussion

In previous sections, we constructed Gibbs ensembles where each state is weighted with a uniform measure on the sphere. A priori, we could consider other measures conserved by the Hamiltonian flows, e.g. modifications of the uniform one as

d ϕ F (ϕ)

where

F (ϕ)

satisfies

{F (ϕ), E (ϕ)} = 0

and the parenthesis

{\cdot}

indicates the Poisson bracket. In the quadratic case this will be equivalent to

[F, H] = 0

, where now

[\cdot]

denotes the commutator. (This reflects the fact that there are many conserved quantities other than the energy and the norm in linear QM (including the moduli of the wavefunction component in the eigenstate directions), showing that linear QM is far from being ergodic.) As discussed in [19], Schrödinger’s equation and the non-linear modifications proposed there are simply Hamilton’s systems disguised; hence, Liouville’s theorem applies, implying that the base measure should be

\prod_{S} d ψ (S)

. The non-linear modification of WFE of course will eliminate the conserved quantities along eigenstate directions. Proving the ergodicity for some such modification would qualify the uniform measure as the unique choice for a base measure. But at the present time we can only demonstrate the existence of expanding and contracting directions in certain special cases, see [21].

An alternative ensemble, with the same base measure, could be constructed by adopting a Boltzmann factor that weights a state

ψ

taking into account the notion of possible microscopic states compatible with a macroscopic one. Concerning quantum theory, this notion can be tracked back to Einstein [31], in a paper about how to define quantum entropy. We think that, if the introduction of this modification has a meaning, it should be related to having considered indistinguishable particles jointly with a lack of spatial coordinates, which would add a degeneration on energy levels. For example, confining each spin in a spatial potential, we would have many levels for each spin and correspondingly many ways to arrange n spins down. In the SCW case, the proposal would be to replace in

\prod_{n} d ϕ (n)

with

\prod_{n} {[C (N, n)]}^{| ϕ_{n} |^{2}} d ϕ_{n}

, where

C (N, n)

is the number of states with n spins down. In this way, a wavefunction acquires a large combinatorical weight if amplitudes are concentrated on highly degenerated components.

Calling

H

the Hilbert space of wavefunctions of norm one decomposed as

H = \underset{n}{\oplus} H_{n}

, where

H_{n}

is the subspace of the wavefunctions with quantum number n and dimension

\dim H_{n} : = C (N, n)

, the proposed measure can be written also as

\prod_{n} {[dim H_{n}]}^{| ϕ_{n} |^{2}} d ϕ_{n} = e^{[log dim H] (ϕ)} \prod_{n} d ϕ_{n},

(37)

with

[log dim H] (ϕ) = \sum_{n} {| ϕ_{n} |}^{2} log (\dim H_{n})

and it is verified

{[log \dim H] (ϕ), H_{S C W M} (ϕ)} = 0

. Whether such a modified measure is still conserved by the Hamiltonian flow, with or without WFE, will depend on the precise details of the dynamics.

An a priori weight associated with a single wavefunction is questionable. However, at the moment we cannot exclude a different measure from the uniform one on the state space. Let us reflect on how this ensemble would introduce an analogous notion to the classical microcanonical entropy of the usual spin models. Introducing the energy and entropy densities, respectively,

e (ϕ) = 〈 ϕ | e | ϕ 〉, where e (n) = - m_{N}^{2} (n) = - {(1 - \frac{2 n}{N})}^{2},

s (ϕ) = \frac{1}{β} 〈 ϕ | s | ϕ 〉, where s (n) = \frac{1}{N} log C (N, n) .

By Sterling, for large N,

s (n)

is approximated by

s (m_{N} (n)) = - \frac{1 - m_{N} (n)}{2} log \frac{1 - m_{N} (n)}{2} - \frac{1 + m_{N} (n)}{2} log \frac{1 + m_{N} (n)}{2} .

We also introduce

f^{β} (ϕ) : = e (ϕ) - s (ϕ) .

The partition function becomes

\begin{matrix} \frac{1}{S_{N}} \int_{∥ ϕ ∥ = 1} e^{log dim H_{n} (ϕ)} d ϕ exp (- β E (ϕ)) \approx \frac{1}{S_{N}} \int_{∥ ϕ ∥ = 1} d ϕ e^{- β N f^{β} (ϕ)} . \end{matrix}

(38)

Since we are interested in making N large, we used the approximation

\frac{1}{β N} log dim H_{n} (ϕ) \approx s (ϕ)

and, neglecting the errors, we arrive at

\begin{matrix} Z_{N} = \frac{1}{S_{N}} \int_{∥ ϕ ∥ = 1} d ϕ e^{- β N [- {(m (ϕ))}^{2} - s (ϕ) - D (ϕ)]} . \end{matrix}

(39)

The extra factor

e^{β N s (ϕ)}

can be thought of as trying to restore a classical picture of a competition between internal energy and entropy due to high degeneracy of disordered macrostates. Anyway, since the role of this factor is giving large weights to some states of zero magnetization the WFE will be still necessary to define magnetic models.

The way to recover the usual ensembles may be to introduce

w N^{2} D

with w constant (i.e., without keeping

ω = w N

constant) in the thermodynamic limit. In this case, one expects that the measure will concentrate on the set

{ϕ : D (ϕ) = 0}

. Conceivably, we might arrive at the classical ensembles even in models without exchange symmetry, as considered in Section 3. However, apart from conflicting with a fundamental fact of quantum mechanics, i.e., that particles are indistinguishable and so wavefunctions must have symmetries, the ensemble will not magnetize in the thermodynamic limit even with suppression of superpositions, as already discussed after (22). We mention this is the route so far followed by various authors [32,33,34,35,36], where they reduced the ensembles on the sphere to the usual ones by introducing delta measures on eigenstates by different reasoning. The concentration we expect to be true for indistinguishable particles.

An interesting computation could be repeating the analysis of Theorem 3 with the modified measure (37), keeping first

ω

constant and in a second step considering the critical temperature with

ω = w M

in the limit of large M.

We guess that the uniform measure is still the right choice. The introduction of the Boltzmann factor appears a bit artificial, but it seems relevant when including

w N^{2} D

with w constant, as otherwise the internal energy would have no competition, assuming the measure concentrates on eigenstates. However, we suggest there could be other meaningful pictures.

6. Conclusions

Once indistinguishable particles are considered, it seems we are forced to use a mean field type of interaction (plus the non-quadritic term of WFE to observe magnetization) because of the symmetry conditions on the wavefunctions at the moment we do not see a way to implement interactions such as nearest neighbors. (Current literature implementing nearest neighbor interactions on quantum states does not help, since they seem to apply the Hamiltonians without considering symmetry conditions.) To enrich our models we could consider to study the mean field for the Heisenberg model or considering the present model with a transverse magnetic field, which might be of interest as a model to study what are called quantum phase transitions [37]. Developing these ensembles could be of help to deal with problems for continuous models related to diffusion.

We might interpret the necessity of considering wavefunctions with symmetry conditions if we hope to have a thermodynamics that includes phase transitions as another justification of the usual quantum rules for combining indistinguishable systems.

Assuming the validity of Theorem Three (which claims that SCW with the usual energy does not spontaneously magnetize), our results suggest that blocking magnetic cats is related to phase transitions, presumably because states which are broad superpositions tend to have low magnetizations.

Perhaps the two main tasks for future investigations (which appear already very hard) are: improving the ability to compute spherical integrals, in the hope of deriving the functional dependence of magnetization on temperature; and generalizing the model to the case

{- 1, 0, 1}

for symmetric wavefunctions and to the case

{- 1, 1}

for antisymmetric wavefunctions. We observe that these cases will automatically add degeneration to energy levels.

Also it might be worth considering N-levels for each spin, namely a pictorial way for replacing the discrete lattice with a confining potential for each spin and studying the limit for

w N^{2}

; this might help an understanding of Boltzmann entropy from a quantum perspective.

Author Contributions

Conceptualization, L.D.C. and W.D.W.; Methodology, L.D.C. and W.D.W.; Software, W.D.W.; Validation, L.D.C. and W.D.W.; Formal analysis, L.D.C. and W.D.W.; Investigation, L.D.C. and W.D.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Department of Economics and Finance LUISS and the Ministry of Education, University and Research under the PRIN 2017 call, DD 3728, 27 December 2017, project “The Time-Space Evolution of Economic Activities: Mathematical Models and Empirical Applications”, CUP I84I20000130008.

Institutional Review Board Statement

No ethical approval is required.

Data Availability Statement

The work did not need data.

Acknowledgments

Leonardo De Carlo thanks Federico De Iure and the ward Chirurgia Vertebrale Ospedale Maggiore Bologna for giving him back a normal life, the DEF of LUISS Guido Carli di Roma and Fausto Gozzi for the allowed time to finish the writing of the present project and Laura with her cats for the after-launch-time during the solitary Winter in Tirrenia(PI). L.D.C. thanks also W. van Ackooij for a private communication about probability functions.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

COM	Center of mass
SCW	Schrödingerist Curie–Weiss
UIF	Useful integral formula
WFE	Wavefunction energy

Appendix A

Here we collect in several appendices the various computational lemmas we used in the proofs that follow and the proofs themselves.

Appendix A.1. Some Useful Lemmas

Let S be a compact manifold without boundary, f a real-valued function on S,

d x

a finite measure on S. Without loss of generality, we can take

\int_{S} d x = | S | = 1,

(A1)

and assume

f \geq 0

. Let, for any bounded g on S:

\begin{matrix} [g] = \int_{S} d x exp {- f (x)} g (x) / Z, where Z = \int_{S} d x exp {- f (x)} . \end{matrix}

Lemma A1

(Concentration Lemma). Let there be two open subsets of S, called U and V, and three positive numbers α, η and μ such that

\begin{matrix} (A) & V \subset U & (B) f (x) \leq η, for x \in V; \\ (C) & f (x) \geq α, for x \notin U, & (D) | V | \geq μ . \end{matrix}

(A2)

Then:

[g] = \frac{R + ξ}{1 + ζ},

(A3)

where

\begin{matrix} R & = \int_{U} d x exp {- f (x)} g (x) / Z_{U}, & Z_{U} = \int_{U} d x exp {- f (x)}; \\ | ξ | & \leq e^{- α} e^{η} μ^{- 1} | | g | |, & | ζ | \leq e^{- α} e^{η} μ^{- 1} . \end{matrix}

(A4)

Here

| | g | |

denotes the supremum norm of g on S. Note that

R \in span {g (x) : x \in U}

. The intuition behind this lemma is the following. Suppose we are expecting the measure with density

ρ = e^{- f} / Z

to be close to a delta function. Then this density must have a spike at the minimum of f, call it

ρ_{*}

, and the ratio of

ρ_{*}

to values of

ρ

far away from the spike should tend to infinity. But we still will not get a delta function if the spike occupies a very tiny part of the manifold (imagine it as an infinitely thin needle) because it will contribute little to the integral. Hence the roles of the sets U and V and the bounds on f which prevent this “needle-like” behavior. The idea is that V is a small neighborhood of the global minimum of f. The minimum may not occur at a single point, but on a subset. The volume

| V |

should not be so small as to put a large factor in

ξ

and

ζ

, while

α > η

. Then

ξ

and

ζ

should tend to zero and the measure concentrates on the set U.

We note that the “balance of energy and entropy” game is contained in the difference

α - η

and in

μ

, measuring how f increases compared with the volume of the manifold required.

Next consider an integral of form:

\int_{B} exp {- f (ϕ)} d ϕ,

(A5)

where

d ϕ

stands for a probability distribution on a compact manifold with continuous density, B is a regular subset (non-empty interior and boundary of

d ϕ

-measure zero) and f is a continuous function on B with values in

[0, c]

and level sets of zero

d ϕ

-measure. Then we have the representation:

Lemma A2

(Useful Integral Formula (UIF)).

\int_{B} exp {- f (ϕ)} d ϕ = e^{- c} | B | + c \int_{0}^{1} d x e^{- c x} F (x),

(A6)

where

F (x) = | \{f (ϕ) \leq c x; ϕ \in B\} | .

(A7)

Here,

| \cdot |

denotes the volume (“area”) of the subset B.

Proof.

\begin{matrix} \int_{B} exp {- f (ϕ)} d ϕ \geq \sum_{k = 0}^{K} e^{- k c / K} |\{\frac{(k - 1) c}{K} < f (ϕ) \leq \frac{k c}{K}; ϕ \in B\}| \\ \sum_{k = 0}^{K} e^{- k c / K} \{F (\frac{k}{K}) - F (\frac{k - 1}{K})\} \approx \frac{1}{K} \sum_{k = 0}^{K} e^{- k c / K} F^{'} (\frac{k}{K}) \end{matrix}

(A8)

The last line tends, as

K \to \infty

, to:

\int_{0}^{1} d x e^{- c x} F^{'} (x) .

(A9)

We can get an identical upper bound in the limit

K \to \infty

by substituting

e^{- (k - 1) c / K} = e^{- k c / K} e^{c / K}

(A10)

in the calculation leading to (A8). Now the lemma follows from integration by parts, using

F (0) = 0

and

F (1) = | B |

. □

Appendix A.2. A Curious Computation: The Average Magnetization at T = ∞

In this section we ask: what is the magnetization per spin at T = ∞, meaning

β = 0

? It should certainly be zero, which we prove here; the calculation will also provide some tools useful further on.

We adopt symmetric wavefunctions, which we denote by

ϕ_{n}

, for

n = 0, \dots, N

. Here n denotes the number of “down” spins. To avoid factors of 1/2 below we take our spins to have values ±1. In the following the expression:

\int d ϕ

will be understood to be an integral over the normalized measure on the sphere, i.e.,

\int d ϕ = \frac{1}{A_{N}} \int_{| | ϕ | | = 1} \prod d ϕ_{n} .

(A11)

where

A_{N}

is the “area” of the 2N-sphere (the hypersphere in

R^{2 N}

). We can write:

\begin{matrix} {[m^{2} (ϕ)]}_{\infty} & = & {[{(\frac{M (ϕ)}{N})}^{2}]}_{\infty} = \int d ϕ {\{\sum_{n = 0}^{N} {| ϕ_{n} |}^{2} g_{n}\}}^{2}; \end{matrix}

where

g_{n} = 1 - 2 n / N

. Developing:

\begin{matrix} {[m^{2} (ϕ)]}_{\infty} & = & \sum_{n = 0}^{N} \sum_{m = 0}^{N} \int d ϕ | ϕ_{n} |^{2} | ϕ_{m} |^{2} g_{n} g_{m} = c_{1} \sum_{n = 0}^{N} g_{n}^{2} + c_{2} \sum_{n \neq m}^{N} g_{n} g_{m}, \end{matrix}

(A12)

where

c_{1} = \frac{1}{A_{N}} \int d ϕ | ϕ_{1} |^{4}, c_{2} = \frac{1}{A_{N}} \int d ϕ | ϕ_{1} |^{2} {| ϕ_{2} |}^{2}

. By adding and subtracting a term in (A12) and using

\sum_{n = 0}^{N} g_{n} = 0

, we can write:

{[m^{2} (ϕ)]}_{\infty} = (c_{1} - c_{2}) \sum_{n = 0}^{N} g_{n}^{2} .

(A13)

The sum can be evaluated from formulas:

\begin{matrix} \sum_{n = 0}^{N} g_{n}^{2} & = & N + 1 - \frac{4}{N} \sum_{n = 0}^{N} n + \frac{4}{N^{2}} \sum_{n = 0}^{N} n^{2} \\ = & N + 1 - \frac{4}{N} N (N + 1) / 2 + \frac{4}{N^{2}} \frac{N (N + 1) (2 N + 1)}{6} = \frac{1}{3} N + smaller order . \end{matrix}

We conclude that

lim_{N \to \infty} \frac{1}{N} \sum_{n = 0}^{N} g_{n}^{2} = 1 / 3 .

(A14)

Putting in factors of N and

1 / N

in (A13) we deduce that we must consider the limit:

lim_{N \to \infty} N (c_{1} - c_{2}) .

(A15)

By the same tricks but replacing

g_{n}

by one, we find:

1 = (N + 1) c_{1} + c_{2} N (N - 1)

, from which we conclude that

c_{1}

= O(

1 / N

) and

c_{2}

= O(

1 / N^{2}

), so the issue becomes evaluating:

{lim}_{N \to \infty} N c_{1} .

Informally, this limit should be zero, since

| ϕ_{n} |^{2} \approx 1 / (N + 1)

on average, so

c_{1}

should also be O(

1 / N^{2}

). To prove this rigorously, we resort to the polyspherical coordinate system: let

ξ

be a point on the (n − 1)-sphere (meaning the sphere in

R^{n}

). Write, see [30] Section 9.19,

ξ = η sin (θ) + ζ cos (θ),

(A16)

where

η

lies on the (s − 1)-sphere and

ζ

lies on the (n − s − 1)-sphere. Then:

d ξ = b_{n} {sin}^{s - 1} (θ) {cos}^{n - s - 1} (θ) d η d ζ d θ, where b_{n} = \frac{2 Γ (n / 2)}{Γ (s / 2) Γ ([n - s] / 2)} .

(A17)

We take

s = 2

and

η

as the first component of

ξ

and can therefore write:

\begin{matrix} \int d ξ | ξ_{1} |^{4} & = & b_{n} \int_{0}^{π / 2} d θ sin (θ) {cos}^{n - 3} (θ) \int d η {sin}^{4} (θ) {| η |}^{4} \int d ζ \\ = & b_{n} \int d η {| η_{1} |}^{4} \int_{0}^{π / 2} d θ {sin}^{5} (θ) {cos}^{n - 3} (θ) . \end{matrix}

(A18)

Making the substitution

u = cos (θ)

the above integrand is a polynomial and the integral works out to be, substituting

n = 2 N

,

\frac{8}{n (n^{2} - 4)} = \frac{1}{N (N^{2} - 1)}

, also, the prefactor comes out:

b_{2 N} = 2 (N - 1),

(A19)

which proves the assertion about the order of

c_{1}

and that the average magnetization at infinite temperature is zero in the thermodynamic limit.

This theorem can also be proved using LD theory but we omit it as it is covered in other Appendices.

Appendix A.3. Models without Exchange Symmetries

For the computations of this section we add to the energy an external field with a constant

λ

, incorporate the factor of

β

, and take the positive version of the interaction energy (which merely multiplies Z by a factor). For example

\sum_{i, j; n . n .}^{N} {(S_{i} - S_{j})}^{2}

and

N + E_{W C} (ϕ)

, respectively, for nearest neighbors and mean field. We write:

\begin{matrix} E (ψ; λ) = β \sum_{S} {| ψ |}^{2} (S) E (S) + λ M (ψ), M = \sum_{S} {| ψ |}^{2} (S) \sum_{i} S_{i} . \end{matrix}

Here and below “

\sum_{S}

”, resp. “

\prod_{S}

”, is shorthand for the sum, resp. product, over all configurations

{S_{1} = \pm 1 / 2, \dots, S_{N} = \pm 1 / 2}

. We will also write

E (ψ; λ) = \sum_{S} {| ψ |}^{2} (S) E_{S} (λ) .

(A20)

The starting point is to replace the spherical integral by a Gaussian integral, preserving the definition of the model by projecting

ψ

onto the sphere. Hence we can write:

\begin{matrix} Z_{N} = c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ / | | ψ | |; λ) / 2\} {, | | ψ | |}^{2} = \sum_{S} {| ψ |}^{2} (S) . \end{matrix}

We have introduced a factor of 1/2 before the energy to simplify some calculations below. The basic idea is to note that, although the integrand is singular at

ψ = 0

, because the components of

ψ

are i.i.d.

N (0, 1)

, that value is improbable; indeed,

\sum_{S} {| ψ |}^{2} (S) \approx a_{N} = 2 2^{N} .

(A21)

Therefore we define a simpler model by replacing the worrisome sum in the integrand by a constant:

{\hat{Z}}_{N} = c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ; λ) / 2 a_{N}\};

(A22)

which we will dub the “exactly solvable model” (ESM), as we are reduced to computing a Gaussian integral. We have:

{\hat{Z}}_{N} = c_{N} \prod_{S} \int d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 σ_{S}^{2}\}, where σ_{S}^{2} = {\{1 + E_{S} / a_{N}\}}^{- 1},

(A23)

so from the basic Gaussian integral in two dimensions we find:

{\hat{Z}}_{N} = \prod_{S} \frac{1}{1 + E_{S} / a_{N}} .

(A24)

Next we take two derivatives with respect to

λ

, denoted with primes, divide by

{\hat{Z}}_{N}

and evaluate at

λ = 0

:

\begin{matrix} \frac{{\hat{Z}}_{N}^{″}}{{\hat{Z}}_{N}} |_{λ = 0} = {\{\frac{1}{a_{N}} \sum_{S} \frac{M_{S}}{(1 + E_{S} / a_{N})}\}}^{2} + \frac{1}{a_{N}^{2}} \sum_{S} \frac{M_{S}^{2}}{{(1 + E_{S} / a_{N})}^{2}} . \end{matrix}

(A25)

The first term is zero by the spin-flip symmetry (

M_{S}

changes sign but since

E_{S}

is quadratic in S it does not). Then, observing the prefactors of

1 / a_{N}^{2}

, the whole thing tends to zero as

N \to \infty

even without a factor of

1 / N^{2}

, so there is no phase transition in the ESM. Finally we must estimate the difference in magnetizations in the two models. We can rewrite

Z_{N}

as:

Z_{N} = c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N} + χ_{N}\},

(A26)

where

χ_{N} = \frac{E (ψ)}{{| | ψ | |}^{2}} \{\frac{{| | ψ | |}^{2}}{a_{N}} - 1\} and ξ_{N} = \frac{1}{4} {\{\frac{M (ψ)}{{| | ψ | |}^{2}}\}}^{2} \{1 - {(\frac{{| | ψ | |}^{2}}{a_{N}})}^{2}\} .

(A27)

By expanding the exponential we can write

\begin{matrix} Z_{N} & \approx & {\hat{Z}}_{N} + c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} χ_{N}; \\ Z_{N}^{^{''}} & \approx & {\hat{Z}}_{N}^{^{''}} + c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} ξ_{N} + \\ c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} {(\frac{M (ψ)}{{| | ψ | |}^{2}})}^{2} χ_{N}; \end{matrix}

Hence, expanding to first order in the small quantities

χ_{N}

and

ξ_{N}

,

\begin{matrix} \frac{Z_{N}^{″}}{Z_{N}} & \approx & \frac{{\hat{Z}}_{N}^{″}}{{\hat{Z}}_{N}} + c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} ξ_{N} / {\hat{Z}}_{N} - \\ c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} {(\frac{M (ψ)}{{| | ψ | |}^{2}})}^{2} χ_{N} / {\hat{Z}}_{N} - \\ \frac{{\hat{Z}}_{N}^{″}}{{\hat{Z}}_{N}^{2}} c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / 2 a_{N}\} χ_{N} . \end{matrix}

(A28)

Let us treat the third term first. Noting that

\begin{matrix} {(\frac{M (ψ)}{{| | ψ | |}^{2}})}^{2} \leq (1 / 4) N^{2}, \frac{E (ψ)}{{| | ψ | |}^{2}} \leq (1 / 4) N . \end{matrix}

Applying Cauchy–Schwarz we can bound the term by:

\begin{matrix} N^{3} / 16 {\{c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2\} {\{\frac{{| | ψ | |}^{2}}{a_{N}} - 1\}}^{2}\}}^{1 / 2} \times \\ {\{c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2 - E (ψ) / a_{N}\}\}}^{1 / 2} / {\hat{Z}}_{N} . \end{matrix}

(A29)

By standard CLT calculations, the first factor in curly brackets tends to zero, at rate

1 / \sqrt{a_{N}}

. Using (A24) the second factor can be rewritten as

{\{\prod_{S} \frac{1 + 2 E_{S} / a_{N} + E_{S}^{2} / a_{N}^{2}}{1 + 2 E_{S} / a_{N}}\}}^{1 / 2},

which is easily seen to be bounded (and in fact tends to one). In treating the second term we encounter instead the quantity

{\{c_{N} \int \prod_{S} d ψ_{S} exp \{- {| | ψ | |}^{2} / 2\} {\{{(\frac{{| | ψ | |}^{2}}{a_{N}})}^{2} - 1\}}^{2}\}}^{1 / 2} .

This term can also be shown to tend to zero. E.g., writing “

E

” for the Gaussian integral, factoring the difference of squares and making another Cauchy–Schwarz, we have to estimate:

\begin{matrix} E {(\frac{\sum | ψ_{S} |^{2}}{a_{N}} - 1)}^{4} & = & E {(\frac{1}{a_{N}} \sum \{| ψ_{S} |^{2} - 1\})}^{4} \\ = & \frac{1}{a_{N}^{4}} \sum_{S, S^{'}, S^{″}, S^{‴}} E \{| ψ_{S} |^{2} - 1\} \cdot \cdot \cdot \{| ψ_{S^{‴}} |^{2} - 1\} \\ = & \frac{6}{a_{N}^{4}} {\{\sum_{S} E {(| ψ_{S} |^{2} - 1)}^{2}\}}^{2} + \frac{4}{a_{N}^{4}} \sum_{S} E {(| ψ_{S} |^{2} - 1)}^{4} \end{matrix}

(A30)

which is O

(a_{N}^{- 2})

. We also have to estimate

E {\{\frac{{| | ψ | |}^{2}}{a_{N}} + 1\}}^{4}

, which by the same tricks is seen to be bounded. Recalling that

\frac{{\hat{Z}}_{N}^{″}}{{\hat{Z}}_{N}} \to 0,

(A31)

the other term in (A28) is even easier to treat. Since moments of Gaussians increase more slowly than a factorial, the sum of the other terms is dominated by a convergent series and of smaller order. QED.

Now consider adding wavefunction energy:

\begin{matrix} E_{W F E} (ψ) & = & w N^{2} D (ψ); \\ D (ψ) & = & \frac{1}{N^{2}} < ψ | {(\sum S_{i} - < ψ | \sum S_{i} | ψ >)}^{2} | ψ > \end{matrix}

(A32)

\begin{matrix} = & \frac{1}{N^{2}} [\sum_{i, j} < ψ | S_{i} S_{j} | ψ > - {\{\sum_{i} < ψ | S_{i} | ψ >\}}^{2}] \end{matrix}

(A33)

Let us switch to the Gaussian version, divide

ψ

by

| | ψ | |

and replace

{| | ψ | |}^{2}

by

a_{N}

. Treating the second term in the last line above, define

X_{\pm, i} = \frac{1}{2 a_{N}} \sum_{\hat{S}} {| ψ |}^{2} (S_{1}, \dots, \pm \frac{1}{2}, \dots, S_{N}),

(A34)

where “

\sum_{\hat{S}}

” means summation over the spin configurations with

S_{i}

held at the fixed value indicated. Then we can write:

< ψ | S_{i} | ψ > = \frac{1}{a_{N}} \sum_{S} {| ψ (S) |}^{2} S_{i} = X_{+, i} - X_{-, i} .

(A35)

Note that, using

E

for expectation over the Gaussian distribution:

\begin{matrix} E (X_{+, i} - X_{-, i}) = 0, E {(X_{+, i} - X_{-, i})}^{2} = E {(X_{+, i} - E X_{+, i})}^{2} + E {(X_{-, i} - E X_{-, i})}^{2}; \end{matrix}

(A36)

and

E {(X_{+, i} - E X_{+, i})}^{2} = E {\{\frac{1}{2 a_{N}} \sum_{\hat{S}} [{| ψ |}^{2} (S_{1}, \dots, \frac{1}{2}, \dots, S_{N}) - 2]\}}^{2} .

(A37)

Noting the average of i.i.d. mean-zero random variables this last is O

(1 / a_{N})

. The last term in (A32) is a sum of N terms which are not independent but all of the order just computed. By a simple Jensen inequality, the sum is at most N times this order. We conclude that the whole term is negligible. The first term is of the form of a mean-field model, still quadratic in

ψ

. Hence it can be added to the usual magnetic energy. The bound on

E_{S}

increases to

N^{2}

, but does not affect the argument. For the comparison with the actual model with WFE, we encounter the expression

\frac{1}{4} \{\frac{M (ψ)}{{| | ψ | |}^{2}}\} \{(\frac{{| | ψ | |}^{2}}{a_{N}}) - 1\},

(A38)

which can be treated as before.

Thus the argument of the quadratic case goes through and yields that with WFE there is no magnetization at finite temperature.

Appendix A.4. SCW with Wavefunction Energy

We prove here that, assuming the hypotheses of Theorem Two, if

β > \frac{{\underset{̲}{p}}^{*} (ϵ)}{(ω - 1) ϵ},

(A39)

where

{\underset{̲}{p}}^{*} (ϵ)

is a certain positive function specified in the proof below, then the conclusion of the theorem follows. We will apply the Concentration Lemma (A3). We can assume by adding a term as usual to render f positive:

f = N β \{1 - m^{2} (ϕ) + (ω - 1) D (ϕ)\} .

(A40)

We next define the quantities involved in the lemma. For the sets U and V we take, given two positive numbers

ϵ

and

η

(

η

may depend on N),

\begin{matrix} U & = & \{ϕ : m^{2} (ϕ) \geq ϵ\}; V = \{f \leq η\} . \end{matrix}

Further, define:

α = β N (1 - ϵ) .

We next check the hypotheses of the Concentration Lemma. A quick calculation gives the equivalent description of set V:

V = \{ϕ : ω m^{2} (ϕ) \geq (ω - 1) \sum {| ϕ_{n} |}^{2} g_{n}^{2} + (1 - η / β N)\} = \{ϕ : m^{2} (ϕ) \geq \sum {| ϕ_{n} |}^{2} a_{n}\},

where

g_{n} = 1 - \frac{2 n}{N}

and we have defined

\begin{matrix} a_{n} = \frac{ω - 1}{ω} g_{n}^{2} + \frac{1 - η^{'} / β}{ω} = r g_{n}^{2} + δ; \end{matrix}

Since

ω > 1

, assuming

η = η^{'} N

and

1 - η^{'} / β > ω ϵ

, the condition defining V implies

V \subset U

. Since

ω > 1

, it follows from (A40) that

f \geq α

on

U^{c}

. For the lower bound on

| V |

, we translate to a model with i.i.d. Gaussians, call them

χ_{n}

, replacing:

ϕ_{n} ⟶ \frac{χ_{n}}{\sqrt{\sum | χ_{n} |^{2}}},

(A41)

and the definition of V becomes:

V = \{{(\sum_{n = 0}^{N} {| χ_{n} |}^{2} g_{n})}^{2} \geq [\sum_{n = 0}^{N} a_{n} {| χ_{n} |}^{2}] \sum_{n = 0}^{N} {| χ_{n} |}^{2}\} .

(A42)

(Since

ψ_{n}

has both a real and an imaginary part, we should double the lengths of these sums, but clearly this has no impact on the final result.) We have to find an exponential lower bound on

P [V]

. Since

a_{n} \geq δ

, we can get a lower bound by writing:

\begin{matrix} P [V] \geq P [{(\frac{1}{N} \sum_{n = 0}^{N} {| χ_{n} |}^{2} g_{n})}^{2} \geq \{\frac{1}{N} \sum_{n = 0}^{N} a_{n} {| χ_{n} |}^{2}\} \frac{1}{N} \sum_{n = 0}^{N} \frac{a_{n}}{δ} {| χ_{n} |}^{2}] = \\ P [\sqrt{δ} \frac{1}{N} \sum_{n = 0}^{N} | χ_{n} |^{2} g_{n} \geq \frac{1}{N} \sum_{n = 0}^{N} a_{n} {| χ_{n} |}^{2}] + P [\sqrt{δ} \frac{1}{N} \sum_{n = 0}^{N} | χ_{n} |^{2} g_{n} \leq - \frac{1}{N} \sum_{n = 0}^{N} a_{n} {| χ_{n} |}^{2}] . \end{matrix}

We will work on the first probability on the last line. (The second one is really the same since, multiplying the inequality through by −1, replacing

g_{n}

by

- g_{n}

just reverses the order of its values.) According to Gärtner–Ellis [38], we have to compute:

p (θ) = lim_{N \to \infty} \frac{1}{N} log E exp {θ \sum χ_{n}^{2} b_{n}},

(A43)

where

E

denotes expectation and

\begin{matrix} b_{n} = \sqrt{δ} g_{n} - a_{n} = \sqrt{δ} (1 - \frac{2 n}{N}) - r {(1 - \frac{2 n}{N})}^{2} - δ, \end{matrix}

Note that

b_{n}

can take negative and positive values; hence,

θ

must be restricted to ensure that the integral is finite. From the standard Gaussian integral we get:

p (θ) = - \frac{1}{2} lim_{N \to \infty} \sum log (1 - 2 θ b_{n}),

(A44)

which, provided the integrand is bounded, we recognize as the Reimann sum converging to:

p (θ) = - \frac{1}{2} \int_{0}^{1} d u log (1 - 2 θ A (1 - 2 u)),

where

A (x) = \sqrt{δ} x - r x^{2} - δ .

By a change of variable this equals:

p (θ) = - \frac{1}{4} \int_{- 1}^{1} d x log (1 - 2 θ A (x)) .

(A45)

The LD approach requires us to compute:

\begin{matrix} p^{*} (y) & = & sup_{θ} \{θ y - p (θ)\}; {\underset{̲}{p}}^{*} = inf_{y > 0} p^{*} (y); \end{matrix}

(A46)

and then the asympotic lower bound is

exp (- {\underset{̲}{p}}^{*} N)

[38]. By the definition of the domain where

p (θ) < + \infty

in the Gärtner–Ellis theorem, the supremum over

θ

in the definition of

p^{*} (y)

can be limited to:

\frac{1}{2 A_{\min}} \leq θ \leq \frac{1}{2 A_{\max}},

(A47)

where

A_{\min}

is negative but we do not need to know it, while by a simple computation:

A_{\max} = δ \{\frac{4 - 3 ω}{4 (ω - 1)}\} .

(This is the max on the whole line.) We must have positive values of

A (x)

somewhere in the interval [−1, 1], for otherwise

lim p (θ) = - \infty

as

θ \to \infty

, so

p^{*} (x) = + \infty

for

x \geq 0

. This requires

A_{\max} > 0

and that the lower root of

A (x) = 0

lies in the interval [0, 1] (since

A (0) = - δ < 0

and

A^{'} (0) > 0

). The root is easily computed to be:

x_{-} = \frac{\sqrt{δ} (1 - \sqrt{1 - 4 r})}{2 r},

Since

r < 1 / 4

, this root is real; letting

u = \sqrt{1 - 4 r}

the condition

x_{-} < 1

becomes

δ < \frac{1}{4} {(1 + u)}^{2},

Since we will eventually identify

δ

with

ϵ

, the above inequality is identical to (33). Since

0 \leq u \leq 1

, the infimum of the right side is 1/4.

The problem of computing

p^{*} (x)

is then well defined. We note that, if the point at which

A (x) = A_{\max}

lies in the unit interval, then, as

θ \to 1 / 2 A_{\max}

, the integral looks as if it might diverge. However, it remains finite, because logarithmic singularities are integrable. e.g.,

\int_{0}^{1} d x log (x) = {lim}_{ϵ \to 0} \int_{ϵ}^{1} d x log (x) = {lim}_{ϵ \to 0} {[x log (x) - x]}_{ϵ}^{1} = - 1,

not

- \infty

. However, the derivative

p^{'} (θ)

will go to infinity at the boundaries (Ellis calls such a function “steep” and it is an assumption of his theorem).

Assuming that

0 < {\underset{̲}{p}}^{*} < \infty

, it will suffice for Theorem Two to know that for some

η^{'} > 0

:

\begin{matrix} {\underset{̲}{p}}^{*} + η^{'} & \leq & β (1 - ϵ); η^{'} \leq β (1 - ω ϵ); \end{matrix}

We can define

η^{'}

to saturate the second inequality above (note that

ω ϵ < 1

), which also yields

δ = ϵ

. The conclusion of the theorem follows.

Can we compute

p (θ)

and

{\underset{̲}{p}}^{*} (y)

? In fact, the integral defining

p (θ)

is elementary and can be computed as follows. First, integrate by parts:

\begin{matrix} p (θ) & = & - \frac{1}{4} \int_{- 1}^{1} d x log \{1 - 2 θ A (x)\} = - \frac{1}{4} \int_{- 1}^{1} d x x^{'} log \{1 - 2 θ A (x)\} \\ = & - \frac{1}{4} {[x log \{1 - 2 θ A (x)\}]}_{- 1}^{1} - \frac{θ}{2} \int_{- 1}^{1} d x \frac{x A^{'} (x)}{1 - 2 θ A (x)} . \end{matrix}

There are two cases, depending on whether the denominator in the integrand factorizes or not. For

θ

positive, it does not; hence, the denominator is an irreducible quadratic. For some negative

θ

it does factorize. We can proceed by elementary linear and trig substitutions, yielding either log(linear), log(quadratic), or arctan(quadratic) terms. Clearly, given such formulas with nonalgebraic functions, computing the supremum cannot be expected in closed form, so we would resort to the computer, but do not report detailed results here.

Appendix A.5. SCW without Wavefunction Energy

The idea is to determine whether

lim_{N \to \infty} \int_{m (ϕ) \geq ϵ} d ϕ exp {- β E_{CW}} / Z = 0,

(A48)

or not, where

Z = \int d ϕ exp {- β E_{CW}} .

(A49)

and similarily with

{m (ϕ) \geq ϵ}

replaced by

{m (ϕ) \leq - ϵ}

. We can rewrite the integrals as:

\begin{matrix} \int_{m (ϕ) \geq ϵ} d ϕ exp {- f} / Z, Z = \int d ϕ exp {- f}, \end{matrix}

(A50)

where

\begin{matrix} f = β N (1 - \sum | ϕ_{n} |^{2} g_{n}^{2}), g_{n} = 1 - 2 n / N . \end{matrix}

where as usual we have added a term so that

0 \leq f \leq β N

. To compare these integrals in numerator and denominator, we turn to the UIF, setting

B = {m (ϕ) \geq ϵ}

in the numerator and B equal to the whole sphere in the denominator. Thus, for the numerator we will have to estimate:

| {β N (1 - \sum | ϕ_{n} |^{2} g_{n}^{2}) \leq β N x; \sum | ϕ_{n} |^{2} g_{n} \geq ϵ} |,

(A51)

which, replacing wavefunctions on the sphere by i.i.d. Gaussians, equivalently:

P [(1 - x) \sum χ_{n}^{2} \leq \sum χ_{n}^{2} g_{n}^{2}; \sum χ_{n}^{2} g_{n} \geq ϵ \sum χ_{n}^{2}],

(A52)

(since

ϕ_{n}

has two components, the sums are now over 2N + 1 rather than N + 1 indices, with the

g_{n}

repeated; however, as we are taking the limit as

N \to \infty

we do not bother with factors of two everywhere) which is the same as writing (introducing factors of

1 / N

for later purposes):

P [1 / N \sum χ_{n}^{2} (1 - x - g_{n}^{2}) \leq 0; 1 / N \sum χ_{n}^{2} (g_{n} - ϵ) \geq 0] .

(A53)

To motivate the appeal to the Gärtner–Ellis theorem, let

P_{2; N} (x)

stand for the probability of the set appearing above and

P_{1; N} (x)

for the probability with the second restriction dropped. We are interested in the ratio:

p_{N} (x) = \frac{P_{2; N} (x)}{P_{1; N} (x)},

(A54)

which has the interpretation of the conditional probability that the magnetization is greater than

ϵ

, given a bound “x” on the energy. We expect that

lim_{N \to \infty} p_{N} (x) = 0 .

(A55)

Then, integrating over x (as in the UIF) we shall arrive at the result. Noting that the events considered have the form of averages of random variables whose means lie outside the indicated bounds, we expect large-deviation asymptotics for both, and so it is natural to consider

\begin{matrix} lim_{N \to \infty} - \frac{1}{N} log p_{N} (x) = lim_{N \to \infty} - \frac{1}{N} log P_{2, N} (x) + lim_{N \to \infty} \frac{1}{N} log P_{1; N} (x) \end{matrix}

(A56)

and to treat the two terms separately and then compare. We used

{lim}_{N} 1 / N log P_{N} (S) = - I (S)

with

I (S) = {inf}_{S} I (y)

for the involved sets S. But what a large deviation principle gives are upper and lower bounds. To have equality

{lim}_{N} 1 / N log P_{N} (S) = - I (S)

we need S to be I-continuous, i.e.,

I (i n t (S)) = I (c l (S))

, see page 30 [39]. This is the case.

To implement the Gärtner–Ellis procedure, we introduce the random vector with two components:

Y_{N} = (1 / N \sum χ_{n}^{2} (1 - x - g_{n}^{2}); 1 / N \sum χ_{n}^{2} (g_{n} - ϵ)),

(A57)

so we are interested in

P [Y_{N} \in [Upper left quadrant of R^{2}]] .

(A58)

To apply G-E we must compute:

\begin{matrix} c (θ_{1}, θ_{2}) = \\ lim_{N \to \infty} \frac{1}{N} log E exp {N (Y_{N; 1} θ_{1} + Y_{N; 2} θ_{2})} = \\ lim_{N \to \infty} \frac{- 1}{2 N} \sum log {1 - 2 [θ_{1} (1 - x - g_{n}^{2}) + θ_{2} (g_{n} - ϵ)]} = \\ - \frac{1}{2} \int_{- 1}^{1} d y log {1 - 2 [θ_{1} (1 - x - y^{2}) + θ_{2} (y - ϵ)]} = \\ - \frac{1}{2} \int_{- 1}^{1} d y log q (y); \end{matrix}

(A59)

where

\begin{matrix} q (y) & = & 1 - 2 [θ_{1} (1 - x - y^{2}) + θ_{2} (y - ϵ)] \\ = & 2 θ_{1} y^{2} - 2 θ_{2} y + b, \end{matrix}

(A60)

where b is a constant depending on the thetas.

We must first determine the region in the

(θ_{1}, θ_{2})

plane in which

c (θ_{1}, θ_{2}) < + \infty

. Defining

h (y) = h (y; θ_{1}, θ_{2}) = θ_{1} (1 - x - y^{2}) + θ_{2} (y - ϵ);

(A61)

this region, call it D, is defined by:

D = \{(θ_{1}, θ_{2}) : h (y) \leq \frac{1}{2} for all y, - 1 \leq y \leq 1\} .

(A62)

We observe that D is the domain where

c (θ_{1}, θ_{2})

is finite, related to the conditions for the validity of the Gärtner–Ellis theorem, not the domain of the logarithm argument. The geometry of this region is a bit complicated. These are the tests for whether a point lies in D:

Test1: $h (1) \leq 1 / 2$ ;

Test2: $h (- 1) \leq 1 / 2$ ;

Test3: if $y_{c} = θ_{2} / 2 θ_{1}$ , the critical point of $h (y)$ , lies in [−1, 1], then test whether $h (y_{c}) \leq 1 / 2$ .

Now see Figure A1. By tests one and two, D lies between the lines A and B, and to the left of point P. Test three applies in between lines E and F; to the right of the axis, it is satisfied within the oblique ellipse, curve C; to the left of the axis, h is negative. Hence we obtain a diamond-shaped compact region of the plane.

Figure A1. Regions used in describing the domain D where

c (θ_{1}, θ_{2})

is finite. These lines are obtained by the Tests above. Lines A and B are respectively

θ_{2} = θ_{1} \frac{x}{1 - ε} + \frac{1}{2 (1 - ε)}

from

h (1) = 1 / 2

and

θ_{2} = - θ_{1} \frac{x}{1 + ε} - \frac{1}{2 (1 + ε)}

from

h (- 1) = 1 / 2

. To the left of P, Test 1 and 2 can not be satisfied simultaneously. Lines E and F are respectively

θ_{2} = 2 θ_{1}

from

y_{c} = 1

and

θ_{2} = - 2 θ_{1}

from

y_{c} = - 1

. The ellipse C is

4 θ_{1}^{2} (1 - x) + θ_{2}^{2} - 4 θ_{1} θ_{2} ε - 2 θ_{1} = 0

, obtained from

h (y_{c}) = 1 / 2

of Test 3 to be applied for

θ_{1} \geq 0

(for

θ_{1} < 0

Test 3 is satisfied).

Figure A1. Regions used in describing the domain D where

c (θ_{1}, θ_{2})

is finite. These lines are obtained by the Tests above. Lines A and B are respectively

θ_{2} = θ_{1} \frac{x}{1 - ε} + \frac{1}{2 (1 - ε)}

from

h (1) = 1 / 2

and

θ_{2} = - θ_{1} \frac{x}{1 + ε} - \frac{1}{2 (1 + ε)}

from

h (- 1) = 1 / 2

. To the left of P, Test 1 and 2 can not be satisfied simultaneously. Lines E and F are respectively

θ_{2} = 2 θ_{1}

from

y_{c} = 1

and

θ_{2} = - 2 θ_{1}

from

y_{c} = - 1

. The ellipse C is

4 θ_{1}^{2} (1 - x) + θ_{2}^{2} - 4 θ_{1} θ_{2} ε - 2 θ_{1} = 0

, obtained from

h (y_{c}) = 1 / 2

of Test 3 to be applied for

θ_{1} \geq 0

(for

θ_{1} < 0

Test 3 is satisfied).

We note that

c (θ_{1}, θ_{2})

is convex (by examination of its Hessian matrix, omitted) and “steep”, meaning its derivatives approach infinity at the boundaries of D. We have an ellipse as in Figure A1 if

(1 - x) - ϵ^{2} > 0

, otherwise a hyperbola. We restrict the study to the ellipse case, since it is the relevant one: we are primary interested in small

ϵ > 0

and, from the proof after (A105), one can see that the small x are the relevant ones for low temperatures. Therefore, given

ϵ

, we always restrict to the x small enough to satisfy

(1 - x) - ϵ^{2} > 0

.

The G-E procedure is to compute the dual function:

f (z_{1}, z_{2}) = sup_{θ_{1}, θ_{2}} [z_{1} θ_{1} + z_{2} θ_{2} - c (θ_{1}, θ_{2})];

(A63)

from which you can compute the I-function as:

I_{2} = I_{2} (x; ϵ) = inf \{f (z_{1}, z_{2}) : z_{1} \leq 0; z_{2} \geq 0\} .

(A64)

By the convexity and steepness of c, the supremum in (A63) is attained at a critical point for which

\begin{matrix} \frac{\partial c}{\partial θ_{1}} = z_{1}; \\ \frac{\partial c}{\partial θ_{2}} = z_{2}; \end{matrix}

(A65)

so looking forward to a possible computer search (for which we do not want to solve equations but rather evaluate formulas), we define

k (θ_{1}, θ_{2}) = \frac{\partial c}{\partial θ_{1}} θ_{1} + \frac{\partial c}{\partial θ_{2}} θ_{2} - c (θ_{1}, θ_{2}),

(A66)

Let G denote the constraint set in the

(θ_{1}, θ_{2})

-plane:

G = \{(θ_{1}, θ_{2}) : \frac{\partial c}{\partial θ_{1}} \leq 0; \frac{\partial c}{\partial θ_{2}} \geq 0\} .

(A67)

We can express the I-function as

I_{2} = I_{2} (x; ϵ) = inf \{k (θ_{1}, θ_{2}) : (θ_{1}, θ_{2}) \in G \cap D;\} .

(A68)

For the proof we have to compare the I-function for the two-variable, two-inequality problem with that of the one-variable, one-inequality problem, given by

I_{1} = I_{1} (x) = inf \{k (θ_{1}, 0) : θ_{1} \in L; \frac{\partial c}{\partial θ_{1}} (θ_{1}, 0) \leq 0\},

(A69)

where L is the part of the horizontal axis that lies in D. A first, critical question to ask is whether necessarily

I_{2} \leq I_{1}

because the set over which we compute the infimum for

I_{2}

contains the set for which we compute the infimum for

I_{1}

. However, this is not the case. We have:

\frac{\partial c}{\partial θ_{2}} = \int_{- 1}^{1} d y \frac{y}{q (y)} - ϵ \int_{- 1}^{1} d y \frac{1}{q (y)},

(A70)

where

q (y) = 1 - 2 h (y) \geq 0

by the restriction that

(θ_{1}, θ_{2}) \in D

. The first integral can be done easily by elementary calculus:

\begin{matrix} \int_{- 1}^{1} d y \frac{y}{q (y)} = \\ \frac{1}{4 θ_{1}} \int_{- 1}^{1} d y \frac{4 θ_{1} y - 2 θ_{2} + 2 θ_{2}}{2 θ_{1} y^{2} - 2 θ_{2} y + b} = \\ \frac{1}{4 θ_{1}} log \{\frac{2 θ_{1} - 2 θ_{2} + b}{2 θ_{1} + 2 θ_{2} + b}\} + \frac{θ_{2}}{2 θ_{1}} \int_{- 1}^{1} d y \frac{1}{q (y)}, \end{matrix}

(A71)

Now note that, at

θ_{2} = 0

, this integral is zero, and so is the first term in (A70), while the second term there is negative. Hence

{\partial c / \partial θ_{2} \geq 0}

is disjoint from L.

We need a formula for the first partial of c:

\frac{\partial c}{\partial θ_{1}} = \int_{- 1}^{1} d y \{\frac{1 - x - y^{2}}{2 θ_{1} y^{2} - 2 θ_{2} y + b}\} .

(A72)

By long division we can write:

\frac{1 - x - y^{2}}{2 θ_{1} y^{2} - 2 θ_{2} y + b} = A + \frac{B + C y}{q (y)},

(A73)

where

A = - \frac{1}{2 θ_{1}}; B = 1 - x - A b; C = 2 θ_{2} A .

(A74)

Hence:

\begin{matrix} \frac{\partial c}{\partial θ_{1}} & = & \int_{- 1}^{1} d y A + B \int_{- 1}^{1} d y \{\frac{1}{q (y)}\} + C \int_{- 1}^{1} d y \{\frac{y}{q (y)}\} \\ = & 2 A + (\frac{C θ_{2}}{2 θ_{1}} + B) \int_{- 1}^{1} d y \{\frac{1}{q (y)}\} + \frac{C}{4 θ_{1}} log \{\frac{q (1)}{q (- 1)}\} . \end{matrix}

(A75)

where we have plugged in (A71) for one integral.

Note that we have reduced computing these derivatives to computing one integral, of

1 / q (y)

over the interval

[- 1, 1]

. There are two cases, depending on whether

q (y)

is irreducible or factorizes. The discriminant is:

\begin{matrix} disc . = 4 θ_{2}^{2} - 8 θ_{1} b = 4 θ_{2}^{2} - 8 θ_{1} [1 - 2 θ_{1} (1 - x) - θ_{2} ϵ] . \end{matrix}

If disc.

> 0

, q factorizes as

q (y) = 2 θ_{1} (y - r_{+}) (y - r_{-}),

(A76)

where the two roots are outside the interval

[- 1, 1]

. The integrals can then be performed by partial fractions, yielding logarithmic terms:

\int_{- 1}^{1} d y \{\frac{1}{q (y)}\} = \frac{1}{2 θ_{1} (r_{-} - r_{+})} log [\frac{(1 - r_{-}) (1 + r_{+})}{(1 - r_{+}) (1 + r_{-})}] .

(A77)

If disc.

< 0

, in our case q can be written:

q (y) = A^{'} {(y - B^{'})}^{2} + C^{'},

(A78)

with

A^{'} > 0

and

C^{'} > 0

, and one makes a trig substitution:

tan (u) = \sqrt{\frac{A^{'}}{C^{'}}} (y - B^{'}) .

(A79)

The result contains inverse trig functions:

\int_{- 1}^{1} d y \{\frac{1}{q (y)}\} = \frac{1}{\sqrt{A^{'} C^{'}}} [arctan \{\sqrt{\frac{A^{'}}{C^{'}}} (1 - B^{'})\} - arctan \{\sqrt{\frac{A^{'}}{C^{'}}} (- 1 - B^{'})\}] .

(A80)

To check these calculus formulas (and our computer implementation of them) we computed the integral directly (numerically) using the trapezoid rule and compared. We also need a formula for c, which we can obtain by integrating by parts:

\begin{matrix} c & = & - \frac{1}{2} \int_{- 1}^{1} d y log q (y) = - \frac{1}{2} \int_{- 1}^{1} d y y^{'} log q (y) \\ c & = & - \frac{1}{2} log [q (1) q (- 1)] + \frac{1}{2} \int_{- 1}^{1} d y \{\frac{y q^{'}}{q}\} \end{matrix}

(A81)

Here

\frac{1}{2} \int_{- 1}^{1} d y \{\frac{y q^{'}}{q}\} = \int_{- 1}^{1} d y \{\frac{2 θ_{1} y^{2} - θ_{2} y}{2 θ_{1} y^{2} - 2 θ_{2} y + b}\},

(A82)

and applying long division again:

\{\frac{2 θ_{1} y^{2} - θ_{2} y}{q}\} = A + \frac{B y + C}{q},

(A83)

with

A = 1; B = θ_{2}; C = - b .

(A84)

Thus we get

\frac{1}{2} \int_{- 1}^{1} d y \{\frac{y q^{'}}{q}\} = 2 + θ_{2} \int_{- 1}^{1} d y \{\frac{y}{q}\} - b \int_{- 1}^{1} d y \{\frac{1}{q}\} .

(A85)

Combining with previous results yields:

c = 2 - \frac{1}{2} log [q (- 1) q (1)] + \frac{θ_{2}}{4 θ_{1}} log [\frac{q (1)}{q (- 1)}] + \{\frac{θ_{2}^{2}}{2 θ_{1}} - b\} \int_{- 1}^{1} d y \{\frac{1}{q}\} .

(A86)

For the function k we can compute from its definition, but also some simplifications accrue:

\begin{matrix} k (θ_{1}, θ_{2}) & = & \frac{\partial c}{\partial θ_{1}} θ_{1} + \frac{\partial c}{\partial θ_{2}} θ_{2} - c (θ_{1}, θ_{2}) \\ = & \int_{- 1}^{1} d y \{\frac{θ_{1} (1 - x - y^{2})}{q (y)} + \frac{θ_{2} (y - ϵ)}{q (y)}\} - c (θ_{1}, θ_{2}) \\ = & \int_{- 1}^{1} d y \{\frac{h (y)}{1 - 2 h (y)}\} - c (θ_{1}, θ_{2}) \\ = & - \frac{1}{2} \int_{- 1}^{1} d y \{\frac{- 2 h (y) + 1 - 1}{1 - 2 h (y)}\} - c (θ_{1}, θ_{2}) \\ = & - 1 + \frac{1}{2} \int_{- 1}^{1} d y \{\frac{1}{q (y)}\} + \frac{1}{2} \int_{- 1}^{1} d y log (q (y)) . \end{matrix}

(A87)

For the theorem we have to evaluate the asymptotics of the ratio:

\frac{exp {- N β} | B | + N β \int_{0}^{1} d x exp {- N g_{2} (x)}}{exp {- N β} + N β \int_{0}^{1} d x exp {- N g_{1} (x)}},

(A88)

where B is the set appearing in (A53) and

g_{1, 2} (x) = β x + I_{1, 2} (x) .

(A89)

We next list some properties of the I-functions that will imply the theorem’s conclusion:

Assumptions on

I_{2} (x)

and

I_{1} (x)

\begin{matrix} (a) I_{2} (x) > I_{1} (x) for x \in (0, 1]; \\ (b) I_{1, 2} are differentiable, monotone decreasing and convex on (0, 1]; \\ (c) \lim_{x \to 0} I_{1, 2} (x) = + \infty; \\ (d) I_{2} (1) > 0; \\ (e) I_{1} (x) = 0 for x \in [2 / 3, 1] . \end{matrix}

(A90)

The rate functionals are defined as

I_{1, 2} (x) = - lim_{N \to \infty} 1 / N log P_{1, 2; N} (x) .

(A91)

Since they are computed from independent Gaussian variables, even if not identically distributed [40], the corrections to the large deviations are

O (N^{- 1 / 2})

:

P_{1, 2; N} (x) = exp [- N (I_{1, 2} (x) + O (\frac{log N}{N} h_{1, 2} (x)))] .

(A92)

Therefore the approximation

P_{1, 2; N} (x) \approx exp [- N I_{1, 2} (x)]

is very good for large N. If

I_{2} (x) \neq I_{1} (x)

then it has to be

I_{2} (x) > I_{1} (x)

. The study of their large deviation corrections should prove it.

For (a) we know from the inequality

P_{2; N} (x) < P_{1; N} (x)

(for fixed N) and the Gärtner–Ellis Theorem

I_{2} (x) \geq I_{1} (x)

, and we showed that these functions result from infimums of the same function over different sets; therefore, it is implausible that they are exactly equal for all x smaller than a given x*. (Because of the property (b) and

I_{2} (x) \geq I_{1} (x)

, if there exists a

x^{*}

such that

I_{1} (x) = I_{2} (x)

, then they have to coincide for all x smaller than

x^{*}

. We supply some evidence that it is not so, from numerical approximations, see the Computational Appendix A.6 and Figure A3.) From the approximation

P_{1, 2; N} (x) \approx exp [- N I_{1, 2 (x)}]

and

P_{i; N} (x^{'}) > P_{i; N} (x)

for

x^{'} > x

one expects strict decreasing monotonicity. This is indeed what we observed from the computer computations. Differentiability and strict decreasing monotonicity can be deduced from the Envelope Theorem in [41] at pag. 605 applied to (A63) and (A64).

f (z_{1}, z_{2})

is strictly convex and also the Lagrangian

f (z_{1}, z_{2}) - λ \cdot z

to optimize if the optimizer is on the boundaries. Applying the Envelope Theorem, on the right hand side of (A93) the derivative coincides with that of f because the constraint functions do not have dependence on x. From

inf_{z} f (z_{1}, z_{2})

, calling

θ^{*} (z)

the solution of (A63) and

z^{*}

the one of (A64), we have

\begin{matrix} \frac{d I_{i}}{d x} = \frac{\partial f (θ^{*} (z^{*} (x), x), z^{*} (x), x)}{\partial x} . \end{matrix}

(A93)

The smoothness of

θ^{*}

and

z^{*}

comes from the implicit function theorem. The right hand side gives

θ_{1}^{*} \int_{- 1}^{1} \frac{d y}{q}

, since

q > 0

we have that it is negative if and only if

θ_{1} < 0

. We know that

I_{i} (x)

is decreasing by definition so it can be only

θ_{1} < 0

or

θ_{1} = 0

, for

I_{1} (x)

this second case gives

I_{1} (x) = 0

, which it can be only for

x \geq 2 / 3

by definition; therefore,

θ_{1}^{*} < 0

when

I_{1} (x)

is not trivial. For

I_{2} (x)

the value

θ_{1}^{*}

stays in disjoint set from the axis

θ_{1} = 0

; therefore, it can be only

θ_{1}^{*} < 0

since

I_{2} (x)

is decreasing by definition. Second order differentiability needs to develop second order Envelope Theorems; of course it will not be

\frac{d^{2} I_{i}}{d x^{2}} = \frac{\partial^{2} f}{\partial x^{2}}

. Differentiability of the rate functionals can also be deduced from differentiability of the probability functions

P_{i; N} (x)

; in our case one can apply corollary 4.1 of [42] and corollary 32 of [43] because of the quadratic structure of the

χ_{n}

’s and other conditions that are verified. These results should be extended to second order differentiability without problem [44]. We cannot deduce analytically the convexity property (varying x we are restricting in a continuous way the integration on a regular set of a Gaussian integral) and we observe this property from the numerical computations of the exact formulas of the rate functionals, see Appendix A.6, within the limits of our computer analysis, that is below

x = 0.1

. Here we cannot obtain good numerical estimates since the sampling region shrinks to an infinitesimal volume and we have a vertical asymptote, as proved later for (c). In this vertical asymptote part convexity is natural. Anyway, from the proof after (A105), if there is some flex from some unexpected reason, the computations can be applied for smaller x (i.e., lower temperature) where the functionals are convex. From (A53) we have (d) since the first event for

x = 1

is the sure event, while the second never contains the average for any

ε > 0

(see also the Computational Appendix A.6 for a numerical computation).

Furthermore, (e) follows from a computation in a previous section, see (A14), and the remark that LD results from probabilities of sets that do not contain the mean; otherwise, by the Central Limit Theorem such probabilities go to one.

That leaves (c). It would follow immediately from the observation that

P_{1, 2} (0) = 0

(since

g_{n} \leq 1

) if we can assume that two limits can be interchanged. As that is not obvious, we supply a proof using previously obtained formulas. Since certainly

I_{2} (x) \geq I_{1} (x)

it suffices to prove (c) for the latter. Therefore, we set

θ_{2} = 0

in the following. From (A75) we note that

\frac{\partial c}{\partial θ_{1}} = \frac{1}{θ_{1}} \{- 1 + \frac{1}{2} \int_{- 1}^{1} \frac{1}{q (y)}\} .

(A94)

Let us define for a given x:

H = H (θ_{1}) = - 1 + \frac{1}{2} \int_{- 1}^{1} d y \frac{1}{q (y)} .

(A95)

We note

H (0) = 0

, since

q = 1

there, and

\begin{matrix} \frac{\partial H}{\partial θ_{1}} (0) & = & - \frac{1}{2} \int_{- 1}^{1} d y \{2 y^{2} - 2 (1 - x)\} \\ = & \frac{4}{3} - 2 x; \end{matrix}

(A96)

so, if

x < 2 / 3

,

\partial H / \partial θ_{1} > 0

Furthermore:

\frac{\partial^{2} H}{\partial θ_{1}^{2}} = \int_{- 1}^{1} d y \frac{{\{2 y^{2} - 2 (1 - x)\}}^{2}}{q {(y)}^{3}} \geq 0,

(A97)

since by definition of the domain D,

q (y) \geq 0

for all y in [−1, 1]. Hence, H is convex.

Now consider the left-most point of the domain D on the horizontal axis, labeled “P” in Figure A1, which has

θ_{1}

coordinate

- 1 / 2 x

. By definition, as

θ_{1} \to P

,

q (y) \to 0

for some

y \in [- 1, 1]

(here, at the endpoints

y = \pm 1

), so

H \to \infty

. Thus

H (θ_{1})

must have a second zero on the line, to the right of P; let us label it Q.

The constraint set is

H \geq 0

, so from (A94) we conclude that it lies entirely to the left of Q (and of course to the right of P).

Some computer work indicates that Q is very close to P for small x, which explains why we could not locate the constraint set by sampling for

x < 0.1

. Hence, suppose that

Q (x) = P (x) + δ (x)

with

δ (x)

bounded above (or even, as we shall see, tends to zero) as

x \to 0

. From (A87) we have

k (θ_{1}, 0) = H + \frac{1}{2} \int_{- 1}^{1} d y log {q (y)} .

(A98)

Note that, substituting

- 1 / 2 x + η

for

θ_{1}

,

q (y) = \frac{1 - y^{2}}{x} + 2 η y^{2} - 2 η (1 - x) .

(A99)

Therefore, in the constraint set with the above assumption

η \leq δ (x)

, so the second term in k goes to infinity as

x \to 0

. Hence, since the first term is nonnegative by the constraint,

inf {k (θ_{1}, 0) : θ_{1} \in D} \to \infty,

(A100)

as

x \to 0

, which yields (c).

In order to prove that

Q (x) \to P (x)

we apply the exact formula for the integral appearing in

H (θ_{1})

, which, with

θ_{2} = 0

, lies in the factorizable case, see (A77). Letting

s = \sqrt{\frac{1}{- 2 θ_{1}} + 1 - x},

(A101)

we find

H = (\frac{s^{2} + x - 1}{4 s}) log \{\frac{{(1 + s)}^{2}}{{(1 - s)}^{2}}\} - 1 .

(A102)

Next, let

t = s - 1

, so that

t = 0

corresponds to

θ_{1} = - 1 / 2 x

; i.e.,

P (x) = Q (x)

. Rewriting the equation for the root

H = 0

gives:

log (t) = log (2 + t) - \frac{2 (t + 1)}{t^{2} + 2 y + x};

(A103)

so exponentiating both sides:

t = (2 + t) exp \{\frac{- 2 (t + 1)}{t^{2} + 2 t + x}\} .

(A104)

Note that, if x is small, the right-hand side of (A104) is very small at

t = 0

and remains so up to small values of t, while the left-hand side follows t. For instance, if

x = 0.01

, up to

t = 0.01

the RHS is no more than about

exp {- 66}

which is infinitesimal while the LHS reaches

0.01

. Thus, as

x \to 0

, the solution of (A104) goes to zero, and hence

Q (x) \to P (x)

(in fact exponentially fast), which implies (c).

Granted these assumptions, we now proceed to the proof of Theorem Two. Property (e) implies that we can replace the denominator in (A88) as follows:

\frac{exp {- N β} | B | + N β \int_{0}^{1} d x exp {- N g_{2} (x)}}{exp {- (2 / 3) N β} + N β \int_{0}^{2 / 3} d x exp {- N g_{1} (x)}},

(A105)

Next, consider whether the g-functions have critical points in the intervals (0, 1) or (0, 2/3). Using a hat to denote those points, we are asking whether solutions exist for:

β + I_{1}^{'} ({\hat{x}}_{1}) = 0; β + I_{2}^{'} ({\hat{x}}_{2}) = 0,

(A106)

First, suppose both exist. If so, since g is convex, the Laplace approximation applies yielding:

\int exp {- N g (x)} \approx \frac{\sqrt{2 π} exp {- N \hat{g}}}{\sqrt{N {\hat{g}}^{″}}},

(A107)

where a hat means evaluate at the critical point.

We are now prepared to argue that the limit of the ratio in (A105) is always zero. There are four cases, defined by whether the “winner” (largest term asymptotically) in numerator and denominator is the first or second term:

Case:: numerator, second term; denominator, second term. As the $\hat{g}$ ’s are minimums, ${\hat{g}}_{2} > {\hat{g}}_{1}$ and the ratio tends to zero.
Case:: numerator, first term; denominator, second term. Then ${\hat{g}}_{1} < 2 / 3 β$ and we are looking essentially at the limit of

$\frac{exp {- N β} | B |}{exp {- N {\hat{g}}_{1}}}$

(A108)

which clearly goes to zero.
Case:: numerator, second term; denominator, first term. Now we are looking at

$\frac{exp {- N {\hat{g}}_{2}}}{exp {- (2 / 3) N β}}$

(A109)

which goes to zero if ${\hat{g}}_{2} > 2 / 3 β$ . The reverse is impossible, as the ratio would tend to infinity while we know it is bounded by one from the original definition.
Case:: numerator, first term; denominator, first term. The ratio tends to zero.

Since we are interested in low temperatures (that is small x) the presented cases are the relevant ones. However, we can still ask: do these critical points exist? With our assumptions we know that

lim_{x \to 0} I_{1, 2}^{'} (x) = - \infty .

(A110)

If in addition we knew that

lim_{x \to 1} I_{2}^{'} (x) = 0 and lim_{x \to 2 / 3} I_{1}^{'} (x) = 0

(A111)

we can conclude the c.p.’s exist. If not, for sufficiently small

β

a c.p. might not exist. For instance, let us postulate that

I_{2}^{'} (1) = - c < 0

or

I_{1}^{'} (2 / 3) = - c < 0

. Then in either case

g^{'} (x) \leq β - c < 0

(A112)

for all x in the relevant interval. In this case g is monotonic and we can make a change of variable:

u = g (x)

to obtain

\begin{matrix} β N \int_{0}^{1} d x exp {- N g (x)} & = & β N \int_{g (1)}^{g (0)} d u [\frac{1}{- g^{'} \circ g^{- 1} (u)}] exp {- N u} \\ \leq & β N \int_{g (1)}^{g (0)} d u [\frac{1}{c - β}] exp {- N u} \end{matrix}

(A113)

Therefore, for the numerator the integral is:

\leq [\frac{1}{c - β}] exp {- N [β + I_{2} (1)]}

(A114)

(where we have used assumption (c) to drop a term) and so, if

I_{2} (1) > 0

, the first term in the numerator wins. In the denominator, the order is at most

exp {- 2 / 3 β N}

, so the numerator tends to zero faster whatever term predominates in the denominator.

Appendix A.6. Computational Appendix

After implementing the functions appearing in Math Appendix A.5 in a program, we employed a sampling scheme to locate the constraint set. The idea is to choose at random a ray emanating from point “P” of Figure A1, then a point on the ray staying within the domain D. (We also tried a ray emanating from the origin, which gave similar results.)

By experiment, we discovered that for x near zero the region G shrank to nearly a line close to line A in Figure A1, while the constraint set for generating

I_{1}

lay almost at the left endpoint, “P”. Hence, we needed sampling schemes that could be biased to prefer points near the endpoints of a given interval [a,b] of the real axis. Such schemes are given by the following: to bias near b, choose a point “s” by the scheme

\begin{matrix} s & = & a + log (z η u + 1) / η \\ z & = & (exp {η (b - a)} - 1) / η . \end{matrix}

(A115)

and to bias near a:

s = a - log (1 - z η u exp {- η (b - a)}) / η,

(A116)

with the same expression for z. Plugging in a uniform random variable (produced by the system RNG) for u yields the sampling scheme for

s \in [a, b]

. With

η = 0

the sampling is uniform on the interval; large positive

η

yields bias. We will write:

s = biased - sample (a, b)

for the random sample.

Our sampling scheme in the region D was the following:

(1) Choose $α$ : $α = biased - sample (- x / (1 + ϵ), x / (1 - ϵ))$ ;

(2) Choose $θ_{1}$ : $θ_{1} = biased - sample (- 1 / (2 x), 5 / (1 - x))$ ;

(3) Let $θ_{2} = α [θ_{1} + 1 / (2 x)]$ .

If the selected

(θ_{1}, θ_{2})

pass all the tests to lie in D, accept the values; otherwise, reject them. To locate the constraint set, we sampled many pairs

(θ_{1}, θ_{2})

as above and evaluated the two partial derivatives of c, keeping and plotting the points that had the required signs. The results—see Figure A2; parameters in the figure were

x = 0.7

and

ϵ = 0.3

—show that G lies in the upper half of the plane and is disjoint from the horizontal axis. Figure A3 shows curves of the I-functions obtained by sampling for a few values of x (0.1 to 0.7, in increments of 0.1). We used 10 billion samples at each x-value. Unfortunately, we were unable to estimate the I-functions for

x < 0.1

because few or no sampled points fell in the constraint set (even with various choices of bias), for reason indicated earlier.

Figure A2. The domain D, in blue, defined in (A62) and the constraint set G, in red, defined in (A67) located by sampling.

Figure A3. I-functions obtained by sampling. The upper one is the I-function appearing in the numerator.

References

Hensen, B.; Bernien, H.; Dréau, A.E.; Reiserer, A.; Kalb, N.; Blok, M.S.; Ruitenberg, J.; Vermeulen, R.F.L.; Schouten, R.N.; Abellán, C.; et al. Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometers. Nature 2015, 526, 682–686. [Google Scholar] [CrossRef] [PubMed]
Kotler, S.; Peterson, G.A.; Shojaee, E.; Lecocq, F.; Cicak, K.; Kwiatkowski, A.; Geller, S.; Glancy, S.; Knill, E.; Simmonds, R.W.; et al. Direct observation of deterministic macroscopic entanglement. Science 2021, 372, 622–625. [Google Scholar] [CrossRef]
Ockeloen-Korppi, C.F. Entangled massive mechanical oscillators. Nature 2018, 556, 478–482. [Google Scholar] [CrossRef] [PubMed]
Riedinger, R. Remote quantum entanglement between two micromechanical oscillators. Nature 2018, 556, 473–477. [Google Scholar] [CrossRef] [PubMed]
Klimov, P.V. Quantum entanglement at ambient conditions in a macroscopic solid-state ensemble. Sci. Adv. 2015, 1, 10. [Google Scholar] [CrossRef]
Noordam, L.D.; Jones, R.R. Probing Rydberg electron dynamics. J. Mod. Opt. 1997, 44, 2515–2532. [Google Scholar] [CrossRef]
Stodolna, A.S.; Rouzée, A.; Lépine, F.; Cohen, S.; Robicheaux, F.; Gijsbertsen, A.; Jungmann, J.H.; Bordas, C.; Vrakking, M.J.J. Hydrogen Atoms under Magnification: Direct Observation of the Nodal Structure of Stark States. Phys. Rev. Lett. 2013, 110, 213001. [Google Scholar] [CrossRef]
Video about Observation of Rabi Oscillattions. Available online: https://en.wikipedia.org/wiki/File:Quantum_superposition_of_states_and_decoherence.ogv#filelinks (accessed on 1 March 2023).
Bild, M.; Fadel, M.; Yang, Y.; von Lupke, U.; Martin, P.; Bruno, A.; Chu, Y. Schrödinger cat states of a 16-microgram mechanical oscillator. arXiv 2022, arXiv:2211.00449. [Google Scholar]
Fein, Y.Y.; Geyer, P.; Zwick, P.; Kiałka, F.; Pedalino, S.; Mayor, M.; Gerlich, S.; Arndt, M. Quantum superpositions of molecules beyond 25 kDa. Nature 2019, 15, 1242–1245. [Google Scholar] [CrossRef]
Kovachy, T.; Asenbaum, P.; Overstreet, C.; Donnelly, C.A.; Dickerson, S.M.; Sugarbaker, A.; Hogan, J.M.; Kasevich, M.A. Quantum superpositions at the half-metre scale. Nature 2015, 528, 530–533. [Google Scholar] [CrossRef]
Mairhofer, L.; Passon, O. Reconsidering the Relation Between “Matter Wave Interference” and “Wave–Particle Duality”. Found. Phys. 2022, 52, 32. [Google Scholar] [CrossRef]
Arndt, M.; Hornberger, K. Testing the limits of quantum mechanical superpositions. Nat. Phys. 2014, 10, 271–277. [Google Scholar] [CrossRef]
Schrödinger, E. Quantization as an Eigenvalue Problem (4th Communication). Ann. Phys. 1926, 81, 109. [Google Scholar] [CrossRef]
Schrödinger, E. Die geganwärtige Situation in der Quantenmechanik. Die Nat. 1935, 23, 844–849. [Google Scholar] [CrossRef]
Weinberg, S. The Trouble with Quantum Mechanics. The New York Review, 19 January 2017. [Google Scholar]
Weinberg, S. Testing Quantum Mechanics. Ann. Phys. 1989, 194, 336–386. [Google Scholar] [CrossRef]
Bollinger, J.J.; Heinzen, D.J.; Itano, W.M.; Gilbert, S.L.; Wineland, D.J. Testing the linearity of quantum mechanics by rf spectroscopy of the 9Be⁺ ground state. Phys. Rev. Lett. 1989, 63, 1031. [Google Scholar] [CrossRef] [PubMed]
Wick, D. On the Non-Linear Quantum Mechanics and the Measurement Problem I. Blocking Cats. arXiv 2017, arXiv:1710.03278v1. [Google Scholar]
Bell, J.S. Speakable and Unspeakable in Quantum Mechanics; Cambridge University Press: Cambridge, UK, 1987. [Google Scholar]
Wick, D. On Non-Linear Quantum Mechanics and the Measurement Problem III. Poincaré Probability and … Chaos? arXiv 2018, arXiv:1803.11236v1. [Google Scholar]
Wick, D. On Non-Linear Quantum Mechanics and the Measurement Problem IV. Experimental Tests. arXiv 2019, arXiv:1908.02352v1. [Google Scholar]
Wick, D. On Non-Linear Quantum Mechanics, Space-Time Wavefunctions, and Compatibility with General Relativity. arXiv 2020, arXiv:2008.08663v1. [Google Scholar]
Faris, W.; Wick, D. The Infamous Boundary, Seven Decades of Heresy in Quantum Physics; Copernicus New York Inc.: New York, NY, USA, 1995. [Google Scholar]
De Carlo, L.; Wick, D.W. On Schrödingerist Quantum Thermodynamics. arXiv 2022, arXiv:2208.07688. [Google Scholar]
Yamada, K. Thermal properties of the system of magnetic bosons, Bose-Einstein Ferromagnetism. Prog. Theor. Phys. 1982, 67, 2. [Google Scholar] [CrossRef]
Yang, Z.; Yang, L.; Dai, J.; Xiang, T. Rigorous Solution of the Spin-1 Quantum Ising Model with Single-Ion Anisotropy. Phys. Rev. Lett. 2008, 100, 067203. [Google Scholar] [CrossRef] [PubMed]
Lidar, D.A.; Wu, L.-A. Qubits as parafermions. J. Math. Phys. 2002, 43, 9. [Google Scholar]
IBM. Available online: https://www.ibm.com/blogs/research/2019/10/controlling-individual-atom-qubits/ (accessed on 14 March 2023).
Vilenkin, N.J.; Klimyk, A.U. Representations of Lie Groups and Special Functions; Volume 2: Class I Representations, Special Functions, and Integral Transforms; Groza, V.A., Groza, A.A., Eds.; Springer: New York, NY, USA, 1993. [Google Scholar]
Einstein, A. The Collected Papers of Albert Eistein, Contributions to Quantum Theory; Princeton University Press: Princeton, NJ, USA, 1914–1917; Volume 6, pp. 20–26. [Google Scholar]
Bloch, F. Fundamentals of Statistical Mechanics; World Scientific, Imperial College Press: London, UK, 2000. [Google Scholar]
Jona-Lasinio, G.; Presilla, C. On the Statistics of Quantum Expectations for Systems in Thermal Equilibrium. AIP Conf. Proc. 2006, 844, 200. [Google Scholar]
Lebowitz, J. Microscopic origin of macroscopic behavior. arXiv 2021, arXiv:2105.03470v1. [Google Scholar]
Schrödinger, E. The Exchange of Energy according to Wave-Mechanics. In Collected Papers on Wave Mechanics; Blackie & Son Limited: London, UK; Glasgow, UK, 1928; pp. 137–146. [Google Scholar]
Schrödinger, E. Statistical Thermodynamics; Dover Publications, Inc.: New York, NY, USA, 1952. [Google Scholar]
Sachdev, S. Quantum Phase Transitions; University Press: Cambridge, UK, 1999. [Google Scholar]
Dembo, A.; Zeitouni, O. Large Deviations Techniques and Applications, 2nd ed.; Springer: Berlin/‎Heidelberg‎, Germany, 1998. [Google Scholar]
Hollander, F. Large Deviations; American Mathematical Society: Providence, RI, USA, 2000. [Google Scholar]
Petrov, V.; Robinson, J. Large Deviations for Sums of Independent Non Identically Dstributed Random Variables. Commun. Stat.-Theory Methods 2008, 37, 2984–2990. [Google Scholar] [CrossRef]
Carter, M. Foundations of Mathematical Economics; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
Pérez-Aros, P.; van Ackooij, W. Gradient Formulae for Nonlinear Probabilistic Constraintswith Non-convex Quadratic Forms. J. Optim. Theory Appl. 2020, 37, 2984–2990. [Google Scholar]
Pérez-Aros, P.; van Ackooij, W. Gradient formulae for probability functions depending on a heterogenous family of constraints. Open J. Math. Optim. 2021, 2, 7. [Google Scholar]
van Ackooij, W. (EDF R&D, Palaiseau, Ile de France, France). Personal communication, January 2023. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De Carlo, L.; Wick, W.D. On Magnetic Models in Wavefunction Ensembles. Entropy 2023, 25, 564. https://doi.org/10.3390/e25040564

AMA Style

De Carlo L, Wick WD. On Magnetic Models in Wavefunction Ensembles. Entropy. 2023; 25(4):564. https://doi.org/10.3390/e25040564

Chicago/Turabian Style

De Carlo, Leonardo, and William D. Wick. 2023. "On Magnetic Models in Wavefunction Ensembles" Entropy 25, no. 4: 564. https://doi.org/10.3390/e25040564

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Magnetic Models in Wavefunction Ensembles

Abstract

1. Motivations, Results and Organization of the Work

2. Materials and Mathematical Results

3. Models with Distinguishable Particles

3.1. Models with Wavefunction Energy

4. Models Assuming Exchange Symmetry: The Schrödingerist Curie–Weiss Model

4.1. The SCW Model with Wavefunction Energy

4.2. The SCW Model without Wavefunction Energy

5. Ensemble Choice and Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Some Useful Lemmas

Appendix A.2. A Curious Computation: The Average Magnetization at T = ∞

Appendix A.3. Models without Exchange Symmetries

Appendix A.4. SCW with Wavefunction Energy

Appendix A.5. SCW without Wavefunction Energy

Appendix A.6. Computational Appendix

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI