Here we collect in several appendices the various computational lemmas we used in the proofs that follow and the proofs themselves.
  Appendix A.3. Models without Exchange Symmetries
For the computations of this section we add to the energy an external field with a constant 
, incorporate the factor of 
, and take the positive version of the interaction energy (which merely multiplies 
Z by a factor). For  example 
 and 
, respectively, for nearest neighbors and mean field. We write:
          Here and below “
”, resp. “
”, is shorthand for the sum, resp. product, over all configurations 
. We will also write
          
          The starting point is to replace the spherical integral by a Gaussian integral, preserving the definition of the model by projecting 
 onto the sphere. Hence we can write:
          We have introduced a factor of 1/2 before the energy to simplify some calculations below. The basic idea is to note that, although the integrand is singular at 
, because the components of 
 are i.i.d. 
, that value is improbable; indeed,
          
          Therefore we define a simpler model by replacing the worrisome sum in the integrand by a constant:
          which we will dub the “exactly solvable model” (ESM), as we are reduced to computing a Gaussian integral. We have:
          so from the basic Gaussian integral in two dimensions we find:
          Next we take two derivatives with respect to 
, denoted with primes, divide by 
 and evaluate at 
:
          The first term is zero by the spin-flip symmetry (
 changes sign but since 
 is quadratic in 
S it does not). Then, observing the prefactors of 
, the whole thing tends to zero as 
 even without a factor of 
, so there is no phase transition in the ESM. Finally we must estimate the difference in magnetizations in the two models. We can rewrite 
 as:
          where
          
          By expanding the exponential we can write
          
          Hence, expanding to first order in the small quantities 
 and 
,
          
          Let us treat the third term first. Noting that
          
          Applying Cauchy–Schwarz we can bound the term by:
          By standard CLT calculations, the first factor in curly brackets tends to zero, at rate 
. Using (
A24) the second factor can be rewritten as 
 which is easily seen to be bounded (and in fact tends to one). In treating the second term we encounter instead the quantity 
 This term can also be shown to tend to zero. E.g., writing “
” for the Gaussian integral, factoring the difference of squares and making another Cauchy–Schwarz, we have to estimate:
          which is O
. We also have to estimate 
, which by the same tricks is seen to be bounded. Recalling that
          
          the other term in (
A28) is even easier to treat. Since moments of Gaussians increase more slowly than a factorial, the sum of the other terms is dominated by a convergent series and of smaller order. QED.
Now consider adding wavefunction energy: 
          Let us switch to the Gaussian version, divide 
 by 
 and replace 
 by 
. Treating the second term in the last line above, define
          
          where “
” means summation over the spin configurations with 
 held at the fixed value indicated. Then we can write:
          Note that, using 
 for expectation over the Gaussian distribution:
          and
          
          Noting the average of i.i.d. mean-zero random variables  this last is O
. The last term in (
A32) is a sum of 
N terms which are not independent but all of the order just computed. By a simple Jensen inequality, the sum is at most 
N times this order. We conclude that the whole term is negligible. The first term is of the form of a mean-field model, still quadratic in 
. Hence it can be added to the usual magnetic energy. The bound on 
 increases to 
, but does not affect the argument. For the comparison with the actual model with WFE, we encounter the expression
          
          which can be treated as before.
Thus the argument of the quadratic case goes through and yields that with WFE there is no magnetization at finite temperature.
  Appendix A.4. SCW with Wavefunction Energy
We prove here that, assuming the hypotheses of Theorem Two, if
          
          where 
 is a certain positive function specified in the proof below, then the conclusion of the theorem follows. We will apply the Concentration Lemma (
A3). We can assume by adding a term as usual to render 
f positive:
          We next define the quantities involved in the lemma. For the sets 
U and 
V we take, given two positive numbers 
 and 
 (
 may depend on 
N),
          
          Further, define:
          We next check the hypotheses of the Concentration Lemma. A quick calculation gives the equivalent description of set 
V:
          where 
 and we have defined
          
          Since 
, assuming 
 and 
, the condition defining 
V implies 
. Since 
, it follows from (
A40) that 
 on 
. For the lower bound on 
, we translate to a model with i.i.d. Gaussians, call them 
, replacing:
          and the definition of 
V becomes:
          (Since 
 has both a real and an imaginary part, we should double the lengths of these sums, but clearly this has no impact on the final result.) We have to find an exponential lower bound on 
. Since 
, we can get a lower bound by writing:
We will work on the first probability on the last line. (The second one is really the same since, multiplying the inequality through by −1, replacing 
 by 
 just reverses the order of its values.) According to Gärtner–Ellis [
38], we have to compute:
          where 
 denotes expectation and
          
          Note that 
 can take negative and positive values; hence, 
 must be restricted to ensure that the integral is finite. From the standard Gaussian integral we get:
          which, provided the integrand is bounded, we recognize as the Reimann sum converging to:
          where
          
          By a change of variable this equals:
The LD approach requires us to compute:
          and then the asympotic lower bound is 
 [
38]. By the definition of the domain where 
 in the Gärtner–Ellis theorem, the supremum over 
 in the definition of 
 can be limited to:
          where 
 is negative but we do not need to know it, while by a simple computation:
          (This is the max on the whole line.) We must have positive values of 
 somewhere in the interval [−1, 1], for otherwise 
 as 
, so 
 for 
. This requires 
 and that the lower root of 
 lies in the interval [0, 1] (since 
 and 
). The root is easily computed to be:
          Since 
, this root is real; letting 
 the condition 
 becomes
          
          Since we will eventually identify 
 with 
, the above inequality is identical to (33). Since 
, the infimum of the right side is 1/4.
The problem of computing  is then well defined. We note that, if the point at which  lies in the unit interval, then, as , the integral looks as if it might diverge. However, it remains finite, because logarithmic singularities are integrable. e.g.,  not . However, the derivative  will go to infinity at the boundaries (Ellis calls such a function “steep” and it is an assumption of his theorem).
Assuming that 
, it will suffice for Theorem Two to know that for some 
:
          We can define 
 to saturate the second inequality above (note that 
), which also yields 
. The conclusion of the theorem follows.
Can we compute 
 and 
? In fact, the integral defining 
 is elementary and can be computed as follows. First, integrate by parts:
There are two cases, depending on whether the denominator in the integrand factorizes or not. For  positive, it does not; hence, the denominator is an irreducible quadratic. For some negative  it does factorize. We can proceed by elementary linear and trig substitutions, yielding either log(linear), log(quadratic), or arctan(quadratic) terms. Clearly, given such formulas with nonalgebraic functions, computing the supremum cannot be expected in closed form, so we would resort to the computer, but do not report detailed results here.
  Appendix A.5. SCW without Wavefunction Energy
The idea is to determine whether
          
          or not, where
          
          and similarily with 
 replaced by 
. We can rewrite the integrals as:
          where
          
          where as usual we have added a term so that 
. To compare these integrals in numerator and denominator, we turn to the UIF, setting 
 in the numerator and 
B equal to the whole sphere in the denominator. Thus, for the numerator we will have to estimate:
          which, replacing wavefunctions on the sphere by i.i.d. Gaussians, equivalently:
          (since 
 has two components, the sums are now over 2N + 1 rather than N + 1 indices, with the 
 repeated; however, as we are taking the limit as 
 we do not bother with factors of two everywhere) which is the same as writing (introducing factors of 
 for later purposes):
To motivate the appeal to the Gärtner–Ellis theorem, let 
 stand for the probability of the set appearing above and 
 for the probability with the second restriction dropped. We are interested in the ratio:
          which has the interpretation of the conditional probability that the magnetization is greater than 
, given a bound “
x” on the energy. We expect that
          
Then, integrating over 
x (as in the UIF) we shall arrive at the result. Noting that the events considered have the form of averages of random variables whose means lie outside the indicated bounds, we expect large-deviation asymptotics for both, and so it is natural to consider
          
          and to treat the two terms separately and then compare. We used 
 with 
 for the involved sets 
S. But what a large deviation principle gives are upper and lower bounds. To have equality 
 we need 
S to be 
I-continuous, i.e.,  
, see page 30 [
39]. This is the case.
To implement the Gärtner–Ellis procedure, we introduce the random vector with two components:
          so we are interested in
          
To apply G-E we must compute:
          where
          
          where 
b is a constant depending on the thetas.
We must first determine the region in the 
 plane in which 
. Defining
          
          this region, call it 
D, is defined by:
          We observe that 
D is the domain where 
 is finite, related to the conditions for the validity of the Gärtner–Ellis theorem, not the domain of the logarithm argument. The geometry of this region is a bit complicated. These are the tests for whether a point lies in 
D:
Test1: ;
Test2: ;
Test3: if , the critical point of , lies in [−1, 1], then test whether .
Now see 
Figure A1. By tests one and two, 
D lies between the lines A and B, and to the left of point P. Test three applies in between lines E and F; to the right of the axis, it is satisfied within the oblique ellipse, curve C; to the left of the axis, 
h is negative. Hence we obtain a diamond-shaped compact region of the plane.
  
    
  
  
    Figure A1.
      Regions used in describing the domain D where  is finite. These lines are obtained by the Tests above. Lines A and B are respectively  from  and  from . To the left of P, Test 1 and 2 can not be satisfied simultaneously. Lines E and F are respectively  from  and  from . The ellipse C is , obtained from  of Test 3 to be applied for  (for  Test 3 is satisfied).
  
 
   Figure A1.
      Regions used in describing the domain D where  is finite. These lines are obtained by the Tests above. Lines A and B are respectively  from  and  from . To the left of P, Test 1 and 2 can not be satisfied simultaneously. Lines E and F are respectively  from  and  from . The ellipse C is , obtained from  of Test 3 to be applied for  (for  Test 3 is satisfied).
  
 
We note that 
 is convex (by examination of its Hessian matrix, omitted) and “steep”, meaning its derivatives approach infinity at the boundaries of 
D. We have an ellipse as in 
Figure A1 if 
, otherwise a hyperbola. We restrict the study to the ellipse case, since it is the relevant one: we are primary interested in small 
 and, from  the proof after (
A105), one can see that the small 
x are the relevant ones for low temperatures. Therefore, given 
, we always restrict to the 
x small enough to satisfy 
.
The G-E procedure is to compute the dual function:
          from which you can compute the I-function as:
By the convexity and steepness of 
c, the supremum in (
A63) is attained at a critical point for which
          
          so looking forward to a possible computer search (for which we do not want to solve equations but rather evaluate formulas), we define
          
Let 
G denote the constraint set in the 
-plane:
          We can express the I-function as
          
For the proof we have to compare the I-function for the two-variable, two-inequality problem with that of the one-variable, one-inequality problem, given by
          
          where 
L is the part of the horizontal axis that lies in 
D. A first, critical question to ask is whether necessarily 
 because the set over which we compute the infimum for 
 contains the set for which we compute the infimum for 
. However, this is not the case. We have:
          where 
 by the restriction that 
. The first integral can be done easily by elementary calculus:
Now note that, at 
, this integral is zero, and so is the first term in (
A70), while the second term there is negative. Hence 
 is disjoint from 
L.
We need a formula for the first partial of 
c:
          By long division we can write:
          where
          
          Hence:
          where we have plugged in (
A71) for one integral.
Note that we have reduced computing these derivatives to computing one integral, of 
 over the interval 
. There are two cases, depending on whether 
 is irreducible or factorizes. The discriminant is:
          If disc. 
, 
q factorizes as
          
          where the two roots are outside the interval 
. The integrals can then be performed by partial fractions, yielding logarithmic terms:
If disc.
, in our case 
q can be written:
          with 
 and 
, and one makes a trig substitution:
          The result contains inverse trig functions:
To check these calculus formulas (and our computer implementation of them) we computed the integral directly (numerically) using the trapezoid rule and compared. We also need a formula for 
c, which we can obtain by integrating by parts:
          Here
          
          and applying long division again:
          with
          
Combining with previous results yields:
For the function 
k we can compute from its definition, but also some simplifications accrue:
For the theorem we have to evaluate the asymptotics of the ratio:
          where 
B is the set appearing in (
A53) and
          
We next list some properties of the I-functions that will imply the theorem’s conclusion:
Assumptions on 
 and 
          The rate functionals are defined as
          
          Since they are computed from independent Gaussian variables, even if not identically distributed [
40], the corrections to the large deviations are 
:
          Therefore the approximation 
 is very good for large 
N. If 
 then it has to be 
. The study of their large deviation corrections should prove it.
For (a) we know from the inequality 
 (for fixed 
N) and the Gärtner–Ellis Theorem 
, and we showed that these functions result from infimums of the same function over different sets; therefore, it is implausible that they are exactly equal for all 
x smaller than a given 
x*. (Because of the property (b) and 
, if there exists a 
 such that 
, then they have to coincide for all 
x smaller than 
. We supply some evidence that it is not so, from numerical approximations, see the Computational 
Appendix A.6 and 
Figure A3.) From the approximation 
 and 
 for 
 one expects strict decreasing monotonicity. This is indeed what we observed from the computer computations. Differentiability and strict decreasing monotonicity can be deduced from the Envelope Theorem in [
41] at pag. 605 applied to (
A63) and (
A64). 
 is strictly convex and also the Lagrangian 
 to optimize if the optimizer is on the boundaries. Applying the Envelope Theorem, on the right hand side of (
A93) the derivative coincides with that of 
f because the constraint functions do not have dependence on 
x. From 
, calling 
 the solution of (
A63) and 
 the one of (
A64), we have
          
          The smoothness of 
 and 
 comes from the implicit function theorem. The right hand side gives 
, since 
 we have that it is negative if and only if 
. We know that 
 is decreasing by definition so it can be only 
 or 
, for 
 this second case gives 
, which it can be only for 
 by definition; therefore, 
 when 
 is not trivial. For 
 the value 
 stays in disjoint set from the axis 
; therefore, it can be only 
 since 
 is decreasing by definition. Second order differentiability needs to develop second order Envelope Theorems; of course it will not be 
. Differentiability of the rate functionals can also be deduced from differentiability of the probability functions 
; in our case one can apply corollary 4.1 of [
42] and corollary 32 of [
43] because of the quadratic structure of the 
’s and other conditions that are verified. These results should be extended to second order differentiability without problem [
44]. We cannot deduce analytically the convexity property (varying 
x we are restricting in a continuous way the integration on a regular set of a Gaussian integral) and we observe this property from the numerical computations of the exact formulas of the rate functionals, see 
Appendix A.6, within the limits of our computer analysis, that is below 
. Here we cannot obtain good numerical estimates since the sampling region shrinks to an infinitesimal volume and we have a vertical asymptote, as proved later for (c). In this vertical asymptote part convexity is natural. Anyway, from the proof after (
A105), if there is some flex from some unexpected reason, the computations can be applied for smaller 
x (i.e., lower temperature) where the functionals are convex. From  (
A53) we have (d) since the first event for 
 is the sure event, while the second never contains the average for any 
 (see also the Computational 
Appendix A.6 for a numerical computation).
Furthermore, (e) follows from a computation in a previous section, see (
A14), and the remark that LD results from probabilities of sets that do not contain the mean; otherwise, by the Central Limit Theorem such probabilities go to one.
That leaves (c). It would follow immediately from the observation that 
 (since 
) if we can assume that two limits can be interchanged. As that is not obvious, we supply a proof using previously obtained formulas. Since certainly 
 it suffices to prove (c) for the latter. Therefore, we set 
 in the following. From (
A75) we note that
          
Let us define for a given 
x:
          We note 
, since 
 there, and
          
          so, if 
, 
Furthermore:
          since by definition of the domain D, 
 for all 
y in [−1, 1]. Hence, 
H is convex.
Now consider the left-most point of the domain D on the horizontal axis, labeled “P” in 
Figure A1, which has 
 coordinate 
. By definition, as 
, 
 for some 
 (here, at the endpoints 
), so 
. Thus 
 must have a second zero on the line, to the right of 
P; let us label it 
Q.
The constraint set is 
, so from (
A94) we conclude that it lies entirely to the left of 
Q (and of course to the right of 
P).
Some computer work indicates that 
Q is very close to 
P for small 
x, which explains why we could not locate the constraint set by sampling for 
. Hence, suppose that 
 with 
 bounded above (or even, as we shall see, tends to zero) as 
. From (
A87) we have
          
Note that, substituting 
 for 
,
          
          Therefore, in the constraint set with the above assumption 
, so the second term in 
k goes to infinity as 
. Hence, since the first term is nonnegative by the constraint,
          
          as 
, which yields (c).
In order to prove that 
 we apply the exact formula for the integral appearing in 
, which, with 
, lies in the factorizable case, see (
A77). Letting
          
          we find
          
Next, let 
, so that 
 corresponds to 
; i.e., 
. Rewriting the equation for the root 
 gives:
          so exponentiating both sides:
Note that, if 
x is small, the right-hand side of (
A104) is very small at 
 and remains so up to small values of 
t, while the left-hand side follows 
t. For instance, if 
, up to 
 the RHS is no more than about 
 which is infinitesimal while the LHS reaches 
. Thus, as 
, the solution of (
A104) goes to zero, and hence 
 (in fact exponentially fast), which implies (c).
Granted these assumptions, we now proceed to the proof of Theorem Two. Property (e) implies that we can replace the denominator in (
A88) as follows:
Next, consider whether the 
g-functions have critical points in the intervals (0, 1) or (0, 2/3). Using a hat to denote those points, we are asking whether solutions exist for:
First, suppose both exist. If so, since 
g is convex, the Laplace approximation applies yielding:
          where a hat means evaluate at the critical point.
We are now prepared to argue that the limit of the ratio in (
A105) is always zero. There are four cases, defined by whether the “winner” (largest term asymptotically) in numerator and denominator is the first or second term:
- Case: 
              
 numerator, second term; denominator, second term. As the ’s are minimums,  and the ratio tends to zero.
- Case: 
              
 numerator, first term; denominator, second term. Then 
 and we are looking essentially at the limit of
              
              which clearly goes to zero.
- Case: 
              
 numerator, second term; denominator, first term. Now we are looking at
              
              which goes to zero if 
. The reverse is impossible, as the ratio would tend to infinity while we know it is bounded by one from the original definition.
- Case: 
              
 numerator, first term; denominator, first term. The ratio tends to zero.
Since we are interested in low temperatures (that is small 
x) the presented cases are the relevant ones. However, we can still ask: do these critical points exist? With our assumptions we know that
          
          If in addition we knew that
          
          we can conclude the c.p.’s exist. If not, for sufficiently small 
 a c.p. might not exist. For instance, let us postulate that 
 or 
. Then in either case
          
          for all 
x in the relevant interval. In this case 
g is monotonic and we can make a change of variable: 
 to obtain
          
Therefore, for the numerator the integral is:
          (where we have used assumption (c) to drop a term) and so, if 
, the first term in the numerator wins. In the denominator, the order is at most 
, so the numerator tends to zero faster whatever term predominates in the denominator.