Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting

Qian, Hong

doi:10.3390/e26121091

Open AccessArticle

Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting

by

Hong Qian

Department of Applied Mathematics, University of Washington, Seattle, WA 98195-3925, USA

Entropy 2024, 26(12), 1091; https://doi.org/10.3390/e26121091

Submission received: 20 November 2024 / Accepted: 11 December 2024 / Published: 13 December 2024

(This article belongs to the Special Issue The Entropy Production—as Cornerstone in Applied Nonequilibrium Thermodynamics—Dedicated to Professor Signe Kjelstrup on the Occasion of Her 75th Birthday)

Download Versions Notes

Abstract

:

Statistical counting ad infinitum is the holographic observable to a statistical dynamics with finite states under independent and identically distributed N sampling. Entropy provides the infinitesimal probability for an observed empirical frequency

\hat{ν}

with respect to a probability prior

p

, when

\hat{ν} \neq p

as

N \to \infty

. Following Callen’s postulate and through Legendre–Fenchel transform, without help from mechanics, we show that an internal energy

u

emerges; it provides a linear representation of real-valued observables with full or partial information. Gibbs’ fundamental thermodynamic relation and theory of ensembles follow mathematically.

u

is to

\hat{ν}

what chemical potential

μ

is to particle number N in Gibbs’ chemical thermodynamics, what

β = T^{- 1}

is to internal energy U in classical thermodynamics, and what

ω

is to t in Fourier analysis.

Keywords:

emergent phenomenon; entropy; information; internal energy; probability theory; statistic; thermodynamics

1. Introduction

It is a pleasure to be a part of this celebration for Signe Kjelstrup. She has made significant contributions to nonequilibrium thermodynamics, in both theory and applications that include electrochemistry, transport in heterogeneous media, and T. L. Hill’s small systems [1,2,3]. In this work we extend Gibbs’ and Hill’s approach to equilibrium thermodynamics [4] and show that the new logical path via the “crucial step” advocated in [5] is in fact a consequence of a limit theorem [6] in the mathematical theory of probability [7,8]. The results perfectly fit P. W. Anderson’s notion of emergent phenomenon [9].

Sometimes a mathematical transform can provide a fundamental concept beyond just being a technique for solving a problem, and through which a new representation of a natural phenomenon emerges. A case in point is the Fourier transform (FT) that leads to the theory of harmonics in music instruments [10] and the very concept of optical spectrum. FT represents a function of time

f (t)

in terms of

\tilde{f} (ω)

, where

ω

is introduced as a novel notion, the temporal frequency of a sinusoidal oscillatory component in time [11]. The solutions to a large class of problems in differential calculus involving t can be very efficiently expressed through FT.

We show in the present paper that the fundamental notion of internal energy first appeared in the theory of thermodynamics in the 19th century, collectively developed by J. R. von Mayer, W. Rankine, R. Clausius, and W. Thomson among many others [12], is a concept that can be understood, and generalized, in statistical counting. The transformation in question is the Legendre–Fenchel transform (LFT) [13,14], a more refined mathematical formulation of the traditional Legendre transform [15].

When a simple statistical analysis is carried out on a set of data, correlated or not, it is usually supposed that they are from an identical probability distribution. One of the best understood systems that exhibit an invariant probability is the ergodic dynamical system [16]. The ergodic theory of classical Hamiltonian dynamics has been an intense research area in both physics and mathematics for more than a century [17,18]. Even when the data are from seemingly different “objects”, say different individuals within a biological species, it is understood that an ergodic mating or mutational process is behind the statistical practice; and the conclusions drawn are most meaningful in this regard. Such an ergodic stochastic dynamic perspective has transformed cell biology through the notion of phenotypic switching in recent years [19].

2. Energetic Theory of Statistical Counting

Let us consider the repeated statistical samples ad infinitum of a system with finite state space

S = {0, 1, \dots, n}

. In the present work we shall restrict our discussion for independent and identically distributed (i.i.d.) samples. More general sampling of Markov data will be published elsewhere. The number counting

ν = (ν_{0}, \dots, ν_{n})

with

ν_{0} + \dots + ν_{n} = N

and counting frequency

\hat{ν} = ν / N

, not to be confused with the

ω

in FT above, has a homogeneous degree 1 neg-entropy function with respect to a given probability prior

p = (p_{0}, \dots, p_{n})

[8,20]:

Φ (ν) = \sum_{i = 0}^{n} ν_{i} ln (\frac{ν_{i}}{p_{i} \sum_{j = 0}^{n} ν_{j}}) .

(1)

The Appendix A provides the mathematical origin of the non-negative

Φ (ν)

as a result of statistical counting. In information theory, it is interpreted as the “surprise” in observing the

ν

under the assumption

p

[21,22]. It is a double-edged sword which tells the rareness of

ν

(or

\hat{ν}

) with respect to

p

or erroneous model

p

with respect to empirical

ν

. The Kolmogorov probability

p

has two rather different roles in statistical inference and in statistical physics. In the former it has been identified as only half to the other “half of probability theory as it is needed in current applications—the principles for assigning probabilities by logical analysis of incomplete information—is not present at all in the Kolmogorov system” [20].

The application of modern probability to statistical physics involves the limit of sample ad infinitum represented by

N \to \infty

[6]. In this case, the probability

p

is for all the systems with the same state space

S

; it is not meant to be realistic for any particular system. It simply provides a “metric” under which each and every particular system has its own representation in terms of its complete information,

ν

. The

Φ (ν)

is introduced to further gauge the differences among systems with different

ν

’s on the same

S

; it becomes a “theory of everything.” Because the

N \to \infty

limit, there are no uncertainties in

\hat{ν}

; it is a definitive characterization of an i.i.d. statistical distribution with state space

S

.

Therefore, statistical inference is about the mathematical model of a particular system, and statistical physics is about the mathematical representation of all systems with the same

S

under the supposition of i.i.d. data ad infinitum. The entropy in (1) is an emergent characterization in the limit of

N \to \infty

, with the starting point in terms of generative models [9]. It provides the relationship between

ν

and

p

in the sampling process. It is an Eulerian degree 1 homogeneous function of

ν

:

Φ (λ ν) = λ Φ (ν)

. This fits naturally to the fundamental thermodynamic postulate formulated by H. B. Callen [23]. The LFT of

Φ

as a function of the normalized

\hat{ν}

then yields [13,14,24]:

Ψ (u) = inf_{\hat{ν}} \{\sum_{i = 0}^{n} {\hat{ν}}_{i} u_{i} + Φ (\hat{ν})\} = - ln \sum_{i = 0}^{n} p_{i} e^{- u_{i}},

(2a)

with corresponding optimal

{\hat{ν}}^{*} (u)

{({\hat{ν}}^{*})}_{i} = \frac{p_{i} e^{- u_{i}}}{\sum_{ℓ = 0} p_{ℓ} e^{- u_{ℓ}}}, and u_{i} = - (\frac{\partial Φ ({\hat{ν}}^{*})}{\partial {\hat{ν}}_{i}}) .

(2b)

Note that the second equation in (2b) is obtained when one uses calculus to solve the infimum in (2a); this recovers the traditional Legendre transform. Normalizing

ν

to

\hat{ν}

induces a gauge freedom in (2), an arbitrary additive constant to

u_{i}

. In statistical thermodynamics, the conjugate variable

u_{k}

introduced in Equation (2) has been interpreted as the internal energy of the state k, in

k_{B} T

unit [25]; then

\hat{ν} \cdot u

is the mean internal energy of “the statistical system”.

In a real-world laboratory working on a particular system, the

ν

tends to infinity as

N \to \infty

but

\hat{ν}

converges to the intrinsic property of the statistical system. In statistical inference, the assumed

p

, as a prior, then is expected to be replaced by the observed, real, posterior

\hat{ν}

according to conditional probability and/or Bayesian statistical logic [24,25]. This concludes the statistical investigation of the particular system with respect to the type of observations. The neg-entropy function in (1) actually provides a meta-statistical theory for all possible observed

\hat{ν}

, assessing their respective infinitesimal probability (rate) with respect to the prior

p

(see Appendix A).

3. Maximization of Entropy $- Φ$ Under Constraint by Empirical Mean Value

However, The complete counting for the entire state space

S

in terms of empirical frequencies

\hat{ν}

is only a gedankenexperiment. The significance of Gibbs’ ensemble theory is in dealing with observations from a small set of real-valued observables

g_{1} (i), g_{2} (i), \dots, g_{J} (i)

, where

i \in S

but

J ≪ n

. These g’s are random variables on the state space

S

. In fact, their empirical mean values are linear combinations of the

\hat{ν}

:

x_{j} = \sum_{i = 0}^{n} {\hat{ν}}_{i} g_{j} (i) .

(3)

To fix mathematical notations, we append

g_{0} (i) = 1

and

x_{0} = 1

, which represent the fact that

\hat{ν}

is always normalized, and denote

(n + 1) \times (J + 1)

matrix

G_{J}

with elements

{(G_{J})}_{i j} = \{\begin{matrix} 1 & j = 0, \\ g_{j} (i) & j = 1, \dots, J . \end{matrix}

(4)

Equation (3) shows that if all the g’s are linearly independent and

J = n

, then one can solve the normalized

\hat{ν}

uniquely from each set of x’s:

\hat{ν} = x G_{n}^{- 1}

. We refer to such a set of observables as holographic with full information. In the following discussion, we shall always imagine the

(g_{1}, \dots, g_{J})

as the first J component of a holographic observable

(g_{0}, g_{1}, \dots, g_{n})

. When

J < n

, there is missing information [20,22,25].

With a set of observed values

x^{'} = (x_{1}, \dots, x_{J})

in hand where

J < n

, the maximum entropy principle (MEP) from classical thermodynamics [23] and the contraction principle from the mathematical theory of probability [8] assert that the most probable

{\hat{ν}}^{*}

that is consistent with the set of

x^{'}

corresponds to minimum neg-entropy:

{\hat{ν}}^{*} = arg inf_{\hat{ν}} \{Φ (\hat{ν}) | \hat{ν} G_{J} = x^{'}\} .

(5)

The entire Gibbs’ ensemble theory arises in solving the mathematical problem posed in Equation (5) through LFT. See Appendix A for its origin.

Entropy functions for different observables are different. First, for invertible

G_{n}

, one has the entropy function for the holographic observable

x = (1, x_{1}, \dots, x_{n})

:

Φ_{x} (x) \equiv Φ (x G_{n}^{- 1}) .

(6)

This is simply a change in the independent variables from

ν

to

x

. Then in terms of this entropy function

Φ_{x}

, (5) becomes

\begin{matrix} φ (x^{'}) & = & inf_{\hat{ν}} \{Φ (\hat{ν}) | \hat{ν} G_{J} = x^{'}\} \\ = & inf_{x_{J + 1}, \dots, x_{n}} \{Φ_{x} (x) | x_{1} = x_{1}^{'}, \dots, x_{J} = x_{J}^{'}\} . \end{matrix}

(7)

Intimately related to the generating function of a probability distribution, the LFT provides a powerful mathematical transform of the entropy functions

Φ (ν)

,

Φ_{x} (x)

, and

φ (x^{'})

in terms of their conjugates in the energy representation: Parallel to the

Ψ (u)

in (2) are,

\begin{matrix} Ψ_{y} (y) & = & inf_{x} \{\sum_{i = 1}^{n} x_{i} y_{i} + Φ_{x} (x)\}, \end{matrix}

(8)

\begin{matrix} ψ (y^{'}) & = & inf_{x^{'}} \{\sum_{j = 1}^{J} x_{j}^{'} y_{j}^{'} + φ (x^{'})\} . \end{matrix}

(9)

These psi’s are now related through linear transformation:

Ψ_{y} (y) = Ψ (G_{n} y),

(10)

and projection:

\begin{matrix} ψ (y^{'}) & = & Ψ_{y} (y_{1}^{'}, \dots, y_{J}^{'}, 0, \dots, 0) = Ψ (G_{J} y^{'}) \end{matrix}

(11a)

\begin{matrix} = & - ln \sum_{i = 0}^{n} p_{i} exp [- \sum_{j = 1} g_{j} (i) y_{j}^{'}] . \end{matrix}

(11b)

And finally, since

ψ

is convex, the inverse LFT yields

\begin{matrix} - φ (x^{'}) & = & inf_{y^{'}} \{\sum_{i = 1}^{J} x_{i}^{'} y_{i}^{'} - ψ (y^{'})\} \\ = & \{\begin{matrix} - φ & = & y^{'} \cdot \nabla ψ (y^{'}) - ψ (y^{'}) \\ x^{'} & = & \nabla ψ (y^{'}) \end{matrix} \end{matrix}

(12)

The optimization in (5) is completely “solved” in closed form, through LFT and its inverse, as a parametric function in terms of

y^{'}

given in (12).

The equation

- φ = y^{'} \cdot \nabla ψ - ψ

in (12) should be recognized as a generalization of the celebrated “entropy = mean internal energy − free energy”, where

{(\nabla ψ)}_{k} = \frac{\sum_{i = 0}^{n} g_{k} (i) p_{i} exp \sum_{j = 1}^{J} g_{j} (i) y_{j}}{\sum_{i = 0}^{n} p_{i} exp \sum_{j = 1}^{J} g_{j} (i) y_{j}}

(13)

is the mean value of

g_{k}

following Equation (11b), whose conjugate variable is

y_{k}^{'}

. The identification of

u = G_{J} y^{'}

in (11a) with the first law of thermodynamics as formulated by Gibbs seems natural.

The

y_{J + 1} = \dots = y_{n} = 0

in (11a) has a very clear thermodynamic interpretation: Since the conjugate variable

y

are the partial derivatives of the entropy function

Φ_{x}

with respect to

x

, finding x’s with maximum entropy in Equation (7) is simply setting the corresponding

y = 0

, e.g., letting the entropic force be zero. For each independent observable

g_{j}

,

y_{j}

is its “custom-designed” conjugate force and

y_{j} \times d x_{j}

contributes a term to the internal energy as the “thermodynamic work” associated with

g_{j}

: The internal energy

u

is a highly flexible, adaptive representation of the

\hat{ν}

. When

J = n

,

u = G_{n} y

and Equation (10) provides a complete “detailing” of the internal energy in terms of a set of holographic observables. MEP is for missing information [20].

4. Gibbs Distribution and Linear Algebraic Representation

There is a geometric picture associated with the above “thermodynamic analysis”. As we have stated, counting frequency ad infinitum

\hat{ν}

is a fundamental, intrinsic property of an ergodic dynamical system. The space of all possible frequency distributions

\hat{ν}

, with

{\hat{ν}}_{0} + \dots + {\hat{ν}}_{n} = 1

, is a n-dimensional hyper-plane in the positive quadrant of

R^{n + 1}

, known as a probability simplex

M_{n}

. For a given set of observables

(g_{1}, \dots, g_{J})

, the

M_{n}

is foliated by

\hat{ν} G_{j} = x^{'}

with different

x^{'}

. On each leave of the foliation, there is the most probable

ν^{*} (x^{'})

, which is located at the tangent point between the

(n - J)

-dimensional leave and a

(n - 1)

-dimensional level set of the

Φ (\hat{ν})

function. At this point,

\nabla_{ν} Φ (\hat{ν}) = - u (\hat{ν})

is the normal vector to the

x^{'}

-leave in

R^{n + 1}

, and

\nabla φ (x^{'}) = - y^{'}

is its projection onto the J-manifold of

x^{'}

:

ν^{*} (x^{'}) = \{\begin{matrix} ν_{i}^{*} = \frac{1}{Z (y^{'})} p_{i} exp [- \sum_{j = 1}^{J} g_{j} (i) y_{j}^{'}] \\ x_{j}^{'} = \frac{1}{Z (y^{'})} \sum_{i = 0}^{n} g_{j} (i) p_{i} exp [- \sum_{j = 1}^{J} g_{j} (i) y_{j}^{'}] \end{matrix}

(14a)

in which

Z (y^{'}) = \sum_{i = 0}^{n} p_{i} exp [- \sum_{i = 1}^{J} g_{j} (i) y_{j}^{'}] .

(14b)

All the other points on the same

x^{'}

-leave are no longer relevant: they are deemed statistically impossible under the prior

p

and observed

x^{'}

. The foliation therefore represents a partition of the

M_{n}

into macro- and micro-worlds: Transversing between different

x^{'}

-leaves are macroscopic thermodynamic processes that follow the

y^{'} (x^{'})

. According to the logic of Bayesian statistics, one should use the most suitable probability frequency distribution

{\hat{ν}}^{*} (x^{'})

to update the prior

p

for the particular system with observed

x^{'}

. The microscopic world is still random, due to missing information, but its prior is now updated. This is Gibbs’ statistical ensemble.

With a given set of

(g_{1}, \dots, g_{J})

, the

M_{n}

is collapsed into J-manifold in

R^{n + 1}

, which is parametrized by the

x^{'}

, or equivalently

y^{'}

. There is no uncertainty in this “macroscopic” description. For a different set of g’s and

J^{'}

, there will be a different

J^{'}

-manifold. It will be desirable to treat different g’s through transformations. We note that even though

M_{n}

is a “plane” in

R^{n + 1}

, it is not a linear Euclidean space since for any

c \neq 1

,

c \hat{ν} \notin M_{n}

, and neither are the

x^{'}

-leaves. They are affine manifolds [26]. The locating of

{\hat{ν}}^{*} (x^{'})

is a highly nonlinear procedure in the space of energies.

The LFT, in terms

Ψ (u)

,

Ψ_{y} (y)

and

ψ (y^{'})

, etc., enters as a powerful algebraic linear representation of the MEP procedure. The “collapse” of a holographic

y

to

y^{'}

with missing information means simply neglecting all the extra dimensions:

y_{J + 1} = \dots = y_{n} = 0

. This is because due to the convexity of

Φ (\hat{ν})

, there is a one-to-one relation between

\hat{ν}

and

u = - \nabla Φ

under a proper gauge fixing. And since the constrains to MEP in (5) are all linear due to the nature of observables being random variables, each g determines a 1-dimensional linear subspace in the space of

u

.

5. Generalized Clausius Inequality

A combination of Equations (9) and (12a) yields a Clausius’ inequality-like relation:

φ (x^{'}) + x^{'} \cdot y^{'} - ψ (y^{'}) \geq 0 .

(15)

The thermodynamics equilibrium is between the observed mean value

x^{'}

and its conjugate “force”

y^{'}

. When the equality holds, there is a relation between

x^{'}

and

y^{'}

which should be identified as a “the equation of state”, with

\nabla_{x} φ = - y^{'}

and

\nabla_{y} ψ = x^{'}

. When the

x^{'} \neq \nabla_{y} ψ (y)

, the difference

x^{'} \cdot y^{'} - ψ (y^{'})

can be interpreted as the nonequilibrium heat and

φ

again as the entropy; then, the inequality in (15) becomes the Clausius’ inequality.

6. Generalized Gibbs–Duhem Equation

The celebrated Gibbs–Duhem equation in classical thermodynamics is a consequence of the entropy being a Eulerian degree 1 homogeneous function. Thus, for the

Φ (ν)

in (1), we have

\begin{matrix} Φ (ν) = \sum_{i = 0}^{n} ν_{i} (\frac{\partial Φ}{\partial ν_{i}}), \sum_{i = 0}^{n} ν_{i} (\frac{\partial^{2} Φ}{\partial ν_{i} \partial ν_{j}}) = 0, \\ \sum_{j = 0}^{n} d ν_{j} \sum_{i = 0}^{n} ν_{i} (\frac{\partial^{2} Φ}{\partial ν_{i} \partial ν_{j}}) = \sum_{i = 0}^{n} ν_{i} \sum_{j = 0}^{n} (\frac{\partial u_{i}}{\partial ν_{j}}) d ν_{j} = 0, \\ that is, \sum_{i = 0}^{n} ν_{i} d u_{i} = 0, \end{matrix}

(16)

in which we have used (2b). We identify (16) as a generalized Gibbs–Duhem equation.

7. Conclusions

The mathematical theory of probability deals with a set of elementary events

S

, on which the probability

p

and random variables g’s are introduced. Applying this mathematics to the real world, each ergodic dynamical system with state space

S

has its own unique steady-state probability distribution which can be obtained as the

\hat{ν}

from i.i.d. sampling ad infinitum.

Our present theory is to statistical inference obtaining particular

\hat{ν}

’s what dynamics is to kinematics in classical mechanics [27]. The entropy function in (1) arises in this context as a measure of the quantitative relationship between the assumed, “hypothesis” (

p

) and the observed “data” (

ν

and

\hat{ν}

), as “missing information” or “surprise” [21,22]. Motivated by the analogy to the Fourier analysis, our generalized Gibbs’ theory suggests that the notion of thermo-energetics is a powerful mathematical transformation of the statistical description;

\hat{ν}

and

u

are simply two representations of a same physical reality, the former being statistical while the latter thermo-energetic.

u

is to

\hat{ν}

what chemical potential

μ

is to particle number N in J. W. Gibbs’ chemical thermodynamics. With a fixed

p

, the theory of probability [8] revealed a powerful, dual energetic representation for various different systems, with the same state space

S

, in terms of their respective internal energy functions

u

[25]. This fundamental duality between counting frequency and internal energy of course has been recognized by L. Boltzmann already in 1880s, when he was developing the statistical mechanics as a foundation of classical thermodynamics under the principle of equal probability a priori. The present work shows that while the probability and statistics are fundamental as the foundation of thermodynamics, mechanics is not necessary. A similar conclusion was reached in the 1925 thesis of L. Szilard [28,29].

For sufficiently large N, the probability of observing a particular

\hat{ν}

is asymptotically zero except

\hat{ν} = p

. The significance of

Φ (\hat{ν}; p)

is to provide a “high-resolution magnifying glass” for the asymptotically small

exp \{- N Φ (\hat{ν}; p)\} .

(17)

This is known as the large deviations rate function in the modern theory of probability [8]. The entropy

- Φ (\hat{ν}, p)

is a function of both

\hat{ν}

and

p

,

Φ (\hat{ν}; p) \geq 0

and

Φ (p; p) = 0

. For a given

p

, it views each possible

\hat{ν}

from a real system as a part of an entire class of systems under a common

p

, a meta-statistics. If one chooses the true steady-state probability

π

of a particular system to replace

p

, then Equation (17) gives the probability distribution of the uncertainties in the measurement

\hat{ν}

from N samples. The second-order Taylor expansion near

π

,

e^{- N Φ (\hat{ν}; π)} ≃ exp [- N \sum_{i, j = 0}^{n} \frac{({\hat{ν}}_{i} - π_{i}) (δ_{i j} - π_{i}) ({\hat{ν}}_{j} - π_{j})}{π_{i}}],

is the central limit theorem for the statistics of counting frequency

\hat{ν}

, with

Var [{\hat{ν}}_{i}] = π_{i} (1 - π_{i}) / N

and

Cov [{\hat{ν}}_{i}, {\hat{ν}}_{j}] = - π_{i} π_{j} / N

. This is not the fluctuations within the

π

of the system itself. Gibbs’ theory of ensemble is about statistical measurements of a whole system; not about the individuals within.

We choose to present our theory with finite state space

S

for mathematical simplicity. Formal generalization to continuous state space is straight forward if mathematical rigor is not required. Beyond the finite state space, it is known that modern probability and the theory of measures encounter challenges, c.f., de Finetti’s treatment of infinite sets and the axiom of choice of nonempity subsets [20]. In addition to continuous

R^{n}

[30], there are even larger Hilbert spaces of functions on

R^{n}

and/or von Neumann algebra of operators acting on a Hilbert space.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

I thank Jin Feng, Weishi Liu, Zhang-Ju Liu, Bing Miao, Zhongmin Shen, Xiang Tang, Yong-Shi Wu, and particularly Jun Zhang, for many helpful discussions, and the support from Olga Jung Wan Endowed Professorship.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LFT	Legendre–Fenchel transform
FT	Fourier transform

Appendix A. Statistical Counting ad Infinitum

In this section, we provide the mathematical reasoning for stating “entropy provides the infinitesimal probability for an observed frequency

\hat{ν}

with respect to a probability prior

p

”, “it characterizes the relationship between

ν

and

p

in a sampling process”, and the origin of Legendre–Fenchel transform in entropy analysis. The counting of independent and identically distributed samples with state space

S = {0, \dots, n}

yields

ν = (ν_{0}, \dots, ν_{n})

, a

(n + 1)

-tuple of non-negative integers. We call all the

ν

with

N = ν_{0} + \dots + ν_{n}

a simplex for counting. The simplex for counting grows with N, which we shall identify as “time”. With a given prior probability

p = (p_{0}, \dots, p_{n})

on

S

, statistical counting is a Markov process on a growing simplex, with probability:

P^{(N + 1)} (ν) = \sum_{k = 0}^{n} p_{k} P^{(N)} (ν - δ_{k}),

(A1)

in which

δ_{k} = (0, \dots, 1, \dots, 0)

is the unit vector for the

k^{t h}

component. One can easily verify that

P^{(N)} (ν) = \frac{N!}{ν_{0}! \dots ν_{n}!} p_{0}^{ν_{0}} \dots p_{n}^{ν_{n}}

is a solution to (A1).

One is interested in the limit of counting ad infinitum, when all the

ν_{i}

s are expected to tend to infinity as

N \to \infty

. On the increasing simplex for

ν

, the probability

P^{(N)} (ν) \to 0

. However, the properly normalized

\hat{ν} = ν / N

converges, and

P^{(N)}

as a function of the

\hat{ν}

becomes sharper and sharper, concentrated around

{\hat{ν}}^{*} = p

. To more precisely characterize this limiting situation, one introduces counting frequency

\hat{ν} = ν / N

. The space of

\hat{ν}

s then is called a probability simplex

M_{n}

; Equation (A1) then becomes

{\tilde{P}}^{(N + 1)} (\hat{ν}) = \sum_{k = 0}^{n} p_{k} {\tilde{P}}^{(N)} (\{\frac{N + 1}{N} {\hat{ν}}_{i} - \frac{1}{N} δ_{i k}\}) .

(A2)

Its limit is a Dirac-

δ

function:

{\tilde{P}}^{(\infty)} = 0

for all

\hat{ν} \neq p

, and

{\tilde{P}}^{(\infty)} = \infty

at

\hat{ν} = p

. However, “a higher order” infinitesimal analysis shows that [8]

lim_{N \to \infty} \frac{1}{N} ln {\tilde{P}}^{(N)} (\hat{ν}) = - \sum_{i = 0}^{n} {\hat{ν}}_{i} ln (\frac{{\hat{ν}}_{i}}{p_{i}}) = - Φ (\hat{ν}) .

(A3)

It is clear that entropy function

- Φ (ν)

represents the infinitesimal prior probability

e^{- N Φ (ν)}

on

M_{n}

. For two

ν

’s with different entropy values,

Φ (ν)

and

Φ (ν^{'})

, their probabilities

P^{\infty} (ν) / P^{\infty} (ν^{'}) = 0

if

Φ (ν) > Φ (ν^{'})

. This is the origin of the maximum entropy principle (MEP).

To understand the limit

P^{(N)} (ν) \to 0

, one can also introduce the probability generating function [8]:

W^{(N)} (u) = \sum_{ν} P^{(N)} (ν) e^{- u \cdot ν},

(A4)

in which

u \cdot ν = u_{0} ν_{0} + \dots + u_{n} ν_{n}

. Then, Equation (A1) becomes

\begin{matrix} W^{(N + 1)} (u) = \sum_{ν} P^{(N + 1)} (ν) e^{- u \cdot ν} \\ = & \sum_{k = 0}^{n} p_{k} e^{- u \cdot δ_{k}} \sum_{ν} P^{(N)} (ν - δ_{k}) e^{- u \cdot (ν - δ_{k})} \\ = & W^{(N)} (u) e^{- Ψ (u)}, \\ where Ψ (u) = - ln \sum_{k = 0}^{n} p_{k} e^{- u_{k}} . \end{matrix}

(A5)

The free energy function

Ψ

is meaningful for all finite N. This is why the partition function is valid even for small systems in Gibbs’ theory of ensembles [13]. The Legendre–Fenchel transform of

Ψ (u)

is precisely the the right-hand side of (A3):

\begin{matrix} inf_{u} \{u \cdot ν - Ψ (u)\} \\ = & inf_{u} \{\sum_{i = 0}^{n} ν_{i} ln e^{u_{i}} + ln \sum_{k = 0}^{n} p_{k} e^{- u_{k}}\} \\ = & inf_{u} \{- \sum_{i = 0}^{n} ν_{i} ln [\frac{e^{- u_{i}}}{\sum_{k = 0}^{n} p_{k} e^{- u_{k}}}]\} = - \sum_{i = 0}^{n} ν_{i} ln \frac{{\hat{ν}}_{i}}{p_{i}}, \end{matrix}

(A6)

in which the optimal

e^{- u_{i}} \propto ν_{i} / p_{i}

. Legendre–Frenchel transform arises in the limit of

N \to \infty

through the Laplace’s method of evaluating asymptotic integrals, or the related Darwin–Fowler method of maximum term.

The analysis in this Appendix suggests that a proper interpretation of

p = (p_{0}, \dots, p_{n})

in Equation (A1) is not as an intrinsic property, for example, the generative model of data statistics, rather it should be interpreted as a choice of a “gauge” in terms of which a set of counting data is represented: each particular set of data ad infinitum is represented by the energy function

u

, not

p

, and the

\hat{ν}

is gauge invariant via the Boltzmann relation

{\hat{ν}}_{i} \propto p_{i} e^{- u_{i}}

—this yields an i.i.d. generative model. Probability is not for generative models, it is for analyzing empirical measurements on random variables.

References

Førland, K.S.; Førland, T.; Kjelstrup, S. Irreversible Thermodynamics: Theory and Applications; John Wiley & Sons: Chichester, UK, 1988. [Google Scholar]
Kjelstrup, S.; Bedeaux, D. Non-Equlibrium Thermodynamics of Heterogeneous Systems; Series on Advances in Statistical Mechanics; World Scientific: Singapore, 2008; Volume 16. [Google Scholar] [CrossRef]
Bedeaux, D.; Kjelstrup, S.; Schnell, S.K. Nanothermodynamics Theory and Applications; World Scientific: Singapore, 2023. [Google Scholar] [CrossRef]
Guggenheim, E.A. Modern Thermodynamics by the Methods of Willard Gibbs; Methuen & Co.: New York, NY, USA, 1933. [Google Scholar]
Hill, T.L. A different approach to nanothermodynamics. Nano Lett. 2001, 1, 273–275. [Google Scholar] [CrossRef]
Khinchin, A.Y. Mathematical Foundations of Statistical Mechanics; Dover: New York, NY, USA, 1949. [Google Scholar]
Touchette, H. The large deviation approach to statistical mechanics. Phys. Rep. 2009, 478, 1–69. [Google Scholar] [CrossRef]
Dembo, A.; Zeitouni, O. Large Deviations Techniques and Applications, 2nd ed.; Springer: New York, NY, USA, 1998. [Google Scholar] [CrossRef]
Anderson, P.W. More is different: Broken symmetry and the nature of the hierarchical structure of science. Science 1972, 177, 393–396. [Google Scholar] [CrossRef] [PubMed]
Alm, J.F.; Walker, J.S. Time-frequency analysis of musical instruments. SIAM Rev. 2002, 44, 457–476. [Google Scholar] [CrossRef]
Fourier, J.B.J. The Analytic Theory of Heat; Freeman, A., Translator; Cambridge University Press: London, UK, 1878. [Google Scholar] [CrossRef]
Truesdell, C. Rational Thermodynamics; Springer: New York, NY, USA, 1984. [Google Scholar] [CrossRef]
Lu, Z.; Qian, H. Emergence and breaking of duality symmetry in thermodynamic behavior: Repeated measurements and macroscopic limit. Phys. Rev. Lett. 2022, 128, 150603. [Google Scholar] [CrossRef] [PubMed]
Galteland, O.; Bering, E.; Kristiansen, K.; Bedeaux, D.; Kjelstrup, S. Legendre-Fenchel transforms capture layering transitions in porous media. Nanoscale Adv. 2022, 4, 2660–2670. [Google Scholar] [CrossRef] [PubMed]
Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
Qian, M.; Xie, J.S.; Zhu, S. Smooth Ergodic Theory for Endomorphisms; Lecture Notes in Mathematics; Springer: Berlin, Germany, 2009; Volume 1978. [Google Scholar] [CrossRef]
Dorfman, J.R. An Introduction to Chaos in Nonequilibrium Statistical Mechanics; Cambridge Lect. Notes in Phys.; Cambridge University Press: London, UK, 1999. [Google Scholar] [CrossRef]
Mackey, M.C. The dynamic origin of increasing entropy. Rev. Mod. Phys. 1989, 61, 981–1015. [Google Scholar] [CrossRef]
Qian, H.; Ge, H. Stochastic Chemical Reaction Systems in Biology; Lect. Notes on Math. Modelling in the Life Sci.; Springer Nature: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
Jaynes, E.T. Probability Theory: The Logic of Science; Cambridge University Press: London, UK, 2003. [Google Scholar]
Levine, R.D. Information theory approach to molecular reaction dynamics. Annu. Rev. Phys. Chem. 1978, 29, 59–92. [Google Scholar] [CrossRef]
Ben-Naim, A. A Farewell to Entropy: Statistical Thermodynamics Based on Information; World Scientific: Singapore, 2008. [Google Scholar] [CrossRef]
Callen, H.B. Thermodynamics and an Introduction to Thermostatistics, 2nd ed.; Wiley: New York, NY, USA, 1991. [Google Scholar]
Commons, J.; Yang, Y.J.; Qian, H. Duality symmetry, two entropy functions, and an eigenvalue problem in Gibbs’ theory. arXiv 2021. [Google Scholar] [CrossRef]
Qian, H. Statistical chemical thermodynamics and energetic behavior of counting: Gibbs’ theory revisited. J. Chem. Theory Comput. 2022, 18, 6421–6436. [Google Scholar] [CrossRef]
Hong, L.; Qian, H.; Thompson, L.F. Representations and divergences in the space of probability measures and stochastic thermodynamics. J. Comput. Appl. Math. 2020, 376, 112842. [Google Scholar] [CrossRef]
Goldstein, H. Classical Mechanics; Addison-Wesley: New York, NY, USA, 1951. [Google Scholar]
Szilard, L. Über die ausdehnung der phänomenologschen thermodynamik auf die schwankungserscheinungen. Z. Physik. 1925, 32, 753–7888. [Google Scholar] [CrossRef]
Mandelbrot, B. On the derivation of statistical thermodynamics from purely phenomenological principles. J. Math. Phys. 1964, 5, 164–171. [Google Scholar] [CrossRef]
Miao, B.; Qian, H.; Wu, Y.S. Emergence of Newtonian deterministic causality from stochastic motions in continuous space and time. arXiv 2024. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qian, H. Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting. Entropy 2024, 26, 1091. https://doi.org/10.3390/e26121091

AMA Style

Qian H. Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting. Entropy. 2024; 26(12):1091. https://doi.org/10.3390/e26121091

Chicago/Turabian Style

Qian, Hong. 2024. "Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting" Entropy 26, no. 12: 1091. https://doi.org/10.3390/e26121091

APA Style

Qian, H. (2024). Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting. Entropy, 26(12), 1091. https://doi.org/10.3390/e26121091

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting

Abstract

1. Introduction

2. Energetic Theory of Statistical Counting

3. Maximization of Entropy $- Φ$ Under Constraint by Empirical Mean Value

4. Gibbs Distribution and Linear Algebraic Representation

5. Generalized Clausius Inequality

6. Generalized Gibbs–Duhem Equation

7. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Statistical Counting ad Infinitum

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Internal Energy, Fundamental Thermodynamic Relation, and Gibbs’ Ensemble Theory as Emergent Laws of Statistical Counting

Abstract

1. Introduction

2. Energetic Theory of Statistical Counting

3. Maximization of Entropy − Φ Under Constraint by Empirical Mean Value

4. Gibbs Distribution and Linear Algebraic Representation

5. Generalized Clausius Inequality

6. Generalized Gibbs–Duhem Equation

7. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Statistical Counting ad Infinitum

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3. Maximization of Entropy $- Φ$ Under Constraint by Empirical Mean Value