1. Introduction
There are in fact two distinct paradoxes that go under the heading of the Gibbs paradox. The original one was formulated by Josiah Willard Gibbs in 1875 [
1]. It addresses the mixing of two quantities of ideal gas, and the entropy change that occurs as a result of the mixing process. The paradox arises from the difference between two scenarios: one in which two quantities of the same gas are mixed, and one in which the two gases being mixed are different kinds. First, of course, in the case of mixing two equal gases, thermodynamics tells us that there is no change of entropy. In this case, we could restore the initial thermodynamical state of the mixing process simply by putting the original partitioning back in place, at no entropy cost. It immediately follows that no change in entropy occurs when two quantities of the same kind of gas are mixed. Secondly, in contrast to this, when the two gases are of different kind, the final state of the mixing process is different from the initial state not only in a microscopic but also in a macroscopic description, and now an entropy increase does occur. It is well known that this entropy increase is equal to
, a number which interestingly depends on the amounts of gases mixed, but not on their nature.
The paradoxical point, as noted by Gibbs, is that this leads to a discontinuity in the entropy of mixing, in an imagined sequence of mixing processes in which gases are mixed that are made more and more similar. Gibbs writes:
“Now we may without violence to the general laws of gases which are embodied in our equations suppose other gases to exist than such as actually do exist, and there does not appear to be any limit to the resemblance which there might be between two such kinds of gas. However, the increase of entropy due to the mixing of given volumes of the gases at a given temperature and pressure would be independent of the degree of similarity between them.”
Thus, in the imagined limit in which the gases become equal, the entropy of mixing drops to zero discontinuously. This is the original Gibbs paradox.
Both in textbooks on statistical physics and in the more recent literature on the subject another version of the Gibbs paradox figures more prominently, which deals with the mixing of equal gases only. The paradox now is the fact that a straightforward calculation of the entropy of mixing leads to the previously found quantity . This time, however, this value for the entropy increase holds for the mixing of equal gases, contrary to the thermodynamical result according to which entropy remains constant. Note that therefore a solution of this, second, paradox would amount to the restoration of the first paradox.
Both paradoxes apply to classical physics. Nevertheless, in both cases, quantum mechanics has been appealed to for a solution. In case of the original paradox, it is argued that, due to the impossibility of letting the two substances become equal in a continuous process, the paradox disappears, as the following quote from Pauli shows:
“The increase in entropy is always finite, even if the two gases are only infinitesimally different. However, if the two gases are the same, then the change in entropy is zero. Therefore, it is not allowed to let the difference between two gases gradually vanish. (This is important in quantum theory.)”
In a similar vain, it is often argued that the quantized discreteness of nature explains that there cannot exist an arbitrarily small difference between two substances. In a textbook that still is widely used, Reif writes:
“Just how different must molecules be before they should be considered distinguishable [...]? In a classical view of nature, two molecules could, of course, differ by infinitesimal amounts [...] In a quantum description, this troublesome question does not arise because of the quantized discreteness of nature [...] Hence, the distinction between identical and non-identical molecules is completely unambiguous in a quantum-mechanical description. The Gibbs paradox thus foreshadowed already in the last [i.e., nineteenth] century conceptual difficulties that were resolved satisfactorily only by the advent of quantum mechanics.”
In case of the second paradox, it is often argued that classical physics gives incorrect results, since it mistakenly takes microstates that differ only by a permutation of particles to be different. Thus, when counting the number of microstates that give rise to the same macrostate, too many microstates are counted in. Thus, for example, in another once popular textbook, Huang writes:
“It is not possible to understand classically why we must divide [the number of states with energy smaller than E] by to obtain the correct counting of states. The reason is inherently quantum mechanical.”
Similarly, Schrödinger claims that it is quantum mechanics that solves the Gibbs paradox by pointing out that permutations of like particles should not be counted as different states:
“It was a famous paradox pointed out for the first time by W. Gibbs, that the same increase of entropy must not be taken into account, when the two molecules are of the same gas, although (according to naive gas-theoretical views) diffusion takes place then too, but unnoticeably to us, because all the particles are alike. The modern view (i.e., quantum mechanics) solves this paradox by declaring that, in the second case, there is no real diffusion because exchange between like particles is not a real event—if it were, we should have to take account of it statistically. It has always been believed that Gibbs’s paradox embodied profound thought. That it was intimately linked up with something so important and entirely new (i.e., quantum mechanics) could hardly be foreseen.”
Yet there is something peculiar in these appeals to quantum mechanics. After all, both versions of the paradox appear in a classical setting. It might very well be that quantum physics gives us a better description of physical reality. However, it would be strange indeed if it were needed to solve puzzles that are interior to classical theories. Rather than turning to quantum mechanics for a solution to the Gibbs paradox, I will argue in this paper that we can learn important lessons by reverting to elementary thermodynamics, and especially to the way in which thermodynamical entropy is intimately related to exchanges of heat, energy, work and particles in reversible processes. The second paradox consists in a difference between thermodynamical and statistical mechanical calculations of the entropy increase when equal gases are mixed. I will show how this difference disappears when the statistical mechanical entropy is introduced in a way that does justice to its thermodynamical origin, by paying close attention to the variation of entropy and the other thermodynamical quantities in reversible processes, rather than simply by counting the number of microstates that lead to the same macrostate.
This paper is structured as follows. In
Section 2, I will start by presenting an account of the entropy of mixing that shows how confusions and incorrect results arise when one does not pay close enough attention to the connection between entropy and reversible processes. More specifically, determining the way in which entropy depends on the number of particles by neglecting constants of integration or by fixing these by conventional stipulations could lead to incorrect results. For a correct determination of the entropy difference as a result of a mixing process, one needs to calculate entropy changes during processes in which particle number is actually allowed to vary. Next, I will present three different ways in which one may correctly arrive at the entropy of mixing in thermodynamics. I will briefly indicate why the discontinuous change in the entropy of mixing when gases are considered that are more and more similar should not be seen as paradoxical, whereby the first aspect of the Gibbs paradox is dissolved. In
Section 3, I will first discuss the way in which entropy can be introduced in statistical mechanics while being faithful to thermodynamics. I will argue that the Gibbsian framework is much better suited than the Boltzmannian framework to give us proper counterparts of thermodynamical quantities. Moreover, within the Gibbsian framework, it is only the grandcanonical ensemble that is a suitable device for describing processes in which particle numbers vary, such as mixing processes. I will present two different ways in which one could motivate the appearance of the factor
in the grandcanonical distribution, neither of which makes an appeal to indistinguishability of particles. With the grandcanonical ensemble and this factor of
in place, one easily recovers the thermodynamical results for the entropy of mixing, both for the mixing of equal and unequal gases, whereby the second aspect of the Gibbs paradox is solved.
2. Formulating the Gibbs Paradox in Thermodynamics
Let us start our discussion of the Gibbs paradox in thermodynamics by going over a standard way of deriving the entropy of mixing. This derivation will, surprisingly, turn out to be quite problematic, and the source of confusion. The setup is simple and familiar. We consider the mixing of two equal portions of monatomic ideal gases of equal temperature, volume and number of particles. We want to know the entropy of mixing, that is, the difference between the entropy of the initial state in which the two portions of gas are confined to their own part of the container of volume
V, and the final state in which the partition is removed and the gas is spread out over the whole container of volume
. One arrives at the expression for the entropy of an ideal monatomic gas by starting from the fundamental equation
, and filling in the ideal gas law
and
. We get
A straightforward calculation of the entropy of mixing now is
which gives us the well-known result. So far so good. Or really? We can see that something is not quite in order when we repeat the derivation, but start from expressions for the entropy in terms of a different set of variables. In terms of the pressure
p and temperature
T, we have
and if we take the variables to be pressure and volume, we get
Clearly, these results contradict each other. What is going on here? Well, in all three derivations, the additive constants
,
and
have been set to zero. Since these are constants of integration, they obviously cannot depend on the variables, but they may still depend on anything else—including the number of particles! The above expressions for the entropy simply do not fully specify the way in which entropy varies with particle number. They treat particle number as a parameter, not as a variable. In fact, one easily checks that
and so it is clear that setting all three constants to zero leads to incompatible results. In fact, these constants may still depend on the number of particles, and it turns out that they are
different functions of particle number. Setting either one of the constants
,
and
to zero is thus a conventional choice that stands in need of justification.
There is another peculiar feature of the above derivations. Have we attempted to derive the entropy change when two different ideal gases were mixed, or two portions of the same gas? No assumption as to the nature of the ideal gas went into the derivations. One way of looking at this is to note that the additive constants may not only still depend on the particle number, but also on the kinds of gases involved. Somehow, the derivation with leads us to the correct result for the mixing of two different gases, and the derivation with leads us to the correct result for the mixing of two portions of the same gas. This, however, so far seems to be just a coincidence. We need to have a better look at thermodynamical theory, and redo the derivation of the entropy of mixing.
Let us go back to the basics (for a more detailed account, I refer to [
6]). Entropy is introduced into the orthodox theory of thermodynamics as the state variable one gets when integrating
along a curve in equilibrium space:
Here,
is the amount of heat that is exchanged in an infinitesimal process. It is an inexact differential, and
Q is not a state variable, i.e., it cannot be expressed as a function on equilibrium space. The differential
on the other hand is exact, and
S is a state variable. By definition Equation (
10), only entropy
differences are defined, that is, entropy has been defined up to a constant of integration. Moreover, only entropy differences are defined between equilibrium states that can be connected by a process that fully takes place in equilibrium space, i.e., by a quasistatic process. This means, for example, that entropy values for non-equilibrium states so far have not been defined. Neither are entropy differences defined between, say, one mole of oxygen and one mole of argon, since these cannot be connected by a quasistatic process.
There are various ways in which one may extend the definition of entropy. One may, for example, fix the integration constant by reference to a fiducial state. This, however, will not help us out in the case of the Gibbs paradox, since this method does not work for comparing entropy values of different kinds of gas, again since these cannot be connected by a quasistatic process. Another common convention is to appeal to the third law of thermodynamics, which states that all entropy differences (as far as they are defined!) approach zero when the absolute temperature approaches zero. This invites the conventional choice of setting not only all entropy differences but also all entropy values to zero in this limit. Unfortunately, this again will not be a convenient choice for the setting of the Gibbs paradox, since classical ideal gases do not obey the third law. Another conventional choice is to take the entropy to be
extensive, that is, require that the entropy increases by a factor of
q when the system as a whole increases by
q. Note that this is a rough characterisation, since it is not always clear what it means for a system to increase by a certain factor, especially not for inhomogeneous systems. However, it is sufficiently clear in cases where entropy is given as a function of other state variables that themselves are clearly intensive (such as temperature and pressure) or extensive (such as volume and number of particles). We may then require, say, that
. Note, incidentally, that this requirement immediately yields
for the entropy difference in Equation (
3). However, another extension of the definition of entropy is to require
additivity, that is, take the entropy of a composite system to be the sum of the entropies of the constituents, also when those constituents are not in equilibrium with one another. (Note that additivity indeed differs from extensivity. The composite system one gets when combining two containers of volume
V still includes the walls of both containers. Extensivity, on the other hand, applies to extending systems without placing walls between subsystems.) This opens up the whole field of non-equilibrium thermodynamics.
However, in extending the definition of entropy, one needs to proceed with care. Some extensions, such as additivity, simply enlarge the applicability of the notion. However, others bear the risk of fixing entropy differences or absolute values that are
already fixed by the definition Equation (
10). Thus, what about extensivity of entropy? Can this safely be assumed? In many cases, it is simply stated that entropy in thermodynamics clearly
is extensive (see for example [
7,
8]). Based on definition Equation (
10), this cannot be quite right, since that definition does not determine entropy values, only differences, so that the issue of extensivity is underdetermined. A more careful claim would be to state that extending the definition of entropy by requiring extensivity does not lead to conflict with the rest of thermodynamics. However, one may wonder whether this is actually the case. One interesting account in this respect is given by Landsberg [
9], who turns the claim that it is possible to consider all of the most important thermodynamical quantitites as either intensive or extensive into a fourth law of thermodynamics. This, in my view, only highlights the conceptual possibility that entropy is
not extensive. Thus, the most careful thing to do here is to refrain from extending the definition of entropy, and to make use of Equation (
10) in order to calculate entropy differences along quasistatic processes. Fixing the integration constants by conventions would run the risk of introducing conflicts with entropy differences that are
already determined.
Let us return to the entropy of mixing. If we want to improve on the derivations given above, we should not make unwarranted assumptions about the N-dependence of the entropy. A parsimonious derivation is to be preferred over one that makes abundant use of extra assumptions on top of the original definition of entropy differences, and of course conflicting assumptions such as setting all three constants , and to zero should be avoided at all cost. A further desideratum is that it should be clear from the derivation whether it applies to the mixing of equal or different gases. Fortunately, several such derivations are available, which however interestingly differ with respect to the exact assumptions that are appealed to. I will discuss three of them in turn.
The most straightforward thing to do is to make use of expressions for the entropy that truly treat particle number as a variable rather than as a parameter. That is, we add to the fundamental equation terms that take into account varying particle numbers:
so that it becomes possible to calculate the way in which entropy varies with varying number of particles of kind
i. We follow a derivation given by Denbigh [
10] (pp. 111–118), which applies to mixtures of perfect gases. A perfect gas is more general than an ideal gas, since the specific heat may be an arbitrary function of the temperature, rather than a constant
. Denbigh defines a perfect gas mixture as one in which each component satisfies
Denbigh further assumes both additivity and extensivity of entropy, and calculates the entropy
. It then straightforwardly follows that when two different gases are mixed, the entropy increases by
and when two equal gases are mixed, the entropy remains constant. We thus have the familiar results, on the basis of assumptions of additivity and extensivity of entropy and a definition of the chemical potential of perfect gases. How about the difference between the cases of mixing different or equal gases? One could say that this difference is introduced by definition: for each kind of gas, a term
is included in the fundamental equation. Suppose we decided to treat a pure gas as a mixture of two portions of the same kind of gas, and added a term
for each of those portions. We then would also find the entropy increase we found in Equation (14) in the case of mixing the same gas! This leaves us with a somewhat unsatisfactory treatment of the difference between the two cases.
Another derivation of the entropy of mixing has been given by Planck [
11]; it is both sparser in its assumptions, and clearer on the distinction between mixing different or equal gases. Planck does not fully calculate the entropy as a function
of the state parameters. Instead, he directly calculates entropy differences along quasistatic processes only for the mixing processes of interest. For this, he uses a construction with semi-permeable membranes, which allow particles of one kind to go through. Suppose we start with a container of volume
V that contains a mixture of two gases
A and
B, and with two membranes: one on the far left side that lets through only particles of kind
A, and one on the far right side that lets through only particles of kind
B. These containers can now be slowly extended like a telescope, leaving us in the end with two containers of volume
V each, where the left container is filled by gas
A and the right container is filled by gas
B. Suppose further that Dalton’s law holds, that is, the pressure of a gas equals the sum of partial pressures. Then, it follows that, during the extension, no work is done, and moreover, the total energy and amount of particles remain constant. This extension therefore leaves the entropy constant. Next, the volume of each container is compressed in a quasistatic, isothermal process to
, resulting in an entropy change
for each of the two containers. Now, considering the whole process in reverse so that we start with the separate gases and end with the mixture, we arrive at an entropy increase of
. It is immediately clear that this construction does not apply to the mixing of two portions of the same gas. After all, a semi-permeable membrane that is transparent to gas
A but not to gas
A cannot exist. For the mixing of equal gases, we simply appeal to the reasoning given earlier, namely that we can restore the initial thermodynamical state simply by putting back the partition, at no entropy cost. Therefore, there is no entropy of mixing in this case.
A third derivation of the entropy of mixing has been given in a wonderful paper by Van Kampen [
12]. Now, however, the entropy of mixing is understood to be the absolute value of the entropy of a mixture, not the entropy difference as the result of a mixing process. Van Kampen is careful and explicit in formulating conventions with respect to the entropy value, and takes great care not to compare entropy values “belonging to different
N, unless one introduces a new kind of process by which
N can be varied in a reversible way” [
12] (p. 305). The first convention he appeals to is to take the integration constants equal for systems that are identical. The second convention is the additivity of entropy. These conventions make it possible to derive an expression for entropy that still contains an integration constant, but for which the dependence on particle number has been fixed. For a single gas, the procedure is simply to remove a partition between containers with different portions of that single gas. Van Kampen finds
For a mixture, the procedure is to mix or separate the gases by making use of semi-permeable membranes. On this basis, Van Kampen finds
where also Dalton’s law has been used. One notes that the entropy of a mixture in Equation (
17) does not reduce to the entropy of a pure gas in Equation (
16) when
A and
B are the same, and
. This, according to Van Kampen, is the Gibbs paradox. There is an interesting parallel with the remark we made above about the entropy of mixing in Equation (14), where the value of the entropy of mixing also seemed to depend on whether we choose to consider a gas as consisting of one single gas
A, or of a mixtures of two portions of that very gas. Van Kampen, however, unlike Denbigh, treats the difference between mixtures of different or equal gases with great care. He extends the definition of entropy by an appeal to reversible processes that connect systems of different particle number. The process for doing so in the case of equal gases, namely the removal or addition of a partition, clearly cannot be used for reversibly mixing or separating different gases. Conversely, mixing or separating gases by means of semi-permeable membranes is applicable to the case of unlike gases only. Thus, where Denbigh’s derivation introduces the distinction by definition, Van Kampen gives us an explanation of this distinction. How does Van Kampen’s treatment of the Gibbs paradox fare with respect to our other desideratum, parsimony of assumptions? In this respect, Planck’s treatment is still superior. Both assume Dalton’s law, and appeal to semi-permeable membranes that can be used to separate the two kinds of gas. However, on top of this, Van Kampen makes use of two conventions that fix entropy values, rather than just entropy differences in a mixing process. Planck’s derivation shows that this is not necessary in order to derive the difference between mixing the same or different gases, and thus in order to arrive at the Gibbs paradox. The core of the original, thermodynamical paradox thus lies in the procedures by which the definition of entropy
differences can be extended to cases in which particle number varies. It does not lie in the conventions that can be used in order to fix the
value of entropy.
About solutions to the original, thermodynamical paradox, I want to be brief. The modern standard response is that there is indeed a discontinuity between mixing the same or different gases, but that there is nothing remarkable about that [
12,
13]. The construction with the semi-permeable membranes shows that differences between the two cases should not surprise us. Some authors [
14] further motivate this viewpoint by an appeal to a subjective interpretation of entropy, according to which entropy measures the amount of information that is available to a subject. On such a view, it is up to the experimenter to regard two gases either as equal or as different, and the entropy of mixing depends on this subjective choice. Other authors [
15] further motivate the viewpoint by an appeal to an operationalist approach to thermodynamics, according to which the meaning of thermodynamical notions is given by a set of operations (which may be either physical, or “pencil and paper operations”). However, since the operations that define the entropy of mixing of two portions of the same gas differ from those of mixing different gases, a discontinuous change in the entropy of mixing is not considered to be remarkable. We may, however, abstract away from these two particular motivations. One need not be committed to either subjective approaches to statistical physics or to operationalism to appreciate the general point that there is nothing paradoxical about the discontinuity.
The lessons that we should learn from the thermodynamical paradox do not concern the solution to the original paradox, but rather the way in which entropy differences are tied to reversible processes, and the question of whether the definition of entropy needs to be extended by conventions such as extensivity. Jaynes [
14] makes interesting observations about this in a discussion of Pauli’s account (which is similar to Van Kampen’s account discussed above), who requires extensivity in order to fix entropy values. Jaynes writes:
“Note that the Pauli analysis has not demonstrated from the principles of physics that entropy actually should be extensive; it has only indicated the form our equations must take if it is. However, this leaves open two possibilities:
All this is tempest in a teapot; the Clausius definition (i.e., ) indicates that only entropy differences are physically meaningful, so we are free to define the arbitrary additive terms in any way we please. […]
The variation of entropy with N is not arbitrary; it is a substantive matter with experimental consequences. Therefore, the Clausius definition of entropy is logically incomplete, and it needs to be supplemented either by experimental evidence or further theoretical considerations.”
Neither option is exactly right, I would say. We are not free to define the additive terms however we like, since fixing them by convention may easily lead to confusion or even erroneous results, as we have seen. It would lead to two kinds of entropy differences: those to which the Clausius definition applies, and those that are fixed by conventions. The incorrect impression may arise that also entropy differences that have been introduced by convention correspond to heat exchanges in quasistatic processes. This, however, is generally incorrect, since the additive constants in this way are also determined for entropy differences between states that cannot be connected by physical processes, such as one mole of oxygen and one mole of argon. Errors may result in cases where the additive constants are determined by convention, and where also quasistatic processes are possible by means of which the entropy differences can be determined. We have seen examples of this in the above derivations of Equations (4) and (8). The second option Jaynes presents suggests that Clausius’ definition is insufficient to describe processes in which particle number varies. However, this is also incorrect: by considering quasistatic processes in which particle number is allowed to vary, entropy differences can be determined.
My own conclusion would be that the Clausius definition suffices to describe thermodynamical processes. The additive constants should not be neglected, and can still depend on anything but the state parameters of which entropy is a function. Extending the definition of entropy to absolute entropy values is not necessary, and is moreover undesirable, since fixing the constants may lead to confusion and error.