2.1. Theory
We have previously attempted to relate the dynamic properties of a metabolic network to the inherent thermodynamic properties of such system.
Specifically, in Srienc and Unrean (2010) [
25] we provide a derivation for the Boltzmann distribution of reaction entropies when the rate of type 2 entropy production rate (defined below) is maximized. In Unrean and Srienc (2011) [
26] the entropy balance is derived and the data are then interpreted based on the previously presented statistical thermodynamical approach. Unfortunately, the derivations provided in these two papers are incomplete and as such not as transparent and comprehensive as one would expect from a self-consistent theory. Thus, in the current work these shortcomings have been eliminated and a consistent and complete theory is presented.
This has been accomplished by following novel aspects of the presented theory: (i) the affinity of reaction is introduced which provides the commonly accepted link to the rate of entropy production of a reacting system. We define this rate as the Type 1 entropy production rate; (ii) the results are expressed as the entropy and as the Gibbs free energy of the SYSTEM at steady state which has not been done before. It is shown that, at steady state, the minimization of the Gibbs free energy of the system corresponds to the maximization of the Type 1 entropy production rate which corroborates the MEP (Maximum Entropy Production) principle. Thus, the MEP principle is not assumed but derived from basic balances. In contrast, the entropy of the system is maximized when the Type 2 entropy production rate is maximized. The Type 2 entropy production rate lacks the enthalpic contribution present in the Type 1 rate. (iii) The statistical treatment introduces a new parameter of the system: the maximum attainable specific growth rate of the evolving system. This reveals the Boltzmann constant in the Boltzmann factor and unifies the data of the experimentally tested strains since the data collapse into one general relationship that is valid for all strains.
The entropy balance for a continuous stirred tank reactor (CSTR), representing a non-equilibrium, open system, results in the following expression (see
Supplemental File S1 for a detailed derivation):
Here, si,in (si) are the molar entropies of the individual components [kJ/K/mol], at the corresponding concentrations, transported in (out) of the system at molar flow rates ni,in (ni) [mol/h]; is the rate of heat transfer through the reactor walls [kJ/h], and [kJ/K/h] is the rate of internal entropy generation of the system due to the irreversibility of the process. The first two terms on the right-hand side represent the net entropy transported to the surroundings due to material transport and due to heat transfer, respectively. From this expression one can see that in a steady state situation with zero entropy accumulation, the internal entropy production term must be balanced by the transport of entropy to the surroundings. Due to the Second Law, the internal entropy production term must always be larger or equal to zero.
Using an energy balance to obtain
and considering the irreversibilities in the system due to reaction and mixing, we can convert this expression at steady state into Equation (2) by substituting the molar flow rates ǹ
i with the product of volumetric flow rates F [L/h] and concentrations c
i [mol/L] and by introducing the space time τ = V/F [h]. The rate of entropy generation is given by the product of the entropy of reaction
ΔSR and the extent of reaction
[mol/L.h] (see also Equation (11)). Recalling that entropy is an
extensive property, we see that the expression provides a statement of the entropy content of the system at steady state.
This expression shows that the system entropy is a function of the concentration of the species in the inlet stream, the entropy of the species evaluated at the system (outlet) conditions—this includes effects from dilution, temperature, pH, and ionic strength—and the rate of entropy production in the system contributed by the reaction entropy. Equation (2) shows that the system entropy is expected to increase as the rate of entropy production by reaction increases. Furthermore, when the system is operated at isothermal conditions, and dilution of the incoming stream is negligible, as is commonly the case for low cell-density, chemostat cultures, the reaction term dominates, and system entropy approaches a maximum. We define this entropy production rate as the Type 2 entropy production rate since it omits the enthalpic contribution to the entropy formation.
Using the Gibbs relation (
Gsys =
Hsys − TSsys ) along with expressions for
Hsys and
Gsys that are derived similarly to the procedure used for Equation (2), we can derive an expression for the Gibbs free energy of the system:
By combining Equation (2) with different statements of the Gibbs relation (
Gsys =
Hsys −
TSsys,
gi = hi − Tsi, and Δ
GR = ΔHR − TΔSR), we can convert this to the following expression for the Gibbs free energy of the system given that the first three terms on the right hand side of Equation (3) must sum to zero
The rate of entropy production due to the irreversibility of the reaction can also be expressed as
where
[J/K.L.h] is the rate of entropy production per unit volume, and
A [J/mol] is the affinity of reaction defined as the negative of the Gibbs free energy of reaction [
27]:
With these relations, we can write
The Gibbs free energy of the system at steady state is proportional to the negative rate of entropy production
. We define this entropy production rate as the Type 1 production rate since it expresses the commonly known entropy production rate based on the affinity of reaction that includes contributions from both the enthalpy as well as entropy of reaction. Thus, a reacting open system adjusts the component concentrations because of reactions such that the Gibbs free energy at steady state is at a minimum due to the tendency to equilibrate chemical potentials. This is accomplished when the rate of entropy production is maximized corroborating the Maximum Entropy Production (MEP) principle [
28,
29]. But the corresponding system entropy (Equation (2)), under the assumed experimental conditions, depends only on the rate of entropy formation due to the entropy of reaction. From the contributions to the internal entropy generation expression, only the reaction entropy affects the entropy of the system, because the enthalpic component of the internal entropy generation is exported into the surroundings and does not contribute to the entropy content of the system. The obtained relationships describe the macroscopic behavior of the system and are generally valid for any reacting, non-equilibrium system at steady state. A detailed derivation including material and energy balances is given in
Supplemental File S1.
Thermodynamics of Elementary Modes. An Elementary Mode (EM) is formally defined as a minimal set of enzymes that can operate at steady state with all irreversible reactions proceeding in the appropriate direction [
30]. An alternate definition that can be easier visualized, is that an EM represents a reaction sequence (or pathway) that a glucose molecule follows when it is metabolized. At the dynamic steady state, the overall growth reaction can be formally represented by the general chemical equation
that represents all components (nutrients, biomass and products) in the reactor. Here, one mole of glucose (A) plus nutrients (B) get converted into biomass (C) and products (D). The factors ν
ι represent the individual stoichiometry coefficients or the molar yields per one mole of glucose utilized of each component. They are negative for reactants and positive for products. The rate of reaction is described by the extent of reaction
[mol/L.h]. The extent of reaction represents also the rate of glucose consumption since the stoichiometry coefficient for glucose is −1. The rate of the chemical growth reaction (Equation (8)) is proportional to the biomass concentration or the number of cells present. Because the reaction equation holds for all biomass concentrations it is convenient to express it on a per cell mass basis. The rates of reaction then become specific rates of reaction
[mol/h.g CDW].
Such a chemical equation can be written for each EM. It involves only the external metabolites that are taken up or that are excreted by the cells. Each EM contributes to the specific glucose uptake at a rate defined by
where p
j represents the fraction of the total specific glucose uptake rate
[mol/h.g CDW] that is consumed by elementary mode ‘j’. Thus, each elementary mode contributes to the overall metabolism with a usage probability p
j.
For each EM (
j), the entropy, Gibbs free energy and enthalpy of reaction can be computed based on the individual, molar properties of formation (i) summed over the
m reactants participating in the growth equation (see Equation (8))
Here, we use a lowercase (Δsr,i, Δgr,i, or Δhr,j) to represent the reaction properties of a single EM. However, the same relation is also true, for the overall chemical reaction of the cell. Thus, the macroscopic rate of entropy, Gibbs free energy and enthalpy generation (, can be defined in terms of the individual EM reaction properties
where
[moles glucose/h.g CDW] is the cell specific glucose consumption rate in the reactor, ∆S
R [J/K.mole glucose] is the reaction entropy of the overall growth reaction per mole glucose consumed and ∆G
R and ∆H
R the corresponding Gibbs free energy and enthalpy of reaction [J/mole glucose].
[J/K.h.g CDW] is the cell specific rate of entropy production,
and
[J/h.g CDW] are the cell specific rate of Gibbs free energy and enthalpy production,
[mole glucose/h.g CDW] is the specific glucose uptake reaction rate of elementary mode j and ∆s
j, ∆g
j, ∆h
j and p
j are the respective reaction properties and usage probabilities of individual elementary modes, respectively. Note that for simplicity the subscript r is omitted for the reaction properties of individual elementary modes, and the thermodynamic properties of the
overall cell reaction are assigned using uppercase letters (
). After substituting with Equation (9) one obtains
Equation (12) shows that the cell specific rate of production of the thermodynamic reaction properties is a function of the usage probability of individual elementary modes and of the total glucose uptake rate of a cell. In fact, once the usage probabilities of elementary modes are known the specific rates of change relative to the glucose uptake rate are defined by the overall stoichiometry of the growth reaction (Equation (8)) since the stoichiometry coefficients (or yields) can be computed from the sum of contributions of individual elementary modes.
where ν
i is the stoichiometry coefficient of the ith reactant in the overall growth equation and ν
j,i is the corresponding stoichiometry coefficient in the elementary mode j.
The specific rate of Gibbs free energy production (Equation (12)) can be combined with Equations (5) and (6) to obtain the specific rate of entropy production of the system due to the irreversibility of the reaction system:
The corresponding entropy production rate due to individual reaction entropies becomes
It is important to distinguish between the two rates of entropy production defined by Equations (14) and (15). In the first case (Equation (14)), designated as type 1, the total entropy production rate reflects the irreversibility of reactions (entropy of reaction and entropy due to heat generation). In the second case (Equation (15)), designated as type 2, the entropy production rate reflects the contribution by the entropy of reaction only (type 2). The previously derived macroscopic relations have shown that the type 1 entropy production rate is a maximum when the Gibbs free energy of the system is a minimum at steady state (see Equation (7)). In contrast, the type 2 entropy production rate is a maximum when the entropy of the system is at a maximum (see Equation (2)). Clearly, both types of entropy production rates are defined by the internal rate structure of the metabolism, i.e., by the thermodynamic properties of external substrates, by the usage probability of each elementary mode and by the specific rate of glucose uptake.
Thus, the question arises whether the internal elementary mode structure of a cell evolves to minimize the Gibbs free energy of the system or to maximize the entropy of the system. In the first case the Gibbs free energies of reaction are the characteristic properties of the reaction trajectories defined by individual elementary modes as they determine the type 1 entropy production rate. In the second case, the entropies of reaction define the rate structure of the metabolism based on the type 2 entropy production rates of individual elementary modes.
Having derived the equations that link a cell’s macroscopic rate of entropy production to the EM microstates of its metabolic network, the challenge then becomes to adjust the individual usage probabilities (
pj) so that the rate of entropy production by the system is a maximum. At the same time, the probability distribution must be made to satisfy three constraints: (i) the fair apportionment of outcomes, (ii) a constant macroscopic specific entropy production rate, and (iii) unity of the sum of all probabilities [
31]. The solution to this maximization problem, which is obtained by the method of Lagrange multipliers, represents then the constrained maximum specific entropy production rate with respect to the underlying variation in usage probabilities of elementary modes. This approach has been previously described for the case of maximizing the entropy of the system on the basis of maximizing the type 2 entropy production rate [
25].
Alternately, one can carry out the thought experiment as performed originally by Boltzmann [
32] for the energy distribution in gas particles. But instead of observing the
energy content of individual particles, one observes the
time trajectories of individual glucose molecules when they are metabolized, and one evaluates the associated rate of specific entropy production. One should recall that individual glucose molecules are always metabolized following a path along an elementary mode. Arranging the same number of glucose molecule trajectories in all possible permutations yielding the fixed macroscopic specific entropy production rate, results in the most probable distribution of the usage of individual elementary modes.
Both approaches result in following expressions for the usage frequency of an elementary mode which represents a constrained maximum of the overall entropy production rate depending on which type of entropy production rate (type 1 or 2) determines the distribution:
In linearized form the equations become
This expression relates the usage probability of an elementary mode j to the net glucose uptake rate of the cell [moles glucose/h.g CDW] and to the individual entropies of reaction or Gibbs free energies of reaction of elementary modes. K and c are the Lagrange multipliers arising via the constrained optimization.
To keep dimensional consistency, we can separate from K the constant Q = 1 [1/h.g CDW] to give
with R [J/K.mol] representing the universal gas constant (or
molar Boltzmann constant).
The usage probability of elementary modes based on type 2 entropy production rates becomes
To obtain the constant c Equation (16) can be rewritten as
Then, by summing up all probabilities to unity, one obtains the “partition” function Z along with the value of
CThe unique form of Equations (16) and (17), suggests that the evolution of a metabolic network involves an interplay between two mechanisms, each having the ability to advance the fitness of the cell. The first mechanism is due to changing the network structure as reflected in the distribution of usage probabilities pj of elementary modes. The second mechanism is due to the selection process reflected by the specific glucose uptake rate . The network structure determines the yield of biomass on glucose (Y) (see Equation (13)), and when this is multiplied by the specific glucose uptake rate, the resulting value gives the specific growth rate,
where μ [1/h] represents the specific growth rate and
Y [mol biomass/mol-glucose] is the yield coefficient of biomass on glucose, i.e., the stoichiometry coefficient associated with the biomass in the growth equation (Equation (8)).
In the process of evolution, the rate of entropy production is increased. The rate of entropy production (see Equation (12)) can be increased (i) by increasing the specific glucose uptake rate or (ii) by changing the rate structure of the network such that a higher specific entropy of reaction is obtained. The latter case would require that more weight (usage probability) would be given to an elementary mode that has a higher entropy of reaction. The highest rate of entropy production is obtained when the highest specific glucose uptake rate is reached together with the associated rate structure of the metabolism. In that case the specific glucose uptake rate becomes
and the network structure is given by
representing the most probable distribution of elementary modes for the case of a fully evolved metabolic network.
But there could be the case where this state of ultimate network structure has been reached but not yet the state of maximum specific glucose uptake rate. In such a case, the specific glucose uptake rate of a cell can be increased, in principle, by increasing in equal proportions all catalysts (enzymes) in a cell. This would increase proportionally the rate of each individual reaction including the specific growth rate, without changing the network structure. But one should expect a limit to this increase, since there will likely be a maximum specific glucose uptake rate that a cell can achieve due to physical transport limitations. For instance, glucose can only diffuse to the surface of a cell at a maximum rate dictated by the diffusion coefficient, or glucose uptake could be limited by a limited number of permeases on the cell surface [
33]. It is therefore useful to relate the experimentally measured glucose uptake rate to this maximum possible specific uptake rate
where
[mol glucose/h.g CDW] is the maximum specific glucose uptake rate of a cell attainable by evolution under the given environmental conditions, and b represents the fraction of the maximum specific growth rate that the strain has attained during the ongoing evolution process.
Thus, the usage probability of an elementary mode of a fully evolved metabolic network structure in a cell that has not yet attained the maximum specific glucose uptake rate, can be computed explicitly from
or, in linearized form, from
If we know the maximum possible glucose uptake rate, we can estimate b from the measured actual specific glucose uptake rate using Equation (26). Alternately, if we do not know the maximum possible macroscopic glucose uptake rate, we need to determine both the specific glucose uptake rate at the current point in the evolution process and estimate the fraction b as shown below.
In case the internal rate structure is determined by the type 1 entropy production rate, an analogous expression is obtained in which the reaction entropy is substituted by the affinity of reaction divided by T. However, we will focus in the following on the type 2 entropy production rate as the data suggest that this type determines the usage frequencies of elementary modes, the justification for which we present later in the Discussion.
2.2. Comparison of Theory with Experimental Systems
Recently, a radioisotope labeling method combined with mass spectrometry has been developed that allows estimation of multiple intracellular reaction rates comprising a metabolic reaction network [
34]. This method has been applied to evaluate the reaction rates of six strains of
E. coli that were evolved over 300 generations from the same ancestor cell line in growth experiments using glucose as the carbon and energy source [
35]. Over this time the strains increased their specific growth rate by 28–38%. Surprisingly, in the set of evolved strains, the network structure is very similar to the original wildtype. This could indicate that the strains are already close to the fully evolved network state. We have used these data to test the presented theory. We first determined the constant b from the experimentally measured glucose uptake rate and Equation (28), and then predicted the intracellular rate structure and compared it with the experimentally measured data.
The thermodynamic reaction properties for each elementary mode, sorted according to decreasing values of Gibbs free energies of reaction, are shown in
Figure 1.
The graph shows that the Gibbs free energy of reaction is negative for all elementary modes, indicating that all elementary modes are thermodynamically spontaneous at the experimental conditions. Furthermore, all elementary modes are exothermic. There is a strong correlation between the Gibbs free energy and the enthalpy of reaction and an inverse correlation with the entropy of reaction. All elementary modes have a positive entropy of reaction and a negative Gibbs free energy of reaction indicating that all pathways are feasible and spontaneous. The results of this first test, are presented to justify our assumption that all identified EMs should be included in the probability distribution.
In
Figure 2 the entropy contribution from the Gibbs free energy of reaction (−∆G
j/T) versus the entropies of reaction (∆S
j) are plotted for all elementary modes.
The solid diamond represents the arithmetic average of −∆G
j/T and ∆S
j taken over all modes, while the open red circles represent the entropy and Gibbs free energy of reaction computed from the experimentally determined overall growth reaction (Equation (8)) of each strain. Thermodynamic values were calculated from the experimental flux data for the externally occurring metabolites of the seven strains tested by Long [
35]. Compared to the mode-average entropy and Gibbs free energy of reaction, which assumes a uniform probability distribution over all the modes, the experimentally determined thermodynamic values are clearly biased towards lower values of entropy as expected from a Boltzmann distribution of elementary modes (see Equation (28)). In addition, probability values for each elementary mode are superimposed onto the figure.
The metabolic network contains n Elementary Modes (n = 7363), which were computed using Cell Net Analyzer, based on the reaction network specified by Long et al. The elementary modes are listed in
Supplementary File S2. Based on n elementary modes, Equation (28) results in n independent relationships in which the probabilities p
j and the constants
and c account for n + 2 unknowns. Thus, to completely specify the system, two additional relationships are needed. One is given by the requirement that the probabilities must sum up to unity, and the second is provided from the experimental data for the stoichiometry coefficients of the overall growth reaction (Equation (8)) that permits computation of the entropy of reaction according to Equation (10). The solution is numerically accessible in MATLAB (R2019a) using the Levenberg–Marquardt least square algorithm for solving non-linear equations.
Note that each EM can be represented by a reaction equation as shown in Equation (8) that is based only on external metabolites. Only the stoichiometry coefficients may be different. Then, the Gibbs free energy and entropy of external metabolites can be evaluated using standard physical chemistry approaches, and, from the difference in products and reactants, the Gibbs free energy and entropy of reaction can be evaluated. These values are then used in the system of equations as described above and numerically solved using MATLAB providing the probabilities of each EM and the parameters of the distribution. Once the parameters of the distribution are known, the probabilities can be directly computed from Equation (27) or (28).
Figure 3 shows the usage probabilities of elementary modes as a function of specific entropy production rates (type 2). When the entropy production values for the strains are normalized, each by their own specific value of b, the trends all collapse to a common form, having the same slope which corresponds to the universal gas constant as defined by Equation (20). The measured macroscopic growth parameters together with the constants b and c for the individual strains are summarized in
Table 1.
Using Equations (24) and (26) along with the measured specific growth rate for each strain and the computed constant b, we can make an estimate of the maximum theoretical growth rate (μ
max) that is possible for
E. coli under the given environmental conditions. Then, with the maximum specific growth rate, the minimum doubling time can also be calculated (see
Table 1).
The average minimum doubling time for all strains is 18 ± 1.4 min. This predicted doubling time points to further evolution capacity as it is shorter than the doubling time of 23 min inferred after evolving
E. coli on minimal media over 21 years or 50,000 generations [
36]. In making this calculation, it is important to remember, we have assumed that all strains are already at (or very near) the optimum network structure. Consequently, this implies that the additional rounds of adaptive evolution conducted by Long et al., served mainly to reduce the effects of any rate limiting states that were restricting the overall flux of glucose, rather than affecting any significant changes in the distribution of the underlying, elementary modes.
With the maximum possible glucose uptake rate identified we can now explicitly compute the usage probabilities of elementary modes from Equation (28). Since each elementary mode is assumed to be operating at steady state, without accumulating intermediate metabolites, the metabolic flux across all reaction steps of the EM is conserved. The flux contribution for each elementary mode can then be obtained from Equation (9), and the rate of production of each external metabolite can be computed from
where r
i [moles/h] represent the rate of change in the overall chemical equation and ν
j,i the stoichiometry coefficient of the ith reactant of elementary mode j and its corresponding flux ξ
j. The rates of internal reactions can be obtained from
where r
k [mol/h] is the flux through the kth reaction in the metabolic reaction network and ξ
k,j is the contribution to the kth reaction by elementary mode j.
The experimentally determined reaction rates for strain ALE-3, which is representative for all other strains, are compared with the reaction rates predicted from the model in
Figure 4.
With the exclusion of three outlying flux values (v33, v34, and v65), the measured and predicted reaction rates are in remarkable agreement as expressed in the R
2 value of 0.97 of the linear regression of the data (see
Figure 4). The three, inconsistent reaction rates represent (i) the conversion of ATP into the external ATP pool (label v65 in
Figure 4), (ii) the anabolic conversion of oxaloacetate into phosphoenolpyruvate consuming ATP (label v34 in
Figure 4), and (iii) the anaplerotic conversion of phosphoenolpyruvate to oxaloacetate (label v33 in
Figure 4). The discrepancy arises because in the measured data set assumes a significant export of ATP and the anaplerotic reaction v34 is essentially zero while the elementary mode-based model predicts a significant reaction activity of v34 that consumes ATP. Thus, in the model, based on elementary modes, a significant turnaround between phosphoenolpyruvate and oxaloacetate consuming ATP is predicted. If this ATP consumption would be assigned to a maintenance reaction (not included in the model) the experimental data would be very well predicted by the model considering that the total consumption of ATP is within 4% when comparing the experimental data with the prediction. At this point it is not clear whether this discrepancy is possibly caused by the extraction of the rate data from the measurements. For instance, a constant ATP production based on a P/O ratio of 2 has been assumed for all strains in the experiment (Long et al., 2017). Or errors could be introduced when a futile cycle exists in the model of elementary modes which lacks a reaction consuming energy for cell maintenance.