Next Article in Journal
Leveraging Zero and Few-Shot Learning for Enhanced Model Generality in Hate Speech Detection in Spanish and English
Previous Article in Journal
Parallel Prediction Method of Knowledge Proficiency Based on Bloom’s Cognitive Theory
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transition to Multicellularity and Peto Paradox

by
Sergey Vakulenko
1,2
1
Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg 197022, Russia
2
Institute for Problems in Mechanical Engineering of the Russian Academy of Sciences, Saint Petersburg 199178, Russia
Mathematics 2023, 11(24), 5003; https://doi.org/10.3390/math11245003
Submission received: 29 November 2023 / Revised: 12 December 2023 / Accepted: 15 December 2023 / Published: 18 December 2023
(This article belongs to the Section Computational and Applied Mathematics)

Abstract

:
This paper aims to explain the transition to multicellularity as a consequence of the evolutionary response to stress. The proposed model is composed of three parts. The first part details stochastic biochemical kinetics within a reactor (potentially compartmentalized), where kinetic rates are influenced by random stress parameters, such as temperature, toxins, oxidants, etc. The second part of the model is a feedback mechanism governed by a genetic regulation network (GRN). The third component involves stochastic dynamics that describe the evolution of this network. We assume that the organism remains viable as long as the concentrations of certain key reagents are maintained within a defined range (the homeostasis domain). For this model, we calculate the probability estimate that the system will stay within the homeostasis domain under stress impacts. Under certain assumptions, we show that a GRN expansion increases the viability probability in a very sharp manner. It is shown that multicellular organisms increase their viability due to compartment organization and stem cell activity. By the viability probability estimates, an explanation of the Peto paradox is proposed: why large organisms are stable with respect to cancer attacks.
MSC:
34D20; 92E20; 34H15

1. Introduction

In this paper, we focus on three enigmatic problems of biology: (I) why multicellular organisms emerge; (II) why complex multicellular organisms can be constructed with a relatively small number of genes, and (III) the Peto paradox on cancer and lifespan, i.e., why they are stable with respect to cancer attacks.
The origin of multicellular organisms presents a significant challenge for biological evolution theory. For instance, E. Koonin notes “The emergence and evolution of complexity at the levels of the genotype and the phenotype, and the relationship between the two, is a central (if not the central) problem in biology. Even leaving aside for now the problem of the actual origin of the very substantial complexity associated with the cellular level of organization, one cannot help wondering why the evolution of life did not stop at the stage of the simplest autotrophic prokaryotes, with 1000 to 1500 genes. Why instead did evolution continue, to produce complex prokaryotes possessing more than 10,000 genes and, far more strikingly, eukaryotes, with their huge, elaborately regulated genomes; multiple tissue types; and even ability to develop mathematical theories of evolution?” (see [1], Ch. 8, p. 250). Some approaches to the evolution of multicellularity were proposed in ([2,3]), describing this evolution as a result of an interaction between hosts and parasites. The emergence of multicellular organisms can be considered as a major evolutionary transition [4].
The second enigma is connected to several protein-coding genes. In fact, the human genome comprises approximately 22,000 protein-coding genes, a number that is comparable to the genomes of fruit flies and nematodes. Surprisingly, more complex multicellular organisms do not necessitate a higher number of genes, despite having a greater number of phenotypic traits and the need to adapt to numerous environmental constraints. These facts challenge the classical ideas of modern evolutionary synthesis. According to the celebrated Fisher geometric model, it can be demonstrated that the likelihood of improving fitness through random mutations diminishes as the organism’s complexity increases ([5]). Building upon Fisher’s approach, Orr estimated the adaptation rate R e = d log F d t as a function of the number of environmental constraints M, with F representing the average population fitness ([6], 2000, [7], 2005). Orr’s findings reveal that the adaptation rate becomes exponentially small when M 1 , a phenomenon known as the complexity cost or complexity barrier. Therefore, for large values of M, the complexity barrier becomes exponentially large.
Richard Peto was the first to notice that the cancer incidence does not correlate with the number of organism cells [8]. The cancer incidence in humans is much higher than the incidence of cancer in whales, despite whales having more cells than humans. For simplicity, suppose that the probabilities of cancer driver mutations are constant for all cells. Then whales should die from cancer more often than people and at an early age. In reality, they live for hundreds of years. No statistically significant relationship has been found between body size and cancer incidence, supporting Peto’s observation. This problem III is connected to I and II. In fact, the existence of multicellular organisms requires the suppression of cancer [9]. There exists a connection between the origins of multicellularity and cancer [10,11]. In order to build larger bodies, organisms must suppress cancer. According to ([12]), large organisms have more anticancer genes.
Let us outline the proposed model, which aims to address problems IIII. The first part describes stochastic biochemical kinetics within a reactor (potentially compartmentalized), where kinetic rates are influenced by random stress parameters such as temperature, toxins, oxidants, etc. The second part of the model is a feedback mechanism governed by a genetic regulation network (GRN). The third component is stochastic dynamics, which describe the evolution of this network. We assume that an organism remains viable as long as the concentrations of certain key reagents are maintained within a defined range (the homeostasis domain).
To describe GRNs that control responses to stress, radial basis function networks (RBFNs) are used. The inputs to these networks are the stress parameter values and the states of the system. By external stress, we mean any deviation in environmental parameters that could potentially harm the organism (whether multicellular or a single cell). The GRN produces an output dependent on its architecture and, ultimately, on the genetic code, s, which determines the architecture and interconnections within the GRN. Following [13], we assume that the GRN output and stress impact neutralize each other. Connections between units in these networks depend on coding sequences s = ( s 1 , , s N ) , s i { 0 , 1 } of Boolean genes. Such a stress response model is not new; however, herein, it uses a new key element connected to stem cell activity in the stress response. Multicellular organisms consist of many types of differentiated (somatic) cells, which form tissues, and stem cells. Stem cells can produce differentiated cells. In this paper, we provide mathematical justification for the well-known idea that cell differentiation emergence in evolution is a response to stress, particularly oxidative stress [14]. In different eukaryotic microorganisms, the induction of antioxidant mechanisms is associated with development. Note that stem cells are particularly susceptible to stresses ([15]) and they are regulated by stress. Experimental data show ([15]) that stem cell division provides tissue turnover and response to damage. Different stress factors induce stem cell division and differentiation. Many pathways that regulate stem cell functioning are also stress–response pathways [15]. So, stress is involved in the long-term maintenance of cell populations [15,16]. In this paper, to describe models for gene regulation in stem cells, we take into account experimental observations and ideas [14,15,16,17]. We propose that the biochemical response of multicellular organisms to stress is facilitated by different gene networks activated in various tissues. We imagine the entire organism as divided into compartments, with stem cell activity aiding in regulating each compartment’s response. Contemporary morphogenesis models explain the emergence of such compartments [18,19]. The model proposed can be considered to be a far-reaching extension of L. Valiant’s model of the ideal answer [20]. In Valiant’s model, organisms are viewed as Boolean circuits modeling responses to environmental stress through a GRN. Our proposed model builds upon this approach while also incorporating real biochemistry, the existence of compartments, and stem cell dynamics. We assume that an organism remains viable as long as the concentrations of certain key reagents are maintained within a specific domain (the homeostasis domain).
The results can be sketched as follows. For such a model, we determine the estimate of the probability that the system does not leave the homeostasis domain under the impact of stress. These estimates are given by Theorem 1 for models without compartments and stem cells and Theorem 2 for models with compartments and stem cells. Furthermore, it is shown that the RBF network growth makes biosystems stable for long time periods. We show that a state with a GRN of the maximal possible size is the most probable, with overwhelming probability. This indicates that evolution favors transitions to increasingly larger GRNs. Multicellular organisms exhibit higher viability compared to a simple colony of cells due to compartment organization and stem cell activity. By these results, we propose an explanation for Peto’s paradox. Moreover, we estimate the number of genes needed to encode GRNs supporting anti-stress responses.

1.1. Innovations

This article continues the series of works (see, for example, [21], 2006, [13], 2021), where the replicative stability concept is investigated. This concept is proposed by M. Gromov and A. Carbone [22]. It was shown that some properties of evolution could be explained by the need for biosystems to maintain homeostasis under the influence of stress and fluctuations (both internal and external), and this homeostasis support must be ensured through subsequent replications. Indeed, many aspects of evolution can be interpreted this way, for example, the tendency to increase the number of genes (see a review in ([23], 2014)). The key question, however, is to explain the complexity of genetic networks. An attempt to resolve this problem was made in ([13], 2021), where a model of the biochemical system’s response to stress was introduced. In this study, the entropy of a stressful environment is defined, and it is demonstrated that an effective response to stress is achievable if the binary logarithm of the size of the regulatory genetic network is at least as large as this entropy. Therefore, according to ([13], 2021), the main reason for the growth of regulatory networks involves the complexity and diversity of the stress environment.
In this paper, the main innovation, compared to previous works, is the use of a compartment model to describe a response to stresses. This model accounts for the modular structure of an organism and stem cell activity. The organism’s response to stress is mediated through cell populations in these compartments, with their dynamics governed by the standard Lotka–Volterra model. Such an approach allows us to address one of the key biological problems, the Peto paradox, because it will be shown that the organism size directly affects stress response mechanisms. The second innovation, with respect to ([13], 2021), is that we no longer need the complexity of the environment to explain the complexity of an organism’s genetic network. If the organism’s environment fluctuates, it is enough to initiate an increase in the complexity of the genetic network. The third intriguing innovative element is that when an organism is divided into compartments, there is no longer a need to fine-tune genetic regulatory networks (as in connectionist models [19]). Instead, it is enough to adjust an appropriate interaction between the compartments.

1.2. Organization of the Paper

The paper is organized as follows. Section 2 describes the different models. Furthermore, we outline the methods. In Section 4, we state the concept of viability under stochastic perturbations. The main results on viability (Theorems 1 and 2) can be found in Section 5. Furthermore, the proof of these results follows based on the standard estimates of the Ventsel–Freidlin theory ([24], 1984). By these viability estimates, in Section 7, we show that evolution has sufficient time to construct complex structures. In Section 8, we outline some contemporary morphogenesis models and explain why multicellular organisms can be encoded by a few genes. An explanation of Peto’s paradox is stated in Section 9. Discussions and conclusions can be found in the last section. Technical estimates are relegated in Appendix A.

2. Model

In this section, we describe the model. It unfolds in a few steps.

2.1. Chemical Kinetics with Random Stress Parameters

In this subsection, following ([13]), we consider systems of differential equations where the right-hand sides depend on a random process, ξ ( t ) . These systems read
d v d t = f ( v , ξ ( t ) ) t 0 ,
where v ( t ) = ( v 1 ( t ) , , v n ( t ) ) D , where D is a compact domain in R n with a smooth boundary D , and v i ( t ) represent reagent concentrations. We assume that f ( v , ξ ) = ( f 1 ( v , ξ ) , , f n ( v , ξ ) ) are sufficiently smooth functions of v; for example, multivariate polynomials in v and smooth functions of ξ R p . Functions ξ ( t ) are trajectories of random processes, which are piecewise constant in t and take values in a compact subset D E of R p , where the quantity p N is a dimension of the stress parameter ξ . These Markov processes ξ ( t ) describe a random stress produced by an environment. For system (1), we set the following initial conditions:
v ( 0 ) = v ( 0 ) .
Reaction terms, f, can be defined by the well-known models of chemical kinetics and population dynamics. The simplest choice from the law of mass action is
f i ( v , ξ ) = a R i C i , a ( ξ ) v 1 a 1 v 2 a 2 v n a n ,
where a = ( a 1 , , a n ) is a multi-index with integers a i 0 , R i are finite subsets of I n = { 1 , , n } , and C i , a denote coefficients that determine kinetic rates. One also uses rational nonlinearities, for example,
f i ( v , ξ ) = a R i C i , a ( ξ ) P i , a ( v ) Q i , a ( v ) ,
where P i , a ( v ) and Q i , a ( v ) are polynomials. We suppose that Q i , a ( v ) are separated from zero:
| Q i , a ( v ) | > δ 0 > 0 ,
for all i , a and v D . Systems (1) with f i defined by (3) or (4) often occur in biochemistry and population dynamics.
To ensure the existence and uniqueness of solutions to the Cauchy problem (1), (2) for all t [ 0 , + ) , we make the following assumption. Let
f ( v , ξ ) · n ( v ) 0 v D ξ D E
where n ( v ) is a unit normal vector directed outside D at the point v D . Then solutions to the Cauchy problem (1), (2) exist and are unique for all t > 0 .
Polynomial and rational functions f i may not fulfill conditions (6). Since our aim is to investigate the local stability of local equilibria, we consider narrow neighborhoods of those equilibria. Let v ¯ be the equilibrium under consideration. For δ > 0 , we introduce a δ -neighborhood of that equilibrium by
W δ = { v : | v v ¯ | < δ } ,
where we assume that W δ D H , where D H is a viability (homeostasis) domain. We truncate nonlinearities in the f i ( v ) setting f ˜ i ( v ) = f i ( v ) χ ( v ) , where χ ( v ) = 0 is defined as follows: χ ( v ) is a Heaviside-like function, which is equal to 0 outside of an open neighborhood W δ of the equilibrium, and χ ( v ) is equal to 1 inside this neighborhood. Here, we assume that the neighborhood radius δ > 0 is so small that this neighborhood W δ lies inside the homeostasis domain.
Moreover, let us note that an explicit form of f i is not essential in the subsequent statement; however, certain assumptions on the existence of the equilibria of system (1) and their stability should be fulfilled.

2.2. Extended Model with Anti-Stress Response Encoded by Genes

Different complicated models of gene regulation were proposed, for example, refs. [19,25] for Drosophila morphogenesis. Suppose reaction terms f i involve feedback, u:
f i = f i ( v , ξ , u ) , u = U n e t , s ( v , η ) .
Here, u = ( u 1 , u 2 , , u n r e g ) U R , where
U R = { u R n r e g , sup | u | < R } ,
and n r e g is a positive integer, R > 0 is a constant. Here, the input η is the averaged value of ξ on an interval:
η ( t ) = γ 1 0 t exp γ ( t τ ) ξ ( τ ) d τ ,
where γ > 0 is a parameter. We are seeking regulators defined by a radial basis function network (RBFN):
U n e t , s Φ N r e g , σ
where the function class Φ N r e g , σ is defined in Appendix A.2.
We suppose that all parameters in U n e t , s are encoded by a Boolean genetic code s, where s ( t ) = ( s 1 , . , s N ) is a gene expression vector (see Appendix A.3). We have N Boolean genes s i { 0 , 1 } ; therefore, s are elements of the Boolean hypercube S N = { 0 , 1 } N . The i-th gene may be switched on ( s i = 1 ) or switched off ( s i = 0 ). The strings, s, can depend on time: at certain time moments, they change as a result of mutations, gene drift, and other evolution forces.

2.3. Assumptions

Let us formulate natural assumptions to stress ξ , chemical kinetics, and the feedback.
Assumption 1. 
Assumption to ξ.
We suppose that ξ is the sum of a trend η ( t ) and a small multiplicative noise:
ξ i ( t ) = η i ( t ) + κ l = 1 n f h i l ( v , η ( t ) ) d w l ( t ) d t ,
where κ is a small positive parameter, h i l are functions, and w l ( t ) , where l = 1 , , n f are independent standard Wiener processes (therefore, d w l ( t ) / d t are white noises).
To describe the trend η ( t ) , we use the model of subsequent environmental shocks. We assume that η ( t ) are piecewise constant functions. Let us denote τ 1 , τ 2 , , τ k , as an infinitely increasing sequence of time moments τ j . Let Δ τ = min j ( τ j + 1 τ j ) . We suppose Δ τ > > Δ t r and Δ τ > > Δ t d , where Δ t r and Δ τ d are the average times of GRN reaction to stress and biochemical dynamics, respectively. At the τ j moments, we have environment shocks, where an interaction between the environment and the biosystem changes. The assumptions on Δ τ , Δ t r , and Δ t d are natural. It means that changes in the organism’s environment are not very frequent and, therefore, organisms have time to react to them. Otherwise, it is difficult to expect that evolution could be successful.
Let us assume that, for t [ τ j , τ j + 1 ) , the parameter η ( t ) = η ( j ) , where at each step j  η ( j ) is chosen randomly. More precisely, at each interval [ τ k , τ k + 1 ] , the value η is a constant random vector, which is distributed according to a continuous density probability function (pdf) d μ ( η ) with a support, which has a support in a compact, non-empty domain D E R p .
Let us denote by R δ = { λ C : λ < δ } the open half-planes of the complex plane C .
Assumption 2. 
Equilibrium existence and stability.
We assume that for each η D E and each u U , the system
d v d t = f ( v , η , u ) t 0
has a hyperbolic rest point v ¯ , smoothly depending on η and u. This means the following: One has
f ( v ¯ ( η , u ) , η , u ) = 0 .
Consider the corresponding linearized operator A ( η , u ) :
A ( η , u ) = f v ( v ¯ ( η , u ) , η , u ) .
The operator A is uniformly hyperbolic and stable: there is a δ * > 0 , such that
S p e c A ( η , u ) R δ * u U , η D E .
This means that the biochemical equilibrium is uniformly stable in ( η , u ) .
To formulate the next assumption, let us introduce important auxiliary functions. Let us define g i l ( v , η , u ) by
g i l = j = 1 p f i ξ j ( v , η , u ) l = 1 n f h j l ( v , η ) .
where i = 1 , 2 , , n and l = 1 , 2 , , n f , where n f is a positive integer, and h j l are smooth functions.
Assumption 3. 
Existence of an ideal feedback
We suppose for each η D E and v D H , there exists a solution u = U * ( v , η ) of the system
g i l ( v , η , u ) = 0 , i = 1 , , n , l = 1 , , n f ,
which is a smooth function defined on the domain D H × D E .
This assumption means that there is an ideal response to the stress impact η . The GRNs should approximate this response to support the organism’s viability, as described in the last assumption.
Assumption 4. 
We suppose that for sufficiently large sizes, N, the ideal feedback u = U * ( v , η ) (which exists according to Assumption 3) can be approximated with a polynomial accuracy:
sup v D H , η D E | U * ( v , η ) U n e t ( v , η ) | < C ¯ 0 N γ
by a network U n e t Φ N , σ , where constants C ¯ 0 , γ > 0 are uniform in N.
This assumption can be justified by estimates stated in Appendix A.2. Moreover, Assumptions 3 and 4 show that our model can be considered an extension of L. Valiant’s circuit model of the ideal answer ([20], 2006, [26], 2009), where we take into account the biochemical kinetics, compartment existence, and the cell dynamics (which will be described below).

2.4. Stochastic Equations for Perturbations Induced by Stress Fluctuations

Taking into account our assumptions on ξ and the existence of the equilibrium, we represent v as
v = v ¯ ( η , u ) + v ˜ ( t )
where v ˜ is a new unknown function, representing a small correction to the equilibrium. By Assumption 2, one has
f i ( v , ξ , u ) = f i ( v , ξ , u ) f i ( v ¯ , η , u ) = P i ( v , ξ , u ) + R i ( v , ξ , u ) ,
where
P i = f i ( v , η , u ) f i ( v ¯ , η , u ) , R i = f i ( v , ξ , u ) f i ( v , η , u ) .
Using the Taylor expansion of f and the definition (14) of the linear operator A , we obtain
P i = A v ˜ i + h i ( v ˜ , ξ , u ) ,
where
| h i ( v ˜ ) | < C | v ˜ | 2 .
One has the following formal asymptotics for R i :
R i = κ f i η ( η , u ) ξ + O ( κ 2 ) .
Using (11), one finds that
R i = κ j = 1 p f i η j l = 1 n f h j l ( v , η ) d w l ( t ) d t + O ( κ 2 ) .
We remove the terms O ( κ 2 ) in R i and substitute the obtained shorted R i into (19), then one has
f i ( v , ξ , u ) = ( A v ˜ ) i + h i ( v ˜ , ξ , u ) + κ g i l ( v , η , u ) d w l ( t ) d t ,
where g i l is defined by (16).
Thus, for each time interval, where η ( t ) does not change, we have the system of the following stochastic Ito equations
d v ˜ i = ( A v ˜ ) i + h i ( v ˜ , u ) d t + κ l = 1 p g i l ( v , η , u ) d w l ,
where w l are independent standard Wiener processes. We suppose that in these equations, u is defined by (8). Note that our derivation of these equations is only formal; for example, in (21), we remove the terms proportional to κ 2 , which involve unbounded quantities ( d w l / d t ) 2 (the Wiener processes are not smooth). This difficulty can be circumvented ([23], 2014). Nonetheless, we will use Equation (23) because it has a classical form of the Ito equations, which are well-studied and fundamental in applications; see [27].

2.5. Model with Compartments

In biology and medicine, compartment models are popular for biochemical applications, for example, to describe pharmacokinetics (see, for example, [28]). We consider a model consisting of N c compartments. Each compartment consists of differentiated cells of the k-th type, where k = 1 , , N c , and N c is the number of cell types. Reaction terms in k-th compartments are denoted by f i ( k ) . Furthermore, each cell produces its own response to a stress disturbance η , which depends on the cell type. We denote by U ( k ) ( v , η ) the corresponding feedback (where we omit a dependence on the gene code s to simplify the notation). Let X k ( t ) be a relative abundance of cells of k-type, which produces an answer to stress. One can assume that the dynamics of these quantities are governed by the multispecies Lotka–Volterra dynamics with parameters depending on the stress parameter η :
d X k d t = X k ( r k ( η ) l = 1 N c K k l ( η ) X l ) ,
where r k are growth-mortality coefficients, which take into account the apoptosis and production of differentiated cells by stem cells, and K k l denotes interaction coefficients. We have a dependence on η because the behavior of stem cells depends on the stress factors (see Introduction and [14,15,16,17]). However, real organisms consist of cells; therefore, these variables, X k , which are relative concentrations, take discrete values. For each positive integer, M, let us define the sets:
K + , M = { 0 , 1 / M , 2 / M , , m / M , } .
In a discrete model, we suppose that X k K + , N c e l l , where [ 1 , n ] denotes the set { 1 , 2 , , n } and N c e l l is the total number of cells in the organism. To describe the dynamics, where cell abundances take discrete values, we can replace (24) with the stochastic Markov dynamics, which simulate system (24) by the Gillespie algorithm [29]. We suppose that the total anti-stress response of the whole organism is proportional to X k and U ( k ) :
u ( X , v , η ) = k = 1 N c X k U ( k ) ( v , η ) .
The stochastic Equation (23) takes the form
d v ˜ i = ( A v ˜ ) i + h i ( v ˜ , u ( X , η ) ) d t + κ l = 1 p g i l ( η , u ( X , η ) ) d w l ,
where u is defined by (25).
Within the framework of this weakly nonlinear approximation, all information about the interaction (coupling–decoupling) of reactants is contained in matrix A . We can define the coupling graph ( V , E ) , where the set V of vertices is the set of reactants, and an edge e = i , j lies on the edge set E if the i-th reactant interacts with the j-th one. One can consider various cases of the interactions of reactants that are relevant from a biological point of view. For example, we can consider the case of a Michaelis–Menten reaction chain. In this case, the analysis of the spectrum of the operator specified by the matrix A is relatively simple. A more complex option involves several parallel reactions or parallel chains of Michaelis–Menten reactions. For systems with a large number of reactants, the case of random pairing (of a random E) is also interesting. A more detailed analysis of the coupling problem has been postponed for future work.
In the coming subsection, we describe how the population dynamics for X k ( t ) can produce an effective stress response.

2.6. Associated Optimization Problems

Let u ( η ) be a smooth function defined on D E . When we try to approximate an ideal answer by GRNs acting independently in each compartment, we obtain the risk function F r i s k ( X , η ) , defined by
F r i s k ( X , v , η ) = k = 1 N c | u ( v , η ) X k U ( k ) ( v , η ) | 2 ,
where X = ( X 1 , , X N c ) is a vector with real-valued components and | u | denotes the standard Euclidean norm in R n r e g . The approximation problem reduces to
Relaxed optimization problem RP To find X giving the minimum of F r i s k under conditions
X k 0 , k = 1 , , N c
and
Integer optimization problem IP To find X giving the minimum of F r i s k under conditions
X k K + , N c e l l .
This last problem involves finding the anti-stress response via the correct activation of discrete cell sets.
First, consider the problem RP. If this problem has a solution, then that solution can be found by the population dynamics defined by system (24) under an appropriate choice of coefficients K k l and r l . To show it, let us define a square matrix M of size N c × N c and the vector b:
M k l = ( U ( k ) ( v , η ) , U ( l ) ( v , η ) ) , k , l = 1 , , N c ,
b k = ( u ( v , η ) , U ( k ) ( v , η ) ) k = 1 , , N c .
where ( , ) is the Euclidean scalar product in R n r e g . Let us consider the Cauchy problem defined by the system of equations
d X i d t = X i F r i s k X i
with i [ 1 , N c ] and positive initial conditions X k ( 0 ) = x k > 0 , where x k can be chosen at random. It is clear that this system can be rewritten in the form (24). System (32) enjoys the remarkable properties. First, the quantities X k ( t ) remain positive for all t; therefore, condition (28) holds. Furthermore, for fixed v , η , the function L ( X ) = F r i s k ( X , v , η ) is a Lyapunov function for system (32), decreasing along trajectories X ( t ) , t 0 . In fact, Equation (32) implies
l = 1 N c X l 1 d X l d t 2 = l = 1 N c L X l d X l d t = d L d t 0 .
We observe that F r i s k ( X ) is a convex function of X defined on the convex domain, D X . Therefore, the optimization problem RP has a unique solution. This solution can be found by system (32). Indeed, the Lyapunov function L ( X ) is convex. Therefore, it has a single local minimum X * . According to classical results on gradient-like dynamical systems, all the trajectories X ( t ) converge to X * [30]. We note that the vector X * gives the solution to the optimization problem RP. Finally, we conclude that population dynamics give the solution to that optimization problem RP. We suppose that the domain, D X , is convex, as we would like to deal with a convex optimization problem, where the global minimum is unique. To achieve this uniqueness property, not only must the objective function be convex, but the feasible set must be convex as well. In our biological context, the most natural choice for the domain is the simplex X k 0 , k X k < C x , where C x > 0 is a constant.
A good approximation for solutions X I * of the problem IP can be found for large N c e l l , as follows: First we solve QP and obtain the solution X * (as described above). Then we make the standard relaxation procedure, replacing each X i * with the corresponding closest element of the set K + , N c e l l . We denote an X obtained by X I * . We note that
| F r i s k ( X * ) F r i s k ( X I * ) | < C N c e l l 1 ,
where C is a constant. If F r i s k ( X * ) = 0 for certain feedback, U ( k ) ( v , η ) , we have
F r i s k ( X I * ) < C N c e l l 1 .
Let us find sufficient conditions under which this estimate is satisfied. Note that the set
C H ( u ( 1 ) , u ( m ) ) = { u R n : k = 1 N c X k u ( k ) , X k 0 , k X k = 1 }
is the simplex with vertices u ( k ) . For each v W δ let us consider the set U v * of all possible ideal answers
U v * = { u R n r e g : η D E , u = U * ( v , η ) } .
Proposition 1. 
Let m n r e g + 1 . Assume that for each v W δ , one has the inclusion
U v * C H ( u ( 1 ) , , u ( m ) )
where u ( k ) Φ N , σ . Then for each v W δ , there is a X * R N c , such that F r i s k ( X * ) = 0 and, thus, (34) holds.
Proof. 
It follows at once from the definitions. □
This claim indicates that adapting to stresses does not necessitate fine-tuning genetic networks within the differentiated cells of multicellular organisms. To create an effective response to stress, it is enough to correctly encode interactions between cells in different compartments. Moreover, the Carathéodory Theorem leads to the following conclusion: to satisfy (34), it is sufficient to have more cell types than the regulator dimension: N c > n r e g .

2.7. Stochastic Equation for Feedback Evolution

In this subsection, we describe a model for the evolution of the GRN evolution. Let U n e t , s ( η ) be feedback maps, where all parameters are encoded by binary genes s = ( s 1 , s 2 , , s N ) (see Appendix A.3). At the initial moment, we have a random binary string s ( 0 ) . Consider the time interval [ 0 , T ] , where T > > 1 is big. We choose a sequence of random moments T 1 , T 2 , , T m [ 0 , T ] such that T 1 < T 2 < < T m < T Ȧt these moments, mutations occur in s. To describe this process, we can use an analog of the master equation basic in physics. Let p ( s , t ) represent the probability that the system is in state s at time t. The probability of transitioning from s to s is denoted by w ( s s ) . We also introduce a special state corresponding to the destruction of the system, 0, and the probability of transitioning from s for 0 will be denoted by q ( s ) . We suppose that the likelihood of survival is completely determined by s (the theory of gene-trait maps, pioneered by R. Fisher ([5], 1930); see also [7,31,32]). Then we have
d p ( s , t ) d t = s w ( s s ) p ( s , t ) s w ( s s ) p ( s , t ) q ( s ) p ( s , t ) ,
d p ( 0 , t ) d t = s q ( s ) p ( s , t ) .
To make this general model mathematically tractable, we simplify these equations as follows. Instead of the evolution of the genetic code, we consider an evolution in a space of GRN sizes, neglecting details of gene coding. The evolutionary step is conceptualized as an increase in the GRN size as a result of mutations (possibly, a few mutations are necessary).
We simplify the master equation as follows. Instead of the states s, we consider states defined by positive integers that define the network sizes, m = N , where m { N 0 , . N m a x } = [ N 0 , N m a x ] and N 0 > > 1 . We consider the transitions m m + 1 ahead only (the networks can extend and cannot shrink). For brevity, the corresponding transition probabilities are denoted by w m . We obtain the following equations:
d p m d t = ( w m + q m ) p m + w m 1 p m 1 ,
where we suppose that m [ N 0 , N m a x ] , w N 0 1 = 0 , w N m a x = 0 , and p m is the probability of having a network of size m. These conditions mean that, in particular, the maximal network size is N m a x . This equation can be written down in a short matrix form, as d p / d t = W p , where p = ( p 1 , , p m ) , and W is the transition matrix for (37).

3. Materials and Methods

The classical methods of mathematical analysis and theory of differential equations are applied together with the results on approximations of smooth functions by RBF networks [33]. Our approach is based on the following:
(i) The idea of M. Gromov–A. Carbone on replicative stability in systems supporting homeostasis [22];
(ii) The theory of large deviations and stochastic transitions in random dynamical systems (A. Ventsel, M. Freidlin, Yu. Kifer et al.) ([24], 1984);
(iii) Models of morphogenesis proposed by A. Turing ([34]) and others ([19,35,36,37]);
(iv) Universal approximations by networks [33,38,39].

4. Stochastic Stability

Let us formulate the stochastic stability problem.
Following [13], we introduce probability, where for a given feedback u = u ( η ) , the solution v ( t , v ( 0 ) ) of the Cauchy problem defined by (1) with initial data (2) lies in W δ within the time interval [ 0 , τ ] :
P δ , τ , u = P r o b { v ( t , v ( 0 ) ) W δ t [ 0 , τ ] } .
We suppose that the initial value is the equilibrium: v ( 0 ) = v ¯ . The probability P δ , τ , u can be considered a measure of stochastic stability. A more relevant measure of stochastic robustness is given by the minimum over all network regulators, u:
P δ , τ = inf u Φ N , σ P δ , τ , u ,
where the infimum is taken over all feedback functions defined by RBFN of size N.
In the multi-compartment case, we take infimum over all X k and U ( k ) :
P δ , τ = inf X k K N c e l l , U ( k ) ( v , η ) Φ N , σ , k = 1 , , N c P δ , τ , U * ( X , v , η ) ,
where
U * ( X , v , η ) = k = 1 N c X k U ( k ) ( v , η ) .

5. Main Results

Let us formulate the main results. First, we choose δ > 0 in a special way. If it is necessary, we diminish δ to satisfy the following condition:
| ( A v ˜ , v ˜ ) | > > | | h ( v ˜ ) | | , | | v ˜ | | < δ .
Theorem 1. 
Let (41) hold. Consider the stochastic Equation (23) under Assumptions 1–4. Let the feedback u = U n e t ( v , η ) satisfy estimate (18).
Then one has the following estimate of the stochastic stability via the network size N, as κ 0 :
P δ , τ = 1 P o u t , κ , N ,
where
κ 2 log P o u t , κ , N < C s u r δ 2 N 2 γ ,
where C s u r is a positive constant, depending on f, and uniform in ( κ , N ) .
The second theorem concerns the multi-compartment case.
Theorem 2. 
Consider the multi-compartment system (26) under Assumptions 1–4. Let conditions of Proposition 1 hold. Then one has the following estimate of the stochastic stability via the numbers N , N c e l l as κ 0 :
P δ , A , τ = 1 P o u t , κ
where
κ 2 log P o u t , κ < C 1 N c e l l 1 δ 2 ,
where C 1 is a positive constant that is uniform in κ > 0 and N c e l l .

6. Proof of Theorems 1 and 2

Proof. 
Let us estimate the probability P o u t , κ , N that the trajectory v ( t ) exits W δ within the time interval [ 0 , T ] under the condition that v ( 0 ) = v ¯ . We use the following estimate (see Appendix A.4) for the probability P o u t :
κ 2 log P o u t , κ , N < C ( f ) δ 2 | | g | | 2 ,
where
| | g ( v , η , u ) | | 2 = max i l = 1 p | g i l ( v , η , u ) | 2 .
Let us estimate | | g ( v ¯ , η ) | | , considering the size of the network N. According to Assumption 3, there exists a U * ( v , η ) , such that
g i l ( v , η , U * ) = 0 i , l .
Due to (18)
sup i , l , η D E | g i l ( U * ( v , η ) ) g i l U N e t ( v , η ) | < C 1 N γ ,
thus
sup i , l , η D E | g i l U N e t ( η ) | < C 2 N γ ,
where C 1 , C 2 > 0 are constants. This last estimate and (46) imply the assertion of Theorem. □
To prove Theorem 2, we repeat the same arguments in the proof of Theorem 1 and use definition of F r i s k .

7. Transitions to Complex GRN

Using estimates of viability from Theorems 1 and 2 and the simplified Master Equation (37), we can estimate the probability of reaching the maximally complex regulation network state, m = N m a x .
Let us denote by N the size of the gene network of the regulation. In real genetic networks, the degrees of the most nodes are bounded. Therefore, we need a bounded number N m u t = O ( 1 ) of mutations, when the network increases its size from N to N + 1 as a result of the gene code modification s s . It is natural, therefore, to assume that the transition probabilities w m admit the estimate:
w m = w ( m m + 1 ) > C p m u t N m u t , N m u t < C *
where p m u t > 0 is a probability of a single mutation and a constant C * is uniform in the network size m. It is shown by Theorem 1 that
log q m < C κ 2 m 2 γ , γ > 0 ,
where C > 0 is uniform in m. Under these assumptions, the following claim holds:
Proposition 2. 
Let w m , q m satisfy (47) and (48), respectively, and p N 0 ( 0 ) = 1 . Then for N 0 > > 1 , sufficiently small κ > 0 and t, such that t > > max m [ N 0 , N m a x ] w m 1 one has that the solution p ( t ) lies in the state N m a x with an overwhelming probability, as N 0 :
max m N m a x p m < p N m a x exp ( c N 0 γ 0 ) ,
where c , γ 0 > 0 are constants, uniform in N 0 .
Proof. 
By estimates (47) and (48), system (37) can be investigated by the standard algebraic method. Let ψ ( k ) denote the eigenvectors of the linear operator p W p and ϕ ( l ) denote the same ones for the conjugate operator p W p . These functions form a biorthogonal system:
( ϕ ( k ) , ψ ( l ) ) = δ k l ,
where δ k l stands for the Kronecker symbol. The corresponding eigenvalues are λ k = w k q k < 0 . Suppose that all w k are different. Under conditions (47) and (48), the minimum of absolute values is | λ N m a x | , where
| λ N m a x | < < min k N m a x | λ k | .
The solution of (37) with the initial data p ( 0 ) has the form
p ( t ) = k = N 0 N m a x ( p ( 0 ) , ϕ ( k ) ) ψ ( k ) exp ( λ k t ) .
Furthermore, we use Formula (50) and expressions for eigenvectors ψ ( k ) and ϕ ( k ) . One has, in particular,
ψ ( N m a x ) = c 0 ( 0 , 0 , , 0 , 1 ) ,
where all components are equal to zero except for the last one, and ϕ ( N ) = c 1 ( a N 0 , a N 0 + 1 , , a N ) , where N = N m a x , and a N 0 > 0 is not exponentially small. Constants c 0 , c 1 > 0 are not exponentially small in N 0 . Now the result follows from (50). □

8. Morphogenesis Models and the Number of Coding Genes

In this section, we first present a short review of different mathematical models of morphogenesis. Different approaches were developed: reaction–diffusion models [34,36,37], mechanochemical models, and connectionist models [18,25,40].

8.1. Brief Overview of Models

Reaction–diffusion models have received great attention and become popular due to seminal work [34]. These models successfully describe pattern formation in biology, physics, chemistry [36,41,42,43,44,45], wave propagation, chaos ([46]), and other important phenomena. Turing’s instability mechanism proposed in [34] can explain the emergence of layered periodical patterns, such as zebra strips, etc. [41]. However, the famous biologist S. Brenner noted in [47] that biological support for Turing’s idea has been marginal. Patterns of Drosophila development do not fit the Turing theory. Moreover, this Turing approach involves no information on gene expression. Nonetheless, the reaction–diffusion models describe many observed patterns and effects, for example, somitogenesis [35,43,48,49,50,51]. The model from ([43]) describes waves, which generate periodical layered patterns. Moreover, it is shown that reaction–diffusion models are capable of generating any cell pattern, i.e., they potentially have a formidable pattern capacity. In fact, if we combine Turing’s ideas with the celebrated Wolpert concept of positional information (see [52]), then the reaction–diffusion models (even with two components ) are capable of generating all possible cell patterns (see [53]), not only periodical. The mechanism working in the model ([53]) is based on the presence of long-range interactions.
However, it is well-known that real morphogenesis is not only based on biochemical interactions but also on mechanical effects. It is an extremely complicated process; see [54] for a recent review. Biological media can be characterized as viscous–elastic, and can be fluid or solid states; there are possible transitions between solid and fluid states. Tissues form media that can change their mechanical properties during the morphogenesis process [54]. Such complicated models can be studied mainly via numerical simulations. However, in [55], a simple mathematically tractable model of an excitable mechanochemical medium is described, where waves resembling moving Turing machines arise and facilitate cell differentiation.
Many reaction–diffusion and mechanochemical models do not use explicitly genetic information, only general ideas about morphogens and cell differentiation. One can suppose that their parameters are encoded by a genetic code. Connectionist models, similar to neural networks, are capable of describing real patterns of gene expression in the Drosophila segmentation process ([18,19,25,40,56]) and their robustness [57]. In neural networks, connections define neuron interactions; similarly, in connectionist morphogenesis models, pair interactions between transcription factors (TFs) are determined by a matrix. Similar to classical neural nets, these models do not take into account triple interactions. In fact, one can show that a practically arbitrary reaction–diffusion model can be simulated by a connectionist system with a sufficiently large interaction matrix and pair interactions.

8.2. Number of Genes

Although we are still far from a comprehensive description of morphogenesis that incorporates genes, biochemistry, and mechanical effects, we can nevertheless draw interesting conclusions, which will be used in the next section to enhance our understanding of problem II. First, how many genes are needed for morphogenesis and stress responses, as described in Section 2? For example, models proposed in [37,55] can be encoded by a few genes N m o r , the number N m o r weakly depends on the organism size and its complexity. The models from [37,55] look like toy ones; nonetheless, one can suppose that this conclusion about the number of coding genes is applicable to more realistic systems. In fact, it is well-known that mammals and insects have many similar genes involved in morphogenesis.
To encode compartment structures and their cell dynamics, as considered in Section 2.5, only a small number of genes is needed. In fact, to define dynamics (24), it is necessary to encode coefficients r k and K k l . If an organism consists of N c cell types, the corresponding number of genes has the order O ( N c 2 ) .

9. Multicellularity and Peto’s Paradox

9.1. How Multicellularity Supports Homeostasis

To understand why multicellularity is advantageous for adaptation and how it could have arisen in evolution, we use the results of the previous sections. Let us consider an organism consisting of M = N c e l l cells, and let cells of that organism replicate T l i f e times. The number T l i f e is an integer, which corresponds to the number of divisions of cells during the organism’s life. Since somatic cells can die and have the Hayflick limit, one can take T l i f e as a theoretical Hayflick limit. Let q o u t ( M , N r e g ) be the probability of homeostasis violations for an average somatic cell, which depends on two main parameters, M and N r e g (the size of the gene regulation networks). Then the probability that all cells survive together within T l i f e replications is
P v i a b l e ( M , N r e g ) 1 q o u t ( M , N r e g ) T l i f e M .
The asymptotics of q o u t are given by Theorem 2. Since q o u t is small, we obtain
log P v i a b l e ( M , N r e g ) q o u t ( M , N r e g ) N c e l l T l i f e M T l i f e exp ( C s M 2 ) ,
where C s is a constant. This computation is elementary but it leads to the following interesting conclusions. An increase in the gene regulation network produces an exponential gain in homeostasis stability, and the same effect gives an increase in the organism’s mass. One can suppose that mass growth is limited by ecological resources only. We also conclude that it is useful to diminish T l i f e . Multicellular organisms could have emerged because the time required to build a network of size M depends polynomially on that size, and although this time is large, it is much less compared to the exponential gain in adaptability.

9.2. Peto’s Paradox: When a Large Mass Can Help

Let us consider now the Peto paradox. To this end, we again consider an organism consisting of M = N c e l l cells making T l i f e replications. Let p c ( M ) be the probability of a mutation per generation, which can produce cancer (driver mutation). It is assumed that in order to modify a healthy cell into a cancerous one, n d r driver mutations are necessary (see [58], where different methods of finding n d r are considered). One thinks that, typically, cancer appears as a result of 1 to 6 driver mutations [58]. Note that most mutations are not driver ones and are not so dangerous.
The probability P c , m o d that within T d divisions a cell will acquire n d r mutations and that cell will be modified in a cancer cell, is
P c , m o d = T d n d r p * n d r ( 1 p * ) T d n d r ( p * T d ) n d r n d r ! exp ( p * T d ) ,
where the Bernoulli law is replaced with the Poisson distribution because the probability p * of a driver mutation is very small. We suppose that each driver mutation can appear as a result of a replication error. This probability, as well as the number n d r , depends on the cancer type, the organism, and the stress level. A cancer emerges in a tissue. If an organism consists of N c e l l cells, one can suppose that the number of cells M c where the cancer can appear is proportional to N c e l l : M c c t N c e l l , where a coefficient c t ( 0 , 1 ) depends on the tissue. The logarithm of the probability P s u r that this cancer does not emerge can be roughly approximated by
log P s u r c t N c e l l log ( 1 P c , m o d ) c t N c e l l P c , m o d
(the logarithm is replaced by its asymptotic because P c , m o d is small). It is a very simplified estimate, which does not take into account apoptosis, immune system reactions, etc. The next step is an approximation of p * . It is natural to suppose that, in an ideal environment, p * has a minimal value p 0 . Fluctuations of external fields, chemical reagents, toxins, and other stress factors can increase p * . Some experimental data on the dependence of mutation rates on toxin concentrations were obtained in [59], where it was found that the response of the mutation rate in bacteria depends linearly on the toxin concentration. According to Theorem 2, the magnitudes of random fluctuations of reagent concentrations have the order N c e l l 1 , and the number N c e l l of cells, in turn, is proportional to the body mass M. Thus, one can expect that p * decreases with the body mass. We consider a general dependence
p * = p 0 + a 0 N c e l l μ ,
where μ > 0 and a 0 > 0 are coefficients. So, we expect that cells of organisms of larger sizes have lower mutation rates. This conclusion, qualitatively, is consistent with experimental data, see [60]. According to [60], for somatic cells in humans, we have about 3.3 · 10 11 mutations per base per mitosis, and for a mouse, one has 1.2 · 10 10 ; that is, for humans, the mutation frequency is 20 times less. If a typical human weight is ≈70,000 g and a typical mouse is 20 g, then we can assume that μ lies within intervals [ 1 / 3 , 1 / 2 ] .
Relations (53)–(55) show that the probability of acquiring a cancer of a fixed type can increase or decrease in mass; it depends on the parameters a 0 , p 0 , n d r , where n d r is the number of driver mutations, and a 0 , p 0 are coefficients in (55). The numerical computations conducted by these relations are shown in Figure 1. This analysis and the simulations do not take into account the ecological factors connected to the resources and prey–predator interaction. On the one hand, a large mass increases the amount of food consumed, on the other hand, can improve the defense against predators.
The dependence of the survival probability P s u r on the stress parameter η is complex. The parameter determines the type of stress (such as cooling or heat shock, exposure to toxins, radiation, etc.) and the stress level (for example, concentrations of toxins). The probability of survival depends on η through the parameters a 0 , p 0 (see relation (55)). It is difficult to determine this dependence within the framework of the simple model considered in this section. In the calculations, the parameters were chosen so that the survival probability P s u r took on more or less realistic values between 0 and 1. To investigate the dependence P s u r ( η ) , we need more advanced spatially extended models, taking into account immune reactions and other phenomena.

10. Concluding Remarks

The three interconnected biological problems are considered: the origins of multicellular organisms, the Peto paradox on cancer and lifespan, and why complex multicellular organisms use a relatively small number of coding genes. The model, proposed in the paper, describes the response of a complex biochemical system to stress and can be viewed as a generalization of the Valiant model of the ideal answer [20]. The Valiant model considers organisms as circuits generating responses to environmental challenges. In the suggested model, some contemporary ideas on cell differentiation, stress response, and morphogenesis are applied, taking into account, to some extent, the real structures of multicellular organisms: biochemical kinetics, decomposition into compartments (tissues), stem cell activity, and gene regulation. It is found that the number of Boolean genes needed to encode an organism does not correlate with the organism’s size. The most intriguing results are obtained in Section 2.5 and Section 2.6. They show that evolutionary success in adaptation does not require fine-tuning genetic networks in differentiated cells of multicellular organisms. To create an effective response to stress, it is enough to correctly encode the interactions between cells. Evolution does not need to have too many genes for this coding. An analysis of the stress response mechanism leads to the conclusion that species with larger organism masses have larger viability: when the mass increases as a linear function, the viability grows as an exponential function. It is important to note that within the same species, we have an inverse dependence: small dogs live longer than large dogs, an effect that is not yet explained.
Complex traits can evolve adaptively or non-adaptively; the discussion regarding the nature of evolution has been ongoing for many years, starting with Kimura’s seminal work [61]; see ([1], 2011) for an overview. Large multicellular organisms form populations of relatively small effective sizes. This suggests that genetic drift in such populations is stronger than in populations formed by prokaryotes. This fact supports the idea that the evolution of these organisms might have been non-adaptive (constructive neutral evolution). The estimates from this paper do not allow us to draw a definite conclusion about the nature of evolution; however, they show that the emergence of multicellular organisms was not completely improbable.
In fact, they show that even a slight increase in network size can exponentially enhance the probability of maintaining homeostasis, provided the initial GRN was sufficiently large. The likelihood of a series of mutations leading to a more efficient network is small, but it is not exponentially small. Even if almost all mutations were non-adaptive, the process of successive replications leading to an organism with complex genetic regulation is possible.
In conclusion, I would like to express my gratitude to the referees for their useful comments and remarks.

Funding

The author received support from the Ministry of Science and Higher Education of the Russian Federation under agreement no. 075-15-2022-291, dated 15 April 2022. This agreement provided a grant in the form of subsidies from the federal budget for state support in establishing and developing the world-class scientific center, Pavlov Center “Integrative physiology for medicine, high-tech healthcare, and stress-resilience technologies”.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Appendix A.1. Estimates for Approximations by RBF

Let us review some results for approximations by radial basis function networks (RBFNs) [33]. Let us introduce a useful notation.
Let B d be a unit ball in R d centered at 0, and let | x | denote the Euclidean norm of the vector x R d . Let p ( 1 , + ] . Let us denote by L p ( D ) the Banach space of all measurable functions on the compact domain D R d with the norm
| | f | | = D f ( x ) p d x 1 / p .
For p = + , we use the supremum norm. The Sobolev class W p r ( D ) is defined by
W p r ( D ) = { f : max k , 0 | k | r | | D k f | | p < }
where k = ( k 1 , k 2 , , k d ) is a multi-index with non-negative integer components, k i , | k | = j = 1 d k j and
D k f ( x ) = | k | k 1 x 1 k d x d .
We denote by C r ( D ) the class of Hölder functions with exponent r 1 . For two function classes, U and W, the distance between those classes is defined by
d i s t ( U , W ) p = sup f U inf g W | | f g | | p .
The RBFN class Φ N , σ consists of functions of the form
F n e t ( x ) = j = 1 N c j σ ( w j | x x ¯ j | ) ,
where c j , w j R are coefficients, x ¯ j are localization centers for radial basis function σ ( z ) , which is a well-localized smooth function (for example, a Gaussian peak), and N is the network size.

Appendix A.2. Estimates for RBFN

The following estimate can be obtained (see, for example, [33]):
d i s t ( W p r ( B d ) , Φ N , σ ) < C r b f N r / d ,
where a positive constant C r b f depends on p and d only.

Appendix A.3. Encoding GRN

Consider RBF networks F n e t (see (A1)). We are going to encode the parameters θ j , c i , w j of this network by a binary string s. According to ([13]), for smooth σ , this binary coding problem can be resolved as follows. Each c l we approximate by M bits within a precision ε b < < ϵ k :
| c l c l b | < ε b ,
where
c l b = ( 1 / 2 s ¯ j ) l = M 1 M 2 s j , l 2 l , s ¯ j , s j , l { 0 , 1 } .
Similarly, we proceed for w j and θ j . We obtain F n e t , B , where all parameters c j , w j , θ j are replaced by their binary approximations. Therefore,
sup v D | F n e t ( v ) F n e t , B ( v ) | < C 1 ε b
for bounded inputs v and some constants.
Remark. According to [13], for a network F n e t with size N, we should use N g = O ( N ) log 2 ε b binary genes to construct the network F n e t , B .

Appendix A.4. Estimate of Ventsel–Freidlin Distance

In this subsection, we derive the estimate (46) from the proof of Theorem 1
κ 2 log P o u t , κ , N , T < C ( f ) δ 2 | | g | | 2 .
This estimate can be obtained in different ways, and really, it is almost obvious. In fact, if we remove the terms h i in (23), this equation becomes a linear stochastic equation, which defines a vector-valued Ornstein–Uhlenbeck process. This process is Gaussian with zero mean; hence, it is fully characterized by its covariance matrix. This matrix can be derived in a standard manner, linked to a fundamental physical principle, the fluctuation–dissipation theorem (see, for example, [62]). The second variant involves investigating the Fokker–Planck equation corresponding to (23). For small κ > 0 , we can obtain an asymptotic expansion of the solutions of that Fokker–Planck equation at equilibrium. This solution characterizes a probability density function, resembling a slightly perturbed multivariate Gaussian distribution, predominantly localized at the equilibrium.
We use the theory of small random perturbations of dynamical systems developed in ([24], 1984). Let us define the matrix D with entries D i j by
D i j = l = 1 p g i l g j l .
This matrix is symmetric and positively definite, and D 1 exists. We note that
( D 1 w , w ) | | g | | 2 | | w | | 2 ,
where w is a vector and ( , ) , | | · | | denotes the standard scalar product and the norm in Euclidean space R n . Indeed, let w ˜ = D 1 / 2 w . Then
( D 1 w , w ) = ( D 1 / 2 w , D 1 / 2 w ) = | | w ˜ | | 2 .
By this notation, (A4) can be rewritten as
| | w ˜ | | 2 | | g | | 2 | | w | | 2 .
We note that
| | w | | 2 = | | D 1 / 2 w ˜ | | 2 = ( D w ˜ , w ˜ ) = ( g , w ˜ ) 2 .
By the Cauchy–Schwartz inequality,
| | w | | 2 | | g | | 2 | | w ˜ | | 2
which is equivalent to (A5).
Let us introduce the Ventsel–Freidlin distance I ( w , v ) between two points, w and v, by
2 I ( w , v ) = inf ϕ ( · ) 0 T D 1 A ( v ¯ ) ϕ ( t ) + h ( ϕ ) + d ϕ d t , A ( v ¯ ) ϕ ( t ) + h ( ϕ ) + d ϕ d t d t ,
where the infimum is taken over all differentiable trajectories ϕ ( t ) , which connect v ¯ and v:
ϕ ( 0 ) = w , ϕ ( T ) = v .
According to classical results ([24], 1984), this distance allows one to find an asymptotic estimate of large deviations from the equilibrium for the system (23). We set w = 0 and let v W δ . Then
κ 2 log P o u t I ( 0 , v ) κ 0 .
To estimate I ( 0 , v ) , we define a trajectory ψ defined by
ϕ ( t ) = exp ( A t ) ψ ( t ) .
Estimate (A4) gives
I ( 0 , v ) | | g | | 2 inf ϕ ( · ) 0 T | | A ( v ¯ ) ϕ ( t ) + h ( ϕ ( t ) ) + d ϕ d t | | 2 d t ,
which is equal to
I ( 0 , v ) | | g | | 2 R , R = inf ψ ( · ) 0 T | | exp ( A t ) d ψ d t + h ( ϕ ( t ) ) | | 2 d t .
Using (20) and taking into account | | ϕ ( t ) | | δ for all t [ 0 , T ] , one has
| | h ( ϕ ( t ) ) | | c | | ϕ ( t ) | | 2 c δ | | ϕ ( t ) | |
which gives
I ( 0 , v ) R = R 1 R 2 ,
where
R 1 = inf ψ ( · ) 0 T | | exp ( A t ) d ψ d t | | 2 d t
and
R 2 = δ 2 | | exp ( A t ) ψ ( t ) | | 2 d t .
The relation for R 1 can be transformed into
R 1 = inf ψ ( · ) 0 T ( exp ( A t ) d ψ d t , exp ( A * t ) d ψ d t ) d t ,
where A * is an operator conjugate to A . One has
R = inf ψ ( · ) 0 T exp ( ( A + A * ) t ) d ψ d t , d ψ d t d t .
The operator L = A + A * 2 is self-adjoint and has negative real-valued eigenvalues; thus, we can use the spectral decomposition. Let Ψ n be orthogonal eigenfunctions of L , and let λ n R , δ * be the corresponding eigenvalues. Let us denote the Fourier coefficients by a k ( t ) = ( ψ , Ψ k ) . Then
R 1 ( a ) = k = 1 n 0 T exp ( 2 λ k t ) ( d a k d t ) 2 d t .
Similarly,
R 2 ( a ) = k = 1 n 0 T exp ( 2 λ k t ) a k ( t ) 2 d t ,
where a = ( a 1 , a 2 , , a n ) .
We are going to find the minimum of R ( a ( · ) ) = R 1 R 2 under the condition | | v ( T ) | | δ . This condition can be rewritten as
k = 1 n a k ( T ) 2 exp ( 2 a k T ) δ 2 .
This minimization problem can be resolved. The Euler–Lagrange equations have the following form:
d 2 a k d t 2 2 λ k d a k d t δ 2 a k = 0 .
They are independent for each k. Solutions of these equations have the following form:
a k ( t ) = b k exp ( γ k t ) exp ( γ ˜ k t ) ,
where b k are unknown real numbers and
γ k = 2 λ k + δ 2 λ k + O ( δ 4 ) ,
γ ˜ k = δ 2 λ k + O ( δ 4 )
for small δ > 0 . We substitute these relations into (A13) and R ( a ( · ) ) . As a result, we obtain the following minimization problem with respect to unknown coefficients, b k , to find the minimum of
R ( b ) = k b k 2 0 T exp ( 2 λ k t ) γ k exp ( γ k t ) γ ˜ k exp ( γ ˜ k t ) 2 δ 2 exp ( γ k t ) exp ( γ ˜ k t ) 2 d t .
under condition
k b k 2 exp ( γ k T ) exp ( γ ˜ k T ) 2 exp ( 2 λ k T ) δ 2 .
The minimization problem can be resolved, and we obtain
R δ 2 ( 4 min λ k ) 1 δ 2 / 4 δ * .
This implies (A3).

References

  1. Koonin, E.V. The Logic of Chance: The Nature and Origin of Biological Evolution; FT Press: NewYork, NY, USA, 2011. [Google Scholar]
  2. Iranzo, J.; Lobkovsky, A.E.; Wolf, Y.I.; Koonin, E.V. Virus-host arms race at the joint origin of multicellularity and programmed cell death. Cell Cycle 2014, 13, 3083–3088. [Google Scholar] [CrossRef]
  3. Koonin, E.V. Viruses and mobile elements as drivers of evolutionary transitions. Phil. Trans. R. Soc. B 2016, 371, 20150442. [Google Scholar] [CrossRef] [PubMed]
  4. Maynard Smith, J.; Schatzmary, E. The Major Transitions in Evolution; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  5. Fisher, R.A. The Genetical Theory of Natural Selection; Clarendon Press: Oxford, UK, 1930. [Google Scholar]
  6. Orr, H.A. Adaptation and cost of complexity. Evolution 2000, 54, 13–20. [Google Scholar] [CrossRef] [PubMed]
  7. Orr, H.A. The genetic theory of adaptation: A brief history. Nat. Rev. Genet. 2005, 6, 119–127. [Google Scholar] [CrossRef] [PubMed]
  8. Peto, R.; Roe, F.J.C.; Lee, P.N.; Levy, L.; Clack, J. Cancer and ageing in mice and men. Br. J. Cancer 1975, 32, 411–426. [Google Scholar] [CrossRef] [PubMed]
  9. Caulin, A.F.; Maley, C.C. Peto’s Paradox: Evolution’s prescription for cancer prevention. Trends Ecol. Evol. 2011, 26, 175–182. [Google Scholar] [CrossRef]
  10. Kobayashi, H.; Man, S. cquired multicellular-mediated resistance to alkylating agents in cancer. Proc. Natl. Acad. Sci. USA 1993, 90, 3294–3298. [Google Scholar] [CrossRef]
  11. Domazet-Loso, T.; Tautz, D. Phylostratigraphic tracking of cancer genes suggests a link to the emergence of multicellularity in metazoa. BMC Biol. 2010, 8, 66. [Google Scholar] [CrossRef]
  12. Dang, C. Links between metabolism and cancer. Genes Dev. 2012, 26, 877–890. [Google Scholar] [CrossRef]
  13. Vakulenko, S.; Grigoriev, D. Deep gene networks and response to stress. Mathematics 2021, 9, 3028. [Google Scholar] [CrossRef]
  14. Hansberg, W.; Aguirre, J.; Rís-Momberg, M.; Rangel, P.; Peraza, L.; Montes de Oca, Y.; Cano-Domínguez, N. Cell differentiation as a response to oxidative stress. Br. Mycol. Soc. Symp. Ser. 2008, 27, 235–257. [Google Scholar]
  15. Tower, J. Stress and stem cells. Wiley Interdiscip. Rev. Dev. Biol. 2012, 1, 789–802. [Google Scholar] [CrossRef] [PubMed]
  16. Bornstein, S.; Steenblock, C.; Chrousos, G.P.; Schally, A.V.; Beuschlein, F.; Kline, G.; Krone, N.P.; Licinio, J.; Wong, M.L.; Ullmann, E.; et al. Stress-inducible-stem cells: A new view on endocrine, metabolic and mental disease? Mol. Psychiatry 2019, 24, 2–9. [Google Scholar] [CrossRef] [PubMed]
  17. Greaves, R.B.; Dietmann, S.; Smith, A.; Stepney, S.; Halley, J.D. A conceptual and computational framework for modelling and understanding the non-equilibrium gene regulatory networks of mouse embryonic stem cells. PLoS Comput. Biol. 2017, 13, e1005713. [Google Scholar] [CrossRef]
  18. Jiang, P.; Ludwig, M.L.; Kreitman, M.; Reinitz, J. Natural variation of the expression pattern of the segmentation gene even-skipped in Drosophila melanogaster. Dev. Biol. 2015, 405, 173–181. [Google Scholar] [CrossRef]
  19. Mjolsness, E.; Sharp, D.H.; Reinitz, J. A connectionist model of development. J. Theor. Biol. 1991, 152, 429–453. [Google Scholar] [CrossRef]
  20. Valiant, L.G. Evolvability. J. ACM 2006, 120, 1–19. [Google Scholar]
  21. Vakulenko, S.; Grigoriev, D. Instability, complexity, and evolutionz. Zap. Nauchn. Sem. POMI 2008, 360, 31–69. [Google Scholar]
  22. Gromov, M.; Carbone, A. Mathematical slices of molecular biology. Gaz. MathéMaticiens 2001, 88A. [Google Scholar]
  23. Vakulenko, S. Complexity and Evolution of Dissipative Systems; de Gruyter: Berlin, Germany, 2014. [Google Scholar]
  24. Ventsel, D.A.; Freidlin, M.I. Random Perturbations of Dynamic Systems; Springer: New York, NY, USA, 1984. [Google Scholar]
  25. Reinitz, J.; Sharp, D.H. Mechanism of formation of eve stripes. Mech. Dev. 1995, 49, 133–158. [Google Scholar] [CrossRef]
  26. Valiant, L.G. Evolvability. J. ACM 2009, 56, 1–21. [Google Scholar] [CrossRef]
  27. Horsthemke, W.; Lefever, R. Noise-induced Transitions; Springer: Berlin/Heidelberg, Germany, 1984. [Google Scholar]
  28. Chen, B.; Abuassba, A.O. Compartmental Models with Application to Pharmacokinetics. Procedia Comput. Sci. 2021, 187, 60–70. [Google Scholar] [CrossRef]
  29. Gillespie, D.T. Exact Stochastic Simulation of Coupled Chemical Reactions. J. Phys. Chem. 1977, 81, 2340–2361. [Google Scholar] [CrossRef]
  30. Henry, D. Geometric Theory of Semilinear Parabolic Equations; Springer: Berlin/Heidelberg, Germany, 1981. [Google Scholar]
  31. Szendro, I.G.; Schenk, M.F.; Franke, J.; Krug, J.; de Visser, J.A.G. Quantitative analyses of empirical fitness landscapes. J. Stat. Mech. Theory Exp. 2013, 2013, P01005. [Google Scholar] [CrossRef]
  32. Jiang, P.; Reinitz, J.; Kreitman, M. The effect of mutational robustness on the evolvability of multicellular organisms and eukaryotic cells. J. Evol. Biol. 2023, 36, 906–924. [Google Scholar] [CrossRef] [PubMed]
  33. Lin, S.; Liu, X.; Rong, Y.; Xu, Z. Almost optimal estimates for approximation and learning by radial basis function networks. Mach. Learn. 2014, 95, 147–164. [Google Scholar] [CrossRef]
  34. Turing, A. The chemical basis of morphogenesis. Phil. Trans. Roy. Soc. B 1952, 237, 37–72. [Google Scholar]
  35. Cooke, J.; Zeeman, E. A Clock and Wavefront model for control of the number of repeated structures during animal morphogenesis. J. Theor. Biol. 1976, 58, 455–476. [Google Scholar] [CrossRef]
  36. Meinhardt, H. Models of Biological Pattern Formation; Academic Press: London, UK, 1982; pp. 1–215. [Google Scholar]
  37. Baker, R.; Schnell, S.; Maini, P. A clock and wavefront mechanism for somite formation. Dev. Biol. 2006, 293, 116–126. [Google Scholar] [CrossRef]
  38. Kishan, K.; Rui, L.; Cui, F.; Yu, Q.; Haake, A.R. GNE: A deep learning framework for gene network inference by aggregating biological information. BMC Syst. Biol. 2019, 13, 38. [Google Scholar]
  39. Shen, Z.; Yang, H.; Zhang, S. Nonlinear Approximation via Compositions. Neural Netw. 2019, 119, 74–84. [Google Scholar] [CrossRef] [PubMed]
  40. Surkova, S.; Spirov, A.V.; Gursky, V.V.; Janssens, H.; Kim, A.R.; Radulescu, O.; Vanario-Alonso, C.E.; Sharp, D.H.; Samsonova, M.; Reinitz, J.; et al. Canalization of gene expression in the Drosophila blastoderm by gap gene cross regulation. PLoS Biol. 2009, 7, e1000049. [Google Scholar]
  41. Murray, J.D. Mathematical Biology, 3rd ed.; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar] [CrossRef]
  42. Crampin, E.J.; Gaffney, E.A.; Maini, P.K. Reaction and diffusion on growing domains: Scenarios for robust pattern formation. Bull. Math. Biol. 1999, 61, 1093–1120. [Google Scholar] [CrossRef] [PubMed]
  43. Krause, A.L.; Gaffney, E.A.; Maini, P.K.; Kika, V. Modern Perspectives on Near-Equilibrium Analysis of Turing Systems. Philos. Trans. A 2019, 379, 20200268. [Google Scholar] [CrossRef]
  44. Nicolis, G.; Prigogine, I. Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order through Fluctuations; Wiley and Sons: New York, NY, USA; London, UK; Sydney, Australia; Toronto, ON, Canada, 1977. [Google Scholar]
  45. Haken, H. Synergetics: An Introduction: Nonequilibrium Phase Transitions and Self-Organization in Physics, Chemistry, and Biology; Springer: Berlin/Heidelberg, Germany; New York, NY, USA, 1978. [Google Scholar]
  46. Vakulenko, S.; Volpert, V. Generalized travelling waves for perturbed monotone reaction-diffusion systems. Nonlinear Anal. 2001, 46, 757–776. [Google Scholar] [CrossRef]
  47. Brenner, S. Life’s code script. Nature 2012, 482, 461. [Google Scholar] [CrossRef]
  48. Maroto, M.; Bone, R.A.; Dale, J. Somitogenesis. Development 2012, 139, 2453–2456. [Google Scholar] [CrossRef]
  49. Resende, T.; Ferreira, M.; Teillet, M.A.; Tavares, A.T.; Andrade, R.P.; Palmeirim, I. Sonic hedgehog in temporal control of somite formation. Proc. Natl. Acad. Sci. USA 2010, 107, 12907–12912. [Google Scholar] [CrossRef]
  50. Pourquié, O. The chick embryo: A leading model for model in somitogenesis studies. Mech. Dev 2004, 121, 1069–1079. [Google Scholar] [CrossRef]
  51. Heltberg, M.L.; Krishna, S.; Jensen, M. On chaotic dynamics in transcription factors and the associated effects in differential gene regulation. Nat. Commun. 2019, 10, 71. [Google Scholar] [CrossRef]
  52. Wolpert, L.; Tickle, C.; Jessell, T. Principles of development; Oxford University Press: Cambridge, UK, 2002. [Google Scholar]
  53. Reinitz, J.; Vakulenko, S.; Sudakow, I.; Grigoriev, D. Robust morphogenesis by chaotic dynamics. Sci. Rep. 2023, 13, 7482. [Google Scholar] [CrossRef] [PubMed]
  54. Maroudas-Sacks, Y.; Keren, K. Mechanical Patterning in Animal Morphogenesis. Annu. Rev. Cell Dev. Biol. 2021, 37, 469–493. [Google Scholar] [CrossRef] [PubMed]
  55. Sudakow, I.; Vakulenko, S.; Grigoriev, D. Excitable media store and transfer complicated information via topological defect motion. Commun. Nonlinear Sci. Numer. Simul. 2023, 116, 106844. [Google Scholar] [CrossRef]
  56. Vakulenko, S.; Grigoriev, D. Complexity of gene circuits, Pfaffian functions and the morphogenesis problem. Compte Rendu Math. 2003, 337, 721–724. [Google Scholar] [CrossRef]
  57. Vakulenko, S.; Radulescu, O.; Reinitz, J. Size Regulation in the Segmentation of Drosophila: Interacting Interfaces between Localized Domains of Gene Expression Ensure Robust Spatial Patterning. Phys. Rev. Lett. 2009, 103, 168102–168106. [Google Scholar] [CrossRef]
  58. Iranzo, J.; Martincorena, I.; Koonin, E. Cancer-mutation network and the number and specificity of driver mutations. Proc. Natl. Acad. Sci. USA 2018, 115, E6010–E6019. [Google Scholar] [CrossRef]
  59. Longa, H.; Miller, S.F.; Strauss, C.; Zhao, C.; Cheng, L.; Ye, Z.; Griffin, K.; Te, R.; Lee, H.; Chen, C.; et al. Antibiotic treatment enhances the genome-wide mutation rate of target cells. Proc. Natl. Acad. Sci. USA 2016, 113. [Google Scholar] [CrossRef]
  60. Milholland, B.; Dong, X.; Zhang, L.; Hao, X.; Suh, Y.; Vijg, J. Differences between germline and somatic mutation rates in humans and mice. Nat Commun. 2017, 8, 15183. [Google Scholar] [CrossRef]
  61. Kimura, M. The Neutral Theory of Molecular Evolution; Cambridge University Press: Cambridge, UK, 1983. [Google Scholar]
  62. Keizer, J. Statistical Thermodynamics of Nonequilibrium Processes; Springer: New York, NY, USA, 1987. [Google Scholar]
Figure 1. This image shows the probability P s u r that cancer does not emerge in an organism as a function of N c e l l for the different numbers n d r of driver mutations. The probability is computed by relations (53)–(55). The number of cells N c e l l runs the interval [ 1 , 4 M 0 ] , where M 0 = 10 9 (in a typical mouse, the number of cells is 3 · M 0 ). For each n d r = 2 , 3 , 4 , parameter a 0 takes values 6.7 · 10 4 , 0.016 , 0.076 , respectively. The number T d of cell divisions is T d = 100 . The parameters are p 0 = 0.1 a 0 , μ = 1 / 3 , and c t = 1 . We see that the effect dramatically depends on the parameter n d r . The mass increase is profitable if n d r is not too small.
Figure 1. This image shows the probability P s u r that cancer does not emerge in an organism as a function of N c e l l for the different numbers n d r of driver mutations. The probability is computed by relations (53)–(55). The number of cells N c e l l runs the interval [ 1 , 4 M 0 ] , where M 0 = 10 9 (in a typical mouse, the number of cells is 3 · M 0 ). For each n d r = 2 , 3 , 4 , parameter a 0 takes values 6.7 · 10 4 , 0.016 , 0.076 , respectively. The number T d of cell divisions is T d = 100 . The parameters are p 0 = 0.1 a 0 , μ = 1 / 3 , and c t = 1 . We see that the effect dramatically depends on the parameter n d r . The mass increase is profitable if n d r is not too small.
Mathematics 11 05003 g001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vakulenko, S. Transition to Multicellularity and Peto Paradox. Mathematics 2023, 11, 5003. https://doi.org/10.3390/math11245003

AMA Style

Vakulenko S. Transition to Multicellularity and Peto Paradox. Mathematics. 2023; 11(24):5003. https://doi.org/10.3390/math11245003

Chicago/Turabian Style

Vakulenko, Sergey. 2023. "Transition to Multicellularity and Peto Paradox" Mathematics 11, no. 24: 5003. https://doi.org/10.3390/math11245003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop