**Fractional Calculus and the Future of Science**

Editor

**Bruce J. West**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Bruce J. West Center for Nonlinear Science University of North Texas Denton, TX USA

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Entropy* (ISSN 1099-4300) (available at: https://www.mdpi.com/journal/entropy/special issues/ fract future).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-2826-7 (Hbk) ISBN 978-3-0365-2827-4 (PDF)**

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**



## **About the Editor**

**Bruce J. West** Prof. Dr. Bruce J. West was Chief Scientist Mathematics, US Army Research Office, 1999–2021 (retired). He has worked on the development of the fractional calculus for the modeling of complex phenomena and his book Physics of Fractal Operators (with Bologna and Grigolini, Springer, 2003) received the Army Research Laboratory Award for Publication in 2003; Fractional Calculus View of Complexity: Tomorrow's Science (CRC Press, 2016) and Nature's Patterns and the Fractional Calculus (Walter de Gruyter, 2017). Among many awards he received: Meritorious Senior Professional Presidential Rank Award 2012 and Distinguished Senior Professional Presidential Rank Award 2018. Before coming to ARO, Dr. West was Professor of Physics, University of North Texas, 1989-1999; Chair of the Department of Physics 1989-1993. During his time at the university, he did research into the quantum manifestations of chaos (energy level repulsion, ionization rate enhancement, breakdown of the Correspondence Principle); the foundations of statistical mechanics (getting random fluctuations without statistics, failure of the Green-Kubo relation, Levy statistics); ´ and nonlinear processing techniques applied to biomedical phenomena. He was elected a Fellow of the American Physical Society in 1992; he received the Decker Scholar Award in 1993 and the UNT President's Award for research in 1994. Prior to joining the university, Dr. West was Director, Division of Applied Nonlinear Science, La Jolla Institute, 1983-1989. During this period, he worked on the development of nonlinear dynamical models of biomedical phenomena, physical oceanography, and the statistical mechanical foundations of thermodynamics. Specifically, he helped developed ways to use renormalization group concepts to extract pattern information from biomedical time series. This latter research eventually led to the development of a new discipline, Fractal Physiology (with Bassingthwaighte and Leibovitch, Oxford University Press, 1994), explaining how fractals have revolutionized the way we think about dynamics in the human body. The prequel to this book was his Fractal Physiology and Chaos in Medicine (World Scientific, 1990). Dr. West was Associate Director, Center for the Studies of Nonlinear Dynamics, La Jolla Institute, 1979-1983. He applied some of the newly emerging concepts in nonlinear dynamics systems theory to nonlinear water wave fields and turbulence. He also explained how the branching structure of the lung and other such physiological structures could be described by scaling, this led to a totally new fractal model for the architecture of the mammalian lung.

## *Editorial* **Fractional Calculus and the Future of Science**

**Bruce J. West**

Center for Nonlinear Science, University of North Texas, Denton, TX 76203, USA; brucejwest213@gmail.com

The invitation to contribute to this anthology of articles on the fractional calculus (FC) encouraged submissions in which the authors look behind the mathematics and examine what must be true about the phenomenon to justify the replacement of an integerorder derivative with a non-integer-order (fractional) derivative (FD) before discussing ways to solve the new equations. The desired articles are intended to provide the reader with a window into the future of specific science disciplines by peering through the lens of the fractional calculus (FC) and suggesting how what is seen entails a difference in thinking about that area of science. Thus, a perfect submission would be more about the implications and utility of the FC than about its formal structure in chemistry, epidemiology, sociology, psychology, physics, or any of the other scientific disciplines. Imaginative articles that implement FC in new and interesting ways that reveal its transformational nature, including, but not limited to, such things as: how a fractional derivative in time incorporates memory into the solution of the dynamic description of an earthquake, a brain quake, or a crash in the stock market; how the fractional derivative in space incorporates spatial non-locality into the solution of the complex dynamical descriptions of a riot, the collective intelligence of social groups, or the neuronal activity of the brain. Finally, we are interested in how the combined fractional derivatives in time and space of functional measures of uncertainty incorporate both memory and nonlocality into the phase space solution to capture the limited uncertainty of an ensemble of fractal trajectories, or the scaling behavior of complex dynamical networks.

As West points out [1], Sir Isaac Newton transformed *Natural Philosophy* into today's *Science* by focusing on the fundamental nature of motion, and he did so by using geometry in a way that resonated with the scientific community of his day. What Newton accomplished was to reveal what was entailed by fluxions (the differential calculus) without explicitly referencing them and in so doing he convinced generations of scientists of the value of their analyzing how physical phenomena change in space and over time. Whether by conscious plan or by serendipity, Newton cleared a path for less talented investigators to follow and contribute to the nascent discipline of mechanics and thereby determined the direction of quantitative reasoning in physics for the next three centuries.

This was the starting point for a discussion of how science, through its intermittent turning of its investigative tools on itself, has reached another epoch of transition. However, unlike the paradigm shifts within a scientific discipline introduced by Thomas Kuhn in 1962 the present shift addresses the whole of science. West argues that the dominance of Newton's world view is drawing to a close and he weaves the threads of chaos theory, fractals, non-ergodic behavior of dynamic systems and fractional kinetic theory into a fractal tapestry of the physical, social, and life sciences. It is the complexity of this tapestry that is shown to be fundamentally incompatible with Newton's notions of space and time.

Angstmann and Henry (AH) [2] address the discipline of chemistry, specifically the equations governing the time evolution of population densities of chemical species taking into account the spatial movement and their interactions using reaction–diffusion equations. They discuss the failure of classical techniques to properly describe diffusion and review fractional subdiffusion resulting from particles being trapped for arbitrarily long times, which are modeled using the continuous time random walk (CTRW) model. This fractional subdiffusion is characterized by the mean square displacement of a chemical population

**Citation:** West, B.J. Fractional Calculus and the Future of Science. *Entropy* **2021**, *23*, 1566. https:// doi.org/10.3390/e23121566

Received: 15 November 2021 Accepted: 19 November 2021 Published: 25 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

spreading as a sublinear power law in time and is described using a Caputo fractional derivative in time in a reactive–subdiffusion equation.

AH emphasize that the main lessons learned from their analysis: (i) The governing equations are different depending on whether newborn particles inherit the waiting times of their parents. (ii) Birth and death terms must be treated differently. (iii) In the case where particles are removed, but not instantaneously at the start of the waiting times between jumps, the reaction and subdiffusive terms are not additive. They go on to explore the analytic solution to several exemplars, including a mass-conserving tempered time fractional diffusion equation that is subdiffusive for short times but manifests standard diffusion at long times.

Machado [3] chose to explore the fractal nature of financial time series using the Dow Jones industrial average (DJIA) but avoided using standard time series analysis. Instead, he uses multidimensional scaling (MDS) together with the concepts of distance, entropy, fractal dimension, and the FC. Introducing ten distinct definitions of distance, most of which I had not previously encountered, he was able to generalize MDS as an extension of the traditional metric formulation to construct a smooth non-Euclidean space. In this space, the fractal dimension and entropy measures for analyzing the three-dimensional portraits produced by the generalized MDS are interpreted.

Several known relations between the fractal dimension, using the box counting definition, and the idea that a random variable's entropy is its average level of "information" of the corresponding PDF, are used to define the information as an entropy of non-integer order α. This parameter α gives an extra degree of freedom to adapt the sensitivity of the entropy calculation to each specific dataset. Machado points out that time is viewed as a continuous and linear flow so that any perturbation is automatically assigned to the variable under analysis. Stated differently, since people are entities immersed in the time flow, apparently, we are incapable of distinguishing between perturbations in the time or the measured variable. Machado's analysis explores an alternative strategy to that of time series analysis for reading the relationship between the variables.

In the reaction–diffusion equation discussed by AH the chemical reaction term was modeled using the Verhulst equation for population growth. In 1798, Malthus argued that the integer-order rate equation produced an exponential population growth and 40 years later Verhulst replaced the constant growth rate of Malthus with one which decreased linearly with the growing population to mitigate the dire predictions of Malthus. The Verhulst (logistic) equation has the advantage of being one of the few nonlinear equations that has an analytic solution depicting a sigmoidal growth to a finite maximum population. Izadi and Srivastava (IS) [4] recount this bit of history and note that the integer-order derivative has been replaced by a fractional derivative in applications in numerous disciplines of science and engineering. They go on to point out that for most fractional differential equations (FDEs) there is to date no possibility of finding an exact analytic solution.

IS provide a brief review of the analytic and numerical methods that have been developed and applied for the FDEs which are based upon various loosely related models of real-world problems. Given the popularity of the fractional logistic equation (FLE) in the modeling of such phenomena as the growth of tumors in medicine, IS use it as a prototype on which to utilize the local discontinuous Galerkin (LDG) discretization approach for numerically solving the FLE. Given that the FLE has a second-order non-linearity IS rewrite it as two linear first order FDEs to apply the LDG scheme. Consequently, LDG is employed to discretize the resulting system, as well as the fractional operator. The mathematical details of the technique are presented along with comparison with alternative numerical and approximation approaches.

The transport of particles through continuous media is described by transport equations often based on general principles, such as conservation of energy and momentum. As pointed out by Masoliver [5], in general these transport equations are unsolvable nonlinear integrodifferential equations. He elected to discuss transport processes using the

telegrapher's equation (TE) and its generalization to the fractional case, emphasizing that random walk (RW) models are fundamental in describing TEs because they try to reproduce the microscopic mechanisms of transport. In this way he accounts for "diffusion with finite velocity".

The integer-order TE (IOTE) at early times behaves like a wave front and at late times like ordinary diffusion, it is generalized to a fractional-order TE (FOTE) in transport through highly disordered media, for instance, random media or fractal structures. Masoliver shows how this is done using fractional RWs resulting in a three-dimensional FOTE that is fractional is both space and time. The two different dynamics governing transport are fractional wave behavior at early times and fractional diffusion behavior as late times. Masoliver also shows analytic solutions for various combinations of IO-time, FO-space, FO-time and IO-space derivatives in one, two and three dimensions.

There are some investigators whose papers I always anticipate reading because I know that in addition to my gaining technical knowledge the author will put their work into a context that I would have missed left to my own resources. Professor Mainardi [6] falls into that category and his review paper falls into another category as well, that being: "Everything you wanted to know about \_?\_ but were afraid to ask.", here the blank is filled in with the Mittag–Leffler function (MLF). Just as the exponential function is the workhorse of linear IO differential equations, the MLF is the workhorse of linear FO differential equations, and the latter reduces to the former when the FO index goes to unity. His paper is written with the skill and insight that only one who has worked with the leaders in the field and has himself made lasting contributions to our understanding could manage.

His survey interleaves and draws connections among stochastic processes, such as the fractional Poisson process, the thinning of renewal processes, using the MLF. The CTRW is used to generalize the classical Kolmogorov–Feller equation (KFE) to the fractional KFE (FKFE) where the MLF is the waiting-time PDF. The fractional diffusion-wave equation has solutions to boundary value problems in terms of Wright functions that are inverse Laplace transforms of two parameter MLFs and are Greens functions in the solutions to specific boundary value problems.

Big Data (BD) and Machine Learning (ML) are two of the more visible areas of research in which investigators are working to span the gap separating the understanding based on modeling in social and life sciences from the more quantitative models of physics and engineering. Niu et al. (NCW) [7] maintain that the future success of these research activities is tied to the successful application of the FC and fractional order thinking (FOT) to the understanding of complex systems, to improving the processing and control of those systems and even to extending the enabling of creativity itself. The heart of the matter is that BD and ML seek to characterize complexity and of the ten characteristics used to describe BD *variability* is selected by NCW as the most important.

The complexity observed in most BD is almost invariably manifest through inverse power law (IPL) resulting from the processed data. The heavy-tailed nature of multiple PDFs is discussed along with its connection to the FC through fractional diffusion equations that are fractional in space, in time, or both. A variety of fractional discrete data processing techniques are sketched out to model the variability of BD along with a discussion of the CTRW. The key for the learning process is the optimization method and NCW inquire into how to use the FC to improve on existing methods of optimization in ML.

A Skellam distribution is generated by taking the difference between two independent Poisson random variables, which results in an integer valued Lévy process. Gupta et al. (GKL) [8] discuss a time fractional Skellam process that describes the inter-arrival times between positive and negative jumps as a MLF distribution rather than an exponential distribution and this formulation has been applied to financial and competitive games datasets. GKL show how the formalism is extended to a Skellam process of order *k*. A Skellam process is used to model the difference between the number of goals between two teams in a soccer match. Similarly, a Skellam process of order *k* can be used to model the difference in the number of points scored by two competing teams in a basketball match

where *k=3*, meaning there are three distinct ways to score points. Elsewhere, the authors show that a fractional Skellam process is better at modeling a tick-by-tick financial dataset than the Skellam process, or equivalently that the MLF is superior to the exponential in describing the inter-arrival times between successive ticks.

The FC has the potential to improve the performance of control systems as demonstrated by Zheng et al. (ZLCW) [9]. They argue that the improvement is due to the greater flexibility in modeling of systems and in the controller design methodology using fractional-order proportional-integral-derivative (FOPID) which is a generalization of the classical PID controller. ZLCW point out that although the FOPID controller provides better performance it is also more difficult to implement. Consequently, rather than presenting a general theory they present a case study to demonstrate the advantages of the proposed method. The classical frequency-domain method is the analytic design method for the FOPID controller used in the case study.

Digital watermarking is a form of embedding a signal (a watermark) within another signal known as the cover, which might be a digital media, such as an image, audio, video, or other digital media, and has become popular as a copyright enforcement tool in the last few decades. Gonzalez-Lee et al. (GVMNPL) [10] explore the advantages of a FC watermarking system for detecting Gaussian watermarks. They briefly critique multiple FC strategies that have been adopted to replace the linear additive rule more commonly used to watermark a signal. Watermarking includes using fractional derivatives, since there is a relationship between the order of the derivative and the resulting function; fractional Fourier transforms (random, continuous, and discrete), since there is a strong dependency between the orders and the resulting coefficient set; as well as, fractional Wavelet transforms.

CVMNPL emphasize that all the techniques discussed that use the FC have the same starting point and the overall difference among them is the use of some transform coefficients set for watermarking. In a previous work, the authors investigated the case of Gaussian watermarks and their results suggested that the FC reduces the false positives percentages (FPP). In that earlier work, however, they had limited testing and a deeper study of the fractional scheme for detecting Gaussian watermarks was called for. The present work accomplishes this task and confirms that the fractional scheme is reliable for the unambiguous (error-free) detection of Gaussian watermarks.

Song and Karniadakis (SK) [11] open their contribution to the anthology with the assertion that the modeling of wall-bounded turbulent flows is presently an unsolved problem in classical physics. They propose a fundamentally new approach for modeling the entire average velocity profile from the wall to the centerline of the pipe based on the FC. They were surprised to find that representing the Reynolds stresses with a non-local variable-order fractional derivative that decays with distance from the wall results in a universal form for all Reynolds numbers for channel flow, pipe flow, and Couette flow.

A remarkable feature of this paper is the exhaustive numerical testing of the new theoretical results against existing datasets from direct numerical simulation of the equations, as well as from experimental measurements. Taken together these results support the hypothesis that the rate of turbulent diffusion changes continuously with distance from the wall and the strong non-locality of turbulent interactions intensify away from the wall.

The final paper in this anthology presents an overview of the rapidly expanding area of Distributed-Order Fractional Calculus (DOFC) by Ding et al. (DPSS) [12]. DOFC generalizes the intrinsic multiscale nature of constant-order and variable-order fractional operators, which provides new ways to think about and model systems whose behavior emerges from the complex interplay and superposition of non-local and memory effects across a multitude of scales. They discuss the various ways the fractional order in space and/or time can be distributed and review the multiple ways these equations can be numerically integrated. The areas of application on which they focus are engineering and the physical sciences, with applications to viscoelasticity, transport processes and control theory taking center stage.

Mechanisms, such as multiple relaxation time in viscoelasticity, multiple temporal and spatial effects in transport processes, and mixtures of time delays in control theory, have all illustrated the significance of DOFC over more traditional integer-order methods. This review provides a glimpse into the various ways the DOFC has established its utility in the modeling of previously unsolved or partially solved complex problems. Hopefully, the attentive reader will see a way in which the DOFC may provide insight into a problem they have put on the backburner because they could not see a way forward.

**Conflicts of Interest:** The author declare no conflict of interest.

#### **References**


## *Article* **Sir Isaac Newton Stranger in a Strange Land**

#### **Bruce J. West**

Office of the Director Army Research, Research Triangle Park, NC 27709, USA; brucejwest213@gmail.com Received: 27 September 2020; Accepted: 19 October 2020; Published: 25 October 2020

**Abstract:** The theme of this essay is that the time of dominance of Newton's world view in science is drawing to a close. The harbinger of its demise was the work of Poincaré on the three-body problem and its culmination into what is now called chaos theory. The signature of chaos is the sensitive dependence on initial conditions resulting in the unpredictability of single particle trajectories. Classical determinism has become increasingly rare with the advent of chaos, being replaced by erratic stochastic processes. However, even the probability calculus could not withstand the non-Newtonian assault from the social and life sciences. The ordinary partial differential equations that traditionally determined the evolution of probability density functions (PDFs) in phase space are replaced with their fractional counterparts. Allometry relation is proven to result from a system's complexity using exact solutions for the PDF of the Fractional Kinetic Theory (FKT). Complexity theory is shown to be incompatible with Newton's unquestioning reliance on an absolute space and time upon which he built his discrete calculus.

**Keywords:** complexity; chaos; fractional calculus; subordination

#### **1. Introduction**

Three centuries ago, Newton transformed Natural Philosophy into today's Science by focusing on change and mathematical quantification and he did so in a way that resonated with the scientific community of his day. His arguments appeared to be geometric in character, and nowhere in the *Principia* do you find explicit reference to fluxions, or to differentials. What Newton did was reveal the entailments of the calculus and convince generations of scientists of the value of their focusing on how physical objects change their location in time. Some contemporary mathematicians of his generation recognized what he had done, but their number can be counted on one hand, and their comments are primarily of historical interest.

Fast forward to today, where modern science, from Anatomy to Zoology, is seen to have absorbed the transformational effect of Newton's contribution to how we quantitatively and qualitatively understand the world, the fundamental importance of motion. However, it has occurred to a number of the more philosophically attuned contemporary scientists that we are now at another point of transition, where the implications of complexity, memory, and uncertainty have revealed themselves to be barriers to our future understanding of our technological society. The fractional calculus (FC) has emerged from the shadows as a way of taming these three disrupters with a methodology capable of analytically smoothing their singular natures.

If Sir Isaac Newton were reincarnated into the modern world would he again achieve scientific greatness using his prodigious intellect? Of course we cannot know the answer to this counterfactual, but what we can determine is whether his fundamental assumptions upon which the physical laws of analytic mechanics are based remain valid in the today's world of complexity science. Whether or not Newton would remain a stranger in this strange land of today's science is the question we seek to answer in this essay. Not literally, of course, but more to the point whether the fundamental assumptions on which his mechanics is based can be sufficiently modified to be compatible with the

mathematics found necessary to describe today's complex phenomena, without being distorted to the point of being abandoned. Can Newton's view of the world be made compatible with the FC?

The FC moldered in the mathematical backwaters for over 300 years. Since the time of Newton it was mostly ignored by the social, physical, and life scientists, intermittently emerging from the shadows of formalism with an application. Historically, the community of international physical scientists saw no need for a new calculus, or if occasionally seeing the need thought it not worthy of acknowledgment. The community agreed that the ordinary differential calculus of Newton and Leibniz, along with the analytic functions entailed by solving the equations resulting from Newton's force law, are all that is required to provide a scientific description of the macroscopic physical world.

In his *Mathematical Principles of Natural Philosophy* [1], Newton introduced mathematics into the study of *Natural Philosophy.* He argued the need for quantification of scientific knowledge through the introduction of mathematics in the form of *fluxions* and thereby changed the historical goal of natural philosophy from that of wisdom to that of knowledge. This new term fluxion does not appear anywhere in the *Principles*, but scholars have found numerous geometric arguments, which, in fact, were in all probability based on limits in which Newton, no doubt, had differentials in the back of his mind. The Marquis de l'Hôpital commented that Newton's *magnum opus* was "a book dense with the theory and application of the infinitesimal calculus"; an observation also made in modern times by Whiteside [2].

Along with mathematics, Newton also introduced a number of definitions that determined how scientists were to understand his vision of the physical world for the next few hundred years. We do not quote his definitions of such well-known things as inertia and force here, but instead we record the notions of space and time that he believed were the accepted understanding of their meanings as explained in his first scholium (A scholium is a marginal note or explanatory comment made by a scholar), which are [1] as follows.


Newton's understanding of these two notions of the absolute are what enabled him to invent fluxions and introduce motion as the basis for his new physics. Of course, the mathematically awkward discrete notation of fluxions was subsequently elbowed out of history by the user-friendly notation of Leibniz, which became known as the differential calculus. The differential calculus enabled subsequent generations of scientists to describe the motion of particles in terms of continuous single particle trajectories in space and time. The differential calculus fills literally thousands of mathematics/physics text books; all assuming that I and II codify the real world and are taught to eager students and novitiate scientists throughout the world. Herein, we argue for a mathematics that provides a logical framework for understanding the more complex world of the Information Age, in which I and II must be applied with extreme caution, if at all.

The increase in sensitivity of diagnostic tools, advances in data processing techniques, and expanding computational capabilities have all contributed to the broadening of science in ways that have brought many phenomena from borderline interest to center stage. These curious complex processes are now cataloged under the heading of non-integer scaling phenomena. An understanding of the fundamental dynamics underlying such scaling requires a new mathematical perspective, such as that obtained using the dynamics described by non-integer (fractional) operators and such descriptions ushered in the sunset for much of what remains of Newton's world view.

Much of what is written in this Introduction will be familiar to those with a background in physics, even if the organization of the material is not. However the reasons why classical physics fails to explain a given complex phenomena remains a mystery to those without such a background as well as to many who do. Therefore, we express the purpose of this paper in the form of a hypothesis and present arguments in support of the Complexity Hypothesis (CH):

Complex phenomena entailing description by chaos theory, fractional Kinetic Theory, or the fractional calculus in general, are incompatible with the ordinary calculus and consequently are incompatible with Newtonian Physics.

#### *1.1. The Demise of Newton's World View?*

The evidence is all around us that the domain of application of Newton's view of the physical world is contracting dramatically. His view was reluctantly contracted with the introduction of quantum mechanics along with relativity over a century ago. However, physicists took consolation in the fact that the dynamic predictions of the very fast, the very large, and the very small, all reduce to those of Newton in the appropriate limits. For special relativity, the dramatic changes in time occur as the speed of light is approached [3]; for general relativity, space curves in the neighborhood of a large mass [4]; and for quantum phenomena, the correspondence principle associated with the size of Planck's constant insures the quantized nature of energy is lost at large quantum numbers and energy is continuous on the scale we live our lives [5]. However, the more recent constrictions produced by chaotic dynamics is different; so much so that once made, there is no limit in which the view of Newton can reemerge. This requires more explanation, as the inappropriate application of the differential calculus to describe the dynamics of strongly nonlinear phenomena often yields misleading results. In the author's view, one such misinterpretation arose in support of the political interpretation of climate change.

It should be evident that the rubric *climate change* provides an example of such a misapplication of the nonlinear hydrodynamic partial differential equations that purport to describe the internal motion of the earth's atmosphere involving the multiple interactions with the earth's temperature field, solar radiation, cloud cover, and all the rest*.* Climate change is not just a problem in Newtonian physics, because if it were we would have the answer to the problem in hand, which some few scientists believe we do. I say this with full appreciation for the criticism such a statement will draw, from both the believers in climate change and the sceptics who do not. Let me be absolutely clear in stating that I believe in climate change, but belief is the wrong word. Climate change is a scientific fact not a matter of faith or belief. What I am skeptical about concerns the quasi-scientific arguments used in the political arena that assign causality of that change to human activity followed by the assertion that climate change can be significantly influenced by political action.

I came to this conclusion, not through a "eureka" moment, or flash of insight, but more through the weight of evidence drawn from my own scientific research. I even coauthored a book about it [6] with a colleague who was then a post-doctoral researcher of mine. Our book addressed climate change as a problem in physics and was greeted with a yawn from the scientific community. It was the last scientific contribution I made to that debate and the science has not moved significantly since its publication. My epiphany was that those who successfully communicate technically difficult ideas tell a story. Thus, I have decided to populate this essay with a sequence of technically-based stories. Each one lending additional support to the CH. The first story concerns chaos theory and some of what that entails.

#### *1.2. Chaos Theory*

The chaos story begins in the middle nineteenth century with Oscar II, the King of Sweden and Norway, and his concern over how long the Earth will survive. More pointedly, he wondered whether the solar system was stable. Could one expect the moon to spiral out of its orbit and crash into the Earth? Would the Earth break from its timeless trajectory and collide with the Sun? Let me stop here and say this is the beginning of the somewhat romanticized historical account of how chaos came into being that I learned when I was first introduced to the "three-body problem" as a freshly mined minted physics PhD in 1970. The actual historical account is a bit more banal, but not much.

Oscar II had done well in mathematics while a university student and had grown into an active patron of the subject [7], so his sponsorship of a prize in mathematics, unrelated to any particular institution was not surprising. Mittag-Leffler, who was then the editor of the Swedish journal *Acta*, made the original announcement of the King's mathematics competition, in the science magazine *Nature*. In that announcement Mittag-Leffler listed four categories to which international scientists could submit contributions. The category concerning the stability of the solar system was written in the following arcane way [7].

(1) A system being given of a number whatever of particles attracting one another mutually according to Newton's law, it is proposed, on the assumption that there never takes place an impact of two particles to expand the coordinates of each particle in a series proceeding according to some known functions of time and converging uniformly for any space of time.

The committee that evaluated the submissions to the competition consisted of, along with Mittag-Leffler, two other giants of nineteenth century mathematics, Hermite and Weierstrass. To avoid any possibility of bias the entrants and their submissions remained anonymous until the winner was selected, at which time the name was to be published in *Acta*. Out of a field of 12 entrants, the committee selected Henri Poincaré, who had responded to question (1). He extended the analysis of the solvable two-body problem to the addition of one additional body, which was much less massive than the other two. Poincaré proved that the solution to Newton's dynamic equations for his restricted three-body problem could not have a simple analytic form. His published proof entailed the invention of new mathematics, the implications of which have kept the best mathematician in the world actively engaged for over a century.

In reviewing the prize-winning memoire for publication in *Acta*, a referee pointed out an error in the manuscript. Part of the drama associated with publishing the final version of the paper concerned the secrecy surrounding that error. Correcting this error entailed a major rewrite, which took Poincaré nearly a year to complete. In composing the revision, he conceived of and implemented in the manuscript the idea of a homoclinic point [7], which is the basis of our understanding of what today goes by the popular name of chaos theory. In short, he introduced the *Three-Body Problem* to the scientific community as being of fundamental importance and proved that the elliptic orbits of the two-body problem were replaced by orbits in the restricted three-body problem that resembled nothing so much as a plate of spaghetti. A single strand of entangled spaghetti was the convoluted trajectory of the third body and the asymptotic position of the body along that trajectory at any time was unpredictable. Today we call such trajectories fractals [8].

Sir James Lighthill, on the three-hundred-year anniversary of the communication of Newton's *Principia* to the Royal Society, and while he was president of the *International Union of Theoretical and Applied Mechanics*, published the paper *The recently recognized failure of predictability in Newtonian dynamics* [9]. In this paper, Lighthill traces the history of mechanics from Tycho Brahe collecting astronomical data as a court astronomer, through Poincaré's proof of the limited predictability horizon of Newton's law of the dynamics of mechanical systems. To put this in a proper perspective let us use Lighthill's words:

We are all deeply conscious today that the enthusiasm of our forebears for the marvelous achievements of Newtonian mechanics led them to make generalizations in this area of predictability which, indeed, we may have generally tended to believe before 1960, but which we now recognize were false. We collectively wish to apologize for having misled the general educated public by spreading ideas about determinism of systems satisfying Newton's laws of motion that, after 1960, were to be proved incorrect. . .

This reluctant indictment of the Newtonian system of nonlinear partial differential equations that describe how the radiation from the sun is absorbed by the earth's atmosphere and redistributed around the globe has to the best of my knowledge never been explicitly refuted. This is not unexpected as Sir James was the scientific leader in the area of applied mathematics involving those same equations for over thirty years. If the unpredictability of coupled systems of nonlinear differential equation were expressed as a theorem, then one can draw a corollary regarding the nature of the computer simulations based on those same equations. The reader is free to infer from these remarks if Newton's view is truly dead or whether it is just confined to an ever decreasing domain of analytic application.

What we can conclude with certainty is that Newton's force law typically breaks down when the system being analyzed is not linear and the equations of motion are nonlinear. Such equations typically do not have analytic solutions, their solutions are generically chaotic [10,11]. As scientists, this loss of predictability, which is the foundation of the physical sciences, ought to be our greatest concern, or at least the mathematical foundation of all our physical models, the differential calculus, ought to be the focus of our concern.

It is worth mentioning that in his philosophical writings Poincaré recognized that his mathematical analysis entailed the loss of predictability and the existence of a new kind of chance [12]:

A very slight cause, which escapes us, determines a considerable effect which we can not help seeing, and then we say this effect is due to chance. If we could know exactly the laws of nature and the situation of the universe at the initial instant, we should be able to predict exactly the situation of this same universe at a subsequent instant. But even when the natural laws should have no further secret for us, we could know the initial situation only *approximately.* If that permits us to foresee the subsequent situation *with the same degree of approximation,* this is all we require, we say the phenomenon has been predicted, that it is ruled by laws. But this is not always the case: it may happen that slight differences in the initial conditions produce very great differences in the final phenomena: a slight error in the former would make an enormous error in the latter. Predication become impossible and we have the fortuitous phenomenon.

For over a century, some of the world's leading mathematicians have been working on what might be a proper replacement for, or extension of, Newton's physics. They typically begin with the notion that a conservative nonlinear dynamical system with three or more degrees-of-freedom is chaotic [13], which means that its dependence on initial conditions is so sensitive that an infinitesimal change in the initial state will produce a trajectory that exponentially diverges from the trajectory predicted by the original state. Such an exponential separation of trajectories means that the perturbed state is unstable in the sense that its asymptotic location cannot be predicted from the initial state.

The work that Lighthill was alluding to in his remarks quoted earlier were those of the meteorologist Ed Lorenz, whose ground breaking paper opened the world of fluid dynamics to the importance of chaos [14], and ended dreams of long-term weather forecasting. Those that have considered chaos as a possible obstacle to climate forecasting as well, treat it in much the same way that the nineteen century physicists Maxwell and Boltzmann treated many-body effects to produce Kinetic Theory. Only now the modern climate physicist examines large-scale computer simulations of

the earth's atmosphere as having random fluctuations around the average dynamical behavior of the atmosphere's velocity field and temperature. The established procedure is to carry out a large number of computer simulations, all starting from the "same state", and from them construct an ensemble of atmospheres with which to calculate the average dynamics of the interesting physical quantities.

The general impression in the meteorology community is that such ensemble averages ought to be sufficient to smooth out the influence of chaotic trajectories and thereby provide the appropriate phase space probability density function in the kinetic theory sense. The problem with the approach is when one actually attempts to average over an ensemble of chaotic trajectories the integer moments diverge leaving the coefficients ill-defined in the kinetic theory of Maxwell and Boltzmann. Here again we find a need for a new kind of mathematics and the fractional calculus comes to the rescue, providing a fractional Kinetic Theory (FKT).

In Section 2, we generalize the traditional phase space partial differential equations for the probability density function (PDF) to the fractional calculus. This is done by averaging over an ensemble of chaotic trajectories, and following the mathematical arguments of Zaslavsky [15] create a FKT. The solution to a simple fractional diffusion equation is shown to have a generic analytic form.

#### *1.3. Allometry Relations*

Scientists believed that phenomena whose dynamic description is the result of using non-integer operators, such as fractional derivatives, were interesting curiosities, but lay outside the mainstream of science. Even such empirical laws as allometry relations (ARs), in which the functionality of a system is related to a non-integer power of the system's size, were thought to have causal relations, with traditional differential dynamic descriptions [16–18]. Perhaps the most famous allometry relation is that between the average metabolic rates of mammals and their average total body masses (TBMs) as depicted by the "mouse-to-elephant" curve in Figure 1. In this figure, the solid curve is a fit to data by a power-law relation of the form

$$
\langle \mathbf{Y} \rangle = a \left\langle \mathbf{X} \right\rangle^b,\tag{1}
$$

which is a straight line on log-log graph paper with slope *b* :

$$
\log \left< Y \right> = \log a + b \log \left< X \right> . \tag{2}
$$

The functionality of the system *Y*, here the average metabolic rate is denoted by -*Y* and the size of the system *X*, here the average TBM is denoted by -*X*. Note that the brackets here denote the empirical averaging process.

Historically such ARs were explained using biophysical arguments, for example, Sarrus and Rameaux [17,18] used simple geometrical arguments for heat transfer. They assumed the heat generated by a body is proportional to its volume and the heat is lost at the body's surface and is proportional to surface area. The balance between the two suggested that the allometry parameter is given by the ratio of dimensions to be 2/3, which does not fit the data very well. The empirical value of the allometry parameter is *b* ≈ 0.74, which was subsequently accounted for by using fractal scaling arguments [19]. A statistical technique based on the fractional calculus was developed in [20] to explain the averaging brackets in Equation (1), which in due course we use herein as an exemplar of complexity in the fractal statistics of physiological phenomena.

In Section 3, selected applications of the FC are presented with the intent of persuading the reader that as systems become more complex the value of the ordinary differential calculus to describe their behavior increasingly diminishes, until it is eventually nearly lost altogether. The analytic PDF that solves the simple FKT problem is shown to explain the empirical AR using a complexity-based arguments.

**Figure 1.** The mouse-to-elephant curve. The average metabolic rates of mammals and birds are plotted versus the average body weight (TBM) on log-log graph paper covering five orders of magnitude in size. The solid line segment is the best linear regression to the data from Schmidt-Neilson [18] with permission.

#### *1.4. Another Time*

The willingness of his contemporaries to accept Newton's view of time flowing as an uninterrupted featureless stream is understandable. However, the reluctance of physicists to directly challenge Newton's view of time outside extreme conditions in the physical sciences is unclear. This reluctance is not evident in psychology where everything we see, smell, taste and otherwise experience is in a continuous state of change. Consequently, the changes in the physical world are not experienced uniformly, which is another way of saying that there is an objective time associated with the physical and a subjective time associated with the psychological world. The physical scientists dismissed subjective time out of hand, prior to Einstein, but even after relatively the experiential time they accepted was considered to be a local physical time.

Here, we follow the discussion of Turalska and West [21]. The idea of different clocks telling different times arises naturally in physics; the linear transformation of Lorentz in relativistic physics being a familiar example. However, we are interested in the notion of multiple clocks in the biological and social sciences wherein they have begun distinguishing between cell-specific and organ-specific clocks in biology and person-specific and group-specific clocks in sociology [22]. Of course, the distinction between subjective and objective time dates back to the empirical Weber–Fechner Law [23] in the latter half of the nineteenth century.

While the global behavior of an organ, say the heart, might be characterized by apparently periodic cycles, the activity of single neurons demonstrate burstiness and noise. In a similar way people in a social group operate according to their individual schedules, not always performing particular actions in the same global time frame. Consequently, because of the stochastic behavior of one or both clocks, a probabilistic transformation between times is often necessary. An example of such a transformation is given by the subordination procedure.

Insight into the subordination procedure is provided if we begin by defining two clocks that operationalize time in two distinct ways. The ticking of the first clock records a subjective or operational discrete time *n*, which measures an individual's time *T*(*n*). The ticking of the second clock records the objective or chronological time *t*, which measures the social time *T*(*t*) upon which a society of individuals agree. If each tick of the discrete clock *n* is considered to be an event, the relation between operational and chronological time is given by the waiting time PDF of those events in chronological time *ψ*(*t*). Assuming a renewal property for events, as given by a chain condition (convolution) from renewal theory in Section 2.1, one can relate operational to chronological time [21]:

$$\langle T(t)\rangle = \sum\_{n=0}^{\infty} \int\_{0}^{t} \Psi\left(t - t'\right) \psi\_{n}\left(t'\right) T(n) dt' \tag{3}$$

Every tick of the operational clock is an event, which in the chronological time occurs at time intervals drawn from the renewal waiting-time PDF. This randomness entails the sum over all events and the result is an average over many realizations of the transformation. The last of the *n* events occurs at time *t* and the survival probability Ψ (*t* − *t* ) insures that no further event occurs before the time *t*.

For example, consider the behavior of a two-state operational clock, whose evolution is depicted in Figure 2, where the clock switches back and forth (tick tock) between its two states at equal time intervals. However, in chronological time this regular behavior is significantly distorted as seen in the figure. The time transformation was taken to be an inverse power law (IPL) waiting time PDF *ψ* (*t*). Thus, a single time step in the operational time corresponds to a random time interval being drawn from *ψ*(*t*) in chronological time. The tail of the IPL PDF leads to especially strong distortions of the operational time trajectory, as there exist a non-zero probability of drawing very large time intervals between events. However, as the transformation between the operational and chronological time scales involves a random process, one needs to consider infinitely many trajectories in the chronological time, which leads to the average behavior of the clock in the chronological time denoted in Equation (3) by brackets.

**Figure 2.** The upper curve is the regular transition between the two states of the individual in operational time. The lower curve is the subordination of the transition times to an IPL PDF to obtain chronological time.

Newton's view of homogeneous isotropic time is shown to be incompatible with multiple phenomena in the social and life sciences in Section 3.2 using subordination theory. In that section the disciplines of biophysics, psychophysics, and sociophysics, to the degree they have adopted the Newtonian viewpoint, are shown to be misleading. The complexity of these disciplines require a new calculus to describe their dynamics.

In Section 3.2, we establish a direct link between subordination theory and the FC. This has been done in the literature in a number of different ways. In Section 2, we show how the probability calculus can be generalized to the FC in order to include temporal memory and spatial heterogeneity with probability theory.

What is entailed by the results presented herein is discussed in Section 5 and some conclusions are drawn.

#### **2. Fractional Kinetic Theory**

Zaslavsky [24] considered chaotic dynamics, as a physical phenomenon, to be a bridge spanning the gap between deterministic and stochastic dynamic systems. The dynamic states in the first case are described by regular functions and in the second by kinetic or other probabilistic equations. He developed the mathematics for the fractional kinetics corresponding to chaotic dynamics that is intermediate between completely regular (integrable) and completely random cases. The kinetics become "strange" because some moments of the PDF are infinite and the Onsager Principle is violated in that it takes infinitely long for fluctuations to relax back to the equilibrium state. An alternative to the derivation of the fractional kinetic equation (FKE) given by Zaslavsky [24] is presented by West and Grigolini [25]. In this section we present the overlapping highlights of these two derivations in schematic form, emphasizing the physical interpretation.

#### *2.1. Generalizing Kinetic Theory*

We sketch Zaslavsky's arguments leading to the FKT resulting from the underlying dynamics being chaotic and consequently the dynamic trajectories being fractal. We begin with the chain condition of Bachelier, Smoluchowsky, Chapman, and Kolmogorov (BSCK) [26]:

$$P(\mathbf{x},t \mid \mathbf{x}\_0, t\_0) = \int P(\mathbf{x}, t \mid \mathbf{x}', t') P(\mathbf{x}', t' \mid \mathbf{x}\_0, t\_0) dy\_\prime \tag{4}$$

where *P*(*x*, *t* |*x* , *t* ) is the probability density of having a particle at position *x* at time *t* if at time *t* ≤ *t* the particle was at the point *x* . We make the assumption that the PDF is stationary such that

$$P(\mathbf{x}, t \mid \mathbf{x}\_0, t\_0) = P(\mathbf{x}, \mathbf{x}\_0; t - t\_0), \tag{5}$$

corresponding to the regular scheme for the kinetic derivation [26] and with Δ*t* ≡ *t* − *t*<sup>0</sup> we have for the initial condition

$$\lim\_{\Delta t \to 0} P(\mathbf{x}, \mathbf{x}\_0; \Delta t) = \delta \left(\mathbf{x} - \mathbf{x}\_0\right). \tag{6}$$

The first generalization of the historical kinetic theory argument is made by taking into account the fractal nature of the set generated by the ensemble of chaotic trajectories initiated by an underlying non-integrable Hamiltonian. Inserting the time limit for a fractional time differential into the BSCK chain condition enables us to write

$$\partial\_t^a \left[ P(\mathbf{x}, t) \right] = \lim\_{\Delta t \to 0} \frac{1}{\Delta t^a} \int dy \left[ P(\mathbf{x}, y; \Delta t) - \delta \left( \mathbf{x} - \mathbf{y} \right) \right] P(\mathbf{y}; t) \,. \tag{7}$$

This expression can be simplified using a second generalization, that being introducing the generalized Taylor expansion

$$P(\mathbf{x}, y; \Delta t) = \delta \left(\mathbf{x} - y\right) + A\_1(y; \Delta t) \delta^{(\beta)}\left(\mathbf{x} - y\right) + A\_2(y; \Delta t) \delta^{(\beta + 1)}\left(\mathbf{x} - y\right),\tag{8}$$

for a set characterized by the fractal dimension 0 < *β* ≤ 1. Inserting this expansion into Equation (7) simplifies the generalized BSCK chain condition by introducing the quantities

$$\mathcal{A}(\mathbf{x}) \equiv \lim\_{\Delta t \to 0} \frac{A\_1(\mathbf{x}; \Delta t)}{\Delta t^{\mathbf{a}}} = \lim\_{\Delta t \to 0} \int dy \frac{|\mathbf{x} - \mathbf{y}|^{\beta}}{\Delta t^{\mathbf{a}}} P(\mathbf{x}, \mathbf{y}; \Delta t), \tag{9}$$

$$\mathcal{B}(\mathbf{x}) \equiv \lim\_{\Delta t \to 0} \frac{A\_2(\mathbf{x}; \Delta t)}{\Delta t^a} = \lim\_{\Delta t \to 0} \int dy \frac{|\mathbf{x} - y|^{\beta + 1}}{\Delta t^a} P(\mathbf{x}, y; \Delta t). \tag{10}$$

Zaslavsky [15] explained that the limit in these two expressions are the result of the fractal dimensionality of the space-time set along which the state of the system is meandering in the Δ*t* → 0 limit.

We do not reproduce the mathematical details from the open literature and instead jump to the result for the one-dimensional Fractional Kinetic equation (FKE) [15,25] and write the fractional Fokker–Planck equation (FFPE):

$$
\partial\_t^{\mathfrak{a}} \left[ P(\mathbf{x}, t) \right] = \partial\_{\left[ \mathbf{x} \right]}^{\mathfrak{f}} \left[ \mathcal{A}(\mathbf{x}) P(\mathbf{x}, t) \right] + \partial\_{\left[ \mathbf{x} \right]}^{\mathfrak{f} + 1} \left[ \mathcal{B}(\mathbf{x}) P(\mathbf{x}, t) \right]. \tag{11}
$$

The FFPE has fractional indices in the domain 0 < *α*, *β* ≤ 1, the fractional time derivative is of the Caputo form, and the fractional spatial derivative is of the symmetric Reisz–Feller form.

So how different are the solutions to the above FFPE from those of the ordinary FPE even when *β* = 1?

#### *2.2. Solution to a Simple FKE*

One of the simplest dynamical processes described by the FFPE having far-reaching implications has a constant fractional diffusion coefficient and a vanishing fractional velocity:

$$\mathcal{A}(\mathbf{x}) = 0 \text{ and } \mathcal{B}(\mathbf{x}) = K\_{\mathbb{R}^\star} \tag{12}$$

thereby reducing Equation (11) to

$$
\partial\_t^{\alpha} \left[ P(\mathbf{x}, t) \right] = K\_{\beta} \partial\_{|\mathbf{x}|}^{\beta + 1} \left[ P(\mathbf{x}, t) \right]. \tag{13}
$$

This is one of the simplest form of anomalous diffusion, first discussed in terms of the continuous time random walk (CTRW) by Montroll and Scher [27].

The solution to this fractional diffusion equation is readily obtained by taking its combined Fourier–Laplace transform and introducing the notation

$$\mathcal{F}\left\{\partial\_{|x|}^{\mathfrak{f}+1}\left[f(x)\right];k\right\}=-|k|^{\mathfrak{f}+1}\,\tilde{f}(k),\tag{14}$$

where *f* (*k*) is the Fourier transform of *f*(*x*) and correspondingly

$$\mathcal{L}\left\{\partial\_t^{\mu}\left[\mathcal{g}(t)\right];\mu\right\} = \mu^{\mu}\hat{\mathcal{g}}\left(\mu\right) - \mu^{\mu-1}\mathcal{g}(0) \tag{15}$$

where *<sup>g</sup>* (*u*) is the Laplace transform of *<sup>g</sup>*(*t*). Note that in Equation (14) we used the Fourier transform of the Reisz–Feller derivative in space and in Equation (15) we used the Laplace transform of the Caputo derivative in time. Consequently we obtain from the Fourier–Laplace transform of the FFPE:

$$
\mu^a P^\*(k, u) - \mu^{a-1} \widetilde{P}(k, t=0) = -K\_{\beta} |k|^{\beta+1} P^\*(k, u), \tag{16}
$$

where the asterisk denotes the double transform of the PDF and the indices lie in the interval 0 < *α*, *β* ≤ 1. This equation is simplified for the initial value problem:

$$P(\mathbf{x}, t=0) = \delta \begin{pmatrix} \mathbf{x} \\ \end{pmatrix} \implies \tilde{P}(\mathbf{k}, t=0) = 1,\tag{17}$$

to the form

$$P^\*(k,\mu) = \frac{\mu^{\alpha - 1}}{\mu^{\alpha} + K\_{\beta} \left|k\right|^{\beta + 1}}.\tag{18}$$

The inverse Fourier–Laplace transform of this expression yields the solution to the initial value problem for the PDF.

Metzler and Klafter [28] derived the FFPE using the CTRW formalism of Montroll and Weiss [29] and reviewed the potential functions for various combinations of indices. It has also been derived using subordination theory by West [30]. The inverse Laplace transform of *P*∗(*k*, *u*) yields the characteristic function

$$\tilde{P}(k,t) = E\_{\mathfrak{a}}\left(-K\_{\mathfrak{F}} \, |k|^{\mathfrak{F}+1} \, t^{\mathfrak{a}}\right) \tag{19}$$

expressed in terms of the Mittag–Leffler function (MLF):

$$E\_{\alpha} \left( z \right) = \sum\_{n=0}^{\infty} \frac{z^{n\alpha}}{\Gamma \left( n\alpha + 1 \right)}. \tag{20}$$

The inverse Fourier transform of the characteristic function yields the PDF solution

$$P(\mathbf{x},t) = \mathcal{F}^{-1}\left[E\_{\mathfrak{a}}\left(-\mathbb{K}\_{\mathfrak{k}}\left|k\right|^{\mathfrak{f}+1}t^{\mathfrak{a}}\right);\mathbf{x}\right].\tag{21}$$

The simple substitution *k* = *kt<sup>δ</sup>* into Equation (21), with *δ* = *<sup>α</sup> <sup>β</sup>*+<sup>1</sup> , after some algebra reduces the formal solution to

$$P(\mathbf{x},t) = \frac{1}{t^{\delta}} \mathcal{F}^{-1} \left[ E\_{\mathfrak{a}} \left( -\mathcal{K}\_{\mathfrak{f}} \left| k' \right|^{\delta+1} \right); \frac{\chi}{t^{\delta}} \right], \tag{22}$$

or in a more familiar scaling form:

$$P(\mathbf{x},t) = \frac{1}{t^{\delta}} F\left(\frac{\mathbf{x}}{t^{\delta}}\right),\tag{23}$$

where the new function is defined:

$$F\left(\frac{\chi}{t^{\delta}}\right) \equiv P\left(\frac{\chi}{t^{\delta}}, 1\right). \tag{24}$$

The function *F*(·) is analytic in the scaled variable *x*/*t <sup>δ</sup>*, is properly normalized and can therefore be treated as a PDF. For a standard diffusion process, *α* = 1, in which case the MLF becomes an exponential so that for *β* = 1 the Fourier transform can be carried out and this function becomes a Gaussian with *δ* = 1/2. When *α* = 1 = *β* the result is a stable Lévy process [26,31] with the Lévy index given by 0 < 1/*δ* ≤ 2. However, for general chaotic systems there is a broad class of distributions for which the functional form is neither Gaussian nor Lévy.

Mainardi et al. [32] obtained a variety of other solutions to the FKE in terms of the properties of the MLF for 0 < *α* < 1. The inverse Fourier transform of the scaled PDF solution for *β* = 1 asymptotically relaxes as the IPL *t* <sup>−</sup>*<sup>α</sup>*/2.

#### *2.3. Self-Similar Random Walks*

Zaslavsky et al. [33] worked to visualize the underlying landscape produced by averaging over chaotic trajectories and to describe the formal structure uncovered by extensive numerical calculations. They discuss the notion of a "stochastic web" to characterize the chaotic dynamics generated by Hamiltionian systems in which "weak" chaotic orbits are concentrated on small measure domains of phase space thereby constituting a "web". They note that transport through stochastic webs could produce non-Gaussian, i.e., intrinsically anomalous, diffusion.

The nexus points of the web constitute traps were homoclinic points have dissolved into a spray of local points that locally entrap trajectories for IPL lengths of time. Exiting a trap the orbit undergoes a long–range flight having self-similar properties. The process can be realized as passing through the turnstiles of "cantori" [34]. This argument is realized by replacing the complete simulation of the Hamiltonian dynamics with a random walk (RW) containing the appropriate qualitative features. They do this by way of example whereby they construct a RW determined by a Weierstrass (W) function [35]. Consider the discrete probability described by the stepping PDF for the Weierstrass random walk (WRW) on a one-dimensional lattice with sites indexed by *x* [35]:

$$p(x) = \frac{a-1}{2a} \sum\_{n=0}^{\infty} \frac{1}{a^n} \left[ \delta\_{x,b^n} + \delta\_{x,-b^n} \right],\tag{25}$$

where *a* and *b* are dimensionless constants greater than one. and *δij* is the Kronecker delta function: *δij* = 1 for *i* = *j* and *δij* = 0 for *i* = *j*. We follow the analysis of this discrete process given by West and Grigolini [6]. The first notable property of the PDF generated by the WRW is that the second moment of this RW process diverges:

$$
\left< \mathbf{x}^2 \right> = \frac{a-1}{a} \sum\_{n=0}^{\infty} \left( \frac{b^2}{a} \right)^n \,, \tag{26}
$$

for *b*<sup>2</sup> > *a* as the series is infinite. The discrete Fourier transform of the PDF given by Equation (25) yields the discrete characteristic function

$$
\hat{p}\left(k\right) = \frac{a-1}{a} \sum\_{n=0}^{\infty} \frac{1}{a^n} \cos\left[b^n k\right].\tag{27}
$$

This series was introduced by Weierstrass in 1872 in response to Cantor, a former student and subsequent colleague, who challenged him to construct an analytic function that is continuous everywhere but is nowhere differentiable. Thanks to Mandelbrot [8] we now know that this was the first consciously constructed fractal function and the divergence of the second moment is a consequence of its non-analytic properties.

As the WRW process unfolds the set of sites visited mimics the influence of localized chaotic islands, interspersed by gaps, nested within clusters of clumps over ever-larger spatial scales. The WRW generates a hierarchy of traps that are statistically self-similar, as suggested by Figure 3. The parameter *a* determines the number of subclusters within a cluster and the parameter *b* determines the scale size between clusters.

**Figure 3.** The landing sites for the WRW are depicted and the islands of clusters discussed in the text are readily seen.

The Weierstrass form of the characteristic function allows for a renormalization group (RG) solution [36] from which we can determine the scaling properties of the WRW. Scaling the argument of the characteristic function by *b* and reordering terms in the series allows us to write [33,36]

$$
\hat{p}\left(bk\right) = a\hat{p}\left(k\right) - \frac{a-1}{a}\cos k.\tag{28}
$$

The RG solution to Equation (28) can be separated into a homogeneous part and a singular part:

$$
\hat{p}\left(k\right) = \hat{p}\_s\left(k\right) + \hat{p}\_h\left(k\right),
\tag{29}
$$

where *<sup>p</sup> <sup>h</sup>* (*k*) is analytic in the neighborhood *<sup>k</sup>* <sup>=</sup> 0 and *<sup>p</sup> <sup>s</sup>* (*k*) is singular in this neighborhood. The singular part *<sup>p</sup> <sup>s</sup>* (*k*) is obtained by solving the scaling equation:

$$
\hat{p}\_s\left(bk\right) = a\hat{p}\_s\left(k\right),
\tag{30}
$$

where we assume the formal solution:

$$
\widehat{p}\_{\delta}\left(k\right) = A\left(k\right)k^{\delta}.\tag{31}
$$

Inserting this form of the singular solution into Equation (30) yields

$$A(bk)b^{\delta}k^{\delta} = aA(k)k^{\delta},\tag{32}$$

providing the distinct equalities

$$b^{\delta} = -a,\tag{33}$$

$$A(bk) \quad = \quad A(k). \tag{34}$$

The first equality yields for the power index in terms of the series parameters *δ* = ln *a*/ ln *b*. The second equality implies that *A*(*k*) is periodic in the logarithm of *k* with period ln *b*. Consequently, the singular part of the RG solution is written

$$\hat{p}\_s(k) = \sum\_{n = -\infty}^{\infty} A\_n \left| k \right|^{H\_n},\tag{35}$$

with the complex power–law index:

$$H\_{\rm nl} = \delta + i n \frac{2\pi}{\ln b} = \frac{\ln a}{\ln b} + i n \frac{2\pi}{\ln b}.\tag{36}$$

The analytic forms of the Fourier coefficients in Equation (35) are given in [35].

Hughes et al. [35] prove that the dominant behavior of the WRW is determined by the lowest-order term in the singular part of the solution for the discrete characteristic function, but we do not show that here. Instead we assume that the dominant behavior is given by the *n* = 0 term in the series:

$$
\widehat{p}\_s\left(k\right) \approx A\_0 \left|k\right|^\delta \,, \tag{37}
$$

whose inverse Fourier transform is determined by a Tauberian theorem to be the IPL:

$$p(\mathbf{x}) = \frac{K(\delta)}{|\mathbf{x}|^{\delta+1}},\tag{38}$$

and *K*(*δ*) is a known function of *δ*. Thus, the singular part of the WRW has an IPL stepping PDF and this dominant behavior intuitively justifies ignoring all the other terms in the series.

We now write for the asymptotic time-dependent form of the discrete PDF resulting from the WRW:

$$\begin{aligned} P(\mathbf{x}, n+1) &= \sum\_{\mathbf{x}'} p(\mathbf{x} - \mathbf{x}') P(\mathbf{x}', n) \\ &= \sum\_{\mathbf{x}'} \frac{K(\delta)}{|\mathbf{x} - \mathbf{x}'|} P(\mathbf{x}', n), \end{aligned} \tag{39}$$

where we assume that each step *n* in WRW process occurs at equal time intervals. Equation (39) was analyzed in 1970 by Gillis and Weiss [37], who determined that its solution is a Lévy PDF, thereby connecting the RG solution of the WRW to our discussion of the fractional diffusion equation given earlier.

Stable Lévy processes can therefore arise from the "weak" chaotic nature of the phase space trajectories. This is, in part, a consequence of the asymptotic behavior *k* → 0 corresponding to the asymptotic *x* → ∞ , which is of significance in determining the transport behavior of the anomalous diffusion process.

#### **3. Patterns and Complexity**

In the Introduction we identified one of those patterns that is not restricted to a particular discipline, but pops up in every discipline from anatomy to zoology, and that pattern is an allometry relation (AR). However, what distinguishes such patterns from, for example, simple periodic motion? Of course, the existence of such regularity, the pattern of reproducibility in space and time, is what motivated the first investigators to seek common causes to associate with those patterns. Periodic motions, such as vibrations, motivated Hook to introduce his law using Newton's mechanical force for its explanation. The amazing success of such laws reinforced the idea that other phenomena including the beating of the heart, walking, and the propagation of light could all be described by adopting a similar modeling strategy. However, the *luminiferous aether* is now a quaint historical myth concerning the assumed need for a medium with remarkable properties to support the propagation of electromagnetic waves. In addition, the *normal sinus rhythm* of the heart is a medical myth as heartbeats are not sinusoidal. The more complex the phenomenon being considered the less well the patterns are reproduced using Newton's view of science.

Much of the present discussion stems from the need to replace Newton's atavistic characterization of space and time, because they fail to capture the rich structure of the complexity of the modern world. The failure to systematically reexamine these fundamental assumptions have restricted the utility of the modeling techniques of modern physics in the study of the psychology, sociology and the life sciences. The experience of space and time differs between those of the claustrophobic or agoraphobic, from the performer on the stage or the surgeon operating on the brain, from the warrior on the battlefield to the physician on the critical care ward. We require a mathematics that can capture all of this and so much more. The conclusions drawn herein were anticipated a couple of years ago [38]:

What is becoming increasingly clear ... is that the technical intensity of the world has become so dense that the mathematical language initiated by Newton is no longer adequate for its understanding. In fact we now find that we have been looking at the world through a lens that often suppresses the most important aspects of phenomena, most of which are not "simple". These are characteristics of the phenomena that cannot be described using differential equations and we refer to them as complex.

#### *3.1. Allometry through Complexity*

We have argued elsewhere [20,39] that the empirical AR given by Equation (1) is a consequence of the imbalance between the complexity associated with the system functionality and the complexity associated with the system size, both being measured by Shannon information. We refer to this as the allometry/information hypothesis (A/I–H) [40] and postulate that in a complex network, composed of two or more interacting subnetworks, the flow of information is driven by the complexity gradient between the subnetworks, transported from that with the greater to that with the lesser complexity.

Implicit in the A/I–H is the assumed existence of dependencies of both system size and system functionality on complexity. Such dependencies have been observed in the positive feedback between social complexity and the size of human social groups [41,42], as well as in ant colony size [43], and the increase in biological complexity with ecosystem size [44]. Other relations have been observed in multiple disciplines, including the increase of prey refuge from predators with habitat complexity [45], computational complexity increasing with program size [46], and gene functionality depending on

system complexity [47]. We abstract from these observations that the complexity of a phenomenon increases with system size and that the system functionality increases with system complexity.

The argument presented in this section follows that given recently by West et al. [48] in their discussion of the evolution of military technology over the past millennium. It is intuitively understood, but not often explicitly stated, that size and complexity grow together and are inextricably intertwined through criticality. Moreover, although tied together, their changes are not in direct proportion to one another. A similar connection exists between complexity and system functionality [38]. These interconnections are represented through homogeneous scaling relations, as shown below. West argued that as a system increases in size it provides increasing opportunity for variability, which is necessary in order to maintain stability. Scaling provides a measure of complexity in dynamic systems, indicating that the system's observables can simultaneously fluctuate over many time and/or space scales. An observable *Z*(*t*) scales if for a constant *λ* it satisfies the homogeneous relation

$$Z(\lambda \mathbf{t}) = \lambda^{\mu\_{\varepsilon}} Z(\mathbf{t}) \tag{40}$$

with the scaling index given by *μz*. Note that if we consider the AR given by Equation (1), but without the averaging brackets, the size and functionality depend on a parameter *t*, and scale in the manner indicated by Equation (40), each with a distinct power law index, then *b* = *μY*/*μ<sup>X</sup>* in order for the AR to be satisfied.

The hallmarks of fractal statistics are spatial (*z*) inhomogeneity and temporal (*t*) intermittency and the phase space trajectory (*z*; *t*) replaces the dynamic variable *Z*(*t*). In phase space, the scaling of the dynamic variable is replaced by a scaling of the PDF *P*(*z*; *t*):

$$P(z;t) = \frac{1}{t^{\mu\_z}} F\_z \left( \frac{z}{t^{\mu\_z}} \right) \tag{41}$$

as given by Equation (23) for general complex phenomena. There is a broad class of PDFs for which the functional form of *Fz*(·) is left unspecified.

It is straightforward to calculate the average value of *Z*(*t*) using the PDF given by Equation (41):

$$
\langle Z(t) \rangle = \int zP(z,t)dz = \overline{q}\_z t^{\mu\_z},\tag{42}
$$

and the overall constant is determined by the scaling variable *q* = *z*/*t <sup>μ</sup><sup>z</sup>* averaged over the PDF *F*(*q*):

$$
\overline{q}\_z \equiv \int q F\_z \left( q \right) dq. \tag{43}
$$

Interpreting *Z*(*t*) as the system's TBM *X*(*t*) Equation (42) describes the growth in the overall average size of a complex system with the time *t*, due to the intrinsic dynamics generating increasing complexity. A similar observation can be made interpreting the dynamic variable with a functionality of the system *Y*(*t*). Consequently, the same functional form results for both *Y*(*t*) and *X*(*t*), each with its own index. This is not entirely unexpected since both the functionality and size of the system grow with complexity, but at different rates.

Notice that using the scaling PDF that the average of the dynamic variable now has the scaling property:

$$
\langle Z(\lambda t) \rangle = \lambda^{\mu\_{\varepsilon}} \langle Z(t) \rangle \,. \tag{44}
$$

If both the size and functionality of the system can be characterized in terms of the system's complexity by the same form of scaling PDF we obtain two equations in *t* for the averages. Setting the scaling parameter to *λ* = 1/*t*, after some algebra we obtain the equalities

$$t = \left(\frac{\langle \varUpsilon(t) \rangle}{\langle \varUpsilon(1) \rangle} \right)^{\frac{1}{\varPi\_Y}} = \left(\frac{\langle \varUpsilon(t) \rangle}{\langle \varUpsilon(1) \rangle} \right)^{\frac{1}{\varPi\_X}},\tag{45}$$

which can rewritten in the form of the empirical AR given by Equation (1):

$$a \left< \mathbf{Y} \right> = a \left< \mathbf{X} \right>^{b},\tag{46}$$

with the allometry parameters:

$$a = \frac{\langle \varUpsilon(1) \rangle}{\langle X(1) \rangle^b} = \frac{\overline{q}\_Y}{\overline{q}\_X^b} \text{ and } b = \frac{\mu\_Y}{\mu\_X}. \tag{47}$$

Here, we have used Equation (42) to obtain the second equality for the allometry coefficient. Thus, demonstrating that the empirical AR is the result of the self-similar behavior of the PDF.

Note that the allometry index *b* is expressed as the ratio of *b* = *α*/ (*β* + 1) for the system functionality to that for the system size. In general, this ratio is less than one for both the system size and functionality. It is also the case that for physiological systems *b* < 1. The more the index for the fractional time derivative deviates downward from one, the greater influence the complexity history has on the present behavior of the independent variable, whether functionality or size. The more the index of the fractional variate derivative deviates downward from two, the greater is the nonlocal coupling of the independent variables (functionality or size) across scales. However, these two mechanisms do not independently determine the scaled PDF. It is their ratio that determines the balancing of effects in the functionality and size separately, and then through their ratio to obtain *b*.

It is this coupling across scales in size as well as in physiologic time that entails the temporal AR with *b* < 1, as well as, the positive growth of entropy in approaching the steady state asymptotically. The results of these brief arguments are encapsulated in the Principle of Complexity Management (PCM), which establishes that in the interaction between two complex networks, information flows from the more complex to the less complex network. Information transfer is maximally efficient when the complexities of the two networks are matched [38]. In the time-size application of this section, the PCM takes the form *The origin of natural patterns manifest by temporal ARs is the imbalance between the complexity associated with a system's measure of time and the complexity associated with a system's size. In both networks the complexity is measured by the Wiener/Shannon entropy*.

#### *3.2. Its about Time*

The fundamental question addressed in this section is whether time outside the physical sciences, say the time for a scurrying mouse at the lower left of Figure 1 is the same as that of the lumbering elephant at the upper right of the metabolic AR curve. Newton would assert that they are identical and we would agree that the time shared by the two animals is the same when referenced to an external mechanical clock. However, are the two times the same when referenced to their individual physiological clocks? This question arises because the lifespans of the two creatures are essentially the same when their lifetimes are measured using the product of the number of heartbeats times the average time interval between beats. This is very different from the comparison of their separate lifespans when referenced to an external clock in which case the two differ by years. This change of reference of time measures, from the ticking of a clock to the beating of a heart, suggests that physiological time may be a monotonically decreasing function of physical time [49].

This difference in the meaning of time has lead to such concepts as biological time [50], physiologic time [51], and metabolic time [52], all in an effort to highlight the distinction between time in living and in inanimate systems. The intrinsic time in a living process was first called biological time by Hill [53], who reasoned that since so many properties of an organism change with size that time itself ought to scale with TBM. Natural scientists have subsequently hypothesized that physiologic time differs from the time measured by the ticking of a mechanical clock, or Newtonian time, in that the former changes with the size of the animal [17,18], whereas the latter does not [54].

Lindstedt and Calder [55] developed the concept of biological time further and determined experimentally that biological time, such as species longevity, satisfies a temporal AR with the functionality of the system being the physiologic time *Y* = *τ* and *X* the TBM *M* [56]:

$$
\langle \tau \rangle = a \left\langle M \right\rangle^b \tag{48}
$$

which describes the average duration of biological events. In Figure 4, we record the average heart rate *R* = 1/ *τ* for sixteen animals [57] covering six orders of magnitude in average TBM. The solid line segment is the fit to the data with empirical values to the allometry parameters given by *a* = 205 and *b* = 0.248, with a quality of fit measured by *r*<sup>2</sup> = 0.96. Other, more exhaustive, fits to larger data sets, made by other investigators, support the notion that physiologic time is extensive and may be found in many other places [17,18], but the results are equivalent.

**Figure 4.** The average heart rate in beats per minute for 16 animals from the fastest, hamsters, to the slowest, large whales, with humans being in the middle of a fitting curve. The data were obtained from [57] and the solid line segment is fitted to the temporal AR. From the work in [49] with permission.

In an allometry context, one version of the FKE, would be given by Equation (13) where the phase space variables (*z*, *t*) are here given by (*m*, *τ*) [30] and *P*(*m*, *τ*)*dm* is the probability that the dynamic mass variable *M*(*τ*) lies in the interval (*m*, *m* + *dm*) at time *τ*. *M*(*τ*) represents the TBM of a mature individual species member, within an ensemble of realizations, at the physiological time *τ*. The exact solution to the FKE has been obtained as the inverse Fourier transform of the characteristic function, expressed in terms of the Mittag–Leffler function given by Equation (21) with the variables properly defined. The allometry coefficient in this temporal AR has a theoretical value expressed in terms of the average of the scaled variable *q* = *m*/*τδ*. Consequently, the complexity of the underlying physiology of an animal entails the physiologic time through the scaling statistics.

The dependence of the empirical AR on the overall state of the system is captured by the entropy. The Wiener/Shannon information entropy associated with the system manifesting temporal allometry has the value

$$S(\tau) = -\int P(m,\tau)\log\_2 P(m,\tau)dm\tag{49}$$

which when the scaled PDF given by Equation (41) is inserted into the integral yields

$$S(\tau) = S\_0 + \delta \log\_2 \tau \tag{50}$$

where *S*<sup>0</sup> is the entropy referenced to the PDF *F*(·).

Consequently, as we mentioned earlier, given a monotonic function relating physical and physiologic time *t* = *g*(*τ*), such that

$$\frac{d\mathcal{G}(\tau)}{d\tau} \equiv \dot{\mathcal{G}} \ge 0 \tag{51}$$

we have for the physical time derivative of the entropy Equation (50):

$$\frac{dS(\mathbf{r})}{dt} = \frac{\delta}{\pi} \frac{1}{\dot{\mathcal{S}}} \ge 0 \tag{52}$$

Consequently, the entropy generation in physical time for the physiologic process entailing the temporal AR is positive semidefinite. Thus, the rate of entropy generation in Newtonian time is consistent with the dynamics of living systems having their own physiological time.

It is worth pointing out that empirical ARs are not necessarily restricted to living systems, but also arise in social systems as well. This is not entirely unexpected, as the average mass in an empirical AR is actually a surrogate for the living system's complexity. Proceeding by analogy, one might anticipate that such an AR should appear in a social context, where the average TBM is replaced with the average population or population density. This does, in fact, occur in the form of ARs where the functionality is expressed in terms of the rate at which an event occurs. An exemplar is Farr's Law, which dates back to the nineteenth century, and quantifies the "evil effects of crowding", relating a population's mortality rate to an institution's patient population density in the form of a rate AR [38,58]. Other examples of social ARs include an increasing urban crime rates, the more rapid spread of infectious diseases, and a speedup in pedestrian walking, all with increasing city size, as quantitatively confirmed by Bettencourt et al. [59]. Unlike the biological case, in the social rate ARs the allometry index has a value greater than one, *b* > 1, confirming that cities have, at all times and in all places, throughout history, entailed increased rates in human activity, for good or ill.

#### **4. Subordination**

The Montroll–Weiss (MW) perspective of CTRW [29] has been used to support the assumption that there are at least two distinct, but related, interpretations of time associated with a system's dynamics. As noted in the Introduction, the first is the external time associated with an objective observer who records the behavior of the system. This is Newton's assumption of what constitutes time: it is experimental or clock time. The second kind of time is the local time associated with the internal dynamics of the system, called subjective or operational time. In a psychological experiment the latter time is what is experienced by the participant. The experimental observation, carried out in the clock time *t*, is subordinated to a process occurring in the operational time *n*. For simplicity, we assume the operational time *n* to be an integer number so large as to become indistinguishable from a continuous variable. In the operational time *n* the evolution of the PDF describing the process is described by the ordinary diffusion equation

$$\frac{\partial P(\mathbf{x},n)}{\partial n} = D \frac{\partial^2 P(\mathbf{x},n)}{\partial \mathbf{x}^2} = \mathcal{L}P(\mathbf{x},n),\tag{53}$$

where L ≡ *<sup>D</sup> <sup>∂</sup>*<sup>2</sup> *<sup>∂</sup>x*<sup>2</sup> is the diffusion operator.

The dynamics generating the diffusion process is the simple Langevin equation

$$\frac{dX(n)}{dn} = \eta\left(n\right),\tag{54}$$

where *X*(*n*) is the space coordinate at time *n* and *η* (*n*) is the fluctuating velocity. If the velocity is a stochastic process with delta correlated fluctuations, this equation yields a diffusion process with scaling index *δ* = 1/2. If *δ* = 1/2 the diffusion is anomalous and is the result of memory influencing the fluctuations. In the present representation *η*(*n*) of Equation (54) is totally random, i.e., it has no memory. However, in the clock time, the event *η*(*n*) occurs at time *t*(*n*) and the independent event

*η*(*n* + 1) at time *t*(*n* + 1) with the time distance *τ*(*n*) = *t*(*n* + 1) − *t*(*n*) derived from a waiting time PDF *ψ*(*τ*). We are interested in the case where the waiting time PDF has the hyperbolic form:

$$\psi\left(\tau\right) = \left(\mu - 1\right) \frac{T^{\mu - 1}}{\left(T + \tau\right)^{\mu}}\tag{55}$$

We use this hyperbolic form to define the concept of crucial event.

Crucial events are defined by the time interval separating the occurrence of consecutive events. The time intervals between crucial events are determined by a waiting time PDF given by Equation (55), with the condition 1 < *μ* < 3. In clock time we use the theoretical MW prescription [29] to obtain

$$P(\mathbf{x},t) = \sum\_{n=0}^{\infty} \int\_{0}^{t} dt' \psi\_{n}\left(t'\right) \Psi\left(t - t'\right) e^{n\mathcal{L}} P(\mathbf{x},0). \tag{56}$$

Note that *ψ<sup>n</sup>* (*t* ) is the PDF that *n* events have occurred and that the last event took place at time *t* .

For the formula given by Equation (56) to hold with *n* going to ∞, we must assume that for the random walker to travel the distance *x* in a time *t* a virtual infinitely large number of events may occur, thereby implying the diffusion coefficient *D* is extremely small. In the case *μ* < 2, the mean waiting time *τ* diverges, thereby providing an additional reason for the experimental observation time *t* to be large.

It is possible to prove, using the arguments developed by Allegrini et al. [60] with a minor notational change, that Equation (56) is equivalent to the integro-differential phase space equation:

$$\frac{\partial P(\mathbf{x},t)}{\partial t} = \int\_0^t dt' \Phi\left(t - t'\right) \mathcal{L}P(\mathbf{x}, t'),\tag{57}$$

where Φ(*t*) is the MW memory kernel related to the waiting–time PDF and *ψ*(*t*) = *ψn*=1(*t*). In the Laplace transform representation where *f* (*u*) denotes the Laplace transform of *f*(*t*), this latter relation is

$$
\hat{\Phi}\left(\boldsymbol{u}\right) = \frac{\boldsymbol{u}\bar{\boldsymbol{\psi}}\left(\boldsymbol{u}\right)}{1 - \hat{\boldsymbol{\Psi}}\left(\boldsymbol{u}\right)}.\tag{58}
$$

In the case where the index for the hyperbolic PDF, which asymptotically is the IPL index, is in the interval 1 < *μ* < 2, using Equation (58) it is shown [61] that asymptotically *u* → 0:

$$
\widehat{\Phi}\left(\mu\right) \approx \mu^{1-\mathfrak{a}}.\tag{59}
$$

Inserting this asymptotic expression into the Laplace transform of Equation (57) and taking the inverse Laplace transform yields the fractional diffusion equation (FDE):

$$\frac{\partial^a P(\mathbf{x}, t)}{\partial t^a} = \mathcal{L}P(\mathbf{x}, t) \tag{60}$$

Here, the fractional time derivative is of the Caputo form with *α* = *μ* − 1 < 1. We note here that the analytic solution to Equation (60) is given by the scaling PDF Equation (23) when *β* = 1 and *δ* = *α*/2.

Culbreth et al. [62] stress certain subtleties of these formal results to provide a context with which to appreciate their contribution to the field of cognition and to the fractional calculus. First, they notice that we can use psychological arguments to interpret the connection between operational time and clock time, as done in [63]. The operational time is subjective in this psychological context with a logarithmic connection with the clock time *t*, which changes an exponential waiting time PDF into the hyperbolic structure of Equation (55). This property provides the rationale for why they [62] consider the CTRW formalism to be closely connected to the issue of cognition. As they point

out, earlier work [60] analyzed a series of events using the hyperbolic waiting time PDF using the Kolmogorov–Sinai definition of complexity and determined that the signal becomes computationally compressible for 2 < *μ* < 3. This is equivalent to assessing that the time series hosts messages that can be decoded.

On the other hand, the Kolmogorov–Sinai entropy vanishes for *μ* < 2 and has been recently generalized to take into account the rare crucial events [64] of this region. These crucial events are conjectured to be the signal of swarm intelligence [65], while the observation of the dynamics of the brain leads to the conclusion that *μ* = 2 is a proper signature of the brain of an awake subject [66]. In summary, the events characterized by the inter-event or hyperbolic waiting time PDF are considered to be a signature of cognition and are known to be responsible for the transport of information from one intelligent system to another [67,68]. The term crucial events is a proper nomenclature to acknowledge the importance of these rare events.

#### **5. Discussion and Conclusions**

We began this essay with the stated intent of supporting the Complexity Hypothesis by demonstrating to the reader why Newton's dynamic view of physical objects is not just inappropriate for living and social systems but its domain of application within the physical sciences is shrinking dramatically as well. The unexamined assumptions regarding the nature of space and time, with which Newton opened his *Principles,* make his force law invalid for the study of complex phenomena. Yet, these are the phenomena of interest to scientists in the 21st century, whether such phenomena reside in the physical, social, or life sciences.

As mentioned, Newton's equations have been shown to require changes when particles are moving very fast (approaching the speed of light), when the spatial scales are very large (cosmological) and when they are very small (quantum mechanical). In each of these domains the dynamic laws follow a correspondence principle in that they converge on Newton's laws by changing a parameter value to replicate the world of our five senses. Herein we have shown that in this world of experience we continually encounter deviations from Newton's laws at normal speeds and spatial scales, due to chaos. Chaotic dynamics led to replacement of the probability calculus of Kinetic Theory with that of FKT, as well as to operational time. One way to measure the degree of complexity generated by chaotic attractors is by using the entropy of the behavior.

Crutchfield et al. [69] interpreted the entropy of a dynamic process as the average rate of information generation by a chaotic process in that the more precisely an initial state of a system is specified, the more information one has available. The amount of information contained in the initial state is inversely proportional to the state space volume *Vi* localized by measurement. Trajectories initiated in a local volume of a regular attractor remain close to one another as the system evolves, and therefore no new information is generated, while the initial information is preserved in time. Consequently, the initial information can be used to predict the system's final state.

On the other hand, on a chaotic attractor the initial volume gets smeared out, consequently, as the system evolves the initial information is destroyed and replaced by newly created information. Thus, the volume in the specification of the initial system is eventually spread over the entire attractor and all predictive power is lost since the probability of being anywhere on the attractor is the same. All causal connection between the present state of the system and its future or final state is lost. This is referred to as the sensitive dependence on initial conditions.

Let us denote the final region of phase space the system occupies by *Vf* so that the change in the observable information Δ*I* is determined by the volume change from the initial to final state [70,71]:

$$
\Delta I = \log\_2\left(\frac{V\_f}{V\_i}\right). \tag{61}
$$

The time rate of information change (creation or dissipation) is therefore

$$\frac{dI}{dt} = \frac{1}{V} \frac{dV}{dt} \,'\,\tag{62}$$

where the time-dependent volume *V* over which the initial conditions are spread determines the ultimate fate of the initial information. In regular, which is to say non-chaotic, systems the sensitivity of the flow in the initial conditions grows with time no more rapidly than a polynomial. Let Ω(*t*) be the number of states at time *t* that can be distinguished such that if the greatest polynomial index is *n* such that Ω (*t*) ∝ *t <sup>n</sup>*. The ratio of the final to initial volume in such a system is equal to the relative number of states independently of the time *Vf Vi* <sup>=</sup> <sup>Ω</sup>*<sup>f</sup>* Ω*i* , so that for the rate at which information changes [71]:

$$\frac{dI}{dt} \sim \frac{n}{t}.\tag{63}$$

Thus, the rate of generation of new information decreases with time and converges to zero as *t* → ∞. As in Poincaré's quote in the Introduction, the final state is approximately predictable from the approximate initial information.

On the other hand, in chaotic systems two trajectories separate exponentially and therefore the number of distinguishable states grows exponentially with time Ω (*t*) ∝ exp (*λt*), where *λ* is the Liapunov coefficient. In this case, the rate at which information is generated is constant:

$$\frac{dI}{dt} \sim \lambda.\tag{64}$$

In this latter system, information is continuously generated by the attractor independently of the initial state. Nicolis and Tsuda [70] used this property of chaotic dynamic systems in the early modeling of cognitive systems using nonlinear dynamics and subsequently for information processing in neurophysiology, cognitive psychology, and perception [72].

Thus, Newton's statements about the absolute nature of space is contradicted by the chaotic trajectories entailed by his own force law when applied to complex systems. Subsequently, even Kinetic Theory and the introduction of stochastic differential equations, which were early attempts to make the differential calculus and complex phenomena compatible, could only be salvaged by means of the FC. In a similar way, Newton's statements regarding the absolute nature of time have been shown to have little place, if any, outside restricted domains of the physical sciences.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The author declares no conflicts of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Time Fractional Fisher–KPP and Fitzhugh–Nagumo Equations**

#### **Christopher N. Angstmann \* and Bruce I. Henry \***

School of Mathematics and Statistics, UNSW, Sydney 2052 NSW, Australia

**\*** Correspondence: c.angstmann@unsw.edu.au (C.N.A.); b.henry@unsw.edu.au (B.I.H.)

Received: 28 August 2020; Accepted: 12 September 2020; Published: 16 September 2020

**Abstract:** A standard reaction–diffusion equation consists of two additive terms, a diffusion term and a reaction rate term. The latter term is obtained directly from a reaction rate equation which is itself derived from known reaction kinetics, together with modelling assumptions such as the law of mass action for well-mixed systems. In formulating a reaction–subdiffusion equation, it is not sufficient to know the reaction rate equation. It is also necessary to know details of the reaction kinetics, even in well-mixed systems where reactions are not diffusion limited. This is because, at a fundamental level, birth and death processes need to be dealt with differently in subdiffusive environments. While there has been some discussion of this in the published literature, few examples have been provided, and there are still very many papers being published with Caputo fractional time derivatives simply replacing first order time derivatives in reaction–diffusion equations. In this paper, we formulate clear examples of reaction–subdiffusion systems, based on; equal birth and death rate dynamics, Fisher–Kolmogorov, Petrovsky and Piskunov (Fisher–KPP) equation dynamics, and Fitzhugh–Nagumo equation dynamics. These examples illustrate how to incorporate considerations of reaction kinetics into fractional reaction–diffusion equations. We also show how the dynamics of a system with birth rates and death rates cancelling, in an otherwise subdiffusive environment, are governed by a mass-conserving tempered time fractional diffusion equation that is subdiffusive for short times but standard diffusion for long times.

**Keywords:** fractional diffusion; continuous time random walks; reaction–diffusion equations; reaction kinetics

#### **1. Introduction**

Reaction–diffusion partial differential equations are among the most widely used equations in applied mathematics modelling. These equations govern the time evolution of concentrations, or population densities, of species, at different spatial locations, that are diffusing and reacting. Applications include the spatio-temporal spread of epidemics, the spatial spread of invasive species and the development of animal coat patterns [1–3]. In these modelling equations, diffusion is represented by a spatial Laplacian operating on the population densities, and reactions are included as additive terms representing changes per unit time in population densities through reaction rates. In well-mixed systems the reaction rate equations can often be derived from the law of mass-action [4]. A famous example of a reaction–diffusion equation is the Fisher–KPP equation named after Fisher [5] and Kolmogorov, Petrovsky and Piskunov [6]. The standard reaction–diffusion representation of this equation is

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = D \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2} + r u(\mathbf{x},t) (1 - u(\mathbf{x},t)), \quad D > 0, r > 0. \tag{1}$$

Here, *<sup>u</sup>*(*x*, *<sup>t</sup>*) represents the population density of a species, *<sup>D</sup>∂*2*u*(*x*,*t*) *<sup>∂</sup>x*<sup>2</sup> represents the diffusion of the species and *ru*(*x*, *t*)(1 − *u*(*x*, *t*)) represents the reactions of the species. In the absence of diffusion, the time rate of change in the population density is the same at all points in space and is given by

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = r u(\mathbf{x},t)(1 - u(\mathbf{x},t)). \tag{2}$$

In this example and in the following, for simplicity, we have considered systems in one spatial dimension. Extensions to higher spatial dimensions are possible.

Over the past two decades, there has been a growing awareness of fractional diffusion, where diffusion cannot be modelled using a standard Laplacian and the mean square displacement of diffusing species does not grow linearly in time, as anticipated by Einstein's famous modelling of Brownian motion [7]. In particular, following widespread observations in biological systems, there has been a great deal of attention focussed on fractional subdiffusion, characterized by the mean square displacement of a population spreading as a sublinear power law in time. It is now generally accepted that if subdiffusion arises from particles being trapped for arbitrarily long periods of time, the appropriate equation to model subdiffusion is the time fractional diffusion equation [8]

$$\frac{\partial \mu(\mathbf{x},t)}{\partial t} = \, \_0D\_t^{1-\gamma} \frac{\partial^2 \mu(\mathbf{x},t)}{\partial \mathbf{x}^2}, \quad 0 < \gamma < 1,\tag{3}$$

which can be derived [9,10] from a continuous time random walk (CTRW) [11] with a power law waiting time density. In this equation,

$$\,\_0D\_t^{1-\gamma}y(\mathbf{x},t) = \frac{1}{\Gamma(\gamma)}\frac{\partial}{\partial t}\int\_0^t \frac{y(\mathbf{x},t')}{(t-t')^{1-\gamma}}dt' \tag{4}$$

is the Riemann–Liouville fractional derivative of order 1 − *γ*, see, for example, reference [12]. It might be anticipated that the appropriate evolution equation to model subdiffusion, with reactions governed by the reaction rate equation,

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = f(u(\mathbf{x},t)),\tag{5}$$

would be

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = \,\_0D\_t^{1-\gamma} \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2} + f(u(\mathbf{x},t)). \tag{6}$$

Indeed, such an equation had been derived from an underyling CTRW model, under certain assumptions, [13], however it is not valid in general. For example, the simple model equation

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = {}\_0D\_t^{1-\gamma} \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2} - u(\mathbf{x},t),\tag{7}$$

can have unphysical negative solutions [14].

The time fractional subdiffusion equation is also often written as [15]

$$\frac{\partial^{\gamma}u(\mathbf{x},t)}{\partial t^{\gamma}} = \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2}, \quad 0 < \gamma < 1,\tag{8}$$

where

$$\frac{\partial^{\gamma}}{\partial t^{\gamma}}y(\mathbf{x},t) = \frac{1}{\Gamma(1-\gamma)} \int\_{0}^{t} \frac{\frac{\partial}{\partial t'} y(\mathbf{x},t')}{(t-t')^{\gamma}} \,dt' \tag{9}$$

denotes a Caputo fractional derivative, see, for example, reference [12]. There has been quite a bit written in the published literature on the greater physical practicality of the Caputo derivative over the Riemann–Liouville derivative, but this is largely unfounded [12]. Note, however, that if one takes Equation (8) as the starting evolution equation for subdiffusion then this is suggestive of the following reaction–subdiffusion equation,

$$\frac{\partial^{\gamma}u(\mathbf{x},t)}{\partial t^{\gamma}} = \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2} + f(u(\mathbf{x},t)). \tag{10}$$

Equations along the lines of Equation (10) are particularly widespread in the literature with the motivation that fractional derivatives incorporate a history dependence, and solutions of Equation (10) remain positive. Equation (10) can be derived from a CTRW where particles are being removed or added instantaneously at the start of the waiting times between jumps, but only under the contrived constraint that *<sup>∂</sup>*1−*<sup>γ</sup> <sup>f</sup>*(*u*(*x*,*t*)) *<sup>∂</sup>t*1−*<sup>γ</sup>* represents the cumulative total of additions and removals to the arrival density of particles at position *x* and time *t* [14].

The derivation of reaction–subdiffusion equations from physically consistent CTRWs has been carried out in a series of papers [14,16–26]. The main lessons from this body of work are: (i) The governing equations are different depending on whether or not new born particles inherit the waiting times of their parents. (ii) Birth terms and death terms must be treated differently. (iii) In the case where particles are removed, but not instantaneously at the start of the waiting time between jumps, the reaction and subdiffusion terms are not additive. The following equation [21,24],

$$\begin{split} \frac{\partial u(\mathbf{x},t)}{\partial t} &= \, ^0D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \varepsilon^{-\int\_0^t a(u(\mathbf{x},t'), \mathbf{x}, t') \, dt'} \, \_0 \mathcal{D}\_t^{1-\gamma} \left( \varepsilon^{\int\_0^t a(u(\mathbf{x},t'), \mathbf{x}, t') \, dt'} u(\mathbf{x}, t) \right) \right] \\ &+ \varepsilon (u(\mathbf{x},t), \mathbf{x}, t) - a(u(\mathbf{x},t), \mathbf{x}, t) u(\mathbf{x}, t), \end{split} \tag{11}$$

which was derived from a continuous time random walk model, provides the evolution equation for particles undergoing subdiffusion with particles annihilated at a per capita rate, *a*(*u*(*x*, *t*), *x*, *t*) and created at a rate *c*(*u*(*x*, *t*), *x*, *t*). In the derivation of this equation it was assumed that newborn particles do not inherit the waiting times of their parents.

In the remainder of this paper we explore examples related to Equation (11). These examples have been selected to emphasize the importance of considering the details of the reaction kinetics when dealing with reaction–subdiffusion problems. Whilst there have been many papers published on various methods of solution for variants of Equation (10) (see, for example, [27–31]), there have been very few papers published considering algebraic or numerical solution methods for variants of Equation (11). We hope that the examples below will stimulate further activity in this area, where the physical motivation for the modelling equation is stronger.

#### **2. Examples**

#### *2.1. Birth and Death Balance*

As a first example, we consider a population density of *u*(*x*, *t*) particles per unit volume that are diffusing with a per capita death rate *α* and a birth rate *αu*(*x*, *t*). The reaction rate equation reflecting this balance between births and deaths, in a well-mixed population, at a location *x* is

$$\frac{d\mu(\mathbf{x},t)}{dt} = 0,\tag{12}$$

and thus the standard reaction–diffusion equation describing this system is

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = D \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2}. \tag{13}$$

The simple generalization of this equation for subdiffusive transport is

$$\frac{\partial \boldsymbol{u}(\boldsymbol{x},t)}{\partial t} \quad = \quad D\_{\gamma\_0} \mathcal{D}\_t^{1-\gamma} \frac{\partial^2 \boldsymbol{u}(\boldsymbol{x},t)}{\partial \boldsymbol{x}^2}$$

$$= \left[ D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \,\_0 \mathcal{D}\_t^{1-\gamma} u(\mathbf{x}, t) \right] \right]. \tag{14}$$

Indeed, if there were no births or deaths then the reaction rate equation would still be given by Equation (12); and Equation (14) is the appropriate equation to describe subdiffusion without births or deaths. However, the reaction–subdiffusion equation, following Equation (11), and using the reaction rate kinetics *a*(*u*(*x*, *t*), *x*, *t*) = *α* and *c*(*u*(*x*, *t*), *x*, *t*) = *αu*(*x*, *t*), which are also consistent with the rate equation, Equation (12), is remarkably different;

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \varepsilon^{-\alpha t} \,\_0 \mathcal{D}\_t^{1-\gamma} \left( \varepsilon^{\alpha t} u(\mathbf{x},t) \right) \right], \quad \alpha > 0. \tag{15}$$

The fundamental difference between Equations (14) and (15) is that in the former equation the Laplacian operates on a time fractional derivative and in the latter the Laplacian operates on a tempered time fractional derivative [32,33]. In the more general time fractional reaction–diffusion equation, Equation (11), the term in brackets following the Laplacian defines a generalized tempered time fractional derivative. The physical interpretation of the tempering is that if particles are being annihilated at a given rate while they wait then they cannot wait an arbitrarily long time at a given location. Note that both Equations (14) and (15) are mass conserving and thus Equation (15) then defines a mass conserving, tempered, time fractional diffusion equation.

The mean square displacement of the diffusing particles, *<sup>x</sup>*2(*t*), provides a clear measurable difference between particles following Equation (14) or Equation (15). In the former case, identified as *x*2 *<sup>I</sup>*(*t*), we have [8],

$$
\langle \mathbf{x}\_I^2(t) \rangle = \frac{2D\_\gamma}{\Gamma(1+\gamma)} t^\gamma,\tag{16}
$$

and in the latter case, identified as *x*2 *I I*(*t*), we have (Appendix A)

$$
\langle \mathbf{x}\_{II}^2(\mathbf{t}) \rangle = 2D\_{\gamma} \mathbf{e}^{-\mathbf{a}t} \mathbf{t}^{\gamma} E\_{1,\gamma}^{(1)}(\mathbf{a}\mathbf{t}), \tag{17}
$$

where

$$E\_{1,\gamma}^{(1)}(z) = \frac{d}{dz} \sum\_{k=0}^{\infty} \frac{z^k}{\Gamma(\gamma + k)}\tag{18}$$

is the derivative of a generalized Mittag–Leffler function [34]. Note that at short times,

$$
\langle \langle \mathbf{x}\_{II}^{2}(t) \rangle \rangle \sim \frac{2D\_{\gamma}}{\Gamma(1+\gamma)} t^{\gamma}, \tag{19}
$$

but at large times, using the asymptotic expansion of the generalized Mittag–Leffler function (Equation (6) in [35]),

$$
\langle \langle \mathbf{x}\_{II}^2(t) \rangle \rangle \sim 2D\_{\gamma} \mathbf{a}^{1-\gamma} t. \tag{20}
$$

Thus, mass conserving tempered time fractional diffusion is not anomalous at long times.

We can also write down explicit expressions for solutions to Equations (14) and (15), labelled as *uI*(*x*, *t*) and *uI I*(*x*, *t*), respectively. For simplicity we consider the infinite domain Greens function solutions with initial condition *u*(*x*, 0) = *δ*(*x*).

The Greens function solution of the fractional diffusion equation Equation (14) can be written as [8]

$$u\_I(\mathbf{x}, t) = \frac{1}{\sqrt{4\pi D\_\gamma t^\gamma}} H\_{1,2}^{2,0} \left[ \frac{\mathbf{x}^2}{4D\_\gamma t^\gamma} \Biggm| \begin{array}{c} \left(1 - \frac{\gamma}{2}, \gamma\right) \\ \left(0, 1\right) \end{array} \begin{array}{c} \left(\frac{1}{2}, 1\right) \\ \left(\frac{1}{2}, 1\right) \end{array} \right] \tag{21}$$

where *H* denotes a Fox H-function [36], see Equation (A11).

To find the Greens function solution *uI I*(*x*, *t*) we first note that Equation (15) can be re-written as

$$\frac{\partial v(\mathbf{x},t)}{\partial t} = D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \, \_0 \mathcal{D}\_t^{1-\gamma} v(\mathbf{x},t) + a v(\mathbf{x},t),\tag{22}$$

where

$$w(\mathbf{x},t) = e^{\mathrm{at}} u\_{II}(\mathbf{x},t). \tag{23}$$

The Greens function solution of Equation (22) can be obtained as a special case of the more general results in Appendix B of [14], yielding

$$w(\mathbf{x},t) = \frac{1}{\sqrt{4\pi D\_{\gamma}t^{\gamma}}} \sum\_{j=0}^{\infty} \frac{(at)^{j}}{j!} H\_{1,2}^{2,0} \left[ \frac{\mathbf{x}^{2}}{4D\_{\gamma}t^{\gamma}} \Biggm| \begin{array}{c} \left(1 - \frac{\gamma}{2} + j\_{\prime}\gamma\right) \\ \left(0,1\right) \end{array} \right] \tag{24}$$

and then using Equation (23) we have

$$u\_{II}(\mathbf{x},t) = e^{-at} \frac{1}{\sqrt{4\pi D\_{\gamma}t^{\gamma}}} \sum\_{j=0}^{\infty} \frac{(at)^{j}}{j!} H\_{1,2}^{2,0} \left[ \frac{\mathbf{x}^{2}}{4D\_{\gamma}t^{\gamma}} \Biggm| \begin{array}{c} \left(1 - \frac{\gamma}{2} + j, \gamma \right) \\ \left(0, 1 \right) \end{array} \quad \left(\frac{1}{2} + j, 1 \right) \end{array} \tag{25}$$

In Appendix B, we show that the Fox functions in Equations (21) and (25) can be simplified for *γ* = <sup>1</sup> <sup>2</sup> in terms of Miejer G-Functions [37], see Equation (A12), which have the advantage that they can readily be evaluated using computer algebra packages such as MATHEMATICA and MAPLE. Using the result of Equation (A19) from the Appendix B, we can write (see also [8] in the case of *uI*(*x*, *t*))

$$u\_I(\mathbf{x}, t) = \frac{1}{\sqrt{8\pi^3 D t^{\frac{1}{2}}}} G\_{0,3}^{3,0} \left[ \left( \frac{\mathbf{x}^2}{16D t^{\frac{1}{2}}} \right)^2 \bigg| \begin{array}{c} - \\ 0, \frac{1}{4}, \frac{1}{2} \end{array} \right],\tag{26}$$

and

$$u\_{II}(\mathbf{x},t) = e^{-at} \frac{1}{\sqrt{8\pi^3 Dt^{\frac{1}{2}}}} \sum\_{j=0}^{\infty} \frac{(2at)^j}{j!} G\_{1,4}^{4,0} \left[ \left(\frac{\mathbf{x}^2}{16Dt^{\frac{1}{2}}}\right)^2 \Bigg| \begin{array}{c} \frac{3}{4} + j \\ 0, \frac{1}{2}, \frac{1}{4} + \frac{j}{2}, \frac{3}{4} + \frac{j}{2} \end{array} \right]. \tag{27}$$

Note that the expression for *uI I*(*x*, *t*) simplifies to the expression for *uI*(*x*, *t*) if *α* = 0. If |*x*| 4 *Dt* <sup>1</sup> 2 then we can use asymptotic expansions for *G*3,0 0,3 (*z*) and *<sup>G</sup>*4,0 1,4 (*z*) with *z* 1 (see Appendix B) to write

$$u\_I(\mathbf{x}, t) \sim \frac{1}{\sqrt{8\pi^3 Dt^{\frac{1}{2}}}} \exp(-3(\frac{\mathbf{x}^2}{16Dt^{\frac{1}{2}}})^{\frac{2}{3}}) (\frac{\mathbf{x}^2}{16Dt^{\frac{1}{2}}})^{\frac{1}{2}} \frac{M\_0}{(\frac{\mathbf{x}^2}{16Dt^{\frac{1}{2}}})^{\frac{2}{3}} - 1},\tag{28}$$

and

$$
\mu\_{II}(\mathbf{x}, t) \sim Me^{\mathrm{at}} \mu\_I(\mathbf{x}, t), \tag{29}
$$

where *M*<sup>0</sup> and *M* are constant terms. The solutions *uI*(*t*), Equation (26), and *uI I*(*t*), Equation (27) are plotted in Figure 1, with *α* = 1 and *D* = 1, at times *t* = 0.1, *t* = 1.0 and *t* = 10.0. The solutions are very similar at early times but the corner at the origin, which is characteristic of subdiffusion, is less sharp at longer times in the solution of Equation (27).

**Figure 1.** Plots of Equation (26), the algebraic solution to Equation (14), (**left**), and Equation (27), the algebraic solution to Equation (15), (**right**), at times *t* = 0.1 (solid line), *t* = 1.0 (dashed line) and *t* = 10.0 (bold solid line). The reaction parameter *α* = 1, and the fractional order derivative is taken to be *γ* = 0.5 in each of these plots.

The lesson from this simple example is that reaction dynamics equations do not contain sufficient information on their own to provide model equations for reaction–subdiffusion systems even in well-mixed systems. In the case of standard diffusion, the evolution of the population density is only affected by the overall reaction rates, in a well-mixed system, but not the details of the reaction kinetics. In a standard reaction–diffusion system, the dynamics with no births and no deaths is the same as if there were births and deaths but the rates cancelled out. The reaction–diffusion equation with these reaction kinetics has no memory of the birth and death processes. This is very different in the case of subdiffusion where the details of the reaction kinetics are important to the overall dynamics of the system. The subdiffusive system retains a memory that there were particles that were created and annihilated. Moreover, the particle deaths temper the fractional diffusion. The example in the next section further highlights the significance of the reaction kinetics in reaction–subdiffusion systems.

#### *2.2. Fractional Fisher–KPP Equation*

The reaction rate equation for the Fisher–KPP Equation (1) is given in Equation (2). There are many different reaction kinetics that could be considered that are consistent with Equation (2). For example, the term (1 − *u*(*x*, *t*)) in its entirety could represent a per capita birth rate if is is strictly positive, or a per capita death rate if it is strictly negative. This term could also be regarded as being composed of two terms, a constant per capita birth term and a linear per capita death term. These three possibilities are highlighted for illustrative purposes below to show how different subdiffusion–reaction equations apply depending on the reaction kinetics.

(i) Constant per capita birth rate, *c*(*u*(*x*, *t*), *x*, *t*) = *ru*(*x*, *t*), linear per capita death rate, *a*(*u*(*x*, *t*), *x*, *t*) = *ru*(*x*, *t*),

$$\begin{array}{rcl}\frac{\partial u(\mathbf{x},t)}{\partial t} &=& D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ e^{-\int\_0^t ru(\mathbf{x},t') \, dt'} \, \_0 \mathcal{D}\_t^{1-\gamma} \left( e^{\int\_0^t ru(\mathbf{x},t') \, dt'} u(\mathbf{x},t) \right) \right] \\ &+ r u(\mathbf{x},t) (1 - u(\mathbf{x},t)), \quad u(\mathbf{x},t) \ge 0 \end{array} \tag{30}$$

*Entropy* **2020**, *22*, 1035

(ii) No births, *c*(*u*(*x*, *t*), *x*, *t*) = 0, linear per capita death rate, *a*(*u*(*x*, *t*), *x*, *t*) = *r*(1 − *u*(*x*, *t*)),

$$\begin{array}{rcl}\frac{\partial u(\mathbf{x},t)}{\partial t} &=& D\_{\gamma} \frac{\partial^{2}}{\partial \mathbf{x}^{2}} \left[ \mathbf{e}^{-\int\_{0}^{t} r(1-u(\mathbf{x},t')) \, dt'} \, \_{0} \mathcal{D}\_{t}^{1-\gamma} \left( \mathbf{e}^{\int\_{0}^{t} r(1-u(\mathbf{x},t')) \, dt'} u(\mathbf{x},t) \right) \right] \\ &+ r(1-u(\mathbf{x},t)) u(\mathbf{x},t), \quad u(\mathbf{x},t) \ge 1. \end{array} \tag{31}$$

(iii) Linear per capita birth rate, *c*(*u*(*x*, *t*), *x*, *t*) = *ru*(*x*, *t*)(1 − *u*(*x*, *t*)), no deaths, *a*(*u*(*x*, *t*), *x*, *t*) = 0,

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \,\_0 \mathcal{D}\_t^{1-\gamma} u(\mathbf{x},t) \right] + r u(\mathbf{x},t) (1 - u(\mathbf{x},t)), \quad 0 \le u(\mathbf{x},t) < \zeta \,. \tag{32}$$

Note that none of the factional Fisher–KPP reaction–diffusion equations can be expressed in the form

$$\frac{\partial^{\gamma}u(\mathbf{x},t)}{\partial t^{\gamma}} = D\_{\gamma} \frac{\partial^2 u(\mathbf{x},t)}{\partial \mathbf{x}^2} + ru(\mathbf{x},t)(1 - u(\mathbf{x},t)),\tag{33}$$

which results from simply replacing the integer order time derivative with a fractional order Caputo derivative. As noted above, an equation of this form could only be obtained from a CTRW if *∂*1−*<sup>γ</sup> <sup>∂</sup>t*1−*<sup>γ</sup>* (*ru*(*x*, *<sup>t</sup>*)(<sup>1</sup> − *<sup>u</sup>*(*x*, *<sup>t</sup>*)) is contrived as the cumulative instantaneous creation and annihilation of particles at the start of the waiting time between particle jumps at position *x* and time *t* [14].

The Greens function solutions for the nonlinear fractional reaction–diffusion equations, Equations (30)–(32), cannot be obtained simply using Fourier–Laplace transform methods. However, it is possible to find numerical solutions using the discrete time random walk methods described in [38].

The Fisher–KPP reaction rate equation, Equation (2) can be motivated by different chemical reactions consistent with the law of mass action [4]. One possibility is that of a single species *A* which undergoes coalescence reactions *A* + *A <sup>r</sup>* <sup>→</sup> *<sup>A</sup>*, and degradation reactions *<sup>A</sup> <sup>r</sup>* → *A* + *A*; also referrred to as reversible coagulation dynamics [39]. In this scenario the creation term, *ru*(*x*, *t*), arises from degradation and the annihilation term, <sup>−</sup>*r*(*u*2(*x*, *<sup>t</sup>*)), arises from coalescence. Another possibility is a branching–coalescence scheme [17], *B* + *X* - *X* + *X*, with the concentration of *B* maintained at a constant level. Equation (30) is a fractional Fisher–KPP reaction–diffusion equation consistent with each of the reaction schemes described here and it was obtained earlier for the branching–coalescence reaction scheme in [17].

#### *2.3. Fractional Fitzhugh–Nagumo Equation*

A widely studied reaction–diffusion system used to model wave propagation and pattern formation in excitable media is the Fitzhugh–Nagumo system of equations [40,41]

$$\frac{\partial v(\mathbf{x},t)}{\partial t} = -D\_\upsilon \frac{\partial^2 v(\mathbf{x},t)}{\partial \mathbf{x}^2} + v(\mathbf{x},t)(v(\mathbf{x},t) - a)(1 - v(\mathbf{x},t)) - w(\mathbf{x},t), \quad D\_\upsilon \ge 0, a \ge 0 \tag{34}$$

$$\frac{\partial w(\mathbf{x},t)}{\partial t} = -D\_w \frac{\partial^2 w(\mathbf{x},t)}{\partial \mathbf{x}^2} + \varepsilon \left( v(\mathbf{x},t) - bw(\mathbf{x},t) \right) \quad D\_w \ge 0, \varepsilon \ge 0, b \ge 0,\tag{35}$$

named after Fizthugh [42] and Nagumo [43]. In recent years, the single component fractional equation

$$\frac{\partial^{\mu}u(\mathbf{x},t)}{\partial t^{\mu}} = D\_{\mu} \frac{\partial^{2}u(\mathbf{x},t)}{\partial \mathbf{x}^{2}} + u(\mathbf{x},t)(u(\mathbf{x},t) - a)(1 - u(\mathbf{x},t)) \tag{36}$$

has been studied as a test equation for various methods of solution of time fractional reaction–diffusion equations (see, for example, [27,30] and references there-in).

A time fractional Fitzhugh–Nagumo system of equations consistent with Equation (11), derived from a CTRW formalism, can be obtained by identifying per capita annihilation rates, *av* and *aw*, and creation rates, *cv* and *cw*, as follows:

$$a\_v(v(\mathbf{x},t), w(\mathbf{x},t)) \quad = \quad a + v^2(\mathbf{x},t) + \frac{w(\mathbf{x},t)}{v(\mathbf{x},t)} \tag{37}$$

$$\left(c\_v(v(\mathbf{x},t), w(\mathbf{x},t))\right) \quad = \ (1+a)v^2(\mathbf{x},t),\tag{38}$$

$$a\_w(v(\mathbf{x}, t), w(\mathbf{x}, t)) \quad = \quad \epsilon b\_\prime \tag{39}$$

$$\left(c\_w(v(\mathbf{x},t), w(\mathbf{x},t))\right) \quad = \quad \epsilon v(\mathbf{x},t). \tag{40}$$

The corresponding time fractional Fitzhugh–Nagumo system is given by

$$\begin{array}{rcl}\frac{\partial v(\mathbf{x},t)}{\partial t} &=& D\_{\mathbf{v},\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[\varepsilon^{-\int\_0^t (v^2(\mathbf{x},t') + a + w(\mathbf{x},t')) \, dt'} \, \_0D\_t^{1-\gamma} \left(\varepsilon^{\int\_0^t (v^2(\mathbf{x},t') + a + w(\mathbf{x},t')) \, dt'} v(\mathbf{x},t)\right)\right] \\ &+ \boldsymbol{n}(\mathbf{r}\cdot\mathbf{t})(n(\mathbf{r}\ \mathbf{r}\ \mathbf{t}) - \mathbf{a})(1 - n(\mathbf{x}\ \mathbf{t}) - m(\mathbf{x}\ \mathbf{t})) \end{array} \tag{41}$$

$$+\upsilon(\mathbf{x},t)(\upsilon(\mathbf{x},t)-a)(1-\upsilon(\mathbf{x},t)-w(\mathbf{x},t))\tag{41}$$

$$\underline{\mathbf{x}},\underline{t}\prime\prime\prime\prime\prime\prime\prime\prime^{1-\gamma}\_{\mathcal{I}}\underline{\mathbf{v}}^{1-\gamma}\_{\mathcal{I}}\left(\underline{\mathbf{c}}^{\text{cbt}}\_{\mathcal{I}}\mathcal{D}^{1-\gamma}\_{\mathcal{I}}\left(\underline{\mathbf{c}}^{\text{cbt}}\_{\mathcal{I}}w(\mathbf{x},t)\right)\right)\vert\!\prime+\varepsilon\prime\prime\prime\prime\vert\_{\mathcal{I}}\vert\_{\mathcal{I}}\vert\_{\mathcal{I}}\tag{42}$$

$$\frac{\partial w(\mathbf{x},t)}{\partial t} = \,^0D\_{w,\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \epsilon^{-\varepsilon bt} \,\_0\mathcal{D}\_t^{1-\gamma} \left( \epsilon^{\varepsilon bt} w(\mathbf{x},t) \right) \right] + \epsilon v(\mathbf{x},t) - \varepsilon b w(\mathbf{x},t). \tag{42}$$

If *w*(*x*, *t*) = 0 this identifies a single component time fractional equation

$$\begin{split} \frac{\partial u(\mathbf{x},t)}{\partial t} &= \, ^0D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \mathbf{e}^{-\int\_0^t (u^2(\mathbf{x},t') + a) \, dt'} \, \_0 \mathcal{D}\_t^{1-\gamma} \left( \mathbf{e}^{\int\_0^t (u^2(\mathbf{x},t') + a) \, dt'} u(\mathbf{x},t) \right) \right] \\ &+ u(\mathbf{x},t) (u(\mathbf{x},t) - a) (1 - u(\mathbf{x},t)) \end{split} \tag{43}$$

which could be called a time fractional Fitzhugh–Nagumo equation, although the nomenclature could be misleading because a single component equation, without external sources or sinks, could not display Fitzhugh–Nagumo dynamics. Equation (43) is, however, well posed as a nonlinear time fractional reaction–diffusion equation that can be derived from a physically consistent CTRW, and thus it should be preferred for testing numerical methods of solution over the single component model equation, Equation (36), obtained by replacing an integer order time derivative with a Caputo fractional order derivative.

#### **3. Discussion**

Over the past two decades there have been large numbers of papers published on numerical methods for nonlinear fractional reaction–diffusion equations. The original motivation for including time fractional derivatives in reaction–diffusion equations was based on a CTRW description of diffusion with traps and reactions [13]. This description was refined and improved in a series of papers [14,16–24], leading to the formulation of time fractional reaction–diffusion equations along the lines of Equation (11). However, many investigations of time fractional reaction–diffusion equations have been carried out on systems obtained by simply replacing integer order time derivatives with Caputo fractional order derivatives. These studies may be interesting from a mathematical analysis point of view but they may not be directly relevant to mathematical modelling applications.

In this paper we have illustrated, through examples, how different time fractional reaction–diffusion equations can be formulated, consistent with an underlying CTRW formalism, taking into account the reaction kinetics. There are three points worth noting in this context: (i) The fractional reaction–diffusion systems considered in this approach are relevant to well-mixed reactions that are not diffusion limited. The reaction dynamics can often be formulated using the law of mass action in these systems. (ii) Different time fractional reaction–diffusion systems can be formulated that are consistent with the same equation for the reaction dynamics. It is important to know the reaction kinetics. (iii) Reaction–subdiffusion equations typically involve a spatial Laplacian operating on a generalized tempered time fractional derivative. The solution of these types of equations would typically require

very different numerical approaches than those proposed for reaction–diffusion systems with a fractional order Caputo time derivative replacing the integer order time derivative.

It is hoped that the physically motivated time fractional reaction–diffusion equations, such as Equations (30) and (43), will become more widely used, replacing the simpler ad-hoc equations, such as Equations (33) and (36), as a test for different methods of solution of nonlinear fractional reaction–diffusion systems. Beyond this, there is a real need for physical experiments to be devised and carried out to validate and calibrate time fractional reaction–diffusion models.

**Author Contributions:** The authors have contributed equally to all aspects of this work including; conceptualization, methodology, formal analysis, writing, project administration, funding acquisition. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Australian Commonwealth Government ARC DP200100345.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Mean Square Displacements**

The mean square displacement of particles evolving according to the fractional diffusion equation

$$\frac{\partial u(\mathbf{x},t)}{\partial t} = D\_{\gamma} \frac{\partial^2}{\partial \mathbf{x}^2} \left[ \mathfrak{e}^{-\text{at}} \,\_0 \mathcal{D}\_t^{1-\gamma} \left( \mathfrak{e}^{\text{at}} u(\mathbf{x},t) \right) \right], \quad \mathfrak{a} > 0. \tag{A1}$$

can simply be obtained from the infinite domain Greens function solution *G*(*x*, *t*) with initial condition *G*(*x*, 0) = *δ*(*x*), via

$$
\langle \chi^2(t) \rangle = \lim\_{q \to 0} -\frac{d^2}{dq^2} \hat{G}(q, t) \tag{A2}
$$

where *G*ˆ(*q*, *t*) denotes the Fourier transform w.r.t. *x*. We begin by taking the Fourier transform of Equation (A2) and re-arranging terms to write

$$\frac{\partial}{\partial t} \left( \epsilon^{\text{at}} \mathbf{\hat{G}}(q, t) \right) = -q^2 D\_{\gamma^\* 0} \mathcal{D}\_t^{1 - \gamma} \left( \epsilon^{\text{at}} \mathbf{\hat{G}}(q, t) \right) + a \epsilon^{\text{at}} \mathbf{\hat{G}}(q, t). \tag{A3}$$

We now introduce

$$
\hat{F}(q,t) = \mathfrak{e}^{at} \hat{G}(q,t),
\tag{A4}
$$

noting that *F*ˆ(*q*, 0) = *G*ˆ(*q*, 0) = 1, and then

$$
\langle \mathbf{x}^2(t) \rangle = e^{-at} \lim\_{q \to 0} -\frac{d^2}{dq^2} \mathcal{F}(q, t). \tag{A5}
$$

Starting with the differential equation for *F*ˆ(*q*, *t*),

$$\frac{\partial}{\partial t} \left( \mathcal{F}(q, t) \right) = -q^2 D\_{\gamma} \, \_0 \mathcal{D}\_t^{1-\gamma} \left( \mathcal{F}(q, t) \right) + a \mathcal{F}(q, t), \tag{A6}$$

we take the Laplace transform w.r.t. time and rearrange terms to write

$$\hat{F}(\eta, \mathbf{s}) = \frac{1}{(\mathbf{s} + D\_{\gamma}\eta^{2}\mathbf{s}^{1-\gamma} - \mathfrak{a})} \tag{A7}$$

From this we have

$$\begin{split} \lim\_{q \to 0} -\frac{d^2}{dq^2} \hat{\mathcal{F}}(q,s) &= \quad \frac{2D\_{\gamma}}{s^{\gamma-1}a^2 - 2s^{\gamma}a + s^{\gamma+1}},\\ &= \quad 2D\_{\gamma} \frac{s^{1-\gamma}}{(s-a)^2}. \end{split} \tag{A8}$$

We now take the inverse Laplace transform using Equation (2.3.26) of [34] to write

$$\lim\_{q \to 0} -\frac{d^2}{dq^2} \mathring{F}(q, t) = 2D\_{\gamma} t^{\gamma} E\_{1, \gamma}^{(1)}(at),\tag{A9}$$

and then

$$
\langle \mathbf{x}^2(t) \rangle = 2D\_{\gamma} \mathbf{e}^{-at} t^{\gamma} E\_{1,\gamma}^{(1)}(at). \tag{A10}
$$

#### **Appendix B. Fox H-Function and Meijer G-Function Solutions**

The Fox H-function and the Meijer G-function are defined as path integrals [36]

$${}\_{1}H\_{p,q}^{m,n} \left[ \begin{array}{c} \left(a\_1, A\_1\right) \left(a\_2, A\_2\right) \dots \left(a\_p, A\_p\right) \\ \left(b\_1, B\_1\right) \left(b\_2, B\_2\right) \dots \left(b\_q, B\_q\right) \end{array} \right] = \frac{1}{2\pi i} \int\_L \frac{\prod\_{j=1}^m \Gamma(b\_j + B\_j s) \prod\_{j=1}^n \Gamma(1 - a\_j - A\_j s)}{\prod\_{j=n+1}^\Gamma \Gamma(1 - b\_j - B\_j s) \prod\_{j=n+1}^p \Gamma(a\_j + A\_j s)} z^{-\delta} \, ds,\tag{A11}$$

and

$$\mathbf{G}\_{p,q}^{m,n}\left[z\left\lfloor\begin{array}{c}a\_1,a\_2,\ldots,a\_p\\b\_1,b\_2,\ldots,b\_q\end{array}\right\rfloor=\frac{1}{2\pi i}\int\_{\mathcal{L}}\frac{\prod\_{j=1}^{m}\Gamma(b\_j-s)\prod\_{j=1}^{n}\Gamma(1-a\_j+s)}{\prod\_{j=m+1}^{q}\Gamma(1-b\_j+s)\prod\_{j=n+1}^{p}\Gamma(a\_j-s)}z^sds,\tag{A12}$$

respectively, where 0 <sup>≤</sup> *<sup>n</sup>* <sup>≤</sup> *<sup>p</sup>*, 1 <sup>≤</sup> *<sup>m</sup>* <sup>≤</sup> *<sup>q</sup>*, {*aj*, *bj*} ∈ <sup>C</sup>, {*αj*, *<sup>β</sup>j*} ∈ <sup>R</sup>+, and *<sup>L</sup>* is a suitably chosen contour. With a simple change of variables it follows that if *Aj* = *C*, *j* = 1..*p* and *Bj* = *C*, *j* = 1..*q* then

$$H\_{p,q}^{\mathfrak{m},\mathfrak{n}}\left[z\left|\begin{array}{c}(a\_1,\mathbb{C})(a\_2,\mathbb{C})\dots(a\_p,\mathbb{C})\\(b\_1,\mathbb{C})(b\_2,\mathbb{C})\dots(b\_q,\mathbb{C})\end{array}\right.\right] = \frac{1}{\mathbb{C}}\mathcal{G}\_{p,q}^{\mathfrak{m},\mathfrak{n}}\left[z^{\frac{1}{\mathfrak{k}}}\begin{array}{c}a\_1,a\_2,\dots,a\_p\\b\_1,b\_2,\dots,b\_q\end{array}\right.\tag{A13}$$

The Legendre duplication formula

$$
\Gamma(2z) = \frac{2^{2z-1}}{\sqrt{\pi}} \Gamma(z) \Gamma(z+\frac{1}{2}) \tag{A14}
$$

is useful for reducing Fox H-functions to Meijer G-functions in the expressions below.

The Fox-H function

$$H\_{1,2}^{2,0}\left[z\left|\begin{array}{c} \left(1-\frac{\gamma}{2}+j,\gamma\right) \\ \left(0,1\right)\left(\frac{1}{2}+j+1\right) \end{array}\right.\right] = \frac{1}{2\pi i} \int\_{\mathcal{L}} \frac{\Gamma(s)\Gamma\left(\frac{1}{2}+j+s\right)}{\Gamma\left(1-\frac{\gamma}{2}+j+\gamma s\right)} z^{-s} \,ds\tag{A15}$$

appears in the solutions, Equations (21) and (25). Here we show how, in the case *γ* = <sup>1</sup> <sup>2</sup> , this can be represented as a Meijer G-function leading to the solutions in Equations (26) and (27). With *γ* = <sup>1</sup> <sup>2</sup> in Equation (A15) we have

$$H\_{1,2}^{2,0}\left[z\left|\begin{array}{c} \left(\frac{3}{4} + j, \frac{1}{2}\right) \\ (0,1)\left(\frac{1}{2} + j + 1\right) \end{array}\right.\right] = \frac{1}{2\pi i} \int\_{\mathcal{L}} \frac{\Gamma(s)\Gamma(\frac{1}{2} + j + s)}{\Gamma(\frac{3}{4} + j + \frac{s}{2})} z^{-s} \, ds. \tag{A16}$$

We now use the duplication formula, Equation (A14), to replace

$$
\Gamma(s) = \frac{2^{s-1}}{\sqrt{\pi}} \Gamma(\frac{s}{2}) \Gamma(\frac{s}{2} + \frac{1}{2}),
\tag{A17}
$$

and

$$
\Gamma(\frac{1}{2} + j + s) = \frac{2^{\frac{1}{2} + j + s - 1}}{\sqrt{\pi t}} \Gamma(\frac{1}{4} + \frac{j}{2} + \frac{s}{2}) \Gamma(\frac{3}{4} + \frac{j}{2} + \frac{s}{2}), \tag{A18}
$$

so that

$$\begin{split} \, \_{12}H\_{12}^{20} \left[ z \left| \begin{array}{c} \left( \frac{3}{4} + j, \frac{1}{2} \right) \\ (0, 1)(\frac{1}{2} + j + 1) \end{array} \right. \right] &= \, \_{21}\hat{I}\_{1} \frac{\Delta^{\frac{1}{2} + j + 2s - 2}}{\pi} \frac{\Gamma(\frac{s}{2})\Gamma(\frac{1}{2} + \frac{s}{2})\Gamma(\frac{1}{4} + \frac{j}{2} + \frac{s}{2})\Gamma(\frac{3}{4} + \frac{s}{2} + \frac{s}{2})}{\Gamma(\frac{3}{4} + j + \frac{s}{2})} z^{-s} \, ds, \\ &= \, \_{21}\hat{I}\_{1,4}^{4,0} \left[ \begin{array}{c} z \\ \end{array} \left( \begin{array}{c} (\frac{3}{4} + j, \frac{1}{2}) \\ (0, \frac{1}{2})(\frac{1}{2}, \frac{1}{2})(\frac{1}{4} + \frac{j}{2}, \frac{1}{2})(\frac{3}{4} + \frac{j}{2}, \frac{1}{2}) \end{array} \right. \right] \\ &= \, \_{21}\hat{I}\_{1,4}^{4,0} \left[ (\frac{z}{4})^{2} \begin{array}{c} (\frac{3}{4} + j) \\ (0, \frac{1}{2}, \frac{1}{4} + \frac{j}{2}, \frac{3}{4} + \frac{j}{2}) \end{array} \right]. \end{split} \tag{A19}$$

Note that if *j* = 0 this simplifies further to

$$\frac{1}{\sqrt{2\pi}}G\_{1,4}^{4,0}\left[ (\frac{z}{4})^2 \left| \begin{array}{c} \frac{3}{4} \\ 0, \frac{1}{2}, \frac{1}{4}, \frac{3}{4} \end{array} \right. \right] = \frac{1}{\sqrt{2}\pi}G\_{0,3}^{3,0}\left[ (\frac{z}{4})^2 \left| \begin{array}{c} - \\ 0, \frac{1}{2}, \frac{1}{4} \end{array} \right. \right]. \tag{A20}$$

The Meijer G-functions above are of the general form *Gq*,0 *<sup>p</sup>*,*q*(*z*) where asymptotic expansions are known for *z* 1 [37]. In particular, using Equation (22) in [37], we have

$$G\_{1,4}^{4,0} \left[ z \begin{array}{c} \frac{3}{4} + j \\ 0, \frac{1}{2}, \frac{1}{4} + \frac{j}{2}, \frac{3}{4} + \frac{j}{2} \end{array} \right] \sim \exp(-3z^{\frac{1}{4}}) z^{-\frac{1}{12}} \sum\_{k=0}^{\infty} \frac{M\_k(j)}{z^{\frac{k}{2}}} \tag{A21}$$

for all *<sup>j</sup>* <sup>∈</sup> <sup>N</sup> and *<sup>z</sup>* 1, where the *Mk*(*j*) are functions of the parameters, including *<sup>j</sup>*, but not the variable *z*. If we let *M* denote the largest *Mk*(*j*) then we can evaluate the sum as a geometric series to write

$$G\_{1,4}^{4,0} \left[ z \begin{array}{c} \frac{3}{4} + j \\ 0, \frac{1}{2}, \frac{1}{4} + \frac{j}{2}, \frac{3}{4} + \frac{j}{2} \end{array} \right] \sim M \frac{\exp(-3z^{\frac{1}{4}}) z^{\frac{1}{4}}}{z^{\frac{1}{4}} - 1} \tag{A22}$$

and similarly for *G*3,0 0,3.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Fractal and Entropy Analysis of the Dow Jones Index Using Multidimensional Scaling**

#### **José A. Tenreiro Machado**

Department of Electrical Engineering, Institute of Engineering, Polytechnic Institute of Porto, 4249-015 Porto, Portugal; jtm@isep.ipp.pt; Tel.: +351-228340500

Received: 4 September 2020; Accepted: 30 September 2020; Published: 8 October 2020

**Abstract:** Financial time series have a fractal nature that poses challenges for their dynamical characterization. The Dow Jones Industrial Average (DJIA) is one of the most influential financial indices, and due to its importance, it is adopted as a test bed for this study. The paper explores an alternative strategy to the standard time analysis, by joining the multidimensional scaling (MDS) computational tool and the concepts of distance, entropy, fractal dimension, and fractional calculus. First, several distances are considered to measure the similarities between objects under study and to yield proper input information to the MDS. Then, the MDS constructs a representation based on the similarity of the objects, where time can be viewed as a parametric variable. The resulting plots show a complex structure that is further analyzed with the Shannon entropy and fractal dimension. In a final step, a deeper and more detailed assessment is achieved by associating the concepts of fractional calculus and entropy. Indeed, the fractional-order entropy highlights the results obtained by the other tools, namely that the DJIA fractal nature is visible at different time scales with a fractional order memory that permeates the time series.

**Keywords:** multidimensional scaling; fractals; fractional calculus; financial indices; entropy; Dow Jones; complex systems

#### **1. Introduction**

The Dow Jones Industrial Average (DJIA), or Dow Jones, is a stock market index that reflects the stock performance of 30 relevant companies included in the U.S. stock exchanges. The DJIA is the second-oldest among the U.S. market indices and started on 26 May 1896. The DJIA is the best-known index in finance and is considered a key benchmark for assessing the global business trend in the world.

The financial time series reflect intricate effects between a variety of agents coming from economic and social processes, geophysical phenomena, health crisis, and political strategies [1–4]. At present, we find all sorts of financial indices for capturing the dynamics of markets and stock exchange institutions. In general, all have a fractal nature with variations that are difficult to predict [5–13]. A number of techniques have been proposed to investigate the financial indices and to unravel the embedded complex dynamics [14–18]. Such studies adopt the underlying concept of linear time flow and consider that the fractal nature of the index is intrinsic to its own artificial nature.

This paper studies the interplay between the DJIA values and the time flow. The present day standard assumption is that time is a continuous linear succession of events often called the "arrow of time". We must clarify that (i) the nature of the time variable, either continuous or discrete, either with a constant rhythm of variation or not, is simply under the light of the financial index, so that we are independent of the classical laws of physics, (ii) merely the DJIA is adopted since other financial indices reveal the same type of behavior, but are limited to much shorter time series, and (iii) no financial foreseeing is intended. Therefore, the Gedankenexperiment in the follow-up addresses the controversy about the texture of time [19–22], but just in the limited scope of financial indices.

For this purpose, the concepts of multidimensional scaling (MDS), fractional dimension, entropy, and fractional calculus are brought up as useful tools to tackle complex systems. MDS is a computational tool for visualizing the level of similarity between items of a dataset. The MDS translates information regarding the pairwise distances among a set of items into a configuration of representative points of an abstract Cartesian space [23–29]. Mandelbrot coined the word "fractal" [30,31] for complex objects that are self-similar across different scales. Fractals can be characterized by the so-called fractal dimension, which may be seen as quantifying complexity [32–34]. Information theory was introduced by Claude Shannon [35] and has as the primary concept the information content of a given event, which is a decreasing function of its probability [36–39]. The entropy of a random variable is the average value of information and has been proven to be a valuable tool for assessing complex phenomena [40–42]. Fractional calculus (FC) is the branch of mathematical analysis that generalizes differentiation and integration to real or complex orders [43–48]. The topic was raised by Gottfried Leibniz in 1695 and remained an exotic field until the Twentieth Century. In the last few decades, FC became a popular tool for analyzing phenomena with long-range memory and non-locality [49–57].

The association of these mathematical and computational tools yields relevant viewpoints when analyzing financial indices [7–9,11,58–61].

Bearing these ideas in mind, this paper is organized as follows. Section 2 introduces the dataset and methods and develops some initial experiments using MDS. Section 3 explores the use of fractal and entropy analysis of the MDS loci. Finally, Section 4 draws the main conclusions.

#### **2. Dataset and Methods**

#### *2.1. The DJIA Dataset*

The dataset consists of the daily close values of the DJIA from 28 December 1959, up to 1 September 2020, corresponding to a time series of *T* = 15,832 days, covering approximately half a century. Each week consists of 5 working days, and some missing data due to special events were estimated by means of linear interpolation between adjacent values.

We assess the dynamics of the DJIA by comparing its values *x*(*t*) for a given time window of *tw* days. Therefore, the *i*th vector of DJIA values consists of *ξ<sup>i</sup>* = [*x*(1),..., *x* (*tw*)], where days "1" and "*tw*" denote the start and end time instants in the time window. Hereafter, for simplicity, we consider consecutive disjoint time windows, and a number of experiments with *tw* having values multiples of 5 days. Therefore, the total number of time windows (and vectors) is *Nw* = *<sup>T</sup> tw* , where · denotes the floor function, which gives as the output the greatest integer less than or equal to the input value.

The evolution of the DJIA in time reveals a fractal nature as represented in Figure 1. If we calculate the histogram of the logarithm of the returns, that is of *lr* <sup>=</sup> ln *x*(*t*+1) *x*(*t*) , we verify a sustained noisy behavior and fat tails in the statistical distribution as depicted in Figure 2 for time windows of *tw* = 60 days.

**Figure 1.** Daily close values of the DJIA from 28 December 1959, up to 1 September 2020.

**Figure 2.** Histogram of the logarithm of the returns of the DJIA from 28 December 1959, up to 1 September 2020, for time windows of *tw* = 60 days.

#### *2.2. Distances*

The DJIA dynamics is studied indirectly through the MDS by comparing the vectors (*ξ<sup>i</sup>* (1),..., *ξ<sup>i</sup>* (*tw*)), *i* = 1, ... , *Nw*, *t* = 1, ... , *tw*, and analyzing the properties of the resulting plot in the perspective of entropy and fractal dimension. This approach requires the definition of an appropriate distance [62]. A function *<sup>d</sup>* : A×A→ <sup>R</sup> on a set <sup>A</sup> is a "distance" when, for the items *<sup>ξ</sup>i*, *<sup>ξ</sup>j*, *<sup>ξ</sup><sup>k</sup>* ∈ A, it satisfies the conditions (i) *d*(*ξi*, *ξj*) ≥ 0 (non-negativity), (ii) *d*(*ξi*, *ξj*) = 0 (identity of indiscernibles) if and only if *ξ<sup>i</sup>* = *ξj*, (iii) *d*(*ξi*, *ξj*) = *d*(*ξj*, *ξi*) (symmetry), and (iv) *d*(*ξi*, *ξk*) ≤ *d*(*ξi*, *ξj*) + *d*(*ξj*, *ξk*) (triangle inequality). If the three conditions are followed, then the function is a "metric" and together

with A yields a "metric space". Obviously, these conditions still allow a considerable freedom, and we find in the literature a plethora of possible metrics each with its own pros and cons. In practice, users adopt one or more distances if they capture adequately the characteristics of the items under assessment. Therefore, we start by considering a test bench of 10 distinct indices, namely the Manhattan, Euclidean, Tchebychev, Lorentzian, Sørensen, Canberra, Clark, divergence, angular, and Jaccard distances (denoted as {Ma, Eu, Tc, Lo, So, Ca, Cl, Dv, Ac, Ja}), given by [63]:

$$d\_{i,j}^{Ma} = \sum\_{t=1}^{t\_w} \left| \mathfrak{z}\_i(t) - \mathfrak{z}\_j(t) \right|,\tag{1a}$$

$$d\_{i,j}^{\mathbb{E}u} = \sqrt{\sum\_{t=1}^{t\_w} \left(\xi\_i(t) - \xi\_j(t)\right)^2},\tag{1b}$$

$$d\_{i,j}^{T\underline{c}} = \max\_{\underline{t}} \left( \xi\_i(t) - \xi\_j(t) \right),\tag{1c}$$

$$d\_{i,j}^{Lo} = \sum\_{t=1}^{t\_w} \log\left(1 + \left|\mathfrak{J}\_i\left(t\right) - \mathfrak{J}\_j\left(t\right)\right|\right),\tag{1d}$$

$$d\_{i,j}^{So} = \frac{\sum\_{t=1}^{t\_w} \left| \xi\_i^{\pi}(t) - \xi\_j^{\pi}(t) \right|}{\sum\_{t=1}^{t\_w} \left( \left| \xi\_i^{\pi}(t) \right| + \left| \xi\_j^{\pi}(t) \right| \right)},\tag{1e}$$

$$d\_{i,j}^{\mathbb{C}a} = \sum\_{t=1}^{t\_w} \frac{\left| \mathfrak{J}\_i^{\mathbb{x}}(t) - \mathfrak{J}\_j^{\mathbb{x}}(t) \right|}{\left| \mathfrak{J}\_i^{\mathbb{x}}(t) \right| + \left| \mathfrak{J}\_j^{\mathbb{x}}(t) \right|},\tag{1}$$

$$d\_{i,j}^{\rm CI} = \sqrt{\sum\_{t=1}^{t\_w} \left( \frac{|\xi\_i^{\rm x}(t) - \xi\_j^{\rm x}(t)|}{|\xi\_i^{\rm x}(t)| + |\xi\_j^{\rm x}(t)|} \right)^2},\tag{1g}$$

$$d\_{i,j}^{Dv} = \sum\_{t=1}^{t\_w} \frac{\left(\xi\_i(t) - \xi\_j(t)\right)^2}{\left(|\xi\_i(t)| + |\xi\_j(t)|\right)^2} \,' \tag{1h}$$

$$d\_{i,j}^{Ac} = \arccos\left(r\_{ij}\right), r\_{ij} = \frac{\sum\_{t=1}^{t\_w} \xi\_i(t)\xi\_j(t)}{\sqrt{\sum\_{t=1}^{t\_w} \xi\_i^2(t) \sum\_{t=1}^{t\_w} \xi\_j^2(t)}},\tag{1i}$$

$$d\_{i,j}^{la} = \frac{\sum\_{t=1}^{t\_w} \left(\xi\_i(t) - \xi\_j(t)\right)^2}{\sum\_{t=1}^{t\_w} \xi\_i^2(t) + \sum\_{t=1}^{t\_w} \xi\_j^2(t) - \sum\_{t=1}^{t\_w} \xi\_i(t)\xi\_j(t)},\tag{1j}$$

where *ξ<sup>i</sup>* and *ξj*, *i*, *j* = 1, ... , *Nw*, are the *i*th and *j*th vectors of the DJIA time series, each of dimension *tw*. The Manhattan, Euclidean, and Tchebychev distances are particular cases of the Minkowski distance *dMi <sup>i</sup>*,*<sup>j</sup>* = ∑*tw t*=1 *ξi*(*t*) − *ξj*(*t*) *q* 1 *q* , namely for *q* = 1, *q* = 2 and *q* → ∞, respectively. The Lorentzian distance applies the natural logarithm to the absolute difference with 1 added to guarantee the non-negativity property and to eschew the log of zero. We find in the literature several distinct versions of the Sørensen distance, eventually with other names, and representing a statistic used for comparing the similarity between two samples. The Canberra and Clark distances are weighted versions of the Manhattan and Euclidean distances. These expressions replace *ξi*(*t*) − *ξj*(*t*) by |*ξ*(*t*)−*ξj*(*t*)|/(|*ξi*(*t*)|+|*ξj*(*t*)|) and are sensitive to small changes near zero. The angular cosine distance follows the cosine similarity *rij* that comes from the inner product of two vectors, *ξ<sup>i</sup>* · *ξj*. The angular

cosine distance *dAc <sup>i</sup>*,*<sup>j</sup>* gives the angle between the vectors *ξ<sup>i</sup>* and *ξj*. The Jaccard distance is the ratio of the size of the symmetric difference to the union of two sets.

#### *2.3. The MDS Loci*

Once having defined the metric for comparing the vectors, the MDS requires the construction of a matrix **D** = *di*,*<sup>j</sup>* of item-to-item distances. In our case, "item" corresponds to a *tw*-dim vectors. Therefore, the square matrix **D** is symmetric, with the main diagonal of zeros and dimension *Nw* × *Nw* equal to the number of items. The MDS computational algorithm tries to plot the items in a low-dimensional space so that users can easily analyze possible relationships that are difficult to unravel in a high number of dimensions. In other words, the MDS performs a dimension reduction and plots items in a *p* < *Nw* dimensional space, by estimating a matrix **D**ˆ = ˆ*di*,*j* , corresponding to the *p*-dim items *x*ˆ*i*, so that the distances, ˆ*di*,*j*, mimic the original ones, *di*,*j*.

The classical MDS can perform the optimization procedure based on a variety of loss functions, often called "strain", that are a form of minimizing the residual sum of squares. The metric MDS generalizes the optimization procedure called "stress", *SD*, such as:

$$\mathcal{S}\_{D}\left(\xi\_{1},\ldots,\xi\_{\dot{\prime}}\right) = \left[\sum\_{i,j} \left(\hat{d}\_{i,\dot{j}} - d\_{i,\dot{j}}\right)^{2}\right]^{\frac{1}{2}},\tag{2}$$

or:

$$S\_D\left(\xi\_1, \ldots, \xi\_\uparrow^\mp\right) = \left[\frac{\sum\_{i,j} \left(\hat{d}\_{i,j} - d\_{i,j}\right)^2}{\sum\_{i,j} d\_{i,j}^2}\right]^{\frac{1}{2}}\tag{3}$$

where *di*,*<sup>j</sup>* = *ξ<sup>i</sup>* − *ξ<sup>j</sup>* , *i*, *j* = 1, . . . , *Nw*.

The generalized MDS is an extension of metric formulation, so that the target space is an arbitrary smooth non-Euclidean space.

Once having obtained the MDS estimate coordinates of the objects *x*ˆ*i*, the user can decide the dimension *p* for visualization. Usually, the values *p* = 2 and *p* = 3 are selected since they allow a direct representation. Moreover, the quality of the MDS approximation can be assessed by means of the Sheppard and stress charts. The Sheppard diagram plots ˆ*di*,*<sup>j</sup>* vs. *di*,*j*. If the points follow a straight/curved line, this means a linear/non-linear relationship, but in both cases, the smaller the scatter, the better the approximation is. A second assessment tool consists of the plot of *SD* vs. *p*. Usually, the curve is monotonic decreasing with a large diminishing at first and a slow variation afterwards.

Since the MDS locus results from relative information (i.e., the distances), the coordinates usually do not have some physical meaning, and the user can rotate, shift, or magnify the representation to have a better view. Moreover, distinct distances lead to different plots that are correct from the mathematical and computational viewpoints, but that reflect distinct characteristics of the dataset. Therefore, it is up to the user to choose one or more distances that better highlight the aspects of the dataset under study.

Often, it is recommended to pre-process the data before calculating the distances in order to reduce the sensitivity to some details such as different units or a high variation of numerical values. In the case of the DJIA, two data pre-processing schemes (also called normalizing, or data transformation), P<sup>1</sup> and P2, are considered: (i) subtracting the arithmetic average and dividing by the standard variation, that is by calculating <sup>P</sup><sup>1</sup> : *<sup>x</sup>*(*t*) <sup>←</sup> *<sup>x</sup>*(*t*)−*<sup>μ</sup> σ* , where *μ* = <sup>1</sup> *<sup>T</sup>* <sup>∑</sup>*<sup>T</sup> <sup>t</sup>*=<sup>1</sup> *x*(*t*) and *σ* = <sup>1</sup> *<sup>T</sup>*−<sup>1</sup> <sup>∑</sup>*<sup>T</sup> <sup>t</sup>*=<sup>1</sup> (*x*(*t*) − *μ*) 2 , and (ii) by applying a logarithm so that {P<sup>2</sup> : *x*(*t*) ← lg (*x* (*t*))}. The linear transformation P<sup>1</sup> is often adopted in statistics and signal processing [64–68], while the non-linear transformation P<sup>2</sup> can be adopted with signals revealing an exponential-like evolution [69–73]. Of course, other data transformations could be

envisaged, but these two are commonly adopted. Therefore, the main question concerning this issue is to understand to what extend the pre-processing influences the final results.

## 2.3.1. Data Pre-Processing Using P<sup>1</sup>

Figure 3 shows the MDS locus for *p* = 3 and *tw* = 60 days, with pre-processing P<sup>1</sup> and using the Lorentzian and Canberra distances, *dLo <sup>i</sup>*,*<sup>j</sup>* and *<sup>d</sup>Ca <sup>i</sup>*,*<sup>j</sup>* . The larger circle represents the first vector, and the lines connect two consecutive dots (representing the vectors from two consecutive time windows). The lines are included simply for auxiliary purposes and for highlighting the discontinuities. The MATLAB nonclassical multidimensional scaling algorithm mdscale and the Sammon's nonlinear mapping criterion sammon were used. Figure 4 illustrates the corresponding Sheppard and stress diagrams for the Canberra distance (1f). For the sake of parsimony, the other charts are not represented.

**Figure 3.** The multidimensional scaling (MDS) locus, *x*ˆ*i*, of the DJIA dataset for *p* = 3 and *tw* = 60 days (*Nw* = 263), with pre-processing P<sup>1</sup> and using the Lorentzian (1d) and Canberra (1f) distances.

**Figure 4.** The Sheppard diagram, ˆ*di*,*<sup>j</sup>* vs. *di*,*j*, for *p* = 3, and stress plot, *SD* vs. *p*, of the DJIA dataset with *tw* = 60 days, with pre-processing P<sup>1</sup> and using the Canberra distance (1f).

We verify that the MDS loci exhibit segments where we have an almost continuous evolution and others with strong discontinuities. The first segments portray relatively smooth dynamics, while the second ones represent dramatic variations, in the perspective of the adopted distance and visualization technique. These dynamical effects are not read in the same way as with the standard time representations. Moreover, their visualization varies according to the type of distance adopted to construct the matrix **D**. This should be expected, since it is well known that each distance highlights a specific set of properties embedded in the original time series and that the selection of one of more distances has to be performed on a case-by-case basis, before deciding those more adapted to the dataset.

Another relevant topic is the effect of the time window *tw* on the results. In other words, we can ask how the dimension of the vector *ξi*, *i* = 1, ... , *Nw*, capturing the DJIA time dynamics, influences the MDS representation. For example, Figure 5 shows the MDS locus for *p* = 3, *tw* = 10 days (*Nw* = 1583), and the Canberra distance (1e).

**Figure 5.** The MDS locus, *x*ˆ*i*, of the DJIA dataset for *p* = 3 and *tw* = 10 days (*Nw* = 1583), with pre-processing P<sup>1</sup> and using the Canberra distance (1e).

## 2.3.2. Data Pre-Processing Using P<sup>2</sup>

Figure 6 shows the MDS locus for *p* = 3 and *tw* = 60 days, with pre-processing P<sup>2</sup> and using the Lorentzian and Canberra distances, *dLo <sup>i</sup>*,*<sup>j</sup>* and *<sup>d</sup>Ca <sup>i</sup>*,*<sup>j</sup>* . Figure 7 depicts the Sheppard and stress diagrams for the Canberra distance (1f).

We can also check the effect of the time window *tw*. Figure 8 shows the MDS locus for *p* = 3, *tw* = 10 days (*Nw* = 1583), and the Canberra distance (1e) revealing, again, a slight diminishing of the volatility.

**Figure 6.** The MDS locus, *x*ˆ*i*, of the DJIA dataset for *p* = 3 and *tw* = 60 days (*Nw* = 263), with pre-processing P<sup>2</sup> and using the Lorentzian (1d) and Canberra (1f) distances.

**Figure 7.** The Sheppard diagram, ˆ*di*,*<sup>j</sup>* vs. *di*,*j*, for *p* = 3, and the stress plot, *SD* vs. *p*, of the DJIA dataset with *tw* = 60 days, with pre-processing P<sup>2</sup> and using the Canberra distance (1f).

**Figure 8.** The MDS locus, *x*ˆ*i*, of the DJIA dataset for *p* = 3 and *tw* = 10 days (*Nw* = 1583), with pre-processing P<sup>2</sup> and using the Canberra distance (1e).

As in the previous sub-section, we observe that the MDS plots reveal some segments almost with a continuous evolution and some with discontinuities. Furthermore, as before, increasing *tw* reduces the volatility in the MDS representations. These results, with regions of smooth variation, interspersed with abrupt changes, were already noticed since they reflect relativistic time effects [74,75]. Such dynamics was interpreted as a portrait of the fundamental non-smooth nature of the flow of the time variable underlying the DJIA evolution. Nonetheless, we are still far from a comprehensive understanding of the MDS loci, and we need to design additional tools to extract additional conclusions.

#### **3. Fractal, Entropy, and Fractional Analysis**

We consider the fractal dimension and entropy measures for analyzing the 3-dim portraits produced by the MDS.

The fractal dimension, *fd*, characterizes the fractal pattern of a given object by quantifying the ratio of the change in detail to the change in scale. Several types of fractal dimension can be found in the literature. In our case, *fd* is calculated by means of the box counting method as the exponent of a power law *N* () = *a* <sup>−</sup>*fd* , where *a* is a parameter that depends on the shape and size of the object, and *N* and  stand for the number of boxes required to capture the object and the size (or scale) of the box, respectively. Therefore, *fd* can be estimated as:

$$f\_d = -\lim\_{\epsilon \to 0} \frac{\ln \left( N \left( \epsilon \right) \right)}{\ln \left( \epsilon \right)}.\tag{4}$$

The entropy of a random variable is the average level of "information" of the corresponding probability distribution. The key cornerstone of the Shannon theory consists of the information content, which for an event having probability of occurrence *pi*, is given by:

$$I\left(p\_i\right) = -\ln p\_i.\tag{5}$$

For a 3-dim random variable (*X*,*Y*, *Z*) with probability distribution *pXYZ*, the Shannon entropy, *HXYZ*, is given by:

$$H\_{XYZ} = -\sum\_{X} \sum\_{Y} p\_{XYZ} \ln \left( p\_{XYZ} \right),\tag{6}$$

where − ln (*pXYZ*) is the information for the event with probability *pXYZ*.

The concept of entropy can be generalized in the scope of fractional calculus [76–86]. This approach gives more freedom to adapt the entropy measure to the phenomenon under study by adjusting the fractional order. The information and entropy of order *<sup>α</sup>* <sup>∈</sup> <sup>R</sup> are given by [77,87]:

$$I\_a\left(p\_i\right) = D^a I\left(p\_i\right) = -\frac{p\_i^{-a}}{\Gamma\left(a+1\right)} \left[\ln p\_i + \psi\left(1\right) - \psi\left(1-a\right)\right] \tag{7}$$

$$H\_{XYZ}^{a} = \sum\_{i} \left\{ -\frac{p\_i^{-a}}{\Gamma(a+1)} \left[ \ln p\_i + \psi \left( 1 \right) - \psi \left( 1 - a \right) \right] \right\} p\_i \tag{8}$$

where Γ (·) and *ψ* (·) represent the gamma and digamma functions.

The parameter *α* gives an extra degree of freedom to adapt the sensitivity of the entropy calculation of each specific data series.

In an algorithmic perspective, these measures require the adoption of some grid (or box) for capturing and counting the objects, the main difference being that the fractal dimension just considers a Boolean perspective of "1" and "0", that is the box is either full or empty, while the entropy considers the number of counts in each box.

In the follow-up, a 3-dim grid defined between the minimum and maximum values obtained for each axis of the MDS locus is considered. For the fractal dimension, we obtain *fd* by the slope of *N* () versus  for 10 decreasing values of the box sizes. In the case of the entropy, we calculate *HXYZ* when adopting 20 bins for each MDS axis. The auxiliary lines connecting the object (i.e., the points) are not considered for the calculations.

Figures 9 and 10 show the variation of *fd* and *HXYZ* with *tw*, with pre-processing P<sup>1</sup> and P2, respectively, when using the distances (1a)–(1j). For *tw* ∈ {5, . . . , 240}, we have correspondingly MDS with *Nw* (*tw* ∈ {3166, . . . , 65}) points.

**Figure 9.** Plot of fractal dimension, *fd*, and Shannon entropy, *HXYZ*, versus *Nw* (*tw* ∈ {5, . . . , 240}), with pre-processing P<sup>1</sup> and using the distances (1a)–(1j).

**Figure 10.** Plot of fractal dimension, *fd*, and Shannon entropy, *HXYZ*, versus *Nw* (*tw* ∈ {5, . . . , 240}), with pre-processing P<sup>2</sup> and using the distances (1a)–(1j). The Manhattan, Euclidean, Tchebychev, Lorentzian, Sørensen, Canberra, Clark, divergence, angular, and Jaccard distances (denoted as {Ma, Eu, Tc, Lo, So, Ca, Cl, Dv, Ac, Ja}).

We note some "noise", but that should be expected due to the numerical nature of the experiments. In general, the two indices decrease with *tw*, revealing, again, the "low pass filtering" effect of the dimension of the time window. We note a considerable difference of the values of *fd* and *HXYZ* for small values of *tw*, but a stabilization and some convergence to closer values when *tw* increases.

In the case of the fractional entropy, *H<sup>α</sup> XYZ*, we can tune the value of *α* to achieve a maximum sensitivity. In other words, we can select the value *<sup>α</sup>max*(*H*) to obtain max *H<sup>α</sup> XYZ* . Figures 11 and 12 depict max *H<sup>α</sup> XYZ* vs. *αmax*(*H*) with *tw* ∈ {5, 10, . . . , 240}, with pre-processing P<sup>1</sup> and P2, respectively, and using the distances (1a)–(1j).

**Figure 11.** Plot of *<sup>α</sup>max*(*H*) versus max *H<sup>α</sup> XYZ* , with *tw* ∈ {5, 10, . . . , 240}, with pre-processing P<sup>1</sup> and using the distances (1a)–(1j).

**Figure 12.** Plot of *<sup>α</sup>max*(*H*) versus max *H<sup>α</sup> XYZ* , with *tw* ∈ {5, 10, . . . , 240}, with pre-processing P<sup>2</sup> and using the distances (1a)–(1j).

We verify a strong correlation between the entropy and the value of the fractional order. Furthermore, we note that 0.55 ≤ *αmax*(*H*) ≤ 0.75 and 0.57 ≤ *αmax*(*H*) ≤ 0.77 for P<sup>1</sup> and P2, respectively, far from integer values and clearly representative of fractional dynamics. For small time windows, each distance has a distinct behavior, but when the time window increases, all distances converge to almost similar points of *αmax*(*H*), both for P<sup>1</sup> and P2. Obviously, with larger time windows, we have a smaller number of points in the MDS locus, and that influences the result. The convergence towards a common behavior for all distances is observed after the first values of *tw*. This means that we are unraveling the fractional dynamics, that is a characteristic of long-range memory effects embedded in the time series.

For the pre-processing P1, the divergence distance produces a slightly separated plot to the left, while for P2, we see that position is occupied the divergence and Jaccard distances, but with a fuzzier behavior. As before, we note that the type of pre-processing does not yield any significant modification of the global conclusions.

#### **4. Conclusions**

Commonly, time is viewed as a continuous and linear flow so that any perturbation, such as noise and volatility, is automatically assigned to the variable under analysis. In other words, since we are entities immersed in the time flow, apparently, we are incapable of distinguishing between perturbations in the time and the measured variable. This paper explored an alternative strategy of reading the relationship between the variables. For that purpose, the DJIA, from 28 December 1959, up to 1 September 2020, was adopted as the vehicle for the numerical experiments. This dataset corresponds to a human-made phenomenon, and therefore, any conjecture about the nature of time is independent of the presently accepted conceptions about its flux. In the proposed approach, the time series was organized into vectors corresponding to specified time windows. Those vectors were then compared by means of a panoply of distances and the resulting information plotted in a three-dimensional space by means of MDS. Indeed, the MDS representation corresponds to a "customized projection" of high-dimensional data into a low-dimensional space. Loosely speaking, we can say "customized projection" since we do not pose any a priori requirements, the algorithm merely being based on the idea of minimizing the difference between the original measurements and the replicated (approximated) value. Therefore, the MDS does not automatically guarantee the success of such a "projection", but the quality results were assessed by the stress and Shepard diagrams. In the case of the DJIA and the adopted distances, the good quality of the MDS technique was confirmed.

The MDS loci have distinct shapes, according to the type of distance adopted to compare vectors. Therefore, additional tools were necessary to highlight the main characteristics of these representations where time is no longer the explicit variable. For that purpose, several mathematical tools were considered, namely the Shannon entropy and fractal dimension. In all cases, we observed some variability with the time window, which occurs naturally due to the numerical treatment of this type of data. The Shannon entropy and fractal dimension exhibited the same type of behavior, with a progressive variation with the time window and a stabilization toward a common value for large *tw*. While these results can be read merely as the effect of a low pass filtering provided by the large time window, we can also foresee that another property inherent to the DJIA is their origin.

The fractional entropy was brought up to further analyze the MDS locus. This tool allows a better sensitivity to the dataset than the Shannon entropy, since users can tune the calculations by means of the fractional order. In the case of the DJIA, the tuning of *α* for achieving the maximum entropy revealed not only that such values are independent of the distance, but also that we clearly have orders far from integer values, characteristic of fractional dynamics with non-local effects.

Some concepts are debatable and do not follow the standard orthodoxy, but the set of experiments with an artificial time series allows thinking outside the box and provides a strategy for exploring the texture of time in the perspective of entropy and fractional calculus.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A Discretization Approach for the Nonlinear Fractional Logistic Equation**

**Mohammad Izadi 1,\* and Hari M. Srivastava 2,3,4**


Received: 14 October 2020; Accepted: 17 November 2020; Published: 21 November 2020

**Abstract:** The present study aimed to develop and investigate the local discontinuous Galerkin method for the numerical solution of the fractional logistic differential equation, occurring in many biological and social science phenomena. The fractional derivative is described in the sense of Liouville-Caputo. Using the upwind numerical fluxes, the numerical stability of the method is proved in the *L*<sup>8</sup> norm. With the aid of the shifted Legendre polynomials, the weak form is reduced into a system of the algebraic equations to be solved in each subinterval. Furthermore, to handle the nonlinear term, the technique of product approximation is utilized. The utility of the present discretization technique and some well-known standard schemes is checked through numerical calculations on a range of linear and nonlinear problems with analytical solutions.

**Keywords:** logistic differential equation; liouville-caputo fractional derivative; local discontinuous Galerkin methods; stability estimate

#### **1. Introduction**

In studies of elementary population dynamics the simplest model for the growth of a population is known as rate equation and structured by Malthus in (1798) [1]

$$\begin{cases} \frac{dM(t)}{dt} &= r \, M(t), \quad t > 0, \\ M(0) &= M\_0. \end{cases} \tag{1}$$

where *M*p*t*q denotes the population at time *t*, the non-zero parameter *r* equals to *r* " *β* ´ *α*, where *β* and *α* are the per capita birth and death rates respectively. Here, *M*<sup>0</sup> is the population at time *t* " 0. The exact analytical solution of Malthus population model (1) is explained the constant population growth rate *M*p*t*q " *M*0*ert*. The Maithusian grow model is unrealistic over long times due to the fact that the solution of the rate equation is not included two main factors such as spread of diseases and the limitation on food supply. To model the effects of these factors in a population model, the logistic equation was considered by P. R. Verhulst in 1838 [2]

$$\frac{dN(t)}{dt} = rN(t)\left(1 - \frac{N(t)}{K}\right) \cdot \frac{1}{t}$$

where the variable *N*p*t*q " *M*p*t*q{*Mmax* is the whole population and normalized to its maximum attainable value *Mmax*, *r* denotes the intrinsic growth rate while the constant *K* ą 0 known as the carrying capacity of the environment. By defining *X*p*t*q :" *N*p*t*q{*K* and *σ* :" *rK*, the standard logistic equation can be rewritten as

$$\begin{cases} \frac{dX(t)}{dt} &= \sigma \, X(t) \left(1 - X(t)\right), \quad t > 0, \\ X(0) &= X\_0. \end{cases} \tag{2}$$

where *X*<sup>0</sup> " *M*p0q{*Mmax*. The exact solution of this equation can be easily obtained as

$$X(t) = \frac{X\_0}{X\_0 + (1 - X\_0)e^{-\sigma t}}$$

In the last decades, many efforts have been devoted to extend the integer-order models to the corresponding fractional-order models, which are more descriptive and can provide a powerful and valuable instrument for the explanation of hereditary and memory properties of several materials and process [3,4]. Replacing the classical derivative operator in (2) by a fractional one, the fractional logistic equation will be obtained. This model of population growth has been found applications in numerous disciplines of science and engineering. For instance, the growth of tumors in medicine [5] can be modelled as the fractional logistic equation (FLE). In addition, the milstone of various important mathematical models is based on the fractional logistic equation such as two models in Radar signals [6] and electroanalytical chemistry [7]. Several variations of the population growth model have been studied in the literature [8]. In the present study, we are going to investigate the following logistic population model of fractional order in the form

$$\begin{cases} \begin{array}{c} \text{{}^{\text{LC}}} \mathcal{D}\_t^\nu X(t) = \sigma \, X(t) \left(1 - X(t)\right) =: \sigma \, X(t) \, \mathcal{g}(X(t)), \quad t > 0, \\\ X(0) = X\_{0\prime} \end{array} \tag{3} $$

where the symbol LC *<sup>a</sup>* <sup>D</sup>*<sup>ν</sup> <sup>t</sup>* denotes the fractional derivative operator of Liouville-Caputo type and *ν* P p0, 1s. It should be emphasized that in (3) we have used the function *g*p*s*q " 1 ´ *s*, which corresponds to the nonlinear logistic equation. However, to address the linear counterpart of this equation we also consider *g*p*s*q " 1. The issue of existence and the uniqueness of the solution of (3) is discussed in detail in Reference [9].

It is known that for most fractional differential equations there is no possibility to find the exact solutions analytically. Consequently, exploring an approximate or numerical technique is of primary interest for such fractional equations. Many efforts have been made toward the exact analytical solution of the problem (3). The first one is proposed by West [10], which is based on the Carleman embedding technique. Later, it is shown that in Reference [11] the this analytical function is only very close to the numerical solutions of the FLE. The other analytical methods for the FLE include the fractional Taylor expansion method [12], a method based on Euler's numbers [13], and the varational iterative method [14]. Besides the analytical investigations, numerous computational approaches have been proposed for the nonlinear FLE. Let us mention the predictor-corrector approaches [9,15], the finite difference schemes [14,16], the spectral methods [17,18], the Bessel collocation method [19], the Chebyshev wavelet method [20], the Laguerre collocation method [21], and the fractional spline collocation method [22].

Many other numerical and approximation methods as well as computational approaches have been developed and applied for the FDEs which are based upon various closely-related models of real-world problems. For example, Baleanu et al. [23] made use of a Chebyshev spectral method based on operational matrices, a remarkable survey of numerical methods can be found in [24], a study of the fractional-order Bessel, Chelyshkov, and Legendre collocation schemes for the fractional Riccati equation was presented in [25], an operational matrix of fractional-order derivatives of Fibonacci polynomials was developed in [26], an introductory overview and recent developments involving FDEs was presented in [27], efficiency of the spectral collocation method in the dynamic simulation of the fractional-order epidemiological model of the Ebola virus was investigated in [28], the Jacobi collocation method and a spectral tau method based on shifted second-kind Chebyshev polynomilas for the approximate solution of some families of the fractional-order Riccati differential equations were discussed in [29,30], computational approaches to FDEs for the biological population model were discussed in [31], the generalized Chebyshev and Bessel colllocation approaches for fractional BVPs and multi-order FDEs were considered in [32,33], and a general wavelet quasi-linearization method for solving fractional-order population growth model was developed and applied in [34].

In this work, we take a further step towards proposing a numerical method for solving the FLE. We utilize a discontinuous finite element approach, i.e, the local discontinuous Galerkin (LDG) discretization approach for the FLE (3). To apply the LDG scheme, we must rewrite a given FDEs as a system of first-order ordinary differential equations (ODEs) with together a fractional integral. Hence, the discontinuous Galerkin (DG) method is employed to discretize the resulting system as well as the fractional integral. The first DG method was introduced by Reed and Hill [35] in 1973 for numerically solving neutron transport, that is, a time-independent linear hyperbolic equation. Since then the DG schemes have been well implemented for the classical ODEs was started by the work [36]. DG schemes as a subclass of finite element methods (FEMs) allow us to exploit discontinuous discrete basis functions. These local basis functions are usually selected as piecewise polynomials. Exploiting completely discontinuous basis functions offers great opportunities compared to traditional FEMs when used to discretize differential equations. In summary, the main gains of the DG methods are in terms of flexibility, accuracy as well as parallelizability, see cf. Reference [37].

To the best of our knowledge, the LDG approaches for the ODEs of fractional-order including one-term and multi-terms were first discussed in Reference [38] and then have been applied to many model problems [39–41]. It is worth mentioning that the success of LDG methods is based on the designing of appropriate numerical fluxes at the interface elements. In this work, we utilize the upwind numerical flux as natural choice for the FLE. By choosing the upwind fluxes we are able to prove the numerical stability of the LDG scheme.

The rest of this paper is organized as follows. In the next Section, we review some fractional calculus preliminaries and state some of their properties that will be used later on. The formulation of the LDG scheme for the logistic equation is established in Section 3. Hence, the algebraic form of the LDG scheme is obtained with the aid of shifted Legendre basis functions. The technique of product approximation is also applied to deal with the nonlinear term in the weak formulation. In Section 4 we establish the numerical stability of the scheme in the linear case and a discussion about the error estimation is made. In Section 5, the applicability and utility of the present numerical schemes are verified by performing several simulations on two linear and nonlinear population growth and logistic model problems. Finally, a conclusion is drawn in Section 6.

#### **2. Fractional Calculus**

Now, we present some fundamental and mathematical preliminaries of the fractional calculus theory to be utilized in our subsequent sections, see References [3,4,27].

**Definition 1.** *Let ν* ě 0 *is given. The (left) Riemann-Liouville fractional integral operator of order ν is given by*

$$\mathcal{X}^{\nu}f(t) \equiv \,\_a\mathcal{Z}\_t^{\nu}f(t) = \begin{cases} \frac{1}{\Gamma(\nu)} \int\_a^t f(p) \, (t-p)^{\nu-1} \, dp, & \nu > 0, \, t > 0, \\ f(t), & \nu = 0. \end{cases}$$

The integral operator <sup>I</sup>*<sup>ν</sup>* has many properties. Among others, we make use of the following

*Entropy* **2020**, *22*, 1328


(3) <sup>I</sup>*ν<sup>t</sup> <sup>γ</sup>* " <sup>Γ</sup>p*γ*`1<sup>q</sup> <sup>Γ</sup>p*γ*`*ν*`1<sup>q</sup> *<sup>t</sup> <sup>ν</sup>*`*γ*, *<sup>γ</sup>* ą ´1.

The corresponding definition of the right Riemann-Liouville fractional integral on the interval r*t*, *b*s instead of r*a*, *t*s is given by

$$\,\_t T\_b^\nu f(t) = \frac{1}{\Gamma(\nu)} \int\_t^b f(p) \, (p-t)^{\nu-1} \, dp, \quad \nu > 0, \ t > 0.$$

**Definition 2.** *The fractional derivative* <sup>D</sup>*<sup>ν</sup> of f*p*t*<sup>q</sup> *in the Liouville-Caputo's sense is defined as*

$$\mathcal{D}^{\boldsymbol{v}}f(t) \equiv \, \_a^{\mathrm{LC}}\mathcal{D}\_t^{\boldsymbol{v}}f(t) = \left\{ \frac{1}{\Gamma(m-\nu)} \int\_a^t \frac{f^{(m)}(p)}{(t-p)^{\nu-m+1}} \, dp, \quad m-1 < \nu < m, \quad t > 0, \ldots \right\}$$

We make use of the following [4]:

$$\mathcal{D}^{\mathcal{V}}(\mathbb{C}) = 0 \quad \text{( $\mathbb{C}$  is a constant)}, \tag{4}$$

$$\mathcal{D}^{\mathcal{V}}\mathcal{I}^{\mathcal{V}} = \begin{cases} \frac{\Gamma(\gamma+1)}{\Gamma(\gamma+1-\nu)}\mathcal{I}^{\gamma-\nu}, & \text{for } \gamma \in \mathbb{N}\_{0} \text{ and } \gamma \gg [\nu], \text{ or } \gamma \notin \mathbb{N}\_{0} \text{ and } \gamma > [\nu],\\ 0, & \text{for } \gamma \in \mathbb{N}\_{0} \text{ and } \gamma < [\nu]. \end{cases} \tag{5}$$

Here, we have used the ceiling and floor functions r*ν*s, t*ν*u respectively. It should be noted that, two operators <sup>I</sup>*<sup>ν</sup>* and <sup>D</sup>*<sup>ν</sup>* are related through the following expression

$$
\mathcal{D}^\nu f(t) = \mathcal{T}^{m-\nu} D^m f(t), \quad D = \frac{d}{dt}. \tag{6}
$$

#### **3. Discretized LDG Formulation**

In order to formulate the LDG method for the logistic equation in (3), some basic notations will first be introduced.

Let us consider (3) on *L* " p0, *T*q for some given *T* ą 0. To rewrite (3) as a first-order system, we introduce two new variables *<sup>z</sup>*0p*t*q " *<sup>X</sup>*p*t*<sup>q</sup> and *<sup>z</sup>*1p*t*q " *dX*p*t*<sup>q</sup> *dt* and use the relation (6) to get

$$\begin{cases} z\_1(t) - \frac{d z\_0(t)}{dt} = 0, \\\_0 \mathcal{Z}\_t^{(1-\nu)} z\_1(t) - \sigma \, z\_0(t) \left(1 - z\_0(t)\right) = 0, \\\_0 \boldsymbol{z}\_0(0) - \mathcal{X}\_0 = 0, \end{cases} \tag{7}$$

with *ν* P p0, 1s and *t* P *L*. By Δ we denote a partitioning of the interval *L* into *J* subintervals *Ll* " p*tl*´1, *tl*q for *l* " 1, . . . , *J*. The grid points of Δ will be denoted as

$$0 = \colon t\_0 < t\_1 < \ldots < t\_{J-1} < t\_J := T \,.$$

By *hl* we mean the length of each *Ll*, that is, *hl* " *tl* ´ *tl*´<sup>1</sup> for *l* " 1, 2, ... , *N*. The maximum length of these element is taken as *<sup>h</sup>* :" max*<sup>J</sup> <sup>l</sup>*"<sup>1</sup> *hl*. We associate the mesh <sup>Δ</sup> with the broken Sobolev spaces

$$H^1\_\Delta = \{ w : L \to \mathbb{R} \mid w|\_{L\_l} \in H^1(L\_l), \ l = 1, 2, \dots, \ l \}.$$

and

$$S\_{\Delta} = \{ w : L \to \mathbb{R} \mid w|\_{L\_l} \in L\_2(L\_l), \ l = 1, 2, \dots, J \}\_{\Delta}$$

By using these function spaces, let assume that the solutions of system (7) belong to corresponding spaces

$$\left(z\_0(t), z\_1(t)\right) \in H^1\_\Delta \times \mathcal{S}\_\Delta.$$

It should be noted that the elements of space *H*<sup>1</sup> <sup>Δ</sup> may be discontinuous in *t* at discrete time level *t*1. In this respect, at the mesh grid points, defining the left-sided as well as the right-sided limits of a function *<sup>w</sup>* is necessary, where *<sup>w</sup>* : *<sup>L</sup>* <sup>Ñ</sup> <sup>R</sup> is a piecewise continuous function. By *<sup>w</sup>*´ *<sup>n</sup>* and *w*` *<sup>n</sup>* , we let the left- and right-sided limits of *w* at *tl*

$$w\_l^+ = w^+(t\_l) = w(t\_l^+) := \lim\_{s \to 0^+} w(t\_{ll} + s), \quad w\_l^- = w^-(t\_l) = w(t\_l^-) := \lim\_{t \to 0^-} w(t\_l + s).$$

For any positive integer number *r*, we denote by *Pr*p*Ll*q the space of polynomials of degree less or equal than *r* on the element *Ll* P Δ. We then let the approximate solutions *z*0p*t*q, *z*1p*t*q belong to a subspace <sup>V</sup>p*r*<sup>q</sup> <sup>Ă</sup> *<sup>H</sup>*<sup>1</sup> <sup>Δ</sup>, which is a finite dimensional space. This subspace is defined as the space of discontinuous and piecewise polynomial functions

$$\mathcal{V}^{(r)} = \{ w : L \to \mathbb{R} \mid w|\_{L\_l} \in P\_r(L\_l), \ l = 1, 2, \dots, r \} \dots$$

We further define Z0p*t*q and Z1p*t*q as the DG approximations to the exact solutions *z*0p*t*q and *z*1p*t*q of the system (7) respectively on the element *Ll*. Below, we make use of the following notations

$$(w,v)\_{\!I} := \int\_{L\_{\!\!\!\!}} w \, v \, dt, \quad \langle w, \, v \rangle\_{\!\!\!} := \int\_{0}^{t\_{\!\!\!\!}} w \, v \, dt, \quad \|w\|\_{\!\!\!\!} := \sqrt{\langle w, \, w \rangle\_{\!\!\!\!\!\!}}.$$

For obtaining the weak DG formulation, we first multiply the first equation in (7) by a test function *<sup>w</sup>*<sup>0</sup> <sup>P</sup> <sup>V</sup>p*r*<sup>q</sup> and integrate over *Ll*. By applying the integrating by parts we get

$$
\left(\left(\mathcal{Z}\_1(t), w\_0\right)\_l + \left(\mathcal{Z}\_0(t), \frac{dw\_0}{dt}\right)\_l - \mathcal{Z}\_0(t\_l^-) \, w\_0(t\_l^-) + \mathcal{Z}\_0(t\_{l-1}^+) \, w\_0(t\_{l-1}^+) = 0. \tag{8}
$$

Hence, the second integral equation in (7) is multiplied by a test function *<sup>w</sup>*<sup>1</sup> <sup>P</sup> <sup>V</sup>p*r*<sup>q</sup> and integrate over *Ll*. To advance the solution in time, we replace Z0p*t* ` *<sup>l</sup>*´1<sup>q</sup> by the upwind flux <sup>Z</sup>0p*<sup>t</sup>* ´ *<sup>l</sup>*´1<sup>q</sup> in (8). Thus, the discrete formulation for finding <sup>Z</sup>0, <sup>Z</sup><sup>1</sup> <sup>P</sup> <sup>V</sup>p*r*<sup>q</sup> takes the following form for all *<sup>w</sup>*0, *<sup>w</sup>*<sup>1</sup> <sup>P</sup> <sup>V</sup>p*r*q, and *l* " 1, 2, . . . , *J*

$$\begin{cases} \left(\mathcal{Z}\_{1}(t),\,\boldsymbol{w}\_{0}(t)\right)\_{\boldsymbol{l}} + \left(\mathcal{Z}\_{0}(t),\,\boldsymbol{w}\_{0}^{\prime}(t)\right)\_{\boldsymbol{l}} - \mathcal{Z}\_{0}(t\_{\boldsymbol{l}}^{-})\,\boldsymbol{w}\_{0}(t\_{\boldsymbol{l}}^{-}) + \mathcal{Z}\_{0}(t\_{\boldsymbol{l}-1}^{-})\,\boldsymbol{w}\_{0}(t\_{\boldsymbol{l}-1}^{+}) = \boldsymbol{0},\\ \left(\boldsymbol{\mathcal{Z}}\_{1}^{(1-\nu)},\,\mathcal{Z}\_{1}(t),\,\boldsymbol{w}\_{1}(t)\right)\_{\boldsymbol{l}} - \sigma\left(\mathcal{Z}\_{0}(t),\,\boldsymbol{w}\_{1}(t)\right)\_{\boldsymbol{l}} + \sigma\left(\mathcal{Z}\_{0}^{2}(t),\,\boldsymbol{w}\_{1}(t)\right)\_{\boldsymbol{l}} = \boldsymbol{0},\\ \mathcal{Z}\_{0}(t\_{0}^{-}=0) - \boldsymbol{X}\_{0} = \boldsymbol{0}. \end{cases} \tag{9}$$

It should be noted that, to start computations on the first element *L*<sup>1</sup> " p*t*0, *t*1q we use the given initial condition Z0p*t* ´ <sup>0</sup> q " *X*0. Hence, by utilizing the upwind flux as the natural choice, we are able to solve the resultant equations element by element on each subinterval *Ll* for *l* " 1, 2, ... , *J*. In each element, we just need to invert a local matrix of size p*r* ` 1qˆp*r* ` 1q in place of a global matrix of size *J*p*r* ` 1q ˆ *J*p*r* ` 1q.

#### *Algebraic Formulation*

Since the functions in <sup>V</sup>p*r*<sup>q</sup> may be discontinuous across interfaces of the element, various local bases can be selected for finite element approximation in (9). Let us choose a basis in the space *Pr*p*Ll*q formed by functions *φ<sup>l</sup>* <sup>0</sup>, *<sup>φ</sup><sup>l</sup>* <sup>1</sup>,..., *<sup>φ</sup><sup>l</sup> <sup>r</sup>*. Thus the numerical approximations Z<sup>0</sup> of *z*<sup>0</sup> and Z<sup>1</sup> of *z*<sup>1</sup> in every element *Ll* can be expressed as

$$\mathcal{Z}\_0(t) = \sum\_{i=0}^q \alpha\_i^l \,\phi\_i^l(t), \quad \mathcal{Z}\_1(t) = \sum\_{i=0}^q \beta\_i^l \,\phi\_i^l(t), \quad t \in L\_l. \tag{10}$$

Here, the coefficients α*<sup>l</sup> i* ,β*<sup>l</sup> i* , *i* " 0, ... ,*r* denote the degrees of freedom to be sought in each *Ll*,*l* " 1, ... , *J*. To proceed, we take the test functions in each element *Ll* in the form *wj* " *φ<sup>l</sup> j* p*t*q for *j* " 0, 1, ... ,*r* and *l* " 0, 1, ... , *J*. Now, by specifying the basis functions as we done below, the discrete LDG formulation (9) is reduced to a algebraic system of equations.

For practical implementation of the LDG scheme (9) for the FLE (3), we use the set of orthogonal Legendre polynomials for the space <sup>V</sup>p*r*q. Let us recall that, the *<sup>i</sup>*'th degree Legendre polynomials *Pi*p*s*<sup>q</sup> can be generated by the well-known Rodriguez formula

$$P\_{\vec{i}}(\mathbf{s}) = \frac{1}{2^{\vec{i}}\vec{i}!} \frac{d^{\vec{i}}}{ds^{\vec{i}}} (\mathbf{s}^2 - 1)^{\vec{i}}\mathbf{s}$$

The Legendre polynomials satisfy the following relations [17]

$$\int\_{-1}^{1} P\_i(\mathbf{s}) P\_j(\mathbf{s}) d\mathbf{s} = \frac{2\delta\_{ij}}{2i+1}, \quad P\_i(\mathbf{1}) = 1, \quad P\_i(\mathbf{s}) = (-1)^i P\_i(-\mathbf{s}), \quad i, j \gg 0,\tag{11a}$$

$$dB\_{0,...,c}(c) \quad dB\_{0,...,c}(c)$$

$$(2i+1)P\_i(s) = \frac{dP\_{i+1}(s)}{ds} - \frac{dP\_{i-1}(s)}{ds},\tag{11b}$$

where *δij* denotes the Kronecker delta. The first property shows that these set of orthogonal polynomials are orthogonal with respect to weighting function *w*p*t*q " 1 on p´1, 1q. The Legendre polynomial *Pi*p*s*q of degree *i* can be explicitly expressed as follows

$$P\_{\bar{l}}(s) = \sum\_{k=0}^{M\_i} c\_{ik} s^{i-2k}, \quad c\_{ik} := \frac{1}{2^i} (-1)^k \binom{i}{k} \binom{2i-2k}{i} \lambda^i$$

where *Mi* " *i*{2 or p*i* ´ 1q{2, whichever is an integer. Due to the fact that these polynomials are orthogonal on r´1, 1s, we map them onto the element *Ll* by using the following change of variable

$$\mathbf{s} := \frac{2t - t\_{l-1} - t\_l}{h\_l}, \quad t \in L\_l.$$

Let the resultant shifted Legendre polynomials denoted by <sup>L</sup>*i*p*t*q. Thus, the explicit form of <sup>L</sup>*i*p*t*<sup>q</sup> of degree *i* takes the form

$$\mathbb{L}\_i(t) = \sum\_{k=0}^{M\_i} c\_{ik} \left( \frac{2t - t\_{l-1} - t\_l}{h\_l} \right)^{i-2k}.$$

By means of the binomial formula, one can further simplify the last expression as follows

$$\mathbb{L}\_{i}(t) = \sum\_{k=0}^{M\_{i}} \sum\_{m=0}^{i-2k} \mathbb{C}\_{i\text{km}} t^{m} \, \tag{12}$$

where the coefficients *Cikm* are defined as

$$C\_{ikm} := \frac{(-1)^{i+k+m} \left(2i - 2k\right)!}{2^i \left(i - k\right)! k! \left.l! \left(i - 2k - m\right)!} \left(\frac{t\_I + t\_{I-1}}{t\_I - t\_{I-1}}\right)^{i-2k} \left(\frac{2}{t\_I + t\_{I-1}}\right)^m \dots$$

*Entropy* **2020**, *22*, 1328

Now, we choose *φ<sup>l</sup> i* <sup>p</sup>*t*q " <sup>L</sup>*i*p*t*qin (10) for *<sup>l</sup>* " 1, 2, ... , *<sup>J</sup>*, where <sup>L</sup>*<sup>i</sup>* is the shifted Legendre polynomial of degree *i* in *t* defined in (12). With this transformation, the unknown values α*<sup>l</sup> i* ,β*<sup>l</sup> <sup>i</sup>* in (10) can be interpreted as the Legendre coefficients of the expansion of Z0, Z1. Hence, by the virtue of the Legendre properties (11) and inserting (10) into the discrete formulation (9) we have for *l* " 1, . . . , *J* as

$$\begin{aligned} \sum\_{i=0}^{r} \mathbb{B}\_{i}^{l} \left( \mathbb{L}\_{i}(t), \mathbb{L}\_{j}(t) \right)\_{l} + \sum\_{i=0}^{r} \alpha\_{i}^{l} \Big( \mathbb{L}\_{i}(t), \mathbb{L}\_{j}^{\prime}(t) \Big)\_{l} - \sum\_{i=0}^{r} \alpha\_{i}^{l} + \sum\_{i=0}^{r} \alpha\_{i}^{l-1} (-1)^{j} = 0, \\ \sum\_{i=0}^{r} \mathbb{B}\_{i}^{l} \Big( \mathbb{L}\_{i}^{(1-\nu)} \mathbb{L}\_{i}(t), \mathbb{L}\_{j}(t) \Big)\_{l} - \sigma \sum\_{i=0}^{r} \alpha\_{i}^{l} \Big( \mathbb{L}\_{i}(t), \mathbb{L}\_{j}(t) \Big)\_{l} + \sigma \Big( \Big( \sum\_{i=0}^{r} \alpha\_{i}^{l} \mathbb{L}\_{i}(t) \Big)^{2}, \mathbb{L}\_{j}(t) \Big)\_{l} = 0, \end{aligned} \tag{13}$$

for *j* " 0, ... ,*r*. To proceed, we need to deal with two main difficulties involving the integral and nonlinear terms that appear in (13). To tackle the integral term, the properties (1)–(3) of fractional integration in Section 2 is used to obtain

$$\,\_0\mathrm{T}\_t^{(1-\nu)}\mathrm{L}\_i(t) = \sum\_{k=0}^{M\_i} \sum\_{m=0}^{i-2k} \mathrm{C}\_{i\text{km}} \,\_0\mathrm{T}\_t^{(1-\nu)} t^m = \sum\_{k=0}^{M\_i} \sum\_{m=0}^{i-2k} \mathrm{C}\_{i\text{km}}' t^{m+1-\nu}, \,\mathrm{C}\_{i\text{km}}' := \mathrm{C}\_{i\text{km}} \frac{\Gamma(m+1)}{\Gamma(m+2-\nu)}.$$

Next, the explicit form (12) is utilized for <sup>L</sup>*j*p*t*<sup>q</sup> and then <sup>0</sup>Ip1´*ν*<sup>q</sup> *<sup>t</sup>* <sup>L</sup>*i*p*t*<sup>q</sup> will be inserted into the inner product. Now, by integration over *Ll* we obtain

$$d\_{i,j} := \left( {}\_0\mathbb{Z}\_t^{(1-\nu)} {}\_{\mathbb{L}}\mathbb{L}(t), \,\mathbb{L}\_j(t) \right)\_l = \sum\_{k=0}^{M\_i} \sum\_{m=0}^{i-2k} \sum\_{k'=0}^{M\_j} \sum\_{m'=0}^{j-2k'} \mathcal{C}\_{ikm jk'm'}^{\nu} \left( t\_l^{m+m'+2-\nu} - t\_{l-1}^{m+m'+2-\nu} \right), \tag{14}$$

with the coefficients

$$
\mathbb{C}^{\prime}\_{ikmjk'm'} := \mathbb{C}^{\prime}\_{ikm} \mathbb{C}\_{jk'm'} / (m+m'+2-\nu) .
$$

The nonlinear term in (13) can be computed using the Legendre polynomials. For instance, if *r* " 1 we may write it as a product of two vectors

$$m\_{\boldsymbol{\beta}}^{\rm DC} := \left(\mathcal{Z}\_{0}^{2}(t), \mathbb{L}\_{\boldsymbol{\beta}}(t)\right)\_{\boldsymbol{l}} = \left[[\mathfrak{a}\_{0}^{\boldsymbol{l}}]^{2}, 2\mathfrak{a}\_{0}^{\boldsymbol{l}}\mathfrak{a}\_{1}^{\boldsymbol{l}}, \{\mathfrak{a}\_{1}^{\boldsymbol{l}}\}^{2}\right] \cdot \int\_{L\_{\boldsymbol{l}}} \left[\mathbb{L}\_{0}^{2}(t), \mathbb{L}\_{0}(t)\mathbb{L}\_{1}(t), \mathbb{L}\_{1}^{2}(t)\right]^{T} \mathbb{L}\_{\boldsymbol{\beta}}(t)dt \,\omega\_{\boldsymbol{\beta}}$$

for *j* " 0, 1. Therefore, it is not a difficult task to calculate *nDC <sup>j</sup>* by direct computation (D.C.) using the shifted Legendre polynomials on each *Ll* for different *j*. Of course one may exploit the symbolic toolbox in Matlab to facilitate the process of integration of these polynomials. Alternatively, to handle the nonlinear term in (13), the product approximation (P.A.) technique [42] is used in the following manner

$$\mathcal{Z}\_0^2(t) = \left[\sum\_{i=0}^r \alpha\_i^l \operatorname{\mathbb{L}}\_i(t)\right]^2 \approx \sum\_{i=0}^r \|\alpha\_i^l\|^2 \operatorname{\mathbb{L}}\_i(t).$$

This technique enables us to write the nonlinear part as

$$m\_{i,j}^{PA} := \left(\mathcal{Z}\_0^2(t), \mathbb{L}\_j(t)\right)\_l = \sum\_{i=0}^r [\alpha\_i^l]^2 \left(\mathbb{L}\_i(t), \mathbb{L}\_j(t)\right)\_l. \tag{15}$$

Now, it suffices to calculate the two first terms in (13). To this end, we compute the elements of the mass matrix as

$$m\_{i,j} := \left(\mathbb{L}\_i(t), \mathbb{L}\_j(t)\right)\_l = \int\_{L\_l} \mathbb{L}\_i(t) \mathbb{L}\_j(t) dt = \begin{cases} \frac{h\_l}{2i+1}, & i=j, \\ 0, & i \neq j. \end{cases} \tag{16}$$

*Entropy* **2020**, *22*, 1328

Finally, the entries of the stiffness matrix

$$s\_{i,j} = \left(\mathbb{L}\_i(t), \,\mathbb{L}'\_j(t)\right)\_l = \int\_{L\_l} \mathbb{L}\_i(t) \mathbb{L}'\_j(t) dt\_\prime$$

need to be calculated. In the new coordinate, we recursively employ the Legendre property (11b) to derive

$$\frac{h\_l}{2}\mathbb{L}\_{i+1}^{\prime}(t) = (2i+1)\mathbb{L}\_i(t) + (2(i-1)+1)L\_{i-2}(t) + (2(i-4)+1)L\_{i-4}(t) + \dotsb \dots$$

By applying the orthogonality relation (11a) to the preceding equation and then simplifying the involved integral in *si*,*j*, we finally get

$$s\_{i,j} = \begin{cases} \ 2, & \text{if } i > j \text{ and } (i+j) \text{ is even,} \\\ 0, & \text{otherwise.} \end{cases} \tag{17}$$

Using (14)–(17), one may write (13) in the matrix-vector multiplication form for *l* " 1, ... , *J* as follows

$$\begin{cases} \mathbf{M} \mathbf{\mathcal{B}}^{l} + (\mathbf{S} - \mathbf{E}) \mathbf{a}^{l} = \mathbf{b}^{l}, \\ \mathbf{D} \mathbf{\mathcal{B}}^{l} - \sigma \mathbf{M} (\mathbf{a}^{l} - \mathbf{a}^{2,l}) = 0, \end{cases} \tag{18}$$

where the unknown vectors α*<sup>l</sup>* ,β*<sup>l</sup>* , and α2,*<sup>l</sup>* are defined

$$\mathbf{a}^{l} = \left(\alpha^{l}\_{0^{\prime}}, \dots, \alpha^{l}\_{r}\right)^{T}, \qquad \mathbf{g}^{l} = \left(\beta^{l}\_{0^{\prime}}, \dots, \beta^{l}\_{r}\right)^{T}, \qquad \mathbf{a}^{2,l} = \left([\alpha^{l}\_{0}]^{2}, \dots, [\alpha^{l}\_{r}]^{2}\right)^{T}.$$

Note in (18) that the components of matrix *E* are *ei*,*<sup>j</sup>* :" 1 while that of *M*,*S*, *N* and *D* are *mi*,*j*,*si*,*j*, *ni*,*j*, and *di*,*<sup>j</sup>* respectively for *i*, *j* " 0, ... ,*r* as defined above. Moreover, the components of the known vector *b<sup>l</sup>* are

$$b\_i := (-1)^{i+1} \mathcal{Z}\_0(t\_{l-1}^-), \quad i = 0, 1, \dots, r.$$

Clearly, the value of Z0p*t* ´ *<sup>l</sup>*´1<sup>q</sup> is already known from the preceding time interval *Ll*´1. Obviously this value at the first time interval is computed as *X*0, the initial condition. Also, the obtained system (18) is a nonlinear algebraic system of equations have to be solved in each *Ll* for *l* " 1, ... , *J*. This system can be solved for example, via Newton type schemes. It is known that this method converges quadratically whenever the approximation is close to the actual solution of the given nonlinear system. Using the D.C. approach, we also arrive at a nonlinear system of equation in the general form *F*pα*<sup>l</sup>* ,β*<sup>l</sup>* q " 0 to be solved in each interval *Ll*. As we show in the numerical experiments, this approach is more accurate than the corresponding P.A. approach.

#### **4. Numerical Stability and Error Estimates**

Now, we are going to establish the stability of proposed LDG scheme when applied to the logistic equation in the linear case by considering *g*p*t*q " 1 in (3). In this case we have
