1. Introduction
Even if occurring very infrequently, rare or extreme events can mediate large transport with significant impact. Examples would include the sudden outbreak of devastating infectious diseases, solar flares, extreme weather conditions, flood, forest fire, sudden stock market crash, flow sensor failure, bursty gene expression and protein productions. The resulting large transports can be either beneficial (e.g., promoting mixing and air circulations by atmospheric jets or removing toxins) or harmful. For instances, tornadoes cause a lot of damage; in magnetic fusion, plasma confinement is hampered by intermittent transport of particles and energy from hot plasma core to the colder plasma boundaries.
Given the damage that these events can cause, finding good statistical methods to predict their sudden onset, or abrupt changes in the system dynamics is a critical issue. For instance, there are different types of plasma disruptions in fusion plasmas [
1] and the current guidance for the minimum required warning time for successful disruption mitigation on ITER is about 30 ms [
2]. Increasing the warning time by the early detection of a sudden event will greatly help ensuring a sufficient time for a control strategy to minimise harmful effects.
Obviously, the whole mark of the onset of a sudden event is an abrupt dynamical change in the system or data over time—time-variability/large fluctuation, whose proper description requires non-stationary statistical measures such as time-dependent probability density functions (PDFs). By using time-dependent PDFs, we can quantify how the “information” unfolds in time through information geometry. The latter refers to the application of the techniques of differential geometry in probability and statistics by using differential geometry to define the metric [
3,
4,
5,
6] (a notion of length). The main purpose of this paper is to examine the capability of the information-geometric theory proposed in a series of recent works [
7,
8,
9,
10,
11,
12] in predicting the onset of a sudden event and compare it with one of the entropy-based information theoretical measures [
13,
14,
15].
In nutshell, the information length [
7,
8] measures the evolution of a system in terms of a dimensionless distance which represents the total number of different statistical states that are accessed by the system (see
Section 2.2). The larger time-variability, the more abrupt change in the information length; in a statistically stationary state, the information length does not change in time. In fact, the recent work [
6] has demonstrated the capability of the information length in the early prediction of transitions in fusion plasmas.
In this paper, we mimic the onset of a sudden event by including a sudden perturbation to the system and calculate time-dependent PDFs and various statistical quantities including information length and one of the entropy-based information-theoretical measure (information flow) [
16,
17]. The latter measures the directional information flow between two variables. This is more sensitive than mutual information which measures the correlation between the variables. The point we want to make is that this information flow like any other entropy-based measures depends solely on entropy, and thus it cannot pick up the onset of a sudden event which does not affect entropy, for instance, such as the mean value (recall, the entropy is independent of the local arrangement of the probability [
3] as well as the mean value).
We should note that there are many other information theoretical measures [
3,
13,
14,
15,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26] that have been used to understand different aspects of complexity, emergent behaviours, etc in non-equilibrium systems. However, the main purpose of this paper is not to provide an exhaustive exploration of these methods, but to point out the possible limitation of the entropy-based information measurements in predicting sudden events. Additionally, our intention is not on modelling the appearance of rare, extreme events (that are nonlinear, non-Gaussian) themselves, but on testing the predictability of information theoretical measures on the onset of such sudden events.
Specifically, to gain a key insight, we utilise an analytically solvable model—a non-autonomous Kramers equation (for the two variables,
and
)—which enables us to derive exact PDFs and analytical expressions for various statistical measures including entropy, information length and information flow, which are then simulated for a wide range of different parameters. This model is the generalisation of the Kramers equation in [
27] where non-autonomy is introduced by an impulse. The latter is included either in the strength of stochastic noise or by an external impulse input which models a sudden perturbation to the system. Examples are shown in
Figure 1; panel (a) shows the phase portrait of
and
without any impulse, where blue dots are generated by sample stochastic simulations using the Cholesky decomposition [
28]. Panel (b) shows the case where an impulse causes the perturbation in the covariance matrix
while panel (c) is the case where the sudden perturbations affect both covariance matrix
and the mean value
.
The paper is organised as follows:
Section 2 introduces a non-autonomous linear system of equations and provides key statistical properties including the information length and information flow. In
Section 3, we present the analysis of the non-autonomous Kramers equation and our main theoretical results, referring readers to
Appendix A and
Appendix B for the detailed steps involved in the derivations. In
Section 4 (and also
Appendix C), we present simulation results;
Section 5 contains our concluding remarks.
To help readers, in the following, we summarise our notations. is the set of real numbers. represents a column vector of real numbers of dimension n, represents a real matrix of dimension (bold-face letters are used to represent vectors and matrices), corresponds to the trace of the matrix . , and are the determinant, transpose and inverse of matrix , respectively. is used for the partial derivative with respect to the variable t. Finally, the average of a random vector is denoted by , the angular brackets representing the average.
3. Non-Autonomous Kramers Equation
To demonstrate how IF and IL can be used in the prediction of abrupt changes in system dynamics, we focus on the non-autonomous Kramers equation, as noted in
Section 1. Recall that the original (autonomous) Kramers equation describes the Brownian motion in a potential, for instance, as a model for reaction kinetics [
33]. By including a time-dependent external input
, we generalise this to the following non-autonomous model for the two stochastic variables
Here,
is a short correlated Gaussian noise with a zero mean
and the strength
D with the following property
In this paper, we consider a time-dependent
to incorporate a sudden perturbation in
D as follows
Here, the second term on RHS is an impulse function which takes a non-zero value for a short time interval
a around
;
is used to cover the two cases without and with the impulse.
Furthermore, we are interested in the case where
is as well an impulse like function given by
Here, the impulse is localised around
with the width
c; again
is used to cover the two cases without and with the impulse. To find IL and IF for system (
17) and (
18), we use Proposition 1 and calculate the expressions for
using Equations (
19) and (
20), as shown in
Appendix A.
Equation (
21) then determines the form of the joint PDF
in Equation (
3) for the two variables
. On the other hand, the marginal PDFs of
and
for Equations (
17) and (
18) are given by
From these PDFs, we can easily obtain the entropy based on the joint and marginal PDFs, respectively, as follows
3.1. Information Length for Equation (17)
We now use Proposition 1 (Equations (
3) for (
17)) and Theorem 1. Since the covariance matrix
as well as the mean values
(see
Appendix A) for the joint PDF involve many terms including special (error) functions, it requires a long algebra and numerical simulations (integrations) to calculate Equations (
8) and (
9), respectively. The following thus summarise the main steps only. First, we can show that
for the linear non-autonomous stochastic process (
1) can be rewritten as
We can then show that for Equation (
17), Equation (
26) becomes
By using
and
given in
Appendix A, we calculate (
28). Finally, to calculate IL in Equation (
8), we perform the numerical integration of
over time for the chosen parameters and initial conditions. Results are presented in
Section 4.
3.2. Information Flow for Equation (17)
To find the information flow for Equation (
17), we compare it with Equation (
13)
After some algebra using Equation (
28) in Equations (
14) and (
15), we can show (see
Appendix B for derivation)
It is important to note that unlike (
28), Equations (
29) and (
30) depend only on the covariance matrix
, being independent of the mean values, as noted in
Section 1.
4. Simulations
In this section, we present simulation results that show how IF and IL capture abrupt changes in the system dynamics of the Kramers equation. To this end, we designed four simulation experimental scenarios, which are summarised in
Figure 2. The different scenarios were chosen depending on whether
and
(defined in Equations (
19) and/or (
20), respectively) include(s) an impulse function (that is, whether
or 1 and
or 1), which caused the abrupt changes in the values of
and
, respectively. Specifically, Case 1 was without any impulse (
); Cases 2 and 3 were when the impulse was included in
D and
(
and
), respectively; Case 4 was with both impulses (
). As noted at the end of
Section 4, IL and IF in Equation (
28) and Equations (
29) and (
30) clearly reveal that IF was not affected by the change in the mean values. This means, IF took the same value in both Cases 1 and 3; it also took the same value in both Cases 2 and 4. This is highlighted in
Figure 2 by the purple colour.
For Cases 1–4 in
Figure 2, we fixed the value of
to be
and varied
to explore different scenarios of no damping
, underdamping
, critically damping
and over damping
. Furthermore, we fixed the values of the initial covariance matrix as follows
The initial mean values were fixed as
for all Cases.
In addition, we performed the stochastic simulations for Cases 1–4 by using a Cholesky decomposition to generate random numbers [
28] according to the Gaussian statistics
, specified by the values of
and
(
) given in
Appendix A. Simulated random trajectories are shown in blue dots in the phase portrait of
and
in
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7 and
Figure 8 of the following subsections.
4.1. Information Flow Simulation Results
As noted in
Section 2.3, we recall that IF is used to measure a directional information flow in terms of its entropy and that IF is either positive or negative unlike transfer entropy. In our experimental simulations, we were interested in how sensitive IF was to abrupt changes. The time-evolutions of IF
,
, joint
and marginal
,
entropies in Equations (
23)–(
25), and the phase portrait of
vs.
are shown in
Figure 3 and
Figure 4. We used the same initial condition
given by Equation (
31) and
while varying the value of
. As noted above, random trajectories from stochastic simulations (using a Cholesky decomposition to generate the random number [
28]) were overplotted in blue dots in the phase portraits. Specifically,
Figure 3 and
Figure 4 are for Case 1 and Case 2, respectively (with
and
in (
19), respectively). The exact value of
is shown in
Figure 2 and as a blue dotted line in all panels of
Figure 3 and
Figure 4 (using the y-axis on the right of each panel).
4.1.1. Case 1—Constant D(t) and u(t) = 0
We started with Case 1 which had no perturbation (constant
and
) and examined the effects of the system parameters
on IF. First, with no damping
(
Figure 3a),
and
S all increased monotonically in time from a negative value (a less disordered state) to a positive value (more disordered state) due to the stochastic noise. On the other hand,
and
showed similar behaviours but with opposite sign, making
. The opposite sign of
and
suggests that
acted to increase the marginal entropy of
(by transferring the stochasticity fed into
by
) while
decreased the marginal entropy of
(by providing a restoring/inertial force causing the harmonic oscillations). The fact that
can be corroborated by the similarity between the marginal entropies
and
.
Second, in the underdamped case with
shown in
Figure 3b, the phase portrait exhibited the behaviour of an underdamped harmonic oscillator. The role of the damping
was to bring the system to an equilibrium in the long time limit where PDFs were stationary and
and
S took constant values
as can be shown by using (
A7) in (
23)–(
25). Specifically, in Equation (
5), the first term in RHS (which depended on
) vanisheed as
while the second term in RHS (which depended on
) determined the value of
which for
was as follows (see Equation (
A7))
The reason why
and
S overall decreased in time is because the equilibrium had a narrower PDF (
) (see Equation (
32)) than the initial PDF (
). Consequently,
Third, in the critical/overdamped case
in
Figure 3c,d, we observed a much faster decrease in
than
as
damps
quickly (recall that
and see (
17)). Consequently, there was a faster and higher transient in
compared with
for larger
, fluctuations in
having a greater effect on the rate of change in the marginal entropy
. It is worth emphasising that our results for
above (e.g., the decrease in entropies) involved the narrowing of a PDF over time. In particular,
and
for a constant
were caused by the change in
from its initial value
to the equilibrium value in Equation (
32) due to
. For a much larger
, Equation (
32) took a larger value than
, and PDFs became broaden over time, entropies increasing in time, for instance. As a result,
while
.
Appendix C explores how different values of the constant
affect IF. Finally, we note that in the phase portrait plots, the stochastic trajectories shown in blue dots generated by
remained near the trajectories of the mean values.
4.1.2. Case 2—Perturbation in D(t) and u(t) = 0
To study how sensitive IF was to a sudden perturbation in
(therefore in
), we included an impulse function localised around
(see
Figure 2) in
, which is shown in blue dotted line using the right
y axis on
Figure 4. As before,
Figure 4 shows results for the undamped, underdamped, critically damped and over damped cases, respectively.
First, in
Figure 4a for
, we observed that in a sharp contract to
Figure 3a, the impulse rendered large fluctuations in the simulated trajectory
, with significant deviation from the mean trajectory
. On the other hand, such an abrupt change in
led to a rapid increase in
,
and
followed by oscillations. The amplitude of these oscillations slowly decreased in time, the oscillation frequency set by
(as expected for no-damping).
Second, in the underdamped case
shown in
Figure 4b,
and
exhibited some oscillations before reaching the equilibrium, as can also be seen from the phase portrait behaviour. Since the damping was still small, there was rather a long transient. It is interesting to notice that
and
flipped their signs (e.g.,
to
around
as
t increased) due to a sudden increase in
D (
). This can be understood since the perturbation applied to
increased marginal entropy
while
decreased the marginal entropy
. As a result, around the time
where
D was maximum, the sign of IF became opposite to that without the perturbation shown in
Figure 3b. Third, for the case
shown in
Figure 4c,d, the sign of
and
behaved similarly to the underdamped case
Figure 4b). Overall,
Figure 4 shows that
and
exhibited their peaks around
. However, a close examination of the cases with
revealed that the peak of
and
appeared after the peak of the impulse (in blue dotted line). That is, the peaks of
and
proceeded (not preceded) the actual impulse peak. This will be compared with the case of IL in the next section where the peak of the information length diagnostics
tended to precede the impulse peak, predicting the abrupt changes earlier than IF. Furthermore, IF was independent of external perturbations in
.
4.2. Information Length Diagnostics Simulation Results
In this subsection, we investigated how sensitive information length diagnostics (
,
) were to the abrupt changes in the system dynamics. In contrast to IF, IL was capable of detecting changes in both mean values (
) and
(
), as can be inferred from Equation (
9). We considered the four Cases 1–4 in
Figure 2 in
Figure 5,
Figure 6,
Figure 7 and
Figure 8, respectively. In each case, we present the results of
,
,
,
,
and the phase portrait of
vs.
(where the stochastic simulations are shown in blue dots). As before, we used the same initial conditions
in Equation (
31) and the same parameter values (
) while varying
for undamped, underdamped, critically damped and overdamped cases. The initial mean values are fixed as
for all Cases.
It is worth noting that (the unperturbed) Case 1 in
Figure 2 corresponded to the usual Kramers equation, previously studied in [
27]. We nevertheless show results for Case 1 below to be able to compare with Cases 2–4 as well as show new results such as
,
, and
that might be useful for understanding the correlation between variables. Note that in the following,
plots are not discussed in each Case, but instead discussed separately in
Section 4.2.5.
4.2.1. Case 1—Constant D(t) and u(t) = 0
In this unperturbed case, our main focus here was on the effects of on , and the marginal information velocities and .
First, for the undamped case
shown in
Figure 5a, harmonic oscillations (e.g., seen in the phase portrait) appeared in
and
, their oscillation frequency determined by
. We recall that
and
are calculated from the marginal PDF of
and
, respectively. Because of the absence of damping,
decreased but never reached 0. The finite value of
is due to
and
as the PDF
evolved according to (
3).
When
in
Figure 5b, a non-zero damping led to
as the PDF reached its equilibrium value while
converged to a finite value. It is worth highlighting that non-zero
and
signified transient behaviour far from equilibrium. Finally, in
Figure 5c,d for
, we observed that a higher value of
led to the shorter duration of transients and larger fluctuations in
.
4.2.2. Case 2—Perturbation in D(t) and u(t) = 0
Figure 6 shows the effect of an impulse like function in
(see (
19)), which then led to an abrupt change in the covariance of the system PDF
given by (
3). Since IL depended on the value of
(see Equation (
9)), this abrupt change in
had a considerable impact on
.
For the case
shown in
Figure 6a, the amplitude of
and
was seen to be increased around the time of the impulse peak. The phase portrait clearly shows the increase in the uncertainty (more scattered data). The values of
and
were also seen to increase due to the perturbation.
For
, the oscillations in
and
were much less pronounced due to damping (see
Figure 6b). This behaviour prevailed also for
shown in
Figure 6c,d. Interestingly, a close examination revealed that the maxima in
and
proceeded the peaks of the impulse (in blue dotted line), as alluded at the end of
Section 4.1.2. This was seen more clearly for larger
in
Figure 6c,d where the maxima in
,
and
all preceded the impulse peaks. These results demonstrate that the information diagnostics predicted the onset of a sudden event earlier than the information flow.
4.2.3. Case 3—Constant D(t) and Perturbation in u(t)
Figure 7 shows results for a constant
and an impulse-like external input
(see (
20)) which caused an abrupt change in
.
is shown in a red dotted line using the right
y axis.
When
,
Figure 7a shows how the perturbation changed the dynamics of
while
remained unchanged in the phase portrait plot. When a non-zero damping was included in
Figure 7b–d,
,
and
approached zero as
. The phase portrait in
Figure 7b–d shows how the perturbation changed the trajectory temporarily.
Overall, we observed a very large increase in
,
and
(larger increase in
than in
), their peaks forming a little before or around the impulse peak (shown in red dotted line). Besides, the value of
was higher when we had a perturbation on
and a constant
than when
was perturbed and
for
(see it by comparing
Figure 6 to
Figure 7). Furthermore,
was the most affected by the changes in
since
directly depends on
.
Finally, it is important to highlight that our result of a high sensitivity of IL to abrupt changes in was not shared with IF which was insensitive to .
4.2.4. Case 4—Perturbations in Both D(t) and u(t)
Case 4 in
Figure 2 is when we added impulse like functions to both
and
(
and
in Equations (
19) and (
20), respectively.). Again, note that
is shown in a red dotted line using the right
y axis. Overall, the phase portraits in
Figure 8 for the undamped, underdamped, critically damped and overdamped scenarios show that the perturbations momentarily broadened the width of PDF (
3) while causing a large deviation of the trajectory of
.
Figure 8a for the undamped case
shows that the perturbations increased the value of
in comparison to Case 3 with
(See
Figure 7a). This is due to the increase in
in Case 4 by the impulse in
, which increased the uncertainty against which the information was measured.
For non-zero damping in
Figure 8b–d, we saw a substantial increment in the amplitude of
(similar to Case 2 but smaller than in Case 3). In fact, in all cases of the underdamped, critically damped and overdamped scenarios, the overall behaviour was close to that observed in Case 2 (see
Figure 6) than that in Case 4. It is because the increase in mean values due to the impulse
was somewhat compensated by the uncertainty increase due to the impulse in
. This is a consequence of both impulses that had the same form, e.g., taking their maximum values at the same time
(see
Figure 2). For instance, if Case 4 were considered with the two impulses that were timed differently, much larger values of
were expected for Case 4 compared with Case 2. There were obviously differences between Case 2 and Case 4, for instance, in the long time limit
,
in Case 4 was always bigger than that in Case 3. Finally, similar comments as before could be made in regards to the prediction capabilities of the information length diagnostics
.
4.2.5. Interpretation of the Plots
We now discuss the plot of for all Cases 1–4 collectively to point out its usefulness.
First, according to (
9), it is clear that
considered the contribution from the non-independent random variables
,
, and its covariance matrix
to the information changes in time, while
was based on the sum of
from a marginal PDF of
(see Definition 1). Thus plotting
gave an approximation of the contribution from the cross-correlation
to
.
As an example,
Figure 9 shows the simulation of a non-perturbed scenario (
and
) using
,
,
,
and
(underdamped). This example permitted us to compare the evolution/deformation of the width of
(given by Equation (
3)) in the
-
plane with the value of
over time shown in the right panel of
Figure 9.
Figure 9 when
(at
, for instance), shows that the shape of
was a perfect circle (this because
). For
, the shape of
was deformed according to the value of
. The simulations suggest that the bigger the value of
the higher the correlation between the random variables
and
(
was highly deformed).
In summary, in regard to Cases 1–4, we can remark two characteristics on the behaviour of
in
Figure 5,
Figure 6,
Figure 7 and
Figure 8. First, the value presented more variations when we had a perturbation on
, for instance when
there were high oscillations not presented when there was a perturbation on
but not on
. Second, the higher the value of
the less the deformations through time of
’s width since
showed less changes through time.
5. Concluding Remarks
We have investigated the prediction capability of information theory by focusing on how sensitive information-geometric theory (information length diagnostics) [
7,
8,
9,
10,
11,
12] and one of the entropy-based information theoretical methods (information flow) [
16,
17] are to abrupt changes. Specifically, we proposed a non-autonomous Kramers equation by including sudden perturbations to the system as impulses to mimic the onset of a sudden event and calculate time-dependent probability density functions (PDFs) and various statistical quantities with the help of numerical simulations. It was explicitly shown that the information flow like any other entropy-based measures is insensitive to to perturbations which do not affect entropy (such as the mean values). Specifically, the information length diagnostics are very sensitive to both perturbations in the covariance
and mean
of the process while the information flow only detects perturbations in its covariance. Furthermore, we demonstrated that information length diagnostics predict the onset of a sudden event earlier than the information flow; the peaks of
(or
) tend to proceed the impulse peak while the peak of information length diagnostics
tends to precede the impulse peak.
We expect that some of the results presented in this work would be useful in different engineering applications [
34,
35] since linear approximations are often useful [
36] for control engineering applications. For instance, one can develop an information-geometric cost function for control design to achieve a guided self-organisation [
37,
38], instead of using entropy as a cost function [
39]. Given high variabilities involved in complexity and emergent behaviour [
13,
14,
15], it will be interesting to further extend this work to investigate interconnection of the components in a complex system, or causality and also to non-linear, non-Gaussian models or real data.