1. Introduction
The Earth’s ionosphere is a very rich open system [
1] that interacts with the Solar Wind, the Earth’s magnetosphere, the neutral atmosphere including the troposphere, the cosmic radiation and, very likely, with the Earth’s lithosphere [
2].
Due to its relevance for human activities, such as navigation and positioning, power plant and pipeline safety, great efforts have been done to make models of it, enabling predictions of its behaviour (see for instance the book [
3], where some different authors review the first principle and empirical ionospheric models). The prediction of ionospheric behaviour, on a great variety of space and time scales, has made great progress in the history of aeronomy and space science since the “discovery of the ionosphere” by Marconi, Appleton, and Barnett in the 1920s. Still, it does make sense to raise the general question of to which extent this behaviour may really be predicted. Since the pioneering studies of Lorenz [
4], physicists have realised that even
perfectly deterministic systems, the dynamics of which may be written in closed form, show a certain degree of unpredictability, due to the phenomenon of
chaos, whenever
non-linearity comes into the play.
This is the case of the components of the Earth’s atmosphere as well [
3,
5]: in every possible model of a local or global portion of the ionosphere, any predicted quantity
is always expected to have some fluctuations
, irregular and apparently out of reach for our prediction. These space and time irregularities represent an effect of the non-linear components of the ionospheric dynamics, working as a magnifying lens on the effects of the matter’s granularity, as it happens with fluid turbulence [
6]. While making physical, empirical, machine-learning-based models of the ionosphere, one would not mind to know
if the precision to predict some quantity may be reduced arbitrarily, or if any limit to this predictability exists. Such issues make perfect sense in the field of tropospheric weather prediction [
4,
5], so there should be no surprise for them to make sense in ionospheric weather and climate as well [
7,
8,
9].
This paper is dedicated to presenting some data analysis tools, commonly used to assess the predictability of complex systems, which can be used to study the same aspect for the ionosphere. The application of the data analysis tools presented is suggested for the study of the
vertical total electron content (vTEC), a physical quantity largely used to describe the local ionosphere, and of which huge worldwide continuously data monitoring exists. The definition of the vTEC on the top of a ground location of geographical coordinates
reads:
where
is the free electron density number (in (
1)
z is the quote and
t is the time). The dynamics of the density number
of free electrons is very rich, and it should be expected to show all the characteristics defining “complexity” [
5], precisely as it happens in meteorology. The complex dynamics of
are necessarily reflected in a complex evolution of
. The choice of vTEC as a proxy of the local ionosphere state makes practical sense: indeed,
the total electron content along a general path γ is very useful in the field of ionospheric radio propagation, being proportional to the
optical path contribution due to the ionospheric medium along
γ [
10]. The vTEC on the top of a certain ground location of coordinates
is meant to give an idea of the effect of the local ionospheric medium on radio propagation.
Ionospheric complexity may result in precise mathematical terms when a representation of the ionospheric medium is chosen. Consider, for instance, the
fluid representation of the ionospheric medium (FRIM): the evolution of
is described by a system of coupled, time dependent, and partial differential equations (PDEs) in which the density number, bulk velocity, and temperature of each chemical component of the ionosphere are involved as classical fields. Moreover, the geomagnetic and geoelectric field equations couple with those fluid PDEs. This would be by itself enough to expect complex dynamics to develop [
3], namely
high dimensional chaos [
11]. Moreover, the FRIM is not even the most “detailed” representation possible: it is a complex, but still deterministic picture [
3]; representations including
fluctuations may be stochastic variations of the FRIM, such as the representation of the sporadic spread F layer in Ref. [
12] or the kinetic pictures in Ref. [
13].
The complexity of the dynamics just mentioned is expected to be reflected in the vTEC time series, as indeed it is. The evolution pattern of
with time appears as
quasi-periodic: the main component of this evolution is the
diurnal variability, driven by insolation. Besides this periodicity, however, a huge variety of shapes appear in the
, all encoding the complexity of the ionospheric dynamics: this renders the evolution
scarcely predictable. Assessing the limit of predictability of the vTEC, if any, a statement is made on the extent to which the ionospheric medium may be
predictably modelled, i.e., represented deterministically [
3].
In the present work, the data analysis techniques applied make use of concepts well known in the literature of
complex systems, which become popular in the early 1990s in the field of magnetospheric physics (see, e.g., Refs. [
14,
15] and the later work of Ref. [
16] and references therein), but less so in the field of ionospheric dynamics, although important attempts have been made in the past [
17]. In particular, three concepts are used in our vTEC analysis: the concept of
embedding phase space, that of
correlation dimension , and that of
Kolmogorov entropy [
18] (the symbols
and
refer directly to the way the correlation dimension and Kolmogorov entropy are calculated: see below).
The aforementioned quantities are well defined when one deals with an
autonomous finite dimensional dynamical system Σ, described via its trajectories
throughout some phase space
, with finite dimension
. The dynamics
of Σ determine the local properties of the trajectories: in particular, how “irregular” they are, “filling” a region of
as chaotic curves, which is described by the correlation dimension; and how fast the information shared by the present state
with the past ones
is lost, which is given by the Kolmogorov entropy rate.
The physical system in our case is the
local ionospheric medium (LIM), of vTEC (
1), where “local” means the correspondence of the given ground coordinates
. As we have sketched before, the mathematical representations of the ionospheric medium are more complicated than a “simple”
m-dimensional Σ: both the FRIM and all the possible kinetic representations of the “dirty plasma” [
1,
3] have an infinite dimensional
, as they are in practice field theories. In fact, one should think of the ionosphere as a fluid that may be in different conditions, ranging from “laminar” to “turbulent” flows: hence, it may show different behaviours, described via phase spaces of a different
finite dimension, depending on how many physical modes are “switched on” in the flow conditions at hand. The use of finite dynamical systems in fluid dynamics is already well known in the literature, since Lorenz defined his paradigmatic 3-dimensional chaotic system to represent a simplified model for the atmospheric convection [
4]: of course, the continuum mechanics of the atmosphere is an infinite-dimensional system as it would be the case for a kinetic representation of it. Yet, some selected modes of it, coupled among themselves but decoupled from, e.g., smaller scale ones, may well be described via a finite-dimensional Σ.
In order to obtain the information about the vTEC predictability, namely the predictability of the local ionospheric state, we consider the time series
as
the only physical information available, and look for a “suitable Σ” that can mimic the LIM physics. In particular, we apply the
embedding phase space analysis obtained from the important works by Takens and others, see for instance Ref. [
18] and the many references therein (in particular, Refs. [
19,
20]). The procedure, well known in the literature, and already applied by the Authors T.A. and G.C. to the Space Weather research [
16], is worth briefly discussing in terms of how far the assumptions behind it will fit the LIM dynamics (for technical details the reader is addressed to the quoted references [
18,
19,
20]).
The paper is organized as follows.
In
Section 2 the dynamical system tools to be applied to the vTEC time series are introduced, and in
Section 3 the outcome of their application to two vTEC time series is presented; one series pertains to a year of Solar Minimum and the other one to a year Solar Maximum: this choice is expected to make the analysis explore different helio-geophysical conditions, as the solar activity is the main trigger of the ionospheric response, called
Space Weather [
3].
Section 4 is finally devoted to the conclusions and physical reasonings regarding the presented results, and also some developments that are on their way.
2. Embedding Phase Space, and
As stated in
Section 1, we are trying to infer some dynamical information about the physical system “LIM around the location
”, being able to work only with the time series
. This is done by constructing a finite-dimensional dynamical system, assumed to be governed by some dynamics as in Equation (
2), solely out of the collection of values
. The tools presented here are able to give us the dimension
of the phase space
, and other two quantities
and
that characterise the topological structure of the system trajectories
and their predictability. Nothing more will be inferred, instead, on the form of the function
giving the dynamics of
as in (
2).
A description of the presented analysis tools is worth to be given.
2.1. The Embedding Phase Space
Let us have a certain time series as the only proxy collected for a given physical system. For , t will assume only discrete values , being the sampling time and , so that . The assumption is that the values taken by the time series are due to the dynamics of a system Σ. In particular, if the system Σ is described by the finite-dimensional state , at each time the quantity y depends smoothly on all the components of , such as . The aim of the data analysis tool described here is “to obtain Σ out of y″, i.e., to reconstruct the trajectory out of the time series , for t in the interval of observation of y. Let us underline again that this system Σ is completely unknown, but some prior hypotheses on it are necessary:
Along the interval
, its physics is “stationary”, i.e., no sudden changes in the parameters or in the external forces take place, so that the dynamical structure in
remains the same. In practice, one assumes
to be governed by an
autonomous dynamics as in (
2), where
does not depend explicitly on time;
The dimension is unknown, and it will be an output of the embedding procedure (needless to say, this must be constant along the time interval . As in our practical case, the condition of the ionospheric medium to range from hydrostatic equilibrium to bursty turbulence, should be not taken for granted. Reasonably, one should apply this technique to data sets of vTEC where the degree of turbulence of the LIM is constant, or accept to obtain results that are an average of all the different conditions of the LIM met throughout the time series. This latter condition is precisely the one met in the 1-year-long vTEC series analysis, as done here).
The result regarding whether finding a reasonable Σ out of
is possible, dates back to Takens’ Theorem [
18], stating that an
m-vector
formed by
and
other quantities
functionally depending on
, and
functionally independent of each other, may work as a good state of Σ, existing as a smooth 1-to-1 relationship between
and the “true state”
. In the definition (
3) the brackets of
underline that, even if those quantities are
locally dependent on
t, they may depend
non locally on
, as it will be clear from the practical choice of our
s below. Note also that the system state is reconstructed in a possibly proper subinterval of
,
with
and
.
Once the vector is reconstructed, i.e., as we know the number m, and all the values for , the topological properties of , characterizing the degree of chaos and predictability of Σ, are known, because one holds the curve that is in 1-to-1 infinitely differentiable correspondence to ( and are said to be diffeomorphic to each other).
The chosen form for
in (
3) is
the time
is
a time lag suitably chosen, so that
is
functionally independent of its neighbours and
. In practice, on the one hand,
must be
so large that indeed the neighbouring
are independent; on the other hand,
cannot be
too large because, after all, the components of
in (
4) must all refer “to the same state” of Σ; namely, the non-linear dynamics moving Σ has not to had the time to change
during a time interval of length
. In our analysis,
is chosen as the delay after which the mutual information between
and
goes to less than
(this results into approximately
for the data of vTEC at hand, see below). Clearly, other choices for
might be done: the one adopted here guarantees the
functions as in (
3), realised as the
in (
4), to be
functionally independent of each other, and not only linearly independent; moreover, the mutual information is not reduced too much, so to render
and
, with
, is still suitable to describe an
instantaneous state .
As the is established, next one needs to fix m to construct . In the present work, the right embedding dimension is, in a sense, established by looking at what would be the result of guessing it. The guess-and-refine work to obtain the proper is done through the guess calculation of the quantity : various guess are tried, for each of them a guess is calculated, and the right is chosen as the one giving the correct .
2.2. How Chaotic? Defining and Fixing
The dimension
is intended (and calculated) as
the Hausdorff dimension of the points in
lying along the curve
, and it is reasonable to state that the right
m is the smallest one for which
reaches a saturation value, as stated in a while. The fact that one calculates this
means that the trajectory
is expected to develop on an attractor
that has
real dimension ; there is
an implicit assumption that the dynamics has a certain amount of chaos, otherwise
should have dimension 1. So, the quantity
rather makes sense for irregular evolutions
, giving irregular time series
. This is indeed the case of the vTEC time series analysed here, as one can see by looking at the plots in
Figure 1 below. Considering
the amount of points of the trajectory
around the point
within a (small) neighbourhood of size
r, the Hausdorff dimension
is defined so that:
In (
5) the dimension
may depend on the point around that is calculated (multi-fractal, or locally fractal, attractors), but in this work we are looking for a unique
throughout the whole evolution studied, as
. In general,
:
the larger , the thicker is the attractor in
, through which the system trajectory evolves, i.e.,
the more chaotic its dynamics turns out to be. The value
is that of an infinitely regular (smooth) curve, a value
is chaos, while
represents a
fully stochastic evolution, i.e.,
pure noise (the idea of smoothness out of the necessarily discrete map of any real data is rather loose: on the one hand, it does not make sense to speak about any
space of discrete time evolutions; on the other hand, the solution of a system as the
thought of to govern Σ
is infinitely derivable, as
is. In practice, the calculation of
, giving rise to some number in between 1 and
, with some uncertainty of course, provides us with an idea of how chaotic the dynamics of Σ is. Moreover, “noise” does not mean “white noise” or “Gaussian noise”, rather
fully probabilistic evolution: turbulence has made us used to probability distributions that represent all but “trivial” noise). A regular evolution with
may be fully predicted, while the more chaotic it is, the less predictable it turns out to be. Fully stochastic evolutions with
should be treated only via probability.
Coming back to the evaluation of m, as a candidate embedding dimension d is given, it is possible to calculate the first quantity we are interested in, i.e., the correlation dimension , out of the curve : due to its meaning, one has , and it can just grow with d, that is . Hence, the procedure of attempting to use with larger and larger d stops when the correlation dimension stops growing as the embedding dimension is increased, that is for with . Then, the embedding dimension m is chosen as .
2.3. How Unpredictable? Defining
The other quantity that will be calculated for our system is the Kolmogorov entropy
, representing, in practice,
the amount of trajectory location precision that is lost in a single time step of the evolution (this
calculated from
, depends on the candidate embedding dimension
d, i.e., one expects to have
). This quantity is calculated by considering
-coarse graining of the reconstructed phase space
, so that all along the trajectory points are collected in
finite size neighbourhoods, as in
Figure 2. By simply point counting, one may calculate the joint probability measure
that the system state is in the neighbourhood
at time
, in
at time
and so on: the total Shannon uncertainty about the trajectory location after
n times is defined as:
With those
s, one may define the limit
that is our Kolmogorov entropy (rate), where
N is the number of times considered. In (
7) the limits
,
and
rather indicate that the sampling time should be much smaller than the dynamics timescales, so that the neighbourhood size should be much smaller than the
gross variability, and that enough data must be collected, respectively.
Some remarks are necessary for the definition (
6) and its use (
7). First of all, the trajectory
, reconstructed via the embedding procedure discussed above,
is namely fully known, so that
a rigore it should be non-sense speaking about an ignorance entropy. However, introducing the
-graining, the resolution of our observation becomes finite, and some uncertainty must be admitted. This uncertainty is not a mere artefact of some entropy-fanatic: the quantity
defined in (
7) is in fact different for different systems, and is larger for more chaotic systems, i.e., systems with a higher degree of chaos diagnosed via other proxies, for instance Lyapunov exponents. This stated, one accepts that
, which is measured in
, and is indeed an information entropy rate, is in practice the inverse of the time after which the ignorance about the system position increases 1 bit:
is
the timescale within which the behaviour of the system can be accurately predicted.
Regarding the practical interpretation of , a fully predictable evolution would show , because it would be predictable forever as ; a chaotic system has a finite , that is predictability within some time ; last but not least, a fully stochastic system shows , so that the predictability horizon would be a zero time , and the evolution is somehow a continuous dice rolling.
2.4. Practical Calculation of and
In the data analysis performed here, the correlation dimension
and the Kolmogorov entropy
are calculated through the numerical recipes by Grassberger and Procaccia [
21], whose work rendered those abstract quantities more easily calculable in practice [
16]. First of all, one defines a
correlation intergral as
In (
8),
is the Heaviside step function, while the symbol
is the distance in
between the two points
and
along the trajectory embedded in a
space of dimension
m: usually this is calculated as an Euclidean norm, or the
m-dimensional Pitagora Theorem
; instead,
is simply a real positive number. Then, one assumes this integral
to be a
power law in the limit of small
r:
Starting from this correlation integral, the correlation dimension
is finally calculated as:
Once the quantity in (
9) is calculated for the time series
of interest, i.e., for the original
of real data, the degree of “chaoticity” of the dynamics giving rise to
may be assessed as stated before.
The Grassberger–Procaccia method described in Ref. [
21] allows us to calculate
practically: a satisfactory approximation of the quantity defined in (
7) is indeed obtained by using the correlation integrals in the
m and
embedding dimensions as
where
is still the sampling time. Equation (
10) is a powerful estimation of
and it directly comes from inserting Equation (
8) into Equation (
7), since, as shown by Grassberger and Procaccia [
21],
With the two operative formulas (
9) and (
10), we are now in the position of applying these tools to the evolution of the vTEC.
4. Conclusions, Issues, and Future Research
The ionospheric medium undergoes different mathematical representations as a dynamical system, ranging from the fluid mechanical FRIM to kinetic theories. In this paper, a data-driven approach is used to sketch a geometrical representation of the local ionospheric evolution as a finite dimensional dynamical system. In particular, we have constructed a finite dimensional phase space through which the state of the local ionosphere moves: this was done by reconstructing an equivalent state evolution via the embedding procedure, starting from a 1 year long vTEC time series , with resolution. To give some more physical taste to this construction, and to confront it against what we expect from a gross knowledge of the geospace physics, two years of vTEC data were analysed, on the top of the same mid-latitude GNSS station Matera: the Solar Maximum year 2001 and the Solar Minimum year 2008, working on the series and .
The embedding dimensions of the two phase spaces
and
are rather small, and equal:
, yet enough to host “chaotic” trajectories. As commented before, it is already significant that the local ionospheric medium shows the same dimension, for its phase space, in the Solar Maximum and Minimum year: in principle, this suggests that the vTEC dynamics are produced by
a physical autonomous system (2) of three dynamical variables. Of course, the practical construction of those three quantities
,
, and
, e.g., as functions of the quantities describing the FRIM or kinetic theories, is all a different problem, and is definitely not dealt with here. Yet, our analysis ensures that the helio-geophysical conditions of the years 2001 and 2008 reduce the infinite-dimensional functional spaces of the FRIM, or of the kinetic theories, to some effective
de facto similar to
.
In order to measure how chaotic and how predictable the two evolutions and are, their correlation dimension and the Kolmogorov entropy rate have been computed: is a measure of the Hausdorff dimension of the attractor containing the trajectory (treated as a probabilistic evolution with ergodic motion within ); is the time after which the uncertainty on grows at least , becoming strictly unpredictable.
The remarkable result is that there are no sensible differences between the Solar Minimum and Solar Maximum yearly vTEC evolutions, either in the
or in the Hausdorff dimension of the attractor
. As announced in
Section 3, this is a little bit surprising, more for
than for
. During a year of Solar Maximum, many more magnetic storms take place, and we expect that the three variables
describe topologically different trajectories in their
during
geomagnetically quiet or
disturbed periods. This latter fact must be true, as we see that the degree of turbulence and irregularity in the ionospheric medium is different in a quiet period or during a storm.
The question is: why is the variability of with geomagnetic activity not evident in our results?
The fact is that both in the Solar Maximum year 2001 and Solar Minimum year 2008, geomagnetic storms took place, even though in a different number, but here the time series and are simply one-year long time series, adding together the quiet and stormy periods. We should then intend , i.e., a one-year average of the time-local dimension of , and most likely the total amount of stormy time in 2001 to be negligible, to this extent, almost the same as the amount in 2008. To check whether and how much the attractor characteristic changes from the geomagnetically stormy to the quiet dynamics, one should make an analysis separately for stormy and for quiet times, including shorter time series that contain severe geomagnetic activity periods. More time-local analyses will be performed in our future work, focusing on the possible differences due to day-time and night-time conditions.
The calculation of
, with its inverse values (
15), points towards a slightly better predictability of vTEC during the Solar Minimum than during the Solar Maximum (the predictability time horizon is 1 min longer, namely
versus
): this indicates that the Solar Wind impacts on the geospace are able, during the Solar Maximum, to render a more irregular, and hence slightly less predictable, local ionospheric medium.
For the forecast time horizons (
15) applied to the vTEC evolution, an observation has arisen at the end of
Section 3, i.e., regarding the interplay between the forecast horizon of a few minutes and the apparent presence, in the vTEC time series, of a quasi-periodicity of about 24 h (not to mention the seasonal quasi-periodicities). Looking at the vTEC dependence on local time, as shown in
Figure 7 in some detail, clear 24 h recursive patterns show up. Recursivity (quasi-periodicity) introduces some predictability: one can expect the vTEC to be growing the next morning until the local noon, while in the local afternoon an approximate decrease will take place, and this may be stated with great confidence. Yet, the results (
15) state that, analysing what is precisely the behaviour of
y around a certain time
t, the dynamics allows one to infer what
will be at most for
for the year of Solar Minimum, and
for the year of Solar Maximum, which is much less than the 24 h period of the repetitive patterns. The questions are, then, why does this kind of recursive pattern not appear in our calculated
, and how can one recognize the amount of predictability that the recursive patterns introduce, if any. The point is that the whole construction of
and
includes the full time series, with all its time-local different conditions, and one uses these data to give some very local information in
, as the Hausdorff dimension
. The predictability horizon is calculated in the same way, i.e., by making all the values
participate in the calculation of a
value that will take into account all the history of the vTEC analysed. The inputs of such an analysis are all the values of
y at every
t along the year at hand, and all the possible single time-scale components composing the vTEC variability (as a practical example of vTEC scale decomposition, see the empirical orthogonal functions used in Ref. [
24]): looking at the plot in
Figure 7, one may indeed distinguish that the local time dependence of the vTEC shows a rather smooth “background” 24 h quasi-periodic behaviour, on which smaller amplitude fluctuations, taking place on shorter timescales, are superimposed (these fluctuations are not shown in
Figure 7, in which only the 24 h-quasi-periodic “trend” is reported). The quasi-periodicity, induced by the insolation driver, is clearly influencing the daily scale component of
. Let us call it
: one could guess that this
is a much more predictable evolution than the whole
, with some much smaller
, and longer forecast time horizon. The guess at this point is that indeed the predictability due to the quasi-periodic patterns in
appears
only when one analyses the components relative to the time scale ℓ, comparable with the period of those patterns. This guess makes the authors plan to repeat the analysis, performing some time-scale analysis (e.g., via an empirical mode decomposition) of the vTEC
, to see how the single component
, and
as well, varies with the scale
ℓ. Such an analysis will be performed in future works.
As a final point, we want to stress that all the work in this paper pertains to the mid-latitude location of Matera, during one year of Solar Maximum and one of Solar Minimum. In order to understand which physical agents determine the complexity and (un)predictability of the local ionosphere, it will be necessary to extend this investigation to different locations (different magnetic latitudes and longitudes) and different magnetic activity periods, always taking care to distinguish the different time scales. Such future studies would be worth comparing with the results obtained for example by Ref. [
8] and Ref. [
9], who investigated at middle and low latitudes, respectively, the deterministic chaos present in shorter periods of TEC using Lyapunov exponents and surrogate tests, finding higher values for periods of geomagnetically disturbed conditions than for the quiet ones.
Finally, a very ambitious theoretical issue is to explicitly write the
m-dimensional system (
2) for the local ionosphere: a much more physical and mathematical effort, and a detailed study of the very large series of data available, will be necessary for this purpose.