Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator

Izzi, João V. R.; Ferreira, Ricardo F.; Girardi, Victor A.; Pena, Rodrigo F. O.

doi:10.3390/brainsci14050442

Open AccessArticle

Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator

¹

Department of Statistics, Federal University of São Carlos, São Carlos 13565-905, SP, Brazil

²

Department of Biological Sciences, Florida Atlantic University, Jupiter, FL 33458, USA

³

Stiles-Nicholson Brain Institute, Florida Atlantic University, Jupiter, FL 33458, USA

^*

Authors to whom correspondence should be addressed.

Brain Sci. 2024, 14(5), 442; https://doi.org/10.3390/brainsci14050442

Submission received: 23 March 2024 / Revised: 22 April 2024 / Accepted: 26 April 2024 / Published: 29 April 2024

(This article belongs to the Section Computational Neuroscience and Neuroinformatics)

Download

Browse Figures

Versions Notes

Abstract

:

Information theory explains how systems encode and transmit information. This article examines the neuronal system, which processes information via neurons that react to stimuli and transmit electrical signals. Specifically, we focus on transfer entropy to measure the flow of information between sequences and explore its use in determining effective neuronal connectivity. We analyze the causal relationships between two discrete time series,

X : = \{X_{t} : t \in Z\}

and

Y : = \{Y_{t} : t \in Z\}

, which take values in binary alphabets. When the bivariate process

(X, Y)

is a jointly stationary ergodic variable-length Markov chain with memory no larger than k, we demonstrate that the null hypothesis of the test—no causal influence—requires a zero transfer entropy rate. The plug-in estimator for this function is identified with the test statistic of the log-likelihood ratios. Since under the null hypothesis, this estimator follows an asymptotic chi-squared distribution, it facilitates the calculation of p-values when applied to empirical data. The efficacy of the hypothesis test is illustrated with data simulated from a neuronal network model, characterized by stochastic neurons with variable-length memory. The test results identify biologically relevant information, validating the underlying theory and highlighting the applicability of the method in understanding effective connectivity between neurons.

Keywords:

effective connectivity; transfer entropy; conditional independence; causality; hypothesis testing; interacting variable-length Markov chains

1. Introduction

Estimating the effective connectivity between neurons in the brain is not an easy task [1,2,3,4,5]. There are many ways to unveil causal relationships in their multiple scales, such as from neurons to brain regions. Experiments with external stimuli are commonly used for this inference process, where the spikes and the activity of a neuron are related to a second neuron that is connected to the first one if the perturbation allows us to see that [6]. These procedures focus on an improvement in the prediction of the future activity of the second neuron (the receiver) by incorporating information produced by the past activity of the first neuron (the sender of the perturbation), which is seen as a causal interaction between these neurons [7].

Admittedly, connectivity estimation is not straightforward due to the noisy nature of neuronal signals. Recordings of electrophysiological patterns in vitro and in vivo reveal that the neuronal activity is highly irregular and difficult to predict [8,9,10]. Intrinsic variability is apparent in the response of neurons, even to frozen stimulation [11,12]. Experimental data suggest that neurons, synapses, and the network system operate in an inherently stochastic framework [13,14,15]. Accordingly, the mathematical description of neuronal phenomena can be treated in probabilistic terms, i.e., describing the process of spiking as a stochastic process.

Determining which stochastic process is more suitable is a matter of debate. It is, however, reasonable to consider that the probability of a neuron spiking is conditioned by the knowledge of its past temporal response. Hence, this probability is greater the further in the past the last spike of the neuron in question is. This implies that the stochastic process that models the activity of this neuron is not Markovian with full memory, as shown by some works in the literature [16,17,18]. The activity of a neuron could, therefore, be reasonably modeled by a stochastic process whose dependence on the past is variable in scope.

The class of Markov chains with variable-length memory became popular in the statistical and probabilistic community with the work by [19]. The processes in this class are still Markovian of fixed order but with transition probabilities that do not depend on a fixed number of past states, taking into account, on the other hand, the dependency structure present in the data. These relevant sequences of past states are called contexts, and the set of contexts can be represented as a rooted tree, namely a context tree. When considering a variable-length memory, we have more informative models that are flexible and parsimonious compared to Markov chains with full memory.

Given a trajectory of the Markov chain with a variable-length memory, we can estimate transition probabilities using, for example, a plug-in estimator. A way of estimating connectivity and disambiguate spurious correlations from actual connections is by inferring the information that flows from one neuronal spike train to another. For this, we can use information-theoretical measures [20,21,22], which are functions of these transition probabilities. Thus, the estimation of the transition probabilities is essential. In this work, we consider the modeling of the neuronal spike trains by way of Markov chains with variable-length memory and use transfer entropy to understand the transmission of information between neurons over a finite time interval.

Transfer entropy (TE), an information-theoretic measure for quantifying time-directed information transfer between joint processes, was proposed by [20] and independently by [23] as an effective measure of causality. A closely related concept that measures information transport is the transfer entropy rate (TER). These measures can quantify the strength and direction of coupling between simultaneously observed systems [24,25]. Consequently, TE and TER are widely used in neuroscience today to assess connectivity from neuronal datasets [3,26,27,28,29,30,31,32,33]. In this sense, these measures allow us to study both linear and nonlinear causality relations between neuronal spike trains, described as discrete random processes. In this article, we are interested in the application of these measures to the detection of effective neuronal connectivity between neurons with variable-length memory. In other words, we aim to test for the absence of causal influence between neurons. Under fair conditions, the null hypothesis, which corresponds to the absence of causal influence, is equivalent to the requirement that the transfer entropy rate is equal to zero [34].

To test the statistical significance of a connectivity value and determine whether connectivity is detected, we use the plug-in estimator for the transfer entropy rate, which is identified with the log-likelihood ratio test statistic for the desired test. According to [34,35], this statistic is asymptotically

χ^{2}

distributed under the null hypothesis, facilitating the computation of p-values when used on simulated data. In this work, the test is employed in the analysis of spike trains simulated from a space-time framework inspired by the Galves and Löcherbach model [36], which is built on the simple and biologically plausible assumption that the membrane potential of each neuron is reset every time it spikes. The authors construct a stationary version of the process using probabilistic tools and obtain an explicit upper bound for the correlation between successive inter-spike intervals. This enables the application of the proposed statistical test to samples generated from this model. The effectiveness of the resulting hypothesis test is illustrated in these simulated data, which identifies interesting and biologically relevant information.

The problem of testing effective connectivity between neurons based on transfer entropy has been considered in the literature using surrogate data [3,37]. In general, generating surrogate data with the same properties as the original data but without dependencies between signals is difficult. In this sense, feasibility emerges when collecting sufficiently large samples, and when the test statistic has a known asymptotic distribution, the use of parametric tests is a good alternative. Recently, parametric tests have been used to detect connectivity between neurons using a test statistic based on the plug-in estimator of directed information, assuming that the bivariate process is Markovian with full memory [38,39]. To the best of our knowledge, this is the first time that testing for causality using a transfer entropy has been performed in the more general scenario of Markov chains with variable-length memory, based on a transfer entropy rate plug-in estimator. Thus, this work complements existing studies on transfer entropy estimation and effective connectivity detection between neurons.

The remainder of this article is organized as follows. In the next section, we establish our notations and review preliminary definitions and concepts, particularly those concerning the neuronal network model, transfer entropy, and the estimation of the transfer entropy rate. Section 3 introduces the hypothesis test we use to detect causal influence between stochastic neurons. In Section 4, we apply transfer entropy to the identification of effective connectivity between a pair of stochastic neurons using synthetic data generated from the random network model described in Section 2. Lastly, we end this article with our conclusions in Section 5.

2. Notations, Definitions, and Preliminary Notions

In this paper, we denote random variables in uppercase letters, stochastic chains in uppercase bold letters, and the specific values assumed by them in lowercase letters. Calligraphic letters denote the alphabets where random variables take values. Subscripts denote the outcome’s position in a sequence, for example,

X_{t}

generally indicates the

t^{t h}

outcome of the process

X

. For any integers j and k such that

j \leq k

, we use the notation

x_{j}^{k}

for finite sequences

(x_{j}, \dots, x_{k})

,

x_{- \infty}^{k}

for left-infinite sequences

(\dots, x_{k - 1}, x_{k})

, and

x_{k}^{+ \infty}

for right-infinite sequences

(x_{k}, x_{k + 1}, \dots)

. We use the convention that if

j > k

,

x_{j}^{k}

is the empty sequence. We use analogous notations for sequences of random variables.

2.1. Neuronal Spike Trains as Stochastic Processes

Throughout this paper, we assume that we record the neuronal activity over a finite time horizon. The sequence of times at which an individual neuron in the nervous system generates an action potential is termed a spike train. It is useful to consider the times of spike occurrence with a certain degree of accuracy, which is called the bin size [40]. In this sense, the bin size refers to the duration of time over which neural activity is aggregated or binned for analysis. For a small enough bin size (10 ms is a typical choice), the spike train may be represented as a binary sequence

x_{1}^{n} \in {0, 1}^{n}

, where

x_{t} = \{\begin{matrix} 1, & if the neuron spikes at the t^{t h} bin, \\ 0, & otherwise, \end{matrix}

for every

t = 1, 2, \dots, n

. The appropriate bin size to use depends on the specific experimental design and the characteristics of the data being analyzed. In general, the bin size is chosen to strike a balance between capturing relevant details of the neuronal activity and having sufficient statistical power. This typically involves selecting a bin size that is small enough to capture important features of the data but not so small that the resulting spike counts are noisy or unreliable.

Recordings of neuronal activity reveal irregular spontaneous activity of neurons and variability in their response to the same stimulus [41,42,43,44,45]. Thus, the experimental data suggest that spike trains should be modeled from a probabilistic point of view. In this context, and to give a probability measure to describe the process of spiking as a sequential process, we assume that the activity of a neuron is described by a discrete-time homogeneous stochastic chain

X : = {X_{t} : t \in Z}

defined on a suitable probability space

(Ω, F, P)

, where

X_{t} = \{\begin{matrix} 1, & if the neuron spikes at the t^{t h} bin, \\ 0, & otherwise, \end{matrix}

for every

t \in Z

.

In this paper, we assume that the sample spike train is generated by a stochastic source. This means that at each bin, conditional on the whole past, there is a fixed probability of obtaining a spike. Neurons exhibiting this characteristic are arranged in such a way that they share similar biophysical properties and are collectively referred to as stochastic neurons.

The randomness introduced by stochastic neurons can be useful in training neural network models because it can help prevent overfitting and improve the network’s ability to generalize to new data. In this work, we are interested in detecting the effective connectivity between a pair of stochastic neurons using synthetic data generated from such random network models.

2.2. Neuronal Network Model

Let I be a finite set of neurons, and assume that the bins are indexed by the set

Z

. In this context, the network of neurons is described by a discrete-time homogeneous stochastic chain

X : = \{X_{t}^{(i)} : i \in I, t \in Z\}

. For each neuron

i \in I

at each bin

t \in Z

,

X_{t}^{(i)} = \{\begin{matrix} 1, & if neuron i spikes at the t^{t h} bin, \\ 0, & otherwise . \end{matrix}

Moreover, whenever we say time

t \in Z

, it should be interpreted as time bin t. For notational convenience, we write the configuration of

X

at time

t \in Z

by

X_{t} : = \{X_{t}^{(i)} : i \in I\}

and the path of

X

associated with neuron

i \in I

as

X^{(i)} : = \{X_{t}^{(i)} : t \in Z\}

. We use analogous notation for the observed configuration of

X

at time

t \in Z

and the observed path of

X

associated with a neuron

i \in I

.

In what follows, P denotes the law of the neuronal network

X

. In this network, the stochastic chain

X

has the following dynamic. At each time step, conditional on the whole past, neurons update independently from each other, i.e., for any

t \in Z

and any choice

x_{t}^{(i)} \in {0, 1}

,

i \in I

, we have

P (⋂_{i \in I} \{X_{t}^{(i)} = x_{t}^{(i)}\}| X_{- \infty}^{t - 1} = x_{- \infty}^{t - 1}) = \prod_{i \in I} P (X_{t}^{(i)} = x_{t}^{(i)}| X_{- \infty}^{t - 1} = x_{- \infty}^{t - 1}),

(1)

where

x_{- \infty}^{t - 1}

is a left-infinite configuration of

X

.

Moreover, the probability that neuron

i \in I

spikes at bin

t \in Z

, conditional on the whole past, is an increasing function of its membrane potential. In other words, for each neuron

i \in I

at any

t \in Z

,

P (X_{t}^{(i)} = 1| X_{- \infty}^{t - 1} = x_{- \infty}^{t - 1}) = ϕ (v_{t - 1}^{(i)}),

(2)

where

v_{t}^{(i)} \in R

denotes the membrane potential of neuron

i \in I

at time

t \in Z

and

ϕ : R \to [0, 1]

is an increasing function called the spiking rate function.

The membrane potential of a given neuron

i \in I

is affected by the actions of all other neurons interacting with it. More precisely, the membrane potential of a given neuron

i \in I

depends on the influence received from its presynaptic neurons since its last spiking time. In this sense, the probability of neuron

i \in I

spiking increases monotonically with its membrane potential. Whenever neuron

i \in I

fires, its membrane potential is reset to a resting value, and at the same time, postsynaptic current pulses are generated, modifying the membrane potential of all its postsynaptic neurons. When a presynaptic neuron

j \in I - {i}

fires, the membrane potential of neuron

i \in I

changes. The contribution of neuron

j \in I

to the membrane potential of neuron

i \in I

is either excitatory or inhibitory, depending on the sign of the synaptic weight of neuron j on neuron i. Moreover, the membrane potential of each neuron in the network is affected by the presence of leakage channels in its membrane, which tends to push its membrane potential toward the resting potential. This spontaneous activity of neurons is observed in biological neuronal networks.

Assuming the above description, we may consider stochastic neurons with several kinds of short-term memory. In this article, we explore a stochastic neuron model inspired by the GL model [36], where neuronal spike trains are prescribed by interacting chains with variable-length memory.

For each neuron

i \in I

at any bin

t \in Z

, we can write

v_{t - 1}^{(i)} = \{\begin{matrix} 0, & if x_{t - 1}^{(i)} = 1, \\ β_{i} + \sum_{j \in I} ω_{j \to i} \sum_{s = L_{t}^{(i)} + 1}^{t - 1} \frac{x_{s}^{(j)}}{2^{t - L_{t}^{(i)} - 1}}, & otherwise, \end{matrix}

where

ω_{j \to i} \in R

is the synaptic weight of neuron j on neuron i,

β_{i} \in R

is the spontaneous activity of neuron i, and

L_{t}^{(i)}

is the last spike time of neuron

i \in I

before time

t \in Z

, i.e.,

L_{t}^{(i)} : = sup \{s < t : x_{s}^{(i)} = 1\}, \forall i \in I .

Therefore, for each neuron

i \in I

at any

t \in Z

, we may rewrite (2) in the following way

P (X_{t}^{(i)} = 1| X_{- \infty}^{t - 1} = x_{- \infty}^{t - 1}) = ϕ ((1 - x_{t - 1}^{(i)}) (β_{i} + \sum_{j \in I} ω_{j \to i} \sum_{s = L_{t}^{(i)} + 1}^{t - 1} \frac{x_{s}^{(j)}}{2^{t - L_{t}^{(i)} - 1}})) .

(3)

Observe that the spiking probability of a given neuron depends on the accumulated activity of the system after its last spike time. Here, we adopt the convention that

L_{t}^{(i)} \geq t - K

, where K is a positive integer number that represents the largest memory length of all stochastic neurons considered in the network. This implies that the time evolution of each single neuron looks like a Markov chain with variable-length memory. This structure of variable-length memory is more appropriate from the estimation point of view because it implies that some transition probabilities of the Markov chain with order K are lumped together.

One can show the existence and uniqueness of a stationary stochastic chain

X

satisfying (1) whose dynamics are given by (3). We refer the interested reader to [36] for a rigorous proof of this result in the GL neuron model.

2.3. Transfer Entropy

In this work, we use transfer entropy to assess connectivity from neuronal datasets. This measure allows us to study causality relations between neuronal spike trains described as discrete random processes. Transfer entropy is a statistical tool used to quantify the directed flow of information between different neurons. Specifically, it measures how much information from one signal helps predict the future of another signal, after accounting for the past of both signals.

Let

(X, Y) : = \{(X_{t}, Y_{t}) : t \in Z\}

be a discrete-time jointly homogeneous stochastic chain taking values on the alphabet

{0, 1}^{2}

with distribution

P \in M

, where

M

is the set of Borelian probability measures defined on the usual sigma-algebra generated by cylinders of

{0, 1}^{Z} \times {0, 1}^{Z}

. For any positive integer k, the k-block transfer entropy from

X : = \{X_{t} : t \in Z\}

to

Y : = \{Y_{t} : t \in Z\}

is defined as

T (X_{1}^{k} \to Y_{1}^{k}) : = H (Y_{k} | Y_{1}^{k - 1}) - H (Y_{k} | X_{1}^{k - 1}, Y_{1}^{k - 1}),

where

H (Y_{k} | Y_{1}^{k - 1})

is the conditional k-block entropy of

Y

, which is given by

H (Y_{k} | Y_{1}^{k - 1}) : = - \sum_{b_{1}^{k}} P (Y_{1}^{k} = b_{1}^{k}) log P (Y_{k} = b_{k} |Y_{1}^{k - 1} = b_{1}^{k - 1})

and

H (Y_{k} | X_{1}^{k - 1}, Y_{1}^{k - 1})

is the causally conditional k-block entropy of

X

on

Y

, defined as

\begin{matrix} H (Y_{k} | X_{1}^{k - 1}, Y_{1}^{k - 1}) & : = - \sum_{b_{1}^{k}} \sum_{a_{1}^{k - 1}} P (X_{1}^{k} = a_{1}^{k - 1}, Y_{1}^{k} = b_{1}^{k}) log \frac{P (X_{1}^{k - 1} = a_{1}^{k - 1}, Y_{1}^{k} = b_{1}^{k})}{P (X_{1}^{k - 1} = a_{1}^{k - 1}, Y_{1}^{k - 1} = b_{1}^{k - 1})} \\ = - \sum_{b_{1}^{k}} \sum_{a_{1}^{k - 1}} P (X_{1}^{k} = a_{1}^{k - 1}, Y_{1}^{k} = b_{1}^{k}) \\ \times log P (Y_{k} = b_{k} |X_{1}^{k - 1} = a_{1}^{k - 1}, Y_{1}^{k - 1} = b_{1}^{k - 1}) . \end{matrix}

Throughout this paper, “log” denotes the natural logarithm, and, by convention, we take

T (X_{1} \to Y_{1}) : = H (Y_{1}) - H (Y_{1}) = 0

.

Unlike mutual information, transfer entropy is, in general, asymmetric, i.e.,

T (X_{1}^{k} \to Y_{1}^{k}) \neq T (Y_{1}^{k} \to X_{1}^{k})

. The asymmetry of transfer entropy is characterized by the causally conditional entropy, which quantifies the entropy of

Y

conditioned on the causal part of

X

in addition to the history of

Y

. We say that

X

has no causal influence on

Y

when the causally conditional entropy is equal to the conditional entropy of

Y

. In this case, the transfer entropy is zero. Therefore, with this measure, we can quantify the strength and direction of the information flow between simultaneously observed systems.

Although transfer entropy is a measure widely used in neuroscience to quantify the amount of information that flows from one spike train to another, it only considers a finite block of states. In this sense, transfer entropy estimation is sensitive to faulty observations, which may lead to the identification of false causality. For a more comprehensive understanding of the system’s behavior, we may consider the estimation of an information flow rate. This idea leads to the following definition of the transfer entropy rate.

Since

(X, Y)

is a jointly stationary ergodic finite-alphabet process, we can define the transfer entropy rate from

X

to

Y

as

T (X \to Y) = lim_{k \to \infty} T (X_{1}^{k} \to Y_{1}^{k}) .

The existence of the limit can be checked as follows:

\begin{matrix} T (X \to Y) & = lim_{k \to \infty} T (X_{1}^{k} \to Y_{1}^{k}) \\ = lim_{k \to \infty} (H (Y_{k} | Y_{1}^{k - 1}) - H (Y_{k} | X_{1}^{k - 1}, Y_{1}^{k - 1})) \\ = lim_{k \to \infty} H (Y_{k} | Y_{1}^{k - 1}) - lim_{k \to \infty} H (Y_{1} | X_{1}^{k - 1}, Y_{1}^{k - 1}) \\ = H (Y_{0} | Y_{- \infty}^{- 1}) - H (Y_{0} | X_{- \infty}^{- 1}, Y_{- \infty}^{- 1}), \end{matrix}

where

H (Y_{0} | Y_{- \infty}^{- 1})

is the entropy rate

H (Y)

of the process

Y

and

H (Y_{0} | X_{- \infty}^{- 1}, Y_{- \infty}^{- 1})

is the causally conditional entropy rate

H (Y | X)

. Thus,

T (X \to Y) = H (Y) - H (Y | X) .

The following proposition shows that, under appropriate conditions, the transfer entropy rate can be expressed in a simpler form.

Proposition 1.

Suppose

(X, Y)

is a jointly stationary ergodic finite-alphabet variable-length Markov chain with memory no larger than k and with an arbitrary initial distribution. If, in addition,

Y

is also a variable-length Markov chain with memory no larger than k, then the transfer entropy rate

T (X \to Y)

exists and it equals

T (X \to Y) = I (Y_{0}, X_{- k}^{- 1} |Y_{- k}^{- 1}) : = H (Y_{0} |Y_{- k}^{- 1}) - H (Y_{0} |X_{- k}^{- 1}, Y_{- k}^{- 1}) .

Proof.

Since

(X, Y)

is a jointly stationary ergodic finite-alphabet process, we have the existence of

T (X \to Y)

guaranteed. If

(X, Y)

is a variable-length Markov chain with memory no larger than k and with all positive transitions, then

\begin{matrix} H (Y_{0} |X_{- \infty}^{- 1}, Y_{- \infty}^{- 1}) & = \sum_{a_{- \infty}^{- 1}} \sum_{b_{- \infty}^{0}} P (X_{- \infty}^{- 1} = a_{- \infty}^{- 1}, Y_{- \infty}^{0} = b_{- \infty}^{0}) \\ \times log P (Y_{0} = b_{0} |X_{- \infty}^{- 1} = a_{- \infty}^{- 1}, Y_{- \infty}^{- 1} = b_{- \infty}^{- 1}) \\ = \sum_{a_{- k}^{- 1}} \sum_{b_{- k}^{- 1}} P (X_{- k}^{- 1} = a_{- k}^{- 1}, Y_{- k}^{0} = b_{- k}^{0}) \\ \times log P (Y_{0} = b_{0} |X_{- k}^{- 1} = a_{- k}^{- 1}, Y_{- k}^{- 1} = b_{- k}^{- 1}) \\ = H (Y_{0} |X_{- k}^{- 1}, Y_{- k}^{- 1}) . \end{matrix}

If, in addition, the process

Y

is itself a variable-length Markov chain with memory no larger than k, then, in a very similar way, we can show that the entropy rate

H (Y)

is simply

H (Y_{0} |Y_{- k}^{- 1})

. Therefore,

T (X \to Y) = H (Y_{0} |Y_{- k}^{- 1}) - H (Y_{0} |X_{- k}^{- 1}, Y_{- k}^{- 1}) = I (Y_{0}, X_{- k}^{- 1} |Y_{- k}^{- 1}) .

□

Note that

T (X \to Y) = 0

if and only if each

Y_{i}

, given its past

Y_{- \infty}^{i - 1}

, is conditionally independent of

X_{- \infty}^{i - 1}

. In other words, the transfer entropy rate is only zero in the absence of causal influence.

2.4. Transfer Entropy Rate Estimation

Since we generally do not have access to the probability distributions of the stationary processes whose possible causality relations are investigated, there are many methods to estimate the transfer entropy rate. This is particularly the case when recording neuronal and network signals, without or with equal external stimulation to the neurons, so that their activity is stationary, and inferring causal relationships, especially when dealing with different data formats. For a thorough review, we refer the reader to [20,46,47]. Thus, in this paper, we consider a plug-in estimator for the transfer entropy rate

T (X \to Y)

between the jointly stationary ergodic chains

X

and

Y

(see Section 2.2).

Consider the positive integers k and n such that

k \leq n

, and a given finite sample

(x_{- k + 1}^{n}, y_{- k + 1}^{n}) \in {0, 1}^{n + k} \times {0, 1}^{n + k}

from the jointly stationary ergodic chain

(X, Y)

with joint distribution

P \in M

. In this context, for any sequences

a_{0}^{k} \in {0, 1}^{k + 1}

and

b_{0}^{k} \in {0, 1}^{k + 1}

, we define the plug-in estimate of P as

{\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) : = \frac{1}{n} \sum_{i = 1}^{n} I \{{\tilde{x}}_{i - k}^{i} = a_{0}^{k}, {\tilde{y}}_{i - k}^{i} = b_{0}^{k}\},

where

I

denotes the indicator function.

Note that

{\hat{P}}_{n}^{(k)}

defines a probability measure on

{0, 1}^{k + 1} \times {0, 1}^{k + 1}

induced by the sample

(X_{- k + 1}^{n}, Y_{- k + 1}^{n})

from

(X, Y)

. In this context, if

({\hat{X}}_{- k}^{0}, {\hat{Y}}_{- k}^{0}) \sim {\hat{P}}_{n}^{(k)}

, we may define the plug-in estimate for the transfer entropy rate

T (X \to Y)

as

{\hat{T}}_{n}^{(k)} (X \to Y) : = T ({\hat{X}}_{- k}^{0} \to {\hat{Y}}_{- k}^{0}) .

Since

(X, Y)

is a jointly stationary ergodic chain with distribution

P \in M

, we have, by the ergodic theorem,

lim_{n \to \infty} {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) = P (X_{- k}^{0} = a_{0}^{k}, Y_{- k}^{0} = b_{0}^{k}), P - a . s .,

for every positive integer k and

(a_{0}^{k}, b_{0}^{k}) \in {0, 1}^{k + 1} \times {0, 1}^{k + 1}

. Thus, P-almost surely,

lim_{k \to \infty} lim_{n \to \infty} {\hat{T}}_{n}^{(k)} (X \to Y) = lim_{k \to \infty} lim_{n \to \infty} T ({\hat{X}}_{- k}^{0} \to {\hat{Y}}_{- k}^{0}) = lim_{k \to \infty} T (X_{- k}^{0} \to Y_{- k}^{0}) = T (X \to Y) .

As discussed in Section III.2 in [48], p.174, we can take a single limit considering k as a function of n. If

k (n) \to + \infty

whenever

n \to + \infty

and

k (n) \leq \frac{log n}{2}

, then the sequence

{k (n) : n \geq 1}

is admissible to

(X, Y)

in the sense that

lim_{n \to \infty} {\hat{P}}_{n}^{(k (n))} (a_{0}^{k (n)}, b_{0}^{k (n)}) = P (X_{- k (n)}^{0} = a_{0}^{k (n)}, Y_{- k (n)}^{0} = b_{0}^{k (n)}), P - a . s . .

Therefore, P-almost surely,

lim_{n \to \infty} {\hat{T}}_{n}^{(k (n))} (X \to Y) = lim_{n \to \infty} T ({\hat{X}}_{- k (n)}^{0} \to {\hat{Y}}_{k (n)}^{0}) = lim_{n \to \infty} T (X_{- k (n)}^{0} \to Y_{- k (n)}^{0}) = T (X \to Y) .

The asymptotic behavior of

{\hat{T}}_{n}^{(k)} (X \to Y)

can also be described in terms of its probability distribution. According to [34], if

X

does not have a causal influence on

Y

, equivalently, if

T (X \to Y) = 0

, then

2 n {\hat{T}}_{n} (X \to Y)

has an asymptotic

χ^{2} (d)

distribution, where the number of degrees of freedom d is equal to

2^{k} (2^{k} - 1)

.

3. Hypothesis Test

Consider the problem of testing whether the binary time series generated by the process

X

has a causal influence on

Y

. In the present context, this corresponds to testing the null hypothesis that each random variable

Y_{i}

is conditionally independent of

X_{i - k}^{i - 1}

given

Y_{i - k}^{i - 1}

, within the larger hypothesis that the joint stationary and ergodic process

(X, Y)

is a variable-length Markov chain with order no larger than k and with all positive transitions.

Formally, each positive transition matrix

Q = Q_{θ}

for the process

(X, Y)

can be indexed by a parameter vector

θ

taking values in a

3 \times 2^{k + 1}

-dimensional open set

Θ

. The null hypothesis corresponding to each

Y_{i}

being conditionally independent of

X_{i - k}^{i - 1}

given

Y_{i - k}^{i - 1}

is described by transition matrices

Q_{θ}

that can be expressed as

Q_{θ} (a_{0}, b_{0} |a_{- k}^{- 1}, b_{- k}^{- 1}) = Q_{θ}^{x} (a_{0} |a_{- k}^{- 1}, b_{- k}^{0}) Q_{θ}^{y} (b_{0} |b_{- 1}^{- k}), (a_{- k}^{0}, b_{- k}^{0}) \in {0, 1}^{k + 1} \times {0, 1}^{k + 1} .

(4)

This collection of transition matrices can be indexed by parameters in a lower-dimensional parameter set

Φ

, which is an open subset of

R^{2^{k} (2^{k + 1} + 1)}

and can be naturally embedded within

Θ

via a map

h : Φ \to Θ

, with the property that all induced transition matrices

Q_{h (ϕ)}

satisfy the conditional independence property in (4).

To test the null hypothesis

Φ

within the general model

Θ

, we employ a likelihood test. The log-likelihood function

L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n})

of

θ

given a sample

(x_{- k + 1}^{n}, y_{- k + 1}^{n})

from the joint process

(X, Y)

can be expressed as

\begin{matrix} L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n}) & = log P_{θ} (X_{1}^{n} = x_{1}^{n}, Y_{1}^{n} = y_{1}^{n} |X_{- k + 1}^{0} = x_{k + 1}^{0}, Y_{- k + 1}^{0} = y_{- k + 1}^{0}) \\ = log (\prod_{i = 1}^{n} Q_{θ} (x_{i}, y_{i} |x_{i - k}^{i - 1}, y_{i - k}^{i - 1})) \end{matrix}

where

P_{θ}

denotes the law of

(X, Y)

with transition matrix

Q_{θ}

. Then, the likelihood ratio test statistic is the difference

Δ_{n} = 2 \{max_{θ \in Θ} L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n}) - max_{ϕ \in Φ} L_{n} (h (ϕ) |x_{- k + 1}^{n}, y_{- k + 1}^{n})\} .

For our purposes, a key observation is that the statistic

Δ_{n}

is exactly equal to

2 n

times the plug-in estimator

{\hat{T}}_{n}^{(k)} (X \to Y)

.

Proposition 2.

If

(X, Y)

is a variable-length Markov chain of memory no larger than k with all positive transition matrices Q on the finite alphabet

{0, 1} \times {0, 1}

and with an arbitrary initial distribution, then

Δ_{n} = 2 n {\hat{T}}_{n}^{(k)} (X \to Y) .

Proof.

The first maximum in the definition of

Δ_{n}

can be expressed as

\begin{matrix} max_{θ \in Θ} L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n}) & = max_{θ \in Θ} log (\prod_{i = 1}^{n} Q_{θ} (x_{i}, y_{i} |x_{i - k}^{i - 1}, y_{i - k}^{i - 1})) \\ = max_{Q} \sum_{i = 1}^{n} log Q (x_{i}, y_{i} |x_{i - k}^{i - 1}, y_{i - k}^{i - 1}) \end{matrix}

where the last maximization is over all transition matrices Q with all positive entries. Thus,

\begin{matrix} max_{θ \in Θ} L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n}) & = max_{Q} \sum_{a_{0}^{k}} \sum_{b_{0}^{k}} n {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) log Q (a_{0}, b_{0} | a_{- k}^{- 1}, b_{- k}^{- 1}) \\ = - n min \{\sum_{a_{0}^{k}} \sum_{b_{0}^{k}} {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) log \frac{{\hat{P}}_{n}^{(k)} (a_{k}, b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1})}{Q (a_{k} b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1})} \\ - \sum_{a_{0}^{k}} \sum_{b_{0}^{k}} {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) log {\hat{P}}_{n}^{(k)} (a_{k}, b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1})\} . \end{matrix}

The above minimum is achieved by making

\sum_{a_{0}^{k}} \sum_{b_{0}^{k}} {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) log \frac{{\hat{P}}_{n}^{(k)} (a_{k}, b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1})}{Q (a_{k} b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1})} = 0 .

Namely, when

{\hat{P}}_{n}^{(k)} (a_{k}, b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1}) = Q (a_{k}, b_{k} |a_{0}^{k - 1}, b_{0}^{k - 1}) .

Therefore,

max_{θ \in Θ} L_{n} (θ |x_{- k + 1}^{n}, y_{- k + 1}^{n}) = n [H ({\hat{X}}_{- k}^{- 1}, {\hat{Y}}_{- k}^{- 1}) - H ({\hat{X}}_{- k}^{0}, {\hat{Y}}_{- k}^{0})],

where

({\hat{X}}_{- k}^{0}, {\hat{Y}}_{- k}^{0}) \sim {\hat{P}}_{n}^{(k)}

.

The computation for the second maximum in the definition of

Δ_{n}

reduces to two different maximizations. Under the null hypothesis, Q admits the decomposition in (4), so that

\begin{matrix} max_{ϕ \in Φ} L_{n} (h (ϕ) |x_{- k + 1}^{n}, y_{- k + 1}^{n}) & = max_{Q^{x}} max_{Q^{y}} \sum_{i = 1}^{n} log (Q^{x} (x_{i} |x_{i - k}^{i - 1}, y_{i - k}^{i}) Q^{y} (y_{i} |y_{i - k}^{i - 1})) \\ = max_{Q^{x}} \sum_{i = 1}^{n} log Q^{x} (x_{i} |x_{i - k}^{i - 1}, y_{i - k}^{i}) + max_{Q^{y}} \sum_{i = 1}^{n} log Q^{y} (y_{i} |y_{i - k}^{i - 1}) \\ = max_{Q^{x}} \sum_{a_{0}^{k}} \sum_{b_{0}^{k}} {\hat{P}}_{n}^{(k)} (a_{0}^{k}, b_{0}^{k}) log Q^{x} (a_{k} |a_{0}^{k - 1}, b_{0}^{k}) \\ + max_{Q^{y}} \sum_{b_{0}^{k}} {\hat{P}}_{n}^{(k)} (b_{0}^{k}) log Q^{y} (b_{k} |b_{0}^{k - 1}) \\ = n [- H ({\hat{X}}_{- k}^{0}, {\hat{Y}}_{- k}^{0}) + H ({\hat{X}}_{- k}^{- 1}, {\hat{Y}}_{- k}^{0}) - H ({\hat{Y}}_{- k}^{0}) + H ({\hat{Y}}_{- k}^{- 1})] . \end{matrix}

Therefore, by the chain rule,

Δ_{n} = 2 n [H ({\hat{Y}}_{0} |{\hat{Y}}_{- k}^{- 1}) - H ({\hat{Y}}_{0} |{\hat{X}}_{- k}^{- 1}, {\hat{Y}}_{- k}^{- 1})]

which, recalling the definition of

{\hat{T}}_{n}^{(k)} (X \to Y)

, is precisely the claimed result. □

As noted before, under the null hypothesis, that is, when

T (X \to Y) = 0

, the distribution of

2 n {\hat{T}}_{n}^{(k)} (X \to Y)

is approximately

χ^{2}

with

2^{k} (2^{k} - 1)

degrees of freedom. Therefore, by Proposition 2, the likelihood ratio test statistic

Δ_{n}

is approximately

χ^{2}

distributed with

2^{k} (2^{k} - 1)

degrees of freedom. Note that this limiting distribution does not depend on the distribution of the underlying process

(X, Y)

, except through the memory length k. Therefore, we can decide whether the data offer strong enough evidence to reject the null hypothesis by examining the value of

Δ_{n}

. Given a threshold

α \in (0, 1)

, if

δ_{n}

is the observed value of

Δ_{n}

and

P (Δ_{n} > δ_{n}) \leq α

, then the causality hypothesis can be rejected at the significance level

α

. The algorithm for conducting this hypothesis test is described below in Algorithm 1.

Algorithm 1 Causal influence test.

Input: Data

(x_{- k + 1}^{n}, y_{- k + 1}^{n}) \in {0, 1}^{n + k} \times {0, 1}^{n + k}

;
Significance level

α \in (0, 1)

;
Test statistic

Δ_{n}

.
Output: Decision: Reject or not reject the causality hypothesis

H_{0}

.

δ_{n} \leftarrow Δ_{n} (x_{- k + 1}^{n}, y_{- k + 1}^{n})

.

p \leftarrow P (Q > δ_{n})

, where

Q \sim χ_{2^{k} (2^{k} - 1)}^{2}

.
if

p \leq α

then
Reject

H_{0}

;
else if

p > α

then
Not reject

H_{0}

.
end if

Example 1.

Consider a sample

(x_{- k + 1}^{n}, y_{- k + 1}^{n}) \in {0, 1}^{n - k} \times {0, 1}^{n - k}

of length

n = 40,000

generated from a microcircuit composed of two neurons whose activities are modeled as in the neuronal network model described in Section 2. In this case, we consider the jointly stationary ergodic variable-length Markov chain

(X, Y)

with memory no larger than

k = 3

. In this microcircuit, there is a strong excitatory connection from neuron

X

to neuron

Y

but no connection from neuron

Y

to neuron

X

, i.e.,

ω_{x \to y} = 10

and

ω_{y \to x} = 0

. In Figure 1, we illustrate the signals from the neurons generated by the neural model for this parameter specification.

We conduct two hypothesis tests. In the first one, we are interested in testing the following hypotheses:

H_{0} : ω_{x \to y} = 0

vs.

H_{1} : ω_{x \to y} \neq 0

. In this case, the observed value of the test statistic is

δ_{n} = 5881.70

. Therefore, by setting a significance level of

α = 5 %

, we obtain

p \approx 0

. Hence, since

p < α

, we reject the null hypothesis. On the other hand, in the second test, the hypotheses are as follows:

H_{0} : ω_{y \to x} = 0

vs.

H_{1} : ω_{y \to x} \neq 0

. In this case, the observed value of the test statistic is

δ_{n} = 51.08

. Thus, by setting a significance level of

α = 5 %

, we obtain

p = 0.6612

. Therefore, since

p > α

, we do not reject the null hypothesis.

A more comprehensive simulation study is given in the next section.

4. Results on Simulated Data

A natural interest of this work is the application of transfer entropy in the study of effective connectivity between a pair of stochastic neurons. For this, we use synthetic data generated from the neuronal network model described in Section 2. To test the statistical significance of a connectivity value, we use the hypothesis test described in Section 3.

For the experiment conducted in this section, we consider two stochastic neurons with variable-length memory whose dynamics are given by the model introduced in Section 2. In this case, the neuronal network is a microcircuit composed of two neurons whose activities are modeled by the jointly stationary ergodic variable-length Markov chain

(X, Y)

with memory no larger than

k = 3

. We select scenarios where the synaptic weights are either strong or weak. Based on different choices of these synaptic weights, we define the following four distinct cases:

Scenario 1: There is a strong excitatory connection from neuron $X$ to neuron $Y$ but no connection from neuron $Y$ to neuron $X$ , i.e., $ω_{x \to y} = 10$ and $ω_{y \to x} = 0$ . In this case, when $Y$ is the postsynaptic neuron, we observe, on average, a firing proportion of $80 %$ .
Scenario 2: There is a weak excitatory connection from neuron $X$ to neuron $Y$ but no connection from neuron $Y$ to neuron $X$ , i.e., $ω_{x \to y} = 0.375$ and $ω_{y \to x} = 0$ . In this case, when $Y$ is the postsynaptic neuron, we observe, on average, a firing proportion of $52 %$ .
Scenario 3: There is a weak inhibitory connection from neuron $X$ to neuron $Y$ but no connection from neuron $Y$ to neuron $X$ , i.e., $ω_{x \to y} = - 0.375$ and $ω_{y \to x} = 0$ . In this case, when $Y$ is the postsynaptic neuron, we observe, on average, a firing proportion of $48 %$ .
Scenario 4: There is a strong inhibitory connection from neuron $X$ to neuron $Y$ but no connection from neuron $Y$ to neuron $X$ , i.e., $ω_{x \to y} = - 10$ and $ω_{y \to x} = 0$ . In this case, when $Y$ is the postsynaptic neuron, we observe, on average, a firing proportion of $30 %$ .

In all scenarios, when

X

is the postsynaptic neuron, we observe, on average, a firing proportion of

50 %

. In addition, for each scenario, we consider four different sample sizes:

n = 5000

; n = 10,000; n = 20,000; and n = 40,000, representing n bins of 10 ms. These are typical recording times in electrophysiological experiments. Note that with these choices of sample sizes, we have

k < \frac{log n}{2}

, for all n, which ensures the convergence results of the transfer entropy rate estimator discussed in Section 2.4. For each scenario and sample size, 100 replicates are generated, and the test is conducted on each of them.

Using 100 repetitions for the test on samples of length n = 40,000, the empirical distribution of the statistic

Δ_{n}

is estimated, as shown in Figure 2, to be in close agreement with the theoretically predicted

χ^{2} (56)

limiting distribution in all scenarios.

In Table 1, we show the fraction of times, out of 100 simulations of the neuronal network model, that the test rejects the null hypothesis of the absence of causal influence for four different significance levels:

α

= 0.1%,

α = 1 %

,

α = 5 %

, and

α = 10 %

. We can observe that, as expected (and desired), in scenarios 1 and 2, where there is a strong connection from neuron

X

to neuron

Y

(excitatory with

ω_{x \to y} = 10

and inhibitory with

ω_{x \to y} = - 10

, respectively), the test detects the connection in

100 %

of the cases regardless of the sample size and significance level. However, in scenarios 2 and 3, where there is a weak connection (excitatory with

ω_{x \to y} = 0.375

and inhibitory with

ω_{x \to y} = - 0.375

, respectively), the test struggles to detect the connection with small sample sizes and low significance levels, and its performance improves as we increase the sample size and the significance level. Furthermore, in all scenarios, the test does not reject the null conditional independence hypothesis when

Y

is the presynaptic neuron and

X

is the postsynaptic neuron. In fact, there is no connection from neuron

Y

to neuron

X

(ω_{y \to x} = 0)

. Therefore, the results are in agreement with the nature of the data.

In Figure 3, we display the distributions of the estimated transfer entropy rates for each of the 100 samples of length n = 40,000 generated by the neuronal model using box plots. We can observe that in scenarios 1 and 4, where there is a strong connection from neuron

X

to neuron

Y

, the estimated values are similar and greater than those estimated in scenarios 2 and 3, where there is a weak connection from neuron

X

to neuron

Y

. On the other hand, in all scenarios, there is no connection from neuron

Y

to neuron

X

, and the estimated values tend to be lower than those obtained in the aforementioned situations.

5. Conclusions

In this work, we studied the effective connectivity between neurons through a hypothesis test whose test statistic is based on the plug-in estimator of the transfer entropy rate. Effective connectivity refers to the causal interactions among distinct neural units, whereas anatomical connectivity and functional connectivity refer to anatomical links or loosely defined statistical dependencies between units, respectively [49]. Our work demonstrates how to properly use transfer entropy to measure the flow of information between sequences and explore its use in determining effective neuronal connectivity.

Understanding effective connectivity has long been a central challenge in neuroscience [1,50,51,52]. The identification of connectivity has garnered significant interest in recent years, primarily due to advancements enabling the simultaneous recording of a vast number of neurons [53,54,55,56]. Essentially, we now exploit the understanding that synaptic connections induce voltage fluctuations capable of triggering postsynaptic action potentials. These subtle effects modulate spike timing within a spike train, discernible under specific conditions. By iteratively applying inference techniques to extensive datasets, crucial connectivity maps for understanding the brain are generated. A successful recent example is the inference of the small central pattern-generating circuit in the stomatogastric ganglion of the crab Cancer borealis [57]. This circuit is known and so it is amenable to this type of analysis. Yet, it is a challenging circuit because pharmacological manipulations alter the neuronal intrinsic dynamics and synaptic communication, as clearly shown by the authors. However, for the majority of other living systems, challenges persist in this process, including the stochastic nature of neurons originating from a highly random environment, leading to confounding factors that are challenging to disambiguate, as well as the selection of an appropriate and refined metric capable of addressing such issues.

Our analysis indicates that the hypothesis testing framework described in this paper can be a useful exploratory tool for providing conclusive biologically relevant findings. Here, we showed that this test reliably detected effective connectivity when two signals are generated from a neuronal network model in which neurons are stochastic with variable-length memory.

A second contribution of this work concerns the relationship between the synaptic weight values set for the neuronal network model described in Section 2 and the estimates of the transfer entropy rate (see Figure 3). We observed that synaptic weight values close to zero lead to estimates that are also close to zero. On the other hand, the farther from zero the synaptic weight values, the larger the estimates of the transfer entropy rate. Therefore, empirical transfer entropy effectively translates the types of connections existing between the neurons in the network.

One avenue for future research stemming from this work is the testing and validation of our studies with experimental data. Difficulties may arise, given that not many circuits are completely known to act as ground-truth data, with some exceptions such as Cancer borealis [57] or C. elegans [58]. An intermediate step would, therefore, involve applying simulations with biophysically grounded neurons based on the Hodgkin–Huxley model [59]. The inclusion of a stochastic background at different levels of the model (ion channels, synapses, network) would highlight the advantages of our approach and allow for further extensions.

The findings of this study suggest that the method is both robust and versatile, accurately deducing effective connectivity among neurons possessing various synaptic characteristics. We utilize synaptic values aligned with both strong and weak neuronal connections in the brain. Regarding its versatility, this implies that the technique can be enhanced and extended to more intricate systems, considering the influence of confounding factors such as stimuli or additional spike trains from third parties.

Author Contributions

R.F.F. and R.F.O.P. conceived the work; J.V.R.I. and V.A.G. developed the codes and performed the computations; R.F.F. proofed the theoretical results; J.V.R.I., R.F.F., R.F.O.P. and V.A.G. analyzed the results; J.V.R.I., R.F.F. and R.F.O.P. wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grant #2022/10524-6, São Paulo Research Foundation (FAPESP). R.F.O.P. was supported by the Palm Health-Sponsored Program in Computational Brain Science and Health and the FAU Stiles-Nicholson Brain Institute.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Numerical simulations and data are freely available at https://github.com/Joao-Izzi/IC_codes (accessed on 1 April 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Aertsen, A.; Gerstein, G.; Habib, M.; Palm, G. Dynamics of neuronal firing correlation: Modulation of “effective connectivity”. J. Neurophysiol. 1989, 61, 900–917. [Google Scholar] [CrossRef] [PubMed]
Liao, W.; Mantini, D.; Zhang, Z.; Pan, Z.; Ding, J.; Gong, Q.; Yang, Y.; Chen, H. Evaluating the effective connectivity of resting state networks using conditional Granger causality. Biol. Cybern. 2010, 102, 57–69. [Google Scholar] [CrossRef]
Vicente, R.; Wibral, M.; Lindner, M.; Pipa, G. Transfer entropy—A model-free measure of effective connectivity for the neurosciences. J. Comput. Neurosci. 2011, 30, 45–67. [Google Scholar] [CrossRef] [PubMed]
Ito, S.; Hansen, M.E.; Heiland, R.; Lumsdaine, A.; Litke, A.M.; Beggs, J.M. Extending transfer entropy improves identification of effective connectivity in a spiking cortical network model. PLoS ONE 2011, 6, e27431. [Google Scholar] [CrossRef] [PubMed]
Nigam, S.; Shimono, M.; Ito, S.; Yeh, F.C.; Timme, N.; Myroshnychenko, M.; Lapish, C.C.; Tosi, Z.; Hottowy, P.; Smith, W.C.; et al. Rich-club organization in effective connectivity among cortical neurons. J. Neurosci. 2016, 36, 670–684. [Google Scholar] [CrossRef]
Platkiewicz, J.; Saccomano, Z.; McKenzie, S.; English, D.; Amarasingham, A. Monosynaptic inference via finely-timed spikes. J. Comput. Neurosci. 2021, 49, 131–157. [Google Scholar] [CrossRef]
Wiener, N. The theory of prediction. Mod. Math. Eng. 1956, 165, 6. [Google Scholar]
Stein, R.B.; Gossen, E.R.; Jones, K.E. Neuronal variability: Noise or part of the signal? Nature Rev. Neurosci. 2005, 6, 389. [Google Scholar] [CrossRef] [PubMed]
Crochet, S.; Poulet, J.F.; Kremer, Y.; Petersen, C.C. Synaptic mechanisms underlying sparse coding of active touch. Neuron 2011, 69, 1160–1175. [Google Scholar] [CrossRef]
Naud, R.; Gerstner, W. The performance (and limits) of simple neuron models: Generalizations of the leaky integrate-and-fire model. In Computational Systems Neurobiology; Springer: Berlin/Heidelberg, Germany, 2012; pp. 163–192. [Google Scholar]
Bair, W.; Koch, C. Temporal precision of spike trains in extrastriate cortex of the behaving macaque monkey. Neural Comput. 1996, 8, 1185–1202. [Google Scholar] [CrossRef]
Nawrot, M.P.; Boucsein, C.; Molina, V.R.; Riehle, A.; Aertsen, A.; Rotter, S. Measurement of variability dynamics in cortical spike trains. J. Neurosci. Meth. 2008, 169, 374–390. [Google Scholar] [CrossRef] [PubMed]
Schneidman, E.; Freedman, B.; Segev, I. Ion channel stochasticity may be critical in determining the reliability and precision of spike timing. Neural Comput. 1998, 10, 1679. [Google Scholar] [CrossRef]
Oram, M.W.; Wiener, M.C.; Lestienne, R.; Richmond, B.J. Stochastic nature of precisely timed spike patterns in visual system neuronal responses. J. Neurophysiol. 1999, 81, 3021. [Google Scholar] [CrossRef] [PubMed]
Buesing, L.; Bill, J.; Nessler, B.; Maass, W. Neural dynamics as sampling: A model for stochastic computation in recurrent networks of spiking neurons. PLoS Comput. Biol. 2011, 7, e1002211. [Google Scholar] [CrossRef] [PubMed]
Friston, K. The free-energy principle: A unified brain theory? Nat. Rev. Neurosci. 2010, 11, 127–138. [Google Scholar] [CrossRef] [PubMed]
Truccolo, W.; Hochberg, L.R.; Donoghue, J.P. Collective dynamics in human and monkey sensorimotor cortex: Predicting single neuron spikes. Nat. Neurosci. 2010, 13, 105–111. [Google Scholar] [CrossRef]
Cessac, B. A view of neural networks as dynamical systems. Int. J. Bifurc. Chaos 2010, 20, 1585–1629. [Google Scholar] [CrossRef]
Bühlmann, P.; Wyner, A.J. Variable length Markov chains. Ann. Stat. 1999, 27, 480–513. [Google Scholar] [CrossRef]
Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 2000, 85, 461. [Google Scholar] [CrossRef]
Kugiumtzis, D. Partial transfer entropy on rank vectors. Eur. Phys. J. Spec. Top. 2013, 222, 401–420. [Google Scholar] [CrossRef]
Frenzel, S.; Pompe, B. Partial mutual information for coupling analysis of multivariate time series. Phys. Rev. Lett. 2007, 99, 204101. [Google Scholar] [CrossRef]
Paluš, M.; Komárek, V.; Hrnčíř, Z.; Štěrbová, K. Synchronization as adjustment of information rates: Detection from bivariate time series. Phys. Rev. E 2001, 63, 046211. [Google Scholar] [CrossRef] [PubMed]
Kaiser, A.; Schreiber, T. Information transfer in continuous processes. Physica D 2002, 166, 43–62. [Google Scholar] [CrossRef]
Restrepo, J.F.; Mateos, D.M.; Schlotthauer, G. Transfer entropy rate through Lempel-Ziv complexity. Phys. Rev. E 2020, 101, 052117. [Google Scholar] [CrossRef]
Hinrichs, H.; Heinze, H.J.; Schoenfeld, M.A. Causal visual interactions as revealed by an information theoretic measure and fMRI. NeuroImage 2006, 31, 1051–1060. [Google Scholar] [CrossRef] [PubMed]
Gourévitch, B.; Eggermont, J.J. Evaluating information transfer between auditory cortical neurons. J. Neurophysiol. 2007, 97, 2533–2543. [Google Scholar] [CrossRef]
Vakorin, V.A.; Kovacevic, N.; McIntosh, A.R. Exploring transient transfer entropy based on a group-wise ICA decomposition of EEG data. NeuroImage 2010, 49, 1593–1600. [Google Scholar] [CrossRef] [PubMed]
Lizier, J.T.; Heinzle, J.; Horstmann, A.; Haynes, J.D.; Prokopenko, M. Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fMRI connectivity. J. Comput. Neurosci. 2011, 30, 85–107. [Google Scholar] [CrossRef] [PubMed]
Stetter, O.; Battaglia, D.; Soriano, J.; Geisel, T. Model-Free Reconstruction of Excitatory Neuronal Connectivity from Calcium Imaging Signals; Public Library of Science: San Francisco, CA, USA, 2012. [Google Scholar]
Wibral, M.; Pampu, N.; Priesemann, V.; Siebenhühner, F.; Seiwert, H.; Lindner, M.; Lizier, J.T.; Vicente, R. Measuring information-transfer delays. PLoS ONE 2013, 8, e55809. [Google Scholar] [CrossRef]
Lima, V.; Pena, R.F.O.; Ceballos, C.C.; Shimoura, R.O.; Roque, A.C. Information theory applications in neuroscience. Rev. Bras. Ensino Fis. 2019, 41, e20180197. [Google Scholar]
Pena, R.F.; Lima, V.; Shimoura, R.O.; Paulo Novato, J.; Roque, A.C. Optimal interplay between synaptic strengths and network structure enhances activity fluctuations and information propagation in hierarchical modular networks. Brain Sci. 2020, 10, 228. [Google Scholar] [CrossRef]
Barnett, L.; Bossomaier, T. Transfer entropy as a log-likelihood ratio. Phys. Rev. Lett. 2012, 109, 138105. [Google Scholar] [CrossRef]
Anderson, R.P.; Porfiri, M. Assessing significance of information flow in high dimensional dynamical systems with few data. In Proceedings of the Dynamic Systems and Control Conference. American Society of Mechanical Engineers, San Antonio, TX, USA, 22–24 October 2014; Volume 46193, p. V002T24A002. [Google Scholar]
Galves, A.; Löcherbach, E. Infinite systems of interacting chains with memory of variable length? A stochastic model for biological neural nets. J. Stat. Phys. 2013, 151, 896–921. [Google Scholar] [CrossRef]
Bakhshayesh, H.; Fitzgibbon, S.P.; Janani, A.S.; Grummett, T.S.; Pope, K.J. Detecting connectivity in EEG: A comparative study of data-driven effective connectivity measures. Comput. Biol. Med. 2019, 111, 103329. [Google Scholar] [CrossRef]
Kontoyiannis, I.; Skoularidou, M. Estimating the directed information and testing for causality. IEEE Trans. Inf. Theory 2016, 62, 6053–6067. [Google Scholar] [CrossRef]
Theocharous, A.; Gregoriou, G.; Sapountzis, P.; Kontoyiannis, I. Temporally Causal Discovery Tests for Discrete Time Series and Neural Spike Trains. IEEE Trans. Signal Process. 2024, 72, 1333–1347. [Google Scholar] [CrossRef]
MacKay, D.M.; McCulloch, W.S. The limiting information capacity of a neuronal link. Bull. Math. Biophys. 1952, 14, 127–135. [Google Scholar] [CrossRef]
Hill, A. The Basis of Sensation: The Action of the Sense Organs. Nature 1929, 123, 9–11. [Google Scholar] [CrossRef]
Adrian, E.D.; Bronk, D.W. The discharge of impulses in motor nerve fibres: Part II. The frequency of discharge in reflex and voluntary contractions. J. Physiol. 1929, 67, i3. [Google Scholar] [CrossRef] [PubMed]
Gerstner, W.; van Hemmen, J.L. Associative memory in a network of spiking neurons. Network Comput. Neural Syst. 1992, 3, 139–164. [Google Scholar] [CrossRef]
Gerstner, W. Time structure of the activity in neural network models. Phys. Rev. E 1995, 51, 738. [Google Scholar] [CrossRef]
Lindner, B. A brief introduction to some simple stochastic processes. In Stochastic Methods in Neuroscience; Laing, C., Lord, G.J., Eds.; Oxford University Press: Oxford, UK, 2009; p. 1. [Google Scholar]
Diks, C.; Fang, H. Transfer entropy for nonparametric granger causality detection: An evaluation of different resampling methods. Entropy 2017, 19, 372. [Google Scholar] [CrossRef]
Shorten, D.P.; Spinney, R.E.; Lizier, J.T. Estimating transfer entropy in continuous time between neural spike trains or other event-based data. PLoS Comput. Biol. 2021, 17, e1008054. [Google Scholar] [CrossRef] [PubMed]
Shields, P.C. The Ergodic Theory of Discrete Sample Paths; American Mathematical Society: Providence, RI, USA, 1996; Volume 13. [Google Scholar]
Sporns, O. Brain connectivity. Scholarpedia 2007, 2, 4695. [Google Scholar] [CrossRef]
Friston, K.J. Functional and effective connectivity in neuroimaging: A synthesis. Hum. Brain Mapp. 1994, 2, 56–78. [Google Scholar] [CrossRef]
Stephan, K.E.; Friston, K.J. Analyzing effective connectivity with functional magnetic resonance imaging. Wiley Interdiscip. Rev. Cogn. Sci. 2010, 1, 446–459. [Google Scholar] [CrossRef] [PubMed]
Battaglia, D.; Witt, A.; Wolf, F.; Geisel, T. Dynamic effective connectivity of inter-areal brain circuits. PLoS Comput. Biol. 2012, 8, e1002438. [Google Scholar] [CrossRef]
Buzsáki, G. Large-scale recording of neuronal ensembles. Nat. Neurosci. 2004, 7, 446–451. [Google Scholar] [CrossRef]
Maccione, A.; Gandolfo, M.; Tedesco, M.; Nieus, T.; Imfeld, K.; Martinoia, S.; Luca, B. Experimental investigation on spontaneously active hippocampal cultures recorded by means of high-density MEAs: Analysis of the spatial resolution effects. Front. Neuroeng. 2010, 3, 4. [Google Scholar] [CrossRef]
de Abril, I.M.; Yoshimoto, J.; Doya, K. Connectivity inference from neural recording data: Challenges, mathematical bases and research directions. Neural Netw. 2018, 102, 120–137. [Google Scholar]
Ambrogioni, L.; Ebel, P.; Hinne, M.; Güçlü, U.; van Gerven, M.; Maris, E. SpikeCaKe: Semi-Analytic Nonparametric Bayesian Inference for Spike-Spike Neuronal Connectivity. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, Naha, Japan, 16–18 April 2019; Volume 89, pp. 787–795. [Google Scholar]
Gerhard, F.; Kispersky, T.; Gutierrez, G.J.; Marder, E.; Kramer, M.; Eden, U. Successful reconstruction of a physiological circuit with known connectivity from spiking activity alone. PLoS Comput. Biol. 2013, 9, e1003138. [Google Scholar] [CrossRef] [PubMed]
White, J.G.; Southgate, E.; Thomson, J.N.; Brenner, S. The structure of the nervous system of the nematode Caenorhabditis elegans. Philos. Trans. R. Soc. Lond. B Biol. Sci. 1986, 314, 1–340. [Google Scholar] [PubMed]
Hodgkin, A.L.; Huxley, A.F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 1952, 10, 500. [Google Scholar] [CrossRef] [PubMed]

Figure 1. First five hundred observations of the time series

X

and

Y

generated from the neuronal network model described in Section 2, with memory no larger than

k = 3

and synaptic weights

ω_{x \to y} = 10

and

ω_{y \to x} = 0

.

Figure 1. First five hundred observations of the time series

X

and

Y

generated from the neuronal network model described in Section 2, with memory no larger than

k = 3

and synaptic weights

ω_{x \to y} = 10

and

ω_{y \to x} = 0

.

Figure 2. Histogram approximation of the distribution of statistics

Δ_{n}

, based on 100 repetitions of the test on samples of length n = 400,000. The red curve shows the density of the theoretically predicted limiting

χ^{2} (56)

distribution.

Figure 2. Histogram approximation of the distribution of statistics

Δ_{n}

, based on 100 repetitions of the test on samples of length n = 400,000. The red curve shows the density of the theoretically predicted limiting

χ^{2} (56)

distribution.

Figure 3. Box plots of the estimated transfer entropy rates for each of the 100 samples of length n = 40,000 generated by the neuronal network model.

Table 1. Fraction of times, out of 100 simulations of the neuronal network model, that the test rejects the null hypothesis of the absence of causal influence for four different significance levels and sample sizes.

$H_{0} : T (X \to Y) = 0$ vs. $H_{1} : T (X \to Y) > 0$					$H_{0} : T (Y \to X) = 0$ vs. $H_{1} : T (Y \to X) > 0$

SCENARIO 1 $ω_{x \to y} = 10$					SCENARIO 1 $ω_{y \to x} = 0$
	$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$		$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$
$n = 5000$	100	100	100	100	$n = 5000$	0	0	1	4
$n = 10,000$	100	100	100	100	$n = 10,000$	0	0	0	2
$n = 20,000$	100	100	100	100	$n = 20,000$	0	1	3	4
$n = 40,000$	100	100	100	100	$n = 40,000$	0	1	1	3

SCENARIO 2 $ω_{x \to y} = 0.375$					SCENARIO 2 $ω_{y \to x} = 0$
	$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$		$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$
$n = 5000$	5	19	36	45	$n = 5000$	0	0	4	8
$n = 10,000$	18	44	66	76	$n = 10,000$	0	1	4	10
$n = 20,000$	71	88	95	99	$n = 20,000$	0	0	2	8
$n = 40,000$	100	100	100	100	$n = 40,000$	0	1	8	10

SCENARIO 3 $ω_{x \to y} = - 0.375$					SCENARIO 3 $ω_{y \to x} = 0$
	$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$		$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$
$n = 5000$	2	12	23	42	$n = 5000$	0	1	3	6
$n = 10,000$	13	27	58	72	$n = 10,000$	0	1	9	13
$n = 20,000$	65	85	95	95	$n = 20,000$	0	1	5	8
$n = 40,000$	100	100	100	100	$n = 40,000$	0	1	7	10

SCENARIO 4 $ω_{x \to y} = - 10$					SCENARIO 4 $ω_{y \to x} = 0$
	$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$		$α = 0.1 %$	$α = 1 %$	$α = 5 %$	$α = 10 %$
$n = 5000$	100	100	100	100	$n = 5000$	0	1	4	10
$n = 10,000$	100	100	100	100	$n = 10,000$	0	1	3	4
$n = 20,000$	100	100	100	100	$n = 20,000$	0	0	5	10
$n = 40,000$	100	100	100	100	$n = 40,000$	0	0	4	8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Izzi, J.V.R.; Ferreira, R.F.; Girardi, V.A.; Pena, R.F.O. Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator. Brain Sci. 2024, 14, 442. https://doi.org/10.3390/brainsci14050442

AMA Style

Izzi JVR, Ferreira RF, Girardi VA, Pena RFO. Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator. Brain Sciences. 2024; 14(5):442. https://doi.org/10.3390/brainsci14050442

Chicago/Turabian Style

Izzi, João V. R., Ricardo F. Ferreira, Victor A. Girardi, and Rodrigo F. O. Pena. 2024. "Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator" Brain Sciences 14, no. 5: 442. https://doi.org/10.3390/brainsci14050442

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identifying Effective Connectivity between Stochastic Neurons with Variable-Length Memory Using a Transfer Entropy Rate Estimator

Abstract

1. Introduction

2. Notations, Definitions, and Preliminary Notions

2.1. Neuronal Spike Trains as Stochastic Processes

2.2. Neuronal Network Model

2.3. Transfer Entropy

2.4. Transfer Entropy Rate Estimation

3. Hypothesis Test

4. Results on Simulated Data

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI