*Article* **Scalable Distributed State Estimation in UTM Context**

#### **Marco Cicala 1,\*, Egidio D'Amato <sup>2</sup> , Immacolata Notaro <sup>3</sup> and Massimiliano Mattei <sup>3</sup>**


Received: 30 March 2020; Accepted: 6 May 2020; Published: 8 May 2020

**Abstract:** This article proposes a novel approach to the Distributed State Estimation (DSE) problem for a set of co-operating UAVs equipped with heterogeneous on board sensors capable of exploiting certain characteristics typical of the UAS Traffic Management (UTM) context, such as high traffic density and the presence of limited range, Vehicle-to-Vehicle communication devices. The proposed algorithm is based on a scalable decentralized Kalman Filter derived from the Internodal Transformation Theory enhanced on the basis of the Consensus Theory. The general benefit of the proposed algorithm consists of, on the one hand, reducing the estimation problem to smaller local sub-problems, through a self-organization process of the local estimating nodes in response to the time varying communication topology; and on the other hand, of exploiting measures carried out nearby in order to improve the accuracy of the local estimates. In the UTM context, this enables each vehicle to estimate both its own position and velocity, as well as those of the neighboring vehicles, using both on board measurements and information transmitted by neighboring vehicles. A numerical simulation in a simplified UTM scenario is presented, in order to illustrate the salient aspects of the proposed algorithm.

**Keywords:** UAS traffic management; multiple UAV navigation; navigation in GPS/GNSS-denied environments; distributed state estimation; consensus theory

#### **1. Introduction**

Over the last few years, Small Unmanned Aircraft Systems (sUAS) have experienced a widespread diffusion both in military and civilian applications. Their diffusion is destined to grow even further, since they are capable of operating close to the ground and overcoming obstacles in all sorts of hazardous conditions forbidden to traditionally manned vehicles. sUAS offer new opportunities in different operational scenarios including public safety, search and rescue, disaster relief, infrastructure monitoring, precision farming and delivery of goods [1]. Most sUAV operations would take place in low-altitude, densely occupied airspace, over densely populated areas typical of urban scenarios.

The foreseen large-scale sUAV operations are not currently possible without drastically reducing safety levels of low-altitude airspace and a global need for new concepts and enabling technologies is clearly identified in the aeronautical community. These needs find influential formulations in NASA's UAS traffic management [2] (UTM) and the European Commission's U-space visions [3]. All the paradigms proposed so far assume a range of capabilities at an increasing level of autonomy, including Beyond Visual Line of Sight (BVLOS) operations, interactive planning, de-conflict operations with geo-fencing, collision and obstacle avoidance. A common element to these capabilities is the necessity to estimate the position and velocity of each vehicle.

sUAV navigation is typically based on the integration of low-cost Global Navigation Satellite System (GNSS) receivers and commercial-grade Micro-Electro-Mechanical Systems (MEMS)-based inertial and magnetic sensors [4,5]. In nominal conditions, these navigation systems can provide an accuracy of approximately 5–10 m [6], good enough to implement, many autonomous functionalities. In urban environments, this accuracy can be hindered by the presence of obstacles, or greater accuracy may be required in order to perform special operations.

This article deals with sUAV position and velocity estimation within the UTM context. If, on the one hand, the peculiar characteristics of a UTM scenario can be seen as a source of open issues to be faced with, if appropriately interpreted, on the other hand, it is possible to derive benefits compared to traditional estimation methods in terms of accuracy, availability and continuity. In fact, the high density of traffic, the presence of numerous and heterogeneous on-board sensors (in addition to GNSS and inertial and magnetic sensors, low-cost vision-based systems [7,8] and micro radars [9] are becoming increasingly widespread), the presence of vehicle-to-vehicle communication channels are all opportunities to be exploited.

The basic idea is simple and intuitive in nature. The presence of numerous on-board sensors provides a great deal of information on the state of each vehicle. The measurements of certain sensors, such as vision-based systems or radars, contain information not only regarding the vehicle hosting the sensors, but also related to other vehicles (e.g., relative distance). The exchange of information between neighboring vehicles can allow them to improve the estimates of their position and velocity. A typical condition that can benefit from this situation is the navigation of vehicles in an GNSS-denied zone, where position and velocity can be estimated thanks to relative position sensors with other vehicles flying nearby.

The idea is currently being widely investigated. In [10–12], a multiple vehicle configuration is proposed to improve navigation (and attitude estimation) performance of a chief vehicle exploiting differential GPS using information deriving from a formation of flying deputy vehicles. In [13–15], a GPS-denied condition is specifically addressed in similar multi-vehicle configurations. These works consider a fixed number of vehicles flying in formation. A peculiar feature of the UTM scenario is the non-preordained motion of the vehicles, free to fly in the airspace and the absence of hierarchical relationships between the vehicles, which must all have access to the same minimum navigation performance.

The objective of the article is to describe a novel methodology applicable to the navigation of a sUAS fleet, which exploits the typical features of the UTM system in order to improve performance with respect to traditional methods. Therefore, the fundamental assumptions characterizing the considered scenario include the absence of a Central Processing Unit (CPU) for information elaboration or distribution (decentralization), or of a vehicle that is hierarchically distinct from the others. Moreover, only locally relevant computation is required to take place in each local processing unit, allowing the number of nodes to grow arbitrarily, without exceeding local computational resources (scalability). Thus, the starting point is to translate the basic idea of multi-sensor multi-vehicle exploitation into formal terms, with an algorithm preserving optimality characteristic typical of many common State Estimators.

The Kalman Filter (KF) represents the cornerstone for optimal state estimation. In its classical implementation [16], it has an intrinsically-centralized structure, in which the CPU samples the measurements and performs the estimation process. Although possibly optimal, when applied to Large-Scale Multi-Sensor Systems, it does not provide a solution compliant with the previously discussed assumptions. The main Centralized Kalman Filter limitations are the high computational load overcharging the CPU when the size of the system increases and the high communication complexity when the spatial distribution of the system expands.

In order to guarantee the Kalman Filter adequate scalability and decentralization characteristics, many decentralized algorithms [17–24] have been proposed, based on multiple local Kalman Filters, one in each local processing unit. In order to take processing and communication limits into account, the local Kalman Filters must involve the computation and communication only of local quantities of dimension *n<sup>l</sup> n*, where *n* is the dimension of the global system.

Much of the existing studies [25–28] focus on large sensor network monitoring low-dimensional systems, and they mainly address the problem of how efficiently the available information is distributed. These solutions address scalability mainly looking to the dimension of the measurement signals, and not of the state of the system itself.

In other works [29,30], a reduced order Kalman Filters models have been proposed to specifically address the computation burden that arises from increasing the order of the global system. This research and other similar works address the issue of scalability in particular for fully connected or almost fully connected topologies. The algorithms based on this type of topology require long distance communication that reduces some of the benefits of decentralized architectures.

In order to address the problem over arbitrary communication networks, data fusion algorithms based on the Consensus Theory [31] are widely discussed in the literature. At the basis of consensus-based methodologies, there is the concept of covariance intersection [32], which represents a preliminary solution to the problem of merging the local estimates, so as to obtain a more accurate global estimate. All consensus-based methodologies can be interpreted as a generalization of the covariance intersection fusion rule. Consensus-based methodologies are iterative in nature. At the first iteration step, they conceptually correspond to the covariance intersection rule. When the number of iteration steps go to infinity, under certain conditions, they tend towards the centralized solution. This is a highly desirable characteristic.

A first form of Consensus-based algorithm for linear systems is the Consensus on Information (CI), discussed in [33]. The methodology derives from a decentralized estimation algorithm with stability properties guaranteed under collective observability and network-strong connectivity (thus ensuring a relaxation of the condition of full connectivity). Although these stability characteristics are guaranteed, even for only one single consensus step (in this case, the rule corresponds to the covariance intersection), the results obtained applying algorithms of this family do not tend towards a centralized solution for a number of consensus steps that tend towards infinity.

A different approach of Consensus-based estimation for linear systems is named Consensus on Measurements (CM). This method is discussed, among others, in [34,35], and differs from the CI for the quantities on which the consensus procedure is carried out. Unlike CI, Consensus on Measurements tends towards the equivalent centralized solution as the number of consensus steps goes to infinity, but does not guarantee stability unless the number of consensus step is sufficiently high.

In [36], a hybrid consensus approach is described, defined by the author as the Hybrid Consensus on Measurement and on Information (HCMCI), based both on CM and CI. The scope of the proposed approach is to combine their complementary benefits avoiding their main drawbacks. The HCMCI algorithm, which, among other things, extends the consensus-based solution to the non-linear case using an Extended Kalman Filter approach, appears to be a promising methodology for dealing with the problem of the distribution of information in systems of a more general topology.

Consensus-based methods address the issue of decentralization of the estimate by reducing the complexity of the communication system even in the case of systems that are not fully connected. However, these algorithms do not address the problem of scalability: in all the aforementioned methods, each local model has a cardinality equal to that of the global system. These solutions, therefore, do not address the problem of scalability as the size of the system grows.

Furthermore, a common element of consensus-based approaches is the stability criterion of the solution related to the system topology. Strong connectivity is a required condition. Without going into formalisms, this translates into the assumption that each vehicle is connected to the others at least through an indirect route, passing through the other vehicles. If the stability conditions, specifically relating to the system topology, continue to be verified in subsets, the proposed algorithms can be used locally. The topology of a system such as a fleet of sUAVs free to fly in space is highly variable over time. Therefore, it would be advantageous to have an appropriate clustering mechanism that consists of locally stable estimation systems formed of only a properly selected global state subset.

The methodology proposed in this article combines the results achieved for scalable decentralized systems obtained by the Internodal Transformation Theory methodology [29], with the advantages guaranteed by the use of consensus-based techniques [31]. The goal is to inherit the scalability properties from the former, the ability to distribute information within the strongly connected sub graphs from the latter, and to obtain, from the combined use of the two approaches, a self-clustering property, through which the local elements that form the global system self-aggregate, in order to form sub-systems in which the solution is stable.

#### **2. Materials and Methods**

This section introduces the fundamental notions for the definition of a decentralized estimation algorithm. Subsequently, it introduces the basic concepts of both the Internodal Transformation Theory and the Consensus Theory necessary for the proposed algorithm formulation. Finally, it defines the algorithm in its generic formulation identifying a possible application to the problem of estimating position and velocity for a sUAV fleet.

#### *2.1. Problem Formulation*

This article addresses Scalable Distributed State Estimation (SDSE) over a network consisting of nodes representing free-flying vehicles. Each vehicle has its own on-board sensors that can locally process sensor data and exchange data with other vehicles. The problem can be expressed as follows.

Let us consider a set of *N* flying vehicles. Communication topology between vehicles at any time *k* can be defined in terms of a directed graph G = {V, A}, where V = {V1, . . . V*N*} is the set of vehicles (nodes of the graph) and A is the set of pair describing a communication link from V*<sup>i</sup>* to V*<sup>j</sup>* (arcs): vehicle *i* can receive data from vehicle *j* if V*i* , V*<sup>j</sup>* ∈ A. For each vehicle *i*, let be <sup>A</sup><sup>i</sup> = n V*<sup>j</sup>* ∈ V : V*i* , V*<sup>j</sup>* ∈ A<sup>o</sup> the set of its neighbors.

Let *<sup>x</sup>* <sup>∈</sup> R*<sup>n</sup>* be the global system state. State *<sup>x</sup>* is global in that it includes the information to describe the behavior of each vehicle.

Let us consider a non-linear dynamic model of the system in the state space form and discrete time domain:

$$\mathbf{x}\_{k} = f(\mathbf{x}\_{k-1}) + \mathbf{w}\_{k-1} \tag{1}$$

where *<sup>f</sup>* is the non-linear state transition function and <sup>ω</sup>*<sup>k</sup>* <sup>∈</sup> R<sup>n</sup> is the process noise. For each vehicle *i,* let us consider a set of non-linear measurements *z i k* <sup>∈</sup> R<sup>m</sup> given by:

$$\mathbf{z}\_k^i = \mathbf{h}^i(\mathbf{x}\_k) + \boldsymbol{\nu}\_k^i \quad \mathcal{V}\_i \in \mathcal{V} \tag{2}$$

where *h i* is the local observation model and ν *<sup>i</sup>* <sup>∈</sup> R<sup>m</sup> is the measurement noise. Let us assume that the process and measurement noise ω and ν *<sup>i</sup>* are mutually uncorrelated zero-mean noise with covariance **Ω***k*−<sup>1</sup> = **E** h ω*k*−1ω*k*−<sup>1</sup> T i > 0 and *R* i *k* = **E** h ν *i k* ν *i k* T i > 0.

The objective of a state estimation problem is to have, at each time *k* an estimation of the global system state *x*ˆ*<sup>k</sup>* based on measurement *z<sup>k</sup>* = h *z* 1T *k* , . . . , *z N*T *k* iT .

An SDSE problem introduces the concepts of Distribution and Scalability:


#### *2.2. The Basis of Decentralization*

To address the problem, a fundamental result relating to the centralized problem can be briefly recalled. First, let us consider the overall system given the state dynamics (1) and the overall measurement model:

$$\mathbf{z}\_k = \mathbf{h}(\mathbf{x}\_k) + \mathbf{v}\_k \tag{3}$$

where ν*<sup>k</sup>* = h ν 1T *k* , . . . , ν *N*T *k* iT .

The estimation of the state *x* at time *k*, given information up to and including time (*k* − *m*) and the corresponding variance *P* are given by:

$$
\hat{\mathbf{x}}\_{k|k-m} = \mathbb{E}[\,\mathbf{x}\_k \,\, \|\mathbf{z}\_{1'} \,\, \dots \,\mathbf{z}\_{k-m} \,\, ]\tag{4}
$$

$$P\_{k|k-m} = \mathbb{E}\left[\left(\mathbf{x}\_k - \hat{\mathbf{x}}\_{k|k-m}\right)^T \left(\mathbf{x}\_k - \hat{\mathbf{x}}\_{k|k-m}\right) \left|z\_1, \dots, z\_{k-m}\right.\right] \tag{5}$$

being, in a recursive formulation, the most relevant cases those for *m* = *k*−1 and *m* = *k*.

The information filter, equivalent to the traditional Covariance Kalman Filter, provides a recursive estimation *x*ˆ*k*|*<sup>k</sup>* for the state *x* at time *k* given the information *z*1, . . . *z<sup>k</sup>* up to time *k*. The information matrix *Q* and information state vector *q* can be defined [37] as the inverse of covariance matrix and as the product of the inverse of the covariance matrix and the state estimate, respectively:

$$\mathbf{Q}\_{k|k-m} = \mathbf{P}\_{k|k-m}^{-1} \tag{6}$$

$$\mathfrak{q}\_{k|k-m} = \mathbf{Q}\_{k|k-m} \mathbf{\hat{x}}\_{k|k-m} \tag{7}$$

In terms of information space variables, the Extended Information Kalman Filter can be written in the following form [29], given without derivation:

#### *prediction*

$$
\hat{\mathfrak{x}}\_{k|k-1} = f(\hat{\mathfrak{x}}\_{k-1|k-1}) \tag{8}
$$

$$\mathbf{Q}\_{k|k-1} = \left[ \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1})}{\partial \mathbf{x}\_k} \right. \\ \left. \mathbf{Q}\_{k-1|k-1}^{-1} \left. \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1})}{\partial \mathbf{x}\_k}^T + \mathbf{\Omega}\_{k-1} \right]^{-1} \tag{9}$$

$$\mathfrak{q}\_{k|k-1} = \mathbf{Q}\_{k|k-1} \mathbf{\hat{x}}\_{k|k-1} \tag{10}$$

#### *correction*

$$
\mathfrak{q}\_{k|k} = \mathfrak{q}\_{k|k-1} + \mathfrak{i}\_{k} \tag{11}
$$

$$\mathbf{Q}\_{k|k} = \mathbf{Q}\_{k|k-1} + \mathbf{I}\_k \tag{12}$$

where

$$I\_k = \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}\right)^T}{\partial \mathbf{x}\_k} \left(\mathbf{R}\_k\right)^{-1} \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}\right)}{\partial \mathbf{x}\_k} \tag{13}$$

$$\mathbf{i}\_{k} = \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}\right)^{T}}{\partial \mathbf{x}\_{k}} \left(\mathbf{R}\_{k}\right)^{-1} \left[\mathbf{c}\_{k} + \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}\right)}{\partial \mathbf{x}\_{k}} \hat{\mathbf{x}}\_{k|k-1}\right] \tag{14}$$

$$\mathbf{c}\_{k} = \mathbf{z}\_{k} - \hbar(\hat{\mathbf{x}}\_{k|k-1}) \tag{15}$$

The Global Information Filter applies to a fully centralized fusion architecture composed of a central processing unit directly connected to all sensing devices. One of the major advantages of the information filter formulation is its capability to be easily decentralized into a network of communicating nodes. In a decentralized fusion architecture, the system consists of different processing nodes able to perform the estimation of the global state on the basis of local observation and possible shared observations coming from other nodes. Let us first consider the case of a network of fully connected nodes *N*, which supposes that each vehicle is connected to all the other vehicles. Let us assume that each local node has a state model identical to the centralized model (1), i.e., each node performs a global state estimation *x*ˆ *<sup>i</sup>* <sup>∈</sup> R*<sup>n</sup>* . Local information matrix and information state vectors can be defined in each node *i:*

$$\mathbf{Q}\_{k|k-m}^{i} = \left(\mathbf{P}\_{k|k-m}^{i}\right)^{-1} \tag{16}$$

$$\boldsymbol{\mathfrak{q}}\_{k|k-m}^{i} = \mathbf{Q}\_{k|k-m}^{i} \mathbf{\hat{x}}\_{k|k-m}^{i} \tag{17}$$

The estimation algorithm for node *i* with information being communicated to it from other *(N-1)* nodes can be expressed in the following step analogous to the centralized case:

#### *prediction*

$$\mathfrak{X}\_{k|k-1}^{j} = f(\mathfrak{X}\_{k-1|k-1}^{j}) \tag{18}$$

$$\mathbf{Q}\_{k|k-1}^{i} = \left[ \frac{\partial \mathfrak{f}(\mathbf{x}\_{k-1|k-1}^{i})}{\partial \mathbf{x}\_{k}} \; \left( \mathbf{Q}\_{k-1|k-1}^{i} \right)^{-1} \; \frac{\partial \mathfrak{f}(\mathbf{x}\_{k-1|k-1}^{i})^{T}}{\partial \mathbf{x}\_{k}} + \mathbf{D}\_{k-1}^{i} \right]^{-1} \tag{19}$$

*q i <sup>k</sup>*|*k*−<sup>1</sup> <sup>=</sup> *<sup>Q</sup> i k*|*k*−1 *x*ˆ *i k*|*k*−1 (20)

*correction*

$$q\_{k|k}^i = q\_{k|k-1}^i + \sum\_{j=1}^N i\_k^j \tag{21}$$

$$\mathbf{Q}\_{k|k}^{i} = \mathbf{Q}\_{k|k-1}^{i} + \sum\_{j=1}^{N} I\_{k}^{j} \tag{22}$$

where

$$\mathbf{I}\_{k}^{j} = \frac{\partial h^{j} \left(\hat{\mathfrak{x}}\_{k|k-1}^{j}\right)^{T}}{\partial \mathfrak{x}\_{k}} \left(\mathbf{R}\_{k}^{j}\right)^{-1} \frac{\partial h^{j} \left(\hat{\mathfrak{x}}\_{k|k-1}^{j}\right)}{\partial \mathfrak{x}\_{k}} \tag{23}$$

$$\mathbf{r}\_{k}^{j} = \frac{\partial h^{j} \left(\mathbf{\hat{x}}\_{k|k-1}^{j}\right)^{T}}{\partial \mathbf{x}\_{k}} \left(\mathbf{R}\_{k}^{j}\right)^{-1} \left[\mathbf{c}\_{k}^{j} + \frac{\partial h^{j} \left(\mathbf{\hat{x}}\_{k|k-1}^{j}\right)}{\partial \mathbf{x}\_{k}} \mathbf{\hat{x}}\_{k|k-1}^{j}\right] \tag{24}$$

$$\mathbf{c}\_{k}^{j} = \mathbf{z}\_{k}^{j} - \hbar^{j} \left( \hat{\mathbf{x}}\_{k|k-1}^{j} \right) \tag{25}$$

In the correction step expressed it can be assumed that each node begins with a common initial information state (e.g., *q i* <sup>0</sup> = 0 *Q i* <sup>0</sup> = 0 ∀ *i*). The summations in Equations (21) and (22) are feasible because of the full connectivity of the system. Under these conditions, each local estimate is identical to the centralized system defined by the Equations (8)–(15).

The case of a fully connected system does not guarantee any real advantages with respect to the centralized case, both in terms of computational burden and of communication requirements. Nevertheless, it represents a starting point for the definition of any decentralized estimation algorithm based on a Kalman Filter.

#### *2.3. Scalability*

Let us first deal with the problem of scalability. In order to obtain the desired scalable solution, let us introduce the model distribution concepts as defined in [29]. Let us consider a local state for the node *i* related to the global state at time instant *k* by:

$$\mathbf{x}\_k^i = T\_k^i \mathbf{x}\_k \tag{26}$$

where *T i k* is a linear nodal transformation matrix that select states or linear combinations of states from the global state vector. Using the Internodal Transformation Theory, it is possible to obtain a formulation in which each node performs the same estimation as the centralized formulation for a subset of the global state, while minimizing communication between nodes.

In order to derive a scalable solution, the inverse operation to model reduction is required. Generally speaking, *T i k* is rectangular and its ordinary inverse is not defined. Hence, the use of the generalized pseudo-inverse *T i k* + is required. The pseudo-inverse provides the solution to the problem of reconstructing the global state *x<sup>k</sup>* starting from the local state *x i k* in node *i*, minimizing k *x i k* − *T i k x<sup>k</sup>* k:

$$\mathbf{x}\_{k} = \left(\mathbf{T}\_{k}^{i}\right)^{+}\mathbf{x}\_{k}^{i} \tag{27}$$

The geometrical interpretation of the reconstructed global state is a vector containing an unscaled relevant state and a zero in place of any irrelevant state to node *i*.

Let us introduce the concepts of information contribution at node *i* due to current observation from node *j,* defined as *i i*|*j k* , and the associated local information matrix *I i*|*j k* .

Error covariance at node *i* based on local observation from node *j* can be defined as:

$$P\_{k\nmid k}^{l|j} = \left(I\_k^{l|j}\right)^+\tag{28}$$

and the corresponding local estimate at node *i* based only on local observation from node *j* can be obtained from:

$$\mathfrak{X}\_{k|k}^{l|j} = \mathbf{P}\_{k|k}^{l|j} \mathfrak{i}\_k^{l|j} \tag{29}$$

It is possible to rewrite [29] the distributed formulation of Equations (18)–(25) in an equivalent scalable form in which each node propagates only locally relevant states and exchanges only relevant information with any other node:

#### *Prediction*

$$\mathfrak{X}\_{k|k-1}^{i} = T\_{k}^{i} \mathfrak{f} \left( \left( T\_{k-1}^{i} \right)^{+} \mathfrak{X}\_{k-1|k-1}^{i} \right) \tag{30}$$

$$
\overline{\mathbf{Q}}\_{k-1|k-1}^{i} = T\_k^i \left\{ \left( \mathbf{T}\_{k-1}^i \right)^+ \mathbf{Q}\_{k-1|k-1}^i \left( \mathbf{T}\_{k-1}^i \right) \right\} \left( \mathbf{T}\_k^i \right)^+ \tag{31}
$$

$$\mathbf{Q}\_{k|k-1}^{i} = \left[ \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1}^{i})}{\partial \mathbf{x}\_{k}} \; \left( \overline{\mathbf{Q}}\_{k-1|k-1}^{i} \right)^{-1} \; \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1}^{i})^{T}}{\partial \mathbf{x}\_{k}} + \mathbf{\Omega}\_{k-1}^{i} \right]^{-1} \tag{32}$$

$$\mathfrak{q}\_{k|k-1}^{i} = \mathbf{Q}\_{k|k-1}^{i} \mathfrak{k}\_{k|k-1}^{i} \tag{33}$$

*Correction*

$$q\_{k|k}^i = q\_{k|k-1}^i + \sum\_{j=1}^N \mathfrak{i}\_k^{l|j} \tag{34}$$

$$\mathbf{Q}\_{k|k}^{i} = \mathbf{Q}\_{k|k-1}^{i} + \sum\_{j=1}^{N} \mathbf{I}\_{k}^{i|j} \tag{35}$$

*Sensors* **2020**, *20*, 2682

transformations in each node.

the information to node *i*: ൫

 , |, | ൯.

In the correction phase (34), (35), it is assumed that each node *i* receives from the other nodes the local information *i i*|*j k* , *I i*|*j k* . Each node is able to calculate the information contributions locally starting from local measures without relying on information communicated by the other nodes in a completely analogous way to what was done in Equations (23)–(25): starting from local measures without relying on information communicated by the other nodes in a completely analogous way to what was done in Equations (23), (24) and (25): | <sup>=</sup> ൫ෝ|ିଵ ൯ ் ൫ ൯ ିଵ ቈ + ൫ෝ|ିଵ ൯ ෝ|ିଵ (36)

*Sensors* **2020**, *20*, x FOR PEER REVIEW 8 of 21

$$\mathbf{d}\_{k}^{\mathrm{il}} = \frac{\partial h^{i} \left(\mathbf{\hat{x}}\_{k|k-1}^{i}\right)^{T}}{\partial \mathbf{x}\_{k}} \left(\mathbf{R}\_{k}^{i}\right)^{-1} \left[\mathbf{c}\_{k}^{i} + \frac{\partial h^{i} \left(\mathbf{\hat{x}}\_{k|k-1}^{i}\right)}{\partial \mathbf{x}\_{k}} \mathbf{\hat{x}}\_{k|k-1}^{i}\right] \tag{36}$$

$$\mathbf{I}\_{k}^{\mathrm{li}} = \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}^{i}\right)^{T}}{\partial \mathbf{x}\_{k}} \left(\mathbf{R}\_{k}^{i}\right)^{-1} \frac{\partial \hbar \left(\hat{\mathbf{x}}\_{k|k-1}^{i}\right)}{\partial \mathbf{x}\_{k}} \tag{37}$$
 
$$\text{where } \hat{\mathbf{x}}\_{k} \text{ is the } k \text{-th feature of a continuous function with a small } k \text{-th feature of a linear transformation } \mathbf{x}\_{k} \text{ and } \mathbf{x}\_{k} \text{ is the } k \text{-th feature of a linear transformation } \mathbf{x}\_{k} \text{ and } \mathbf{x}\_{k} \text{ are the } k \text{-th feature of a linear transformation } \mathbf{x}\_{k}$$

It is therefore necessary to look for transformations that locally carry out the following transformations in each node. | →

$$\mathbf{I}\_k^{\|j\|} \to \mathbf{I}\_k^{\|j\|} \qquad \forall j \neq i \tag{38}$$

$$
\mathbf{i}\_{\mathbf{k}}^{\dagger j} \to \mathbf{i}\_{\mathbf{k}}^{\dagger j} \qquad \forall j \neq i \tag{39}
$$

It is possible to show [29] that the Information Space Intermodal Transformation map can be schematized as in Figure 1, where: schematized as in Figure 1, where: 

$$\mathbf{V}\_{k}^{\ddot{\imath}\ddot{\imath}} = \mathbf{T}\_{k}^{\dot{\imath}} \left(\mathbf{T}\_{k}^{\dot{\imath}}\right)^{+} \tag{40}$$

$$\mathbf{T}\_k^{ji} = \mathbf{I}\_k^{l|j} \mathbf{V}\_k^{ji} \left(\mathbf{I}\_k^{j|i}\right)^+ \tag{41}$$

**Figure 1.** Information Space Intermodal Transformation map.

**Figure 1.** Information Space Intermodal Transformation map. The information parameter in node given only node observation can thus be been The information parameter in node V*<sup>i</sup>* given only node V*<sup>j</sup>* observation *z j k* can thus be been derived in each node starting from quantities, calculated locally as follows

$$\mathbf{T}\_{k}^{\text{lj}} = \left[ \mathbf{T}\_{k}^{i} \left[ \mathbf{T}\_{k}^{jT} \left( \mathbf{I}\_{k}^{\text{lj}} \right) \mathbf{T}\_{k}^{\text{l}} \right]^{+} \mathbf{T}\_{k}^{iT} \right]^{+} = \mathbf{\mathcal{F}}\_{I}^{j-i} \left( \mathbf{I}\_{k}^{\text{l}} \right) \tag{42}$$
 
$$\text{d}i \quad \text{...} \text{:} \text{d}i \quad \text{...} \text{:} i \text{<} \text{d}i \text{>} \tag{42}$$

$$\mathbf{t}\_{\mathbf{k}}^{\text{il}|j} = \mathbf{T}\_{\mathbf{k}}^{\text{ji}} \mathbf{t}\_{\mathbf{k}}^{\text{l}|j} = \mathcal{F}\_{\mathbf{i}}^{\text{j} \to i} \left( \mathbf{t}\_{\mathbf{k}}^{\text{l}|j} \right) \tag{43}$$

 | = | <sup>=</sup> ℱ →൫ | ൯ (43) In order to carry out the transformations (42) and (43), each node *j* must therefore communicate In order to carry out the transformations (42) and (43), each node *j* must therefore communicate the information to node *i*: *T j k* ,*I j*|*j k* , *i j*|*j k* .

The solution identified, as highlighted by Equations (34) and (35), formally still applies to a fully connected system. The process of minimizing communication between nodes is highly dependent on the choice of the matrices *T i k* of the model distributions. On the other hand, no hypothesis has been made so far about the criteria to be used to select these matrices. The effect of minimizing communications is evident by observing that *T ji k* = *T ij k* = 0 if two nodes do not share any common state. Therefore, the exchange of any information is not necessary between two nodes not sharing any common state. It is possible to extend this consideration to two or more sub-graphs that are individually strongly connected yet which are not connected to each other. By choosing a local state for each node that includes only the states of nodes belonging to its strongly connected subgraph, the need for communication between unconnected sub graphs is avoided. The selected algorithm then performs a sort of clustering of the estimation process selecting groups of nodes that require exchanging data (see Figure 2). The solution identified, as highlighted by Equations (34), (35), formally still applies to a fully connected system. The process of minimizing communication between nodes is highly dependent on the choice of the matrices of the model distributions. On the other hand, no hypothesis has been made so far about the criteria to be used to select these matrices. The effect of minimizing communications is evident by observing that = = 0 if two nodes do not share any common state. Therefore, the exchange of any information is not necessary between two nodes not sharing any common state. It is possible to extend this consideration to two or more sub-graphs that are individually strongly connected yet which are not connected to each other. By choosing a local state for each node that includes only the states of nodes belonging to its strongly connected subgraph, the need for communication between unconnected sub graphs is avoided. The selected algorithm then performs a sort of clustering of the estimation process selecting groups of nodes that require exchanging data (see Figure 2).

*Sensors* **2020**, *20*, x FOR PEER REVIEW 9 of 21

**Figure 2.** Internodal Transformation and graph connectivity. **Figure 2.** Internodal Transformation and graph connectivity.

#### *2.4. Consensus Based Information Distribution 2.4. Consensus Based Information Distribution*

The algorithm discussed in the previous paragraph does not completely solve the problem of the information distribution: two nodes that share part of the local state must exchange information directly. In this way, the algorithm autonomously manages the formation and disintegration of connected sub-graphs, but a fully connectivity is required in each sub-graph. To overcome this The algorithm discussed in the previous paragraph does not completely solve the problem of the information distribution: two nodes that share part of the local state must exchange information directly. In this way, the algorithm autonomously manages the formation and disintegration of connected sub-graphs, but a fully connectivity is required in each sub-graph. To overcome this problem, it is possible to use the Consensus Theory.

problem, it is possible to use the Consensus Theory. Let us consider a strongly connected subgraph ሚ = ൛෨, ሚൟ⊆ included in the global graph. The summation of generic terms distributed between its nodes: Let us consider a strongly connected subgraph Ge = n Ve, Ae o ⊆ G included in the global graph. The summation of generic terms χ*<sup>i</sup>* distributed between its nodes:

$$X = \sum\_{i \in \widetilde{\mathcal{H}}} \chi\_i \tag{44}$$

can be obtained with a consensus averaging iterative process [38]: can be obtained with a consensus averaging iterative process [38]:

$$\forall \, \mathcal{V}\_i \in \tilde{\mathcal{H}} \quad \chi\_i^{(0)} = \chi\_i$$

$$\text{for } \ell = 1, \ldots, L \qquad \chi\_i^{(\ell+1)} = \sum\_{j \in \widetilde{\mathcal{A}}} \mu\_{ij} \chi\_j^{(\ell)} \tag{45}$$

$$X = \tilde{N}\chi\_i^{(L)}\tag{46}$$

where *N*e is the number of nodes of the strongly connected sub-graph.

With a proper choice of the µ*ij*, in fact the solution converges to the average vector

$$\lim\_{\ell \to \infty} \chi\_i^{(\ell)} = \frac{1}{\widetilde{\mathcal{N}}} \sum\_{j \in \mathcal{R}\_i} \chi\_j \quad \forall \mathcal{V}\_i \in \widetilde{\mathcal{H}} \tag{47}$$

A possible choice of terms µ*ij* is to select them as local-degree weights:

$$\mu\_{ij} = \frac{1}{\max\{d(i), d(j)\}}, \; \left(\mathcal{V}\_{i\prime}\mathcal{V}\_{j}\right) \in \mathcal{A} \tag{48}$$

$$
\mu\_{ij} = 0,\ \left(\mathcal{V}\_{i\boldsymbol{\nu}} \mathcal{V}\_j\right) \notin \mathcal{A} \tag{49}
$$

$$\mu\_{ii} = 1 - \sum\_{j \in \mathcal{H}\_i} \mu\_{ij} \tag{50}$$

where *d(i)* is the degree of the node V*<sup>i</sup>* .

#### *2.5. Algorithm Description*

By applying to the Scalable Distributed algorithm defined by Equations (30)–(35), a consensus procedure to asymptotically obtain the summations in Equations (34) and (35), and adding a consensus procedure on a priori information pair to reproduce a Hybrid Consensus on Measurement and on Information (HCMCI) formulation [36], the following algorithm can be obtained:

#### *UpdateT<sup>i</sup> k*

#### *Local prediction*

$$\mathfrak{X}\_{k|k-1}^{i} = T\_k^i f\left(\left(T\_{k-1}^i\right)^+ \mathfrak{X}\_{k-1|k-1}^i\right) \tag{51}$$

$$\widetilde{\mathbf{Q}}\_{k-1|k-1}^{i} = \mathbf{T}\_{k}^{i} \Big\{ \left( \mathbf{T}\_{k-1}^{i} \right)^{+} \mathbf{Q}\_{k-1|k-1}^{i} \left( \mathbf{T}\_{k-1}^{i} \right) \Big\} \left( \mathbf{T}\_{k}^{i} \right)^{+} \tag{52}$$

$$\mathbf{Q}\_{k|k-1}^{i} = \left[ \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1}^{i})}{\partial \mathbf{x}\_{k}} \left( \widetilde{\mathbf{Q}}\_{k-1|k-1}^{i} \right)^{-1} \frac{\partial \mathfrak{f}(\mathbf{\hat{x}}\_{k-1|k-1}^{i})^{T}}{\partial \mathbf{x}\_{k}} + \mathbf{\Omega}\_{k-1}^{i} \right]^{-1} \tag{53}$$

$$\mathfrak{q}\_{k|k-1}^{i} = \mathfrak{Q}\_{k|k-1}^{i} \mathfrak{X}\_{k|k-1}^{i} \tag{54}$$

*Consensus* (*oninformation*) ∀ *i*

*Initialization*

$$\mathbf{Q}\_{k|k-1}^{i}{}^{(0)} = \mathbf{Q}\_{k|k-1}^{i} q\_{k|k-1}^{i}{}^{(0)} = q\_{k|k-1}^{i} \tag{55}$$
 
$$\text{for } \ell = 0, 1, \dots \text{L}$$

$$\forall \ \mathcal{V}\_{j} \in \mathcal{A}^{j} \text{ receive } \mathcal{Q}\_{k|k-1}^{j} \nmid \mathcal{q}\_{k|k-1}^{j} \quad (\ell)$$

$$\mathcal{Q}\_{k|k-1}^{j} \stackrel{(\ell+1)}{=} \sum\_{j \in \mathcal{A}\_{i}} \mu\_{ij} \quad \mathcal{T}\_{k}^{j} \Big( \left( \mathcal{T}\_{k}^{j} \right)^{+} \mathcal{Q}\_{k|k-1}^{j} \nmid \mathcal{T}\_{k}^{j} \Big) \Big( \mathcal{T}\_{k}^{i} \Big)^{+} \tag{56}$$

*Sensors* **2020**, *20*, 2682

$$\boldsymbol{q}\_{k|k-1}^{\boldsymbol{i}}\,^{(\ell+1)} = \sum\_{j \in \mathcal{H}\_{\boldsymbol{i}}} \mu\_{\boldsymbol{i}\boldsymbol{j}} \,^{\boldsymbol{i}}\,^{\boldsymbol{i}}\,^{\boldsymbol{i}}\,^{\boldsymbol{i}}\boldsymbol{q}\_{k|k-1}^{\boldsymbol{i}}\,^{(\ell)}\tag{57}$$
