In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks

Insausti, Xabier; Gutiérrez-Gutiérrez, Jesús; Zárraga-Rodríguez, Marta; Crespo, Pedro M.

doi:10.3390/s17081702

Open AccessArticle

In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks

by

Xabier Insausti

^*

,

Jesús Gutiérrez-Gutiérrez

,

Marta Zárraga-Rodríguez

and

Pedro M. Crespo

Department of Biomedical Engineering and Sciences, Tecnun, University of Navarra, Manuel Lardizábal 13, 20018 San Sebastián, Spain

^*

Author to whom correspondence should be addressed.

Sensors 2017, 17(8), 1702; https://doi.org/10.3390/s17081702

Submission received: 17 May 2017 / Revised: 3 July 2017 / Accepted: 21 July 2017 / Published: 25 July 2017

(This article belongs to the Special Issue Cognitive Radio Sensing and Sensor Networks)

Download

Browse Figures

Versions Notes

Abstract

:

In a network, a distributed consensus algorithm is fully characterized by its weighting matrix. Although there exist numerical methods for obtaining the optimal weighting matrix, we have not found an in-network implementation of any of these methods that works for all network topologies. In this paper, we propose an in-network algorithm for finding such an optimal weighting matrix.

Keywords:

consensus; distributed computation; networks

1. Introduction

A sensor is a device capable of measuring a certain physical property. Normally, in a wireless sensor network (WSN), each sensor or node can transmit and receive data wirelessly, and it has the ability of performing multiple tasks, which are usually based on simple mathematical operations such as additions and multiplications. Moreover, the sensors within a WSN are usually powered with batteries, leading to very limited energy resources.

For most tasks, it is required that each sensor computes a target value that depends on the values measured by other sensors of the WSN. Commonly, a WSN has a central entity, known as the central node, which collects the values measured by all the sensors, computes the target values, and sends each target value to the corresponding sensor. This strategy is known as centralized computation.

The main disadvantage of the centralized computation strategy is that it is extremely energy inefficient from the transmission point of view because, when a sensor is far away from the central node, it has to consume disproportionate amounts of energy, with respect to the energy provided by its battery, in order to transmit its measured value to the central node. An alternative strategy to overcome the energy inefficiency of the centralized computation is the distributed or in-network computation strategy. In distributed computation, which is a cooperative strategy, each sensor computes its target value by interchanging information with its neighbouring sensors.

In many recent signal processing applications of distributed computations (e.g., [1,2,3,4]), the average needs to be computed (i.e., each sensor seeks the arithmetic mean of the values measured by all the sensors of the WSN). The problem of obtaining that average in all the sensors of the WSN by using the distributed computation strategy is known as the distributed averaging problem, or the distributed average consensus problem. Moreover, the problem of obtaining the same value in all the sensors of the WSN by using the distributed computation strategy is known as the distributed consensus problem (see, for example, [5] for a review on this subject).

A common approach for solving the distributed averaging problem is to use a synchronous linear iterative algorithm that is characterized by a matrix, which is called the weighting matrix. A well-known problem related to this topic is that of finding a symmetric weighting matrix that achieves consensus as fast as possible. This is the problem of finding the fastest symmetric distributed linear averaging (FSDLA) algorithm.

The FSDLA problem was solved in [6]. Specifically, in [6], the authors proved that solving the FSDLA problem is equivalent to solving a semidefinite program, and they used the subgradient method for efficiently solving such a problem to obtain the corresponding weighting matrix. Unfortunately, solving the FSDLA problem this way requires a central entity with full knowledge of the entire network. This central entity has to solve the FSDLA problem and then communicate the solution to each node of the network. This process has to be repeated each time the network topology changes due to, for example, a node failing, a node being added or removed (plug-and-play networks), or a node changing its location.

Moreover, WSNs may not have a central entity to compute the optimal weighting matrix. This paper proposes, for those networks without a central entity, an in-network algorithm for finding the optimal weighting matrix.

It is worth mentioning that in the literature, one can find other in-network algorithms that solve the FSDLA problem in a distributed way. In particular, in [7], the authors present an in-network algorithm that computes the fastest symmetric weighting matrix, but only with positive weights. As will be made more explicit in the next section, this matrix is not a solution of the FSDLA problem in general, as the latter might contain negative weights.

In [8], the FSDLA problem is solved in a centralized way when the communication among nodes is noisy. Closed-form expressions for the optimal weights for certain network topologies (paths, cycles, grids, stars, and hypercubes) are also provided. However, unless the considered network topology is one of these five, an in-network solution to the FSDLA is not provided.

Finally, in [9], an in-network algorithm for solving the FSDLA problem is provided. However, as the authors claim, the algorithm breaks down when the second- and third-largest eigenvalues of the weighting matrix become similar or equal.

Unlike the approaches found in the literature, the in-network algorithm presented in this paper is proved to always converge to the solution of the FSDLA problem, irrespective of the considered network topology.

2. Preliminaries

2.1. The Distributed Average Consensus Problem

We consider a network composed of n nodes. The network can be viewed as an undirected graph

G = (V, E)

, where

V = {1, 2, \dots, n}

is the set of nodes and

E

is the set of edges. An edge

e = {i, j} \in E

means that nodes

i, j \in V

are connected and can therefore interchange information. Conversely, if

{i, j} \notin E

, this means that nodes

i, j \in V

are not connected and cannot interchange information. We let K be the cardinal of

E

, i.e., K is the number of edges in the graph G. For simplicity, we enumerate the edges in the graph G as

E = {e_{1}, e_{2}, \dots, e_{K}}

, where

e_{k} = {i_{k}, j_{k}}

for all

k \in {1, 2, \dots, K}

.

We assume that each node

i \in V

has an initial value

x_{i} (0) \in R

, where

R

denotes the set of (finite) real numbers. Accordingly, in this paper,

R^{m \times n}

denotes the set of

m \times n

real matrices. We consider that all the nodes are interested in obtaining the arithmetic mean (average)

x_{ave}

of the initial values of the nodes, that is,

x_{ave} : = \frac{1}{n} \sum_{i = 1}^{n} x_{i} (0),

using a distributed algorithm. This problem is commonly known as the distributed averaging problem, or the distributed average consensus problem.

The approach that will be considered here for solving the distributed averaging problem is to use a linear iterative algorithm of the form

x_{i} (t + 1) = w_{i, i} x_{i} (t) + \sum_{j \in V : {i, j} \in E} w_{i, j} x_{j} (t), i \in V

(1)

where time t is assumed to be discrete (namely,

t \in {0, 1, 2, \dots}

) and

w_{i, j} \in R

are the weights that need to be set so that

lim_{t \to \infty} x_{i} (t) = x_{ave}

(2)

for all

i \in V

and for all

x_{1} (0), x_{2} (0), \dots, x_{n} (0) \in R

. From the point of view of communication protocols, there exist efficient ways of implementing synchronous consensus algorithms of the form of Equation (1) (e.g., [10]).

We observe that Equation (1) can be written in matrix form as

x (t + 1) = W x (t)

(3)

where

x (t) = {(x_{1} (t), x_{2} (t), \dots, x_{n} (t))}^{⊤} \in R^{n \times 1}

, and

W \in R^{n \times n}

is called the weighting matrix, which is such that its entry at the ith row and jth column,

{[W]}_{i, j}

, is given by

{[W]}_{i, j} = \{\begin{matrix} 0 & if i \neq j and {i, j} \notin E, \\ w_{i, j} & otherwise . \end{matrix} i, j \in {1, 2, \dots, n}

(4)

Therefore, Equation (2) can be rewritten as

lim_{t \to \infty} W^{t} = P_{n}

(5)

where

P_{n} : = \frac{1}{n} 1_{n} 1_{n}^{⊤}

, and

1_{n}

is the

n \times 1

matrix of ones.

We only consider algorithms of the form of Equation (3), for which the weighting matrix

W

is symmetric. If

W

is symmetric, it is shown in [6] (Theorem 1) that Equation (5) holds if and only if

W 1_{n} = 1_{n}

and

∥ W - P_{n} ∥_{2} < 1

, where

{∥ \cdot ∥}_{2}

denotes the spectral norm. For the reader’s convenience, we here recall that if

A \in R^{n \times n}

is symmetric, then

{∥ A ∥}_{2} = | λ_{1} (A) |

, where

λ_{l} (A)

,

l \in {1, 2, \dots, n}

, denote the eigenvalues of

A

, which, in this paper, are arranged such that

| λ_{1} (A) | \geq | λ_{2} (A) | \geq \dots \geq | λ_{n} (A) |

(e.g., [11] (pp. 350, 603)).

We observe that Equation (10) can be computed in a distributed way if each node

i \in V

is able to know

y_{i}

. The following result provides a means of computing such a unit eigenvector

y

of

W (w)

in a distributed way.

2.2. Considered Minimization Problem: FSDLA Problem

We denote with

W (G)

the set of all the

n \times n

real symmetric matrices that satisfy Equation (4) and

W 1_{n} = 1_{n}

simultaneously, that is,

\begin{matrix} W (G) : = {W \in R^{n \times n}, & {[W]}_{i, j} = 0 if i \neq j and {i, j} \notin E, \\ W = W^{⊤}, W 1_{n} = 1_{n}} . \end{matrix}

In [6], the convergence time of an algorithm of the form of Equation (3) with symmetric weighting matrix

W

is defined as

τ (W) : = \frac{- 1}{log {∥W - P_{n}∥}_{2}}

(6)

This convergence time is a mathematical measure of the convergence speed of the algorithm.

According to the previous, we call the FSDLA problem to find a weighting matrix

W_{opt} \in W (G)

such that

∥ W_{opt} - P_{n} ∥_{2} \leq {∥ W - P_{n} ∥}_{2} \forall W \in W (G)

(7)

We observe that in this definition the meaning of fastest is in terms of convergence time.

It is shown in [6] that the FSDLA problem of Equation (7) is a constrained convex minimization problem that can be efficiently solved. In fact, in [6], it is shown that the FSDLA problem of Equation (7) can be expressed as a semidefinite program, and semidefinite programs can be efficiently solved [12]. However, to the best of our knowledge, there are yet no approaches for solving this FSDLA problem in a distributed (in-network) way. The contribution of this paper is to solve the FSDLA problem of Equation (7) in a distributed way. To do so, we develop a distributed subgradient method.

Finally, it should be mentioned that in [7], the authors solved, in a distributed way, a related problem: they find the fastest mixing Markov chain (FMMC). The FMMC problem is devoted to finding a matrix

W_{opt}^{+} \in W (G) \cap {W \in R^{n \times n} : {[W]}_{i, j} \geq 0, \forall i, j \in {1, \dots, n}}

such that

∥ W_{opt}^{+} - P_{n} ∥_{2} \leq {∥ W - P_{n} ∥}_{2}

for all

W \in W (G) \cap {W \in R^{n \times n} : {[W]}_{i, j} \geq 0, \forall i, j \in {1, \dots, n}}

. We observe that

∥ W_{opt} - P_{n} ∥_{2} \leq {∥ W_{opt}^{+} - P_{n} ∥}_{2}

, i.e., the solution of the FSDLA problem is faster than, or is at least as fast as, the solution of the FMMC problem.

2.3. FSDLA as an Unconstrained Convex Minimization Problem

In order to use a distributed subgradient method (the classical reference on subgradient methods is [13]), we first need to convert the FSDLA problem into an unconstrained convex minimization problem. We observe that if

W \in W (G)

, it is clear that

W

depends on

w_{e_{k}} : = w_{i_{k}, j_{k}}

for all

k \in {1, 2, \dots, K}

. We notice that

w_{e_{k}}

is well defined because

W

is symmetric. In fact, as it was stated in [6], given the vector

W = {(w_{e_{1}}, w_{e_{2}}, \dots, w_{e_{K}})}^{⊤} \in R^{K \times 1}

, there exists a unique

W \in W (G)

such that

{[W]}_{i_{k}, j_{k}} = w_{e_{k}}

for all

k \in {1, 2, \dots, K}

, namely

W (w) = I_{n} + \sum_{k = 1}^{K} w_{e_{k}} A_{k}

(8)

where

I_{n}

is the

n \times n

identity matrix and

A_{k} \in R^{n \times n}

is defined as

{[A_{k}]}_{i, j} : = \{\begin{matrix} 1 & {i, j} = {i_{k}, j_{k}}, \\ - 1 & i = j = i_{k} or i = j = j_{k}, \\ 0 & otherwise . \end{matrix} \forall k \in {1, 2, \dots, K}

In other words, the function

W : R^{K \times 1} \mapsto W (G)

defined in Equation (8) is a bijection. We define the function

f : R^{K \times 1} \mapsto [0, \infty)

as

f (w) : = {∥W (w) - P_{n}∥}_{2}

. We observe that the FSDLA problem of Equation (7) can now be expressed as an unconstrained minimization of the function f.

In the sequel, we denote with

\hat{w}

a solution of the FSDLA problem, that is,

f (\hat{w}) \leq f (w) \forall w \in R^{K \times 1} .

It is easy to show that f has a bounded set of minimum points

\hat{w}

. In the sequel, we will refer to the function f as the cost function of the FSDLA problem. We finish the section with Lemma 1 which will be useful in the derivation of the algorithm.

Lemma 1.

If

w \in R^{K \times 1}

, then

f (w) = \{\begin{matrix} | λ_{1} (W (w)) | & i f | λ_{2} (W (w)) | \geq 1, \\ | λ_{2} (W (w)) | & i f | λ_{2} (W (w)) | < 1 . \end{matrix}

Proof.

Observe that as

W = W (w)

is symmetric and

W 1_{n} = 1_{n}

, there exists an eigenvalue decomposition of

W

,

W = U {diag}_{n} (1, λ_{2} (W), \dots, λ_{n} (W)) U^{⊤}

, where

U

is a real

n \times n

orthogonal matrix such that

{[U]}_{i, 1} = \frac{1}{\sqrt{n}}

for all

i \in {1, 2, \dots, n}

and

| λ_{2} (W) | \geq | λ_{3} (W) | \geq \dots \geq | λ_{n} (W) |

. As

P_{n} = U {diag}_{n} (1, 0, \dots, 0) U^{⊤}

, we have

\begin{matrix} f (w) & = ∥ W - P_{n} ∥_{2} = {∥ U {diag}_{n} (0, λ_{2} (W), \dots, λ_{n} (W)) U^{⊤} ∥}_{2} \\ = ∥ {diag}_{n} (0, λ_{2} (W), \dots, λ_{n} (W)) ∥_{2} = | λ_{2} (W) | . \end{matrix}

☐

3. Algorithm for the In-Network Solution of the FSDLA Problem

We here derive the algorithm that solves the FSDLA problem in a distributed way (Algorithm 1). To this end, we assume that n is known by all the nodes of the network. The task of counting nodes can be performed in a distributed way (see [14]). The algorithm is a distributed implementation of a subgradient method. More specifically, each pair of nodes

{i_{k}, j_{k}}

will update their weight

w_{i_{k}, j_{k}}

according to the following iterative formula:

w_{p + 1} = w_{p} - η_{p + 1} \tilde{\nabla} f (w_{p})

(9)

where

w_{p} \in R^{K \times 1}

is the vector of weights at the pth step,

η_{p} \in R

is the stepsize, and

\tilde{\nabla} f (w)

is a subgradient of f at

w

. We recall here that a vector

\tilde{\nabla} f (w) \in R^{K \times 1}

is a subgradient of

f : R^{K \times 1} \mapsto R

at

w \in R^{K \times 1}

if

f (v) \geq f (w) + {(\tilde{\nabla} f (w))}^{⊤} (v - w)

for all

v \in R^{K \times 1}

.

Theorem 1.

If

w \in R^{K \times 1}

such that

0 < f (w) < 1

, and

y = {(y_{1}, y_{2}, \dots, y_{n})}^{⊤} \in R^{n \times 1}

is such that

∥ y ∥ = 1

and

W (w) y = {(- 1)}^{s} | λ_{2} (W (w)) | y

for some

s \in {1, 2}

, then a subgradient of f at

w

is

\tilde{\nabla} f (w) = {(- 1)}^{s + 1} (\begin{matrix} {(y_{i_{1}} - y_{j_{1}})}^{2} \\ ⋮ \\ {(y_{i_{K}} - y_{j_{K}})}^{2} \end{matrix})

(10)

We observe that Equation (10) can be computed in a distributed way if each node

i \in V

is able to know

y_{i}

. The following result provides a means of computing such a unit eigenvector

y

of

W (w)

in a distributed way.

The rest of the section is devoted to proving that Equation (9) can be computed in a distributed way (Theorems 1–3), and to proving that Equation (9) actually converges to

\hat{w}

(Theorem 4).

In order to compute Equation (9) in a distributed way, we need to compute a subgradient of f in a distributed way. With this in mind, we review a result given in [6].

Theorem 2.

If

W \in R^{K \times 1}

is such that

0 < f (w) < 1

, then for all

x (0) \in R^{n \times 1}

,

W (w) y_{s} = {(- 1)}^{s} | λ_{2} (W (w)) | y_{s} s \in {1, 2}

(11)

where

y_{s} : = lim_{t \to \infty} (\frac{x (t) - x (t - 2)}{{({(- 1)}^{s} f (w))}^{t}} + \frac{x (t - 1) - x (t - 3)}{{({(- 1)}^{s} f (w))}^{t - 1}})

(12)

and

x (t) = {(W (w))}^{t} x (0)

for all

t \in {0, 1, 2, \dots}

. Furthermore, given

s \in {1, 2}

, for almost every

x (0) \in R^{n \times 1}

, the following assertions are equivalent:

(a): $y_{s} \neq 0_{n \times 1}$ , where $0_{n \times 1}$ is the $n \times 1$ zero matrix.
(b): ${(- 1)}^{s} | λ_{2} (W (w)) |$ is an eigenvalue of $W (w)$ .

Algorithm 1 In-network solution of the FSDLA problem.

1:: $p \leftarrow 0$
2:: for all pair of nodes $e_{k} = {i_{k}, j_{k}}$ do
3:: ${[w_{0}]}_{k} \leftarrow 1 / max (d_{i_{k}}, d_{j_{k}})$
4:: end for
5:: $p \leftarrow p + 1$
6:: for all nodes $i \in V$ do
7:: ${[x]}_{i} \leftarrow rand ()$ ▹ An arbitrary value
8:: ${[γ_{1}]}_{i} \leftarrow {[{ave}_{w_{p}} (x, t_{0})]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 1)]}_{i}$
9:: ${[γ_{2}]}_{i} \leftarrow {[{ave}_{w_{p}} (x, t_{0} - 1)]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 2)]}_{i}$
10:: $f (w_{p}) = {∥ W (w_{p}) - P_{n} ∥}_{2} \leftarrow \sqrt{\frac{{[{ave}_{w_{p}} ({({[γ_{1}]}_{1}^{2}, \dots, {[γ_{1}]}_{n}^{2})}^{⊤}, t_{0})]}_{i}}{{[{ave}_{w_{p}} ({({[γ_{2}]}_{1}^{2}, \dots, {[γ_{2}]}_{n}^{2})}^{⊤}, t_{0})]}_{i}}}$
11:: if $f (w_{p}) \geq 1$ then
12:: $w_{p} \leftarrow w_{p - 1}$
13:: end if
14:: ${[y_{1}]}_{i} \leftarrow \frac{{[{ave}_{w_{p}} (x, t_{0})]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 2)]}_{i}}{{(- f (w_{p}))}^{t_{0}}} + \frac{{[{ave}_{w_{p}} (x, t_{0} - 1)]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 3)]}_{i}}{{(- f (w_{p}))}^{t_{0} - 1}}$
15:: ${[y_{2}]}_{i} \leftarrow \frac{{[{ave}_{w_{p}} (x, t_{0})]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 2)]}_{i}}{{(f (w_{p}))}^{t_{0}}} + \frac{{[{ave}_{w_{p}} (x, t_{0} - 1)]}_{i} - {[{ave}_{w_{p}} (x, t_{0} - 3)]}_{i}}{{(f (w_{p}))}^{t_{0} - 1}}$
16:: $∥ y_{1} ∥ \leftarrow \sqrt{{[{ave}_{w_{p}} ({({[y_{1}]}_{1}^{2}, \dots, {[y_{1}]}_{n}^{2})}^{⊤}, t_{0})]}_{i}}$
17:: if $∥ y_{1} ∥ \neq 0$ then
18:: ${[y]}_{i} \leftarrow {[y_{1}]}_{i} / ∥ y_{1} ∥$
19:: $s \leftarrow 1$
20:: else
21:: $∥ y_{2} ∥ \leftarrow \sqrt{{[{ave}_{w_{p}} ({({[y_{2}]}_{1}^{2}, \dots, {[y_{2}]}_{n}^{2})}^{⊤}, t_{0})]}_{i}}$
22:: ${[y]}_{i} \leftarrow {[y_{2}]}_{i} / ∥ y_{2} ∥$
23:: $s \leftarrow 2$
24:: end if
25:: end for
26:: for all pair of nodes $e_{k} = {i_{k}, j_{k}}$ do
27:: ${[\tilde{\nabla} f (w_{p})]}_{k} \leftarrow {(- 1)}^{s + 1} {({[y]}_{i_{k}} - {[y]}_{j_{k}})}^{2}$
28:: ${[w_{p}]}_{k} \leftarrow {[w_{p - 1}]}_{k} - β_{p} {[\tilde{\nabla} f (w_{p})]}_{k}$
29:: end for
30:: if $p < p_{max}$ go to 5

Proof.

Let

W = W (w) = U {diag}_{n} (1, λ_{2} (W), \dots, λ_{n} (W)) U^{⊤}

be as in the proof of Lemma 1, with

U = (u_{1} | u_{2} | \dots | u_{n})

. Observe that

λ_{2} (W) \neq 0

, as

| λ_{2} (W) | = f (w) \neq 0

.

If

{(- 1)}^{s - 1} | λ_{2} (W) |

is an eigenvalue of

W

for some

s \in {1, 2}

, then we denote by

L_{s}

its algebraic multiplicity. Otherwise we set

L_{s} = 0

. From Lemma 1,

f (w) = | λ_{2} (W) |

and consequently

L_{1}

and

L_{2}

cannot be simultaneously zero. Moreover, without loss of generality we can assume that

λ_{2} (W) \geq \dots \geq λ_{L_{1} + L_{2} + 1} (W)

.

Then, we have that

\begin{matrix} x (t) & = W^{t} x (0) = W^{t} (\sum_{l = 1}^{n} α_{l} u_{l}) = \sum_{l = 1}^{n} α_{l} W^{t} u_{l} \\ = α_{1} W^{t} u_{1} + \sum_{l = 2}^{L_{1} + 1} α_{l} W^{t} u_{l} + \sum_{l = L_{1} + 2}^{L_{1} + L_{2} + 1} α_{l} W^{t} u_{l} + \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} W^{t} u_{l} \\ = α_{1} u_{1} + \sum_{l = 2}^{L_{1} + 1} α_{l} | λ_{2} {(W) |}^{t} u_{l} + \sum_{l = L_{1} + 2}^{L_{1} + L_{2} + 1} α_{l} {(- 1)}^{t} {| λ_{2} (W) |}^{t} u_{l} + \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} λ_{l} {(W)}^{t} u_{l} \\ = α_{1} u_{1} + {| λ_{2} (W) |}^{t} (\sum_{l = 2}^{L_{1} + 1} α_{l} u_{l} + {(- 1)}^{t} \sum_{l = L_{1} + 2}^{L_{1} + L_{2} + 1} α_{l} u_{l}) + \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} λ_{l} {(W)}^{t} u_{l} \\ = α_{1} u_{1} + {| λ_{2} (W) |}^{t} ({(- 1)}^{t} a_{1} + a_{2}) + r (t) \forall t \in {0, 1, 2, \dots} \end{matrix}

(13)

where

α_{l} = {(x (0))}^{⊤} u_{l}

for all

l \in {1, 2, \dots, n}

,

a_{1} : = \sum_{l = L_{1} + 2}^{L_{1} + L_{2} + 1} α_{l} u_{l},

a_{2} : = \sum_{l = 2}^{L_{1} + 1} α_{l} u_{l},

and

r (t) : = \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} λ_{l} {(W)}^{t} u_{l} .

Observe that

W a_{s} = {(- 1)}^{s} | λ_{2} (W) | a_{s} \forall s \in {1, 2}

(14)

On the one hand, from Equation (13), we obtain

\begin{matrix} \frac{x (t) - x (t - 2)}{| λ_{2} {(W) |}^{t}} & = \frac{| λ_{2} {(W) |}^{t} ({(- 1)}^{t} a_{1} + a_{2}) + r (t)}{| λ_{2} {(W) |}^{t}} - \frac{| λ_{2} {(W) |}^{t - 2} ({(- 1)}^{t - 2} a_{1} + a_{2}) - r (t - 2)}{| λ_{2} {(W) |}^{t}} \\ = (1 - | λ_{2} {(W) |}^{- 2}) ({(- 1)}^{t} a_{1} + a_{2}) + \frac{r (t) - r (t - 2)}{| λ_{2} {(W) |}^{t}} \end{matrix}

for all

t \in {2, 3, \dots}

. On the other hand, as

| λ_{l} (W) | < | λ_{2} (W) |

for all

l \in {L_{1} + L_{2} + 2, \dots, n}

, we have that

\begin{matrix} lim_{t \to \infty} \frac{r (t)}{| λ_{2} {(W) |}^{t}} = lim_{t \to \infty} \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} \frac{λ_{l} {(W)}^{t}}{| λ_{2} {(W) |}^{t}} u_{l} = lim_{t \to \infty} \sum_{l = L_{1} + L_{2} + 2}^{n} α_{l} {(\frac{λ_{l} (W)}{| λ_{2} (W) |})}^{t} u_{l} = 0_{n \times 1} . \end{matrix}

Consequently,

\begin{matrix} y_{s} & = lim_{t \to \infty} (\frac{x (t) - x (t - 2)}{{({(- 1)}^{s} | λ_{2} (W) |)}^{t}} + \frac{x (t - 1) - x (t - 3)}{{({(- 1)}^{s} | λ_{2} (W) |)}^{t - 1}}) \\ = lim_{t \to \infty} (\frac{1 - | λ_{2} {(W) |}^{- 2}}{{(- 1)}^{s t}} ({(- 1)}^{t} a_{1} + a_{2} + {(- 1)}^{s} ({(- 1)}^{t - 1} a_{1} + a_{2})) \\ + \frac{r (t) - r (t - 2)}{{({(- 1)}^{s} | λ_{2} (W) |)}^{t}} + \frac{r (t - 1) - r (t - 3)}{{({(- 1)}^{s} | λ_{2} (W) |)}^{t - 1}}) \\ = lim_{t \to \infty} (\frac{1 - | λ_{2} {(W) |}^{- 2}}{{(- 1)}^{s t}} ({(- 1)}^{t} a_{1} + a_{2} + {(- 1)}^{s} ({(- 1)}^{t - 1} a_{1} + a_{2}))) \\ = 2 (1 - | λ_{2} {(W) |}^{- 2}) a_{s}, s \in {1, 2} \end{matrix}

(15)

Combining Equations (14) and (15), we obtain Equation (11).

From Equation (11), (a) implies (b) for all

x (0) \in R^{n \times 1}

.

As

f (w) < 1

, from Lemma 1 and Equation (15), we have

y_{s} \neq 0_{n \times 1}

if and only if

a_{s} \neq 0_{n \times 1}

. Consequently, if (b) holds, the set of

x (0)

such that

a_{s} = 0_{n \times 1}

is a vector space whose dimension is less than n; thus it has Lebesgue measure 0. Therefore, (a) and (b) are equivalent for almost every

x (0) \in R^{n \times 1}

. ☐

Theorem 2 implies that

∥ y_{1} ∥

and

∥ y_{2} ∥

cannot be zero simultaneously. Therefore, either

\frac{y_{1}}{∥ y_{1} ∥}

or

\frac{y_{2}}{∥ y_{2} ∥}

is the unit eigenvector required for computing Equation (10). We notice that the norm of a vector can be computed in a distributed way because it is the square root of n times the average of the squares of its entries. Consequently, we only need to know how to compute Equation (12) in a distributed way, or equivalently, how to compute the cost function f in a distributed way:

Theorem 3.

If

w \in R^{K \times 1}

such that

f (w) \neq 0

, then

f (w) = lim_{t \to \infty} \frac{∥ x (t) - x (t - 1) ∥}{∥ x (t - 1) - x (t - 2) ∥}

(16)

for almost every

x (0) \in R^{n \times 1}

, where

x (t) = {(W (w))}^{t} x (0)

for all

t \in {0, 1, 2, \dots}

.

Proof.

Let

W = W (w) = U {diag}_{n} (1, λ_{2} (W), \dots, λ_{n} (W)) U^{⊤}

be as in the proof of Lemma 1, with

U = (u_{1} | u_{2} | \dots | u_{n})

. Then,

\begin{matrix} x (t) & = W^{t} x (0) = W^{t} (\sum_{l = 1}^{n} α_{l} u_{l}) = \sum_{l = 1}^{n} α_{l} W^{t} u_{l} \\ = α_{1} W^{t} u_{1} + \sum_{l = 2}^{n} α_{l} W^{t} u_{l} = α_{1} u_{1} + \sum_{l = 2}^{n} α_{l} λ_{l} {(W)}^{t} u_{l} \end{matrix}

for all

t \in {0, 1, 2, \dots}

, where

α_{l} = {(x (0))}^{⊤} u_{l}

for all

l \in {1, 2, \dots, n}

. Consider

L \in {0, 1, \dots, n - 2}

such that

| λ_{2} (W) | = | λ_{3} (W) | = \dots = | λ_{2 + L} (W) |

. Observe that

λ_{2} (W) \neq 0

, as

| λ_{2} (W) | = f (w) \neq 0

. Consequently, from the pythagorean theorem,

\begin{matrix} {∥x (t) - x (t - 1)∥}^{2} & = {∥\sum_{l = 2}^{n} α_{l} (λ_{l} {(W)}^{t} - λ_{l} {(W)}^{t - 1}) u_{l}∥}^{2} = \sum_{l = 2}^{n} α_{l}^{2} {(λ_{l} {(W)}^{t} - λ_{l} {(W)}^{t - 1})}^{2} \\ = \sum_{l = 2}^{n} α_{l}^{2} {(λ_{l} (W) - 1)}^{2} λ_{l} {(W)}^{2 t - 2} \\ = λ_{2} {(W)}^{2 t - 2} (\sum_{l = 2}^{L + 2} α_{l}^{2} {(λ_{l} (W) - 1)}^{2} + \sum_{l = L + 3}^{n} α_{l}^{2} {(λ_{l} (W) - 1)}^{2} {(\frac{λ_{l} (W)}{λ_{2} (W)})}^{2 t - 2}) \end{matrix}

for all

t \in {1, 2, \dots}

. Assume that

\sum_{l = 2}^{L + 2} α_{l}^{2} \neq 0

, which holds for almost every

x (0) \in R^{n \times 1}

. As

| λ_{2} (W) | > | λ_{l} (W) |

for all

l \in {L + 3, \dots, n}

, we conclude that

lim_{t \to \infty} \frac{∥ x (t) - x (t - 1) ∥}{∥ x (t - 1) - x (t - 2) ∥} = | λ_{2} (W) | \sqrt{\frac{\sum_{l = 2}^{L + 2} α_{l}^{2} {(λ_{l} (W) - 1)}^{2}}{\sum_{l = 2}^{L + 2} α_{l}^{2} {(λ_{l} (W) - 1)}^{2}}} = | λ_{2} (W) | = f (w) .

☐

We observe that Equation (16) can be computed in a distributed way because a norm can be computed in a distributed way. Moreover, we observe that the condition

f (w) = 0

holds if and only if

W = P_{n}

(this is possible only if every pair of nodes of the network is connected, i.e., it is a fully-connected network; in this case,

W_{opt} = P_{n}

and consequently the FSDLA problem makes no sense). Therefore, for any non-fully-connected network,

f (w) \neq 0

.

At this point, we have shown that the iterative Equation (9) can be computed in a distributed way. It only remains to be shown that Equation (9) actually converges to

\hat{w}

:

Theorem 4.

Consider

w_{0} \in R^{K \times 1}

such that

0 < f (w_{0}) < 1

. Let

{η_{p}}

be a sequence of real numbers satisfying

{lim}_{p \to \infty} η_{p} = 0

and

\sum_{p = 0}^{\infty} η_{p} = \infty

. We also assume that

0 < f (w_{p}) < 1 \forall p \in {1, 2, \dots}

(17)

where

w_{p}

is defined in Equation (9). Then,

f (\hat{w}) = {∥ W_{opt} - P_{n} ∥}_{2} = {lim}_{p \to \infty} f (w_{p})

.

Proof.

Theorem 1 yields

η_{p} ∥\tilde{\nabla} f (w_{p})∥ = η_{p} \sqrt{\sum_{k = 1}^{K} {(y_{i_{k}} - y_{j_{k}})}^{4}} \leq 4 \sqrt{K} max_{p \in {0, 1, 2, \dots}} η_{p} .

Consequently, as f has a bounded set of minimum points, the result now follows from [13] (Theorem 2.4). ☐

We observe that the initial point

w_{0}

in Theorem 4 can be taken, for instance, as that given by the Metropolis-Hastings algorithm (e.g., [8]). That is, if

w_{0}

is that given by the Metropolis-Hastings algorithm, then

{[w_{0}]}_{k, 1} = \frac{1}{max (d_{i_{k}}, d_{j_{k}})}

for all

k \in {1, 2, \dots, K}

, where

d_{i}

is the degree of node

i \in V

(i.e., the number of nodes to which node i is connected). Therefore,

w_{0}

can be computed in a distributed way.

Table 1 relates Algorithm 1 with the theoretical aspects shown in this section.

Remark 1.

As f is continuous, from every initial sequence of real numbers

{β_{p}}

with

{lim}_{p \to \infty} β_{p} = 0

and

\sum_{p = 0}^{\infty} β_{p} = \infty

(e.g.

{β_{p}} = \{1 / \sqrt{p}\}

), a subsequence of stepsizes

{η_{p}} = {β_{σ (p)}}

satisfying Equation (17) can be constructed.

We finish the section by describing Algorithm 1. For ease of notation, we define

{ave}_{w} (x, t) : = {(W (w))}^{t} x t \in {0, 1, 2, \dots}, x \in R^{n \times 1},

which is the tth iteration of Equation (1) and can clearly be computed in a distributed way. As for Algorithm 1, we fix

t_{0}

to be the number of iterations of Equation (1) required for a desired precision. We observe that because the worst possible network topology is a path, if we set

t_{0} \geq \frac{log ϵ}{log cos (π / n)}

, then

∥ {ave}_{w} (x, t_{0}) - x_{ave} 1_{n} ∥_{2} \leq ϵ {∥ x ∥}_{2}

(see [15]), and therefore

t_{0}

can also be obtained in a distributed way.

4. Numerical Results

We here present the numerical results obtained using Algorithm 1 for two networks with

n = 16

nodes. The chosen starting point

w_{0}

was that given by the Metropolis-Hastings algorithm [8], and the chosen initial sequence of stepsizes was

{β_{p}} = \{\frac{1}{\sqrt{p}}\}

for all

p \in {1, 2, \dots}

. Moreover, we took

t_{0} = 250 \approx \frac{log 10^{- 2}}{log cos (π / 16)}

.

Figure 1 shows the convergence time

τ (W (w_{p}))

for the network presented in Figure 2 (solid line). Figure 1 also shows

τ (W_{opt}) = 10.03

, which was obtained by using CVX, a package for specifying and solving convex programs in a centralized way [16,17] (dashed line). Finally, Figure 1 also shows the minimum value of

τ (W (w_{p}))

obtained up to step p (dotted line). For comparison purposes, we observe that the convergence time yielded by the Metropolis-Hastings algorithm was

τ (W (w_{0})) = 20.81

, while the minimum convergence time obtained after 150 iterations of our algorithm was

10.31

.

Figure 3 is of the same type as Figure 1, but in this case, the considered network was a

4 \times 4

grid. In this case, if the problem is optimally solved in a centralized way it yields

τ (W_{opt}) = 2.89

. The convergence time yielded by the Metropolis-Hastings algorithm was

τ (W (w_{0})) = 4.91

, while the minimum convergence time obtained after 150 iterations of our algorithm was

2.99

.

We finish the section with a note on the number of exchanged messages (number of transmissions). For every iteration p of Algorithm 1, the number of exchanged messages per node was at most

5 t_{0}

, divided as follows:

t_{0}

message exchanges were required for lines 8 and 9, another

2 t_{0}

message exchanges were needed in line 10 (lines 14 and 15 did not require new message exchanges), and line 16 required another

t_{0}

message exchanges. Finally, depending on the if-clause, another

t_{0}

message exchanges were required in line 21. Therefore, the overall number of required transmissions per node was between

4 p_{max} t_{0}

and

5 p_{max} t_{0}

.

5. Conclusions

In this paper we have provided an algorithm for the in-network computation of the optimal weighting matrix for distributed consensus. The algorithm can be viewed as an iterative repetition of, at most, five distributed consensus operations. Our algorithm is especially useful for networks that do not have a central entity and that change with time. In fact, if a network never changes with time (and its topology is known a priori), it seems easier to solve the FSDLA problem offline (in a centralized rather than a distributed way) using [6], and then pre-configuring the nodes with the obtained weights. However, if the network topology changes randomly with time (e.g., if sensors are added or removed) and there is no central entity, our algorithm would so far be the only way of obtaining the optimal solution to the FSDLA problem.

Acknowledgments

This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the RACHEL project (TEC2013-47141-C4-2-R), the CARMEN project (TEC2016-75067-C4-3-R) and the COMONSENS network (TEC2015-69648-REDC).

Author Contributions

Xabier Insausti conceived the research question and performed the simulations. Xabier Insausti, Jesús Gutiérrez-Gutiérrez, and Marta Zárraga-Rodríguez proved the main results (Theorems 1 to 4). Xabier Insausti, Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez, and Pedro M. Crespo wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bertrand, A.; Moonen, M. Distributed computation of the Fiedler vector with application to topology inference in ad hoc networks. Signal Process. 2013, 93, 1106–1117. [Google Scholar] [CrossRef] [Green Version]
Weng, Y.; Xiao, W.; Xie, L. Diffusion-Based EM Algorithm for Distributed Estimation of Gaussian Mixtures in Wireless Sensor Networks. Sensors 2011, 11, 6297–6316. [Google Scholar] [CrossRef] [PubMed]
Mohammadi, A.; Asif, A. Consensus-based distributed dynamic sensor selection in decentralised sensor networks using the posterior Cramér-Rao lower bound. Signal Process. 2015, 108, 558–575. [Google Scholar] [CrossRef]
Zeng, Y.; Hendriks, R.C. Distributed estimation of the inverse of the correlation matrix for privacy preserving beamforming. Signal Process. 2015, 107, 109–122. [Google Scholar] [CrossRef]
Olshevsky, A.; Tsitsiklis, J. Convergence speed in distributed consensus and averaging. SIAM Rev. 2011, 53, 747–772. [Google Scholar] [CrossRef]
Xiao, L.; Boyd, S. Fast linear iterations for distributed averaging. Syst. Control Lett. 2004, 53, 65–78. [Google Scholar] [CrossRef]
Boyd, S.; Ghosh, A.; Prabhakar, B.; Shah, D. Randomized gossip algorithms. IEEE Trans. Inf. Theory 2006, 52, 2508–2530. [Google Scholar] [CrossRef]
Xiao, L.; Boyd, S.; Kimb, S.J. Distributed average consensus with least-mean-square deviation. J. Parallel Distrib. Comput. 2007, 67, 33–46. [Google Scholar] [CrossRef]
Bertrand, A.; Moonen, M. Topology-aware distributed adaptation of Laplacian weights for in-network averaging. In Proceedings of the 21st European Signal Processing Conference, Marrakech, Morocco, 9–13 September 2013; pp. 1–5. [Google Scholar]
Insausti, X.; Camaró, F.; Crespo, P.M.; Beferull-Lozano, B.; Gutiérrez-Gutiérrez, J. Distributed Pseudo-Gossip Algorithm and Finite-Length Computational Codes for Efficient In-Network Subspace Projection. IEEE J. Sel. Top. Signal Process. 2013, 7, 163–174. [Google Scholar] [CrossRef]
Bernstein, D.S. Matrix Mathematics; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Shor, N.Z. Minimization Methods for Non-Differentiable Functions; Springer: Berlin/Heidelberg, Germany, 1985. [Google Scholar]
Zhang, S.; Tepedelenlioğlu, C.; Banavar, M.K.; Spanias, A. Distributed Node Counting in Wireless Sensor Networks in the Presence of Communication Noise. IEEE Sens. J. 2017, 17, 1175–1186. [Google Scholar] [CrossRef]
Boyd, S.; Diaconis, P.; Sun, J.; Xiao, L. Fastest mixing Markov chain on a path. Am. Math. Mon. 2006, 113, 70–74. [Google Scholar] [CrossRef]
Grant, M.; Boyd, S. CVX: Matlab Software for Disciplined Convex Programming, Version 2.1. Available online: http://cvxr.com/cvx (accessed on 1 March 2017).
Grant, M.; Boyd, S. Graph implementations for nonsmooth convex programs. In Recent Advances in Learning and Control; Lecture Notes in Control and Information Sciences; Blondel, V., Boyd, S., Kimura, H., Eds.; Springer: London, UK, 2008; pp. 95–110. [Google Scholar]

Figure 1. Numerical results for the graph of 16 nodes shown in Figure 2.

Figure 2. Graph with

n = 16

nodes considered in Figure 1.

Figure 2. Graph with

n = 16

nodes considered in Figure 1.

Figure 3. Numerical results for the grid of 16 nodes (4 rows and 4 columns).

Table 1. Explanation of Algorithm 1.

Lines	Description
2–4	Initialize with Metropolis-Hastings algorithm (Theorem 4)
7–10	Computation of the cost function f according to Theorem 3
11–13	Choose the correct subsequence according to Remark 1
14–15	Compute $y_{1}$ and $y_{2}$ as in Theorem 2
17–24	Obtain a unit eigenvector $y$ from $y_{1}$ and $y_{2}$
27	Compute subgradient as in Theorem 1
28	Update as in Equation (9)

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Insausti, X.; Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Crespo, P.M. In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks. Sensors 2017, 17, 1702. https://doi.org/10.3390/s17081702

AMA Style

Insausti X, Gutiérrez-Gutiérrez J, Zárraga-Rodríguez M, Crespo PM. In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks. Sensors. 2017; 17(8):1702. https://doi.org/10.3390/s17081702

Chicago/Turabian Style

Insausti, Xabier, Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez, and Pedro M. Crespo. 2017. "In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks" Sensors 17, no. 8: 1702. https://doi.org/10.3390/s17081702

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

In-Network Computation of the Optimal Weighting Matrix for Distributed Consensus on Wireless Sensor Networks

Abstract

1. Introduction

2. Preliminaries

2.1. The Distributed Average Consensus Problem

2.2. Considered Minimization Problem: FSDLA Problem

2.3. FSDLA as an Unconstrained Convex Minimization Problem

3. Algorithm for the In-Network Solution of the FSDLA Problem

4. Numerical Results

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI