Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets

Weber, Melanie; Jost, Jürgen; Saucan, Emil

doi:10.3390/axioms5040026

Open AccessArticle

Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets

by

Melanie Weber

^1,2,†,

Jürgen Jost

^1,3

and

Emil Saucan

^1,4,*

¹

Max-Planck-Institute for Mathematics in the Sciences, Inselstrasse 22, 04103 Leipzig, Germany

²

Department of Mathematics, University of Leipzigt, Augustusplatz 10, 04109 Leipzig, Germany

³

Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA

⁴

Departments of Mathematics and Electrical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel

^*

Author to whom correspondence should be addressed.

^†

Current address: Program in Applied and Computational Mathematics, Princeton University, Fine Hall 221, Washington Road, Princeton, NJ 08544, USA.

Axioms 2016, 5(4), 26; https://doi.org/10.3390/axioms5040026

Submission received: 10 September 2016 / Revised: 26 October 2016 / Accepted: 7 November 2016 / Published: 10 November 2016

(This article belongs to the Special Issue Discrete Geometry and its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

We present a viable geometric solution for the detection of dynamic effects in complex networks. Building on Forman’s discretization of the classical notion of Ricci curvature, we introduce a novel geometric method to characterize different types of real-world networks with an emphasis on peer-to-peer networks. We study the classical Ricci-flow in a network-theoretic setting and introduce an analytic tool for characterizing dynamic effects. The formalism suggests a computational method for change detection and the identification of fast evolving network regions and yields insights into topological properties and the structure of the underlying data.

Keywords:

Ricci flow; Forman curvature; complex systems; dynamic networks; change detection; peer-to-peer network

MSC:

05C82; 05C21; 05C10

1. Introduction

Complex networks are by now ubiquitous, both in every day life and as mathematical models for a wide range of phenomena [1,2,3,4] with applications in such diverse fields as Biology [5,6], transportation and urban planning [4,7,8], social networks like Facebook and Twitter [9], and—in its relevance dating back to early work on networks—in communication and computer systems [2]. The latter belong to the class of peer-to-peer networks whose structure is characterized by information transfer between “peers”. With novel geometric methods we attempt to analyze structural properties and dynamics of real-world networks, focusing on peer-to-peer networks as an exemplary use case.

While the body of theoretical research on the analysis of networks and related structures has been focused on the properties of the various (discrete) Laplacians (see [10] for an overview on the state of the art), a more recent and related direction concerns the geometrical characterization of real-world and model type networks with discrete curvature, see [11,12,13,14].

Given that in geometry, curvature (intuitively, a measure quantifying the deviation of a geometrical object from being flat) plays a central role, the drive for finding various discrete, expressive notions of curvature capable of describing the structure of networks is most natural [1,2,15]. More recent “geometric” approaches consider discrete definitions of curvature as in [13,16,17,18].

While somewhat esoteric for the “layman”, Ricci curvature plays a central role in Riemannian Geometry and has recently proven to be a remarkably powerful tool, mainly given its role in the celebrated proof by G. Perelman of Poincare’s Geometerization Conjecture [19,20], following a path laid down earlier by R. Hamilton [21]. Moreover, Ricci curvature proved itself to be unexpectedly flexible and adaptable to various discrete and more general settings, mainly in the version introduced by Y. Ollivier [11,22,23]. Various theoretical and practical applications have been explored in the literature, see, for instance, [12,14,24,25,26].

Despite its practicability in theoretical settings, the Ollivier-Ricci curvature is limited in its applicability on large, real-world networks. A major constraint is the high computational cost of its calculation. The Ollivier-Ricci curvature necessitates the calculation of the so called earth mover’s distance (also known as the Wasserstein 1-metric), which in turn requires solving a non-trivial linear programming problem, thus rendering the computation unfeasible for truly large scale networks. A further limitation comes as a direct consequence of the very definition of Ollivier’s concept: Given that the curvature’s definition is intrinsically based on Optimal Transportation Theory, it is excellently suited for the investigation of information transfer in communication networks. Yet it is less expressive as a model for structurally different types of interaction networks as occurring in biology or social sciences.

However, there is an alternative notion of discrete Ricci curvature that is computationally far simpler and equally suited to characterize complex systems. The definition, introduced by Forman in [27], applies to a very generalized class of cellular structures that includes triangular and polygonal meshes, as well as graphs (and therefore networks). Therefore, the Forman-Ricci curvature is a natural candidate for a discrete Ricci curvature that can be universally adapted to any type of network. After emerging as highly suited for rendering images [28,29,30] and complex network analysis [31,32], we will explore its applicability on characterizing dynamic effects in complex systems.

In the present article, we continue and expand the program initiated partly by [31], namely examining the relation of Forman-Ricci curvature with other geometric network properties, such as the node degree distribution and the connectivity structure (For a systematic comparison of various other network characteristics see [31]). Based on this analysis, we suggest characterization schemes that yield insights into the dynamic structure of the underlying data as described in the following section. The emphasis of the article lies on analyzing the class of peer-to-peer networks, of which we provide two examples: Email communications [33] and information exchange with the file-exchange system Gnutella [34,35].

The main part of the paper introduces a novel change detection method for complex dynamic networks that exploits the Ricci flow on the edges with respect to the Forman curvature. Efficient implementations of the formalism enable structural analysis of big data as we demonstrate with the Gnutella example. For this, we walk the reader through the analysis of sets of peer-to-peer networks with respect to change detection, providing an overview on the work flow of the method.

Future applications include the characterization of dynamic effects in various classes of real-world networks and the analysis of the underlying data, as well as curating related data bases.

2. Methods

The change detection method introduced in this article is based on a network-analytic formulation of the Ricci-flow using a discrete Ricci-curvature on networks introduced by R. Forman [27]. In this section we define both the Forman-Ricci curvature and the Ricci flow on networks and demonstrate their ability to characterize complex dynamic systems and their underlying data on the example of a peer-to-peer network.

2.1. Forman-Ricci Curvature on Networks

In the classical context of smooth Riemannian manifolds (e.g., surfaces), Ricci curvature represents an important geometric invariant that measures the deviation of the manifold from being locally Euclidean by quantifying its volume growth rate. An essential property of this curvature is that it operates directionally along vectors. For our discrete setting it follows directly that Forman’s curvature is associated with the discrete analog of those vectors, namely the edges of the network.

While Forman’s approach defines Ricci curvature on the very general setting of n-dimensional cellular structures, we will concentrate on the simpler case of 1-dimensional weighted cellular spaces that can be represented as a weighted network graph. We will not consider higher dimensional cases here, since their technicalities would carry us well beyond the scope of the present paper and our intended applications. For a theoretical introduction see [27].

In the 1-dimensional case, Forman’s Ricci curvature for a network edge is defined by the following combinatorial formula

{Ric}_{F} (e) = ω (e) (\frac{ω (v_{1})}{ω (e)} + \frac{ω (v_{2})}{ω (e)} - \sum_{ω (e_{v_{1}}) \sim e, e_{v_{2}} \sim e} [\frac{ω (v_{1})}{\sqrt{ω (e) ω (e_{v_{1}})}} + \frac{ω (v_{2})}{\sqrt{ω (e) ω (e_{v_{2}})}}])

(1)

where

e denotes the edge under consideration that connects the nodes $v_{1}$ and $v_{2}$ ;
$ω (e)$ denotes the (positive) weight on the edge e;
$ω (v_{1}), ω (v_{2})$ denote the (positive) weights associated with the nodes $v_{1}$ and $v_{2}$ , respectively;
$e_{v_{1}}, e_{v_{2}}$ denote the set of edges connected to nodes $v_{1}$ and $v_{2}$ , respectively.

Note that in Equation (1) only edges parallel to a given edge e are taken into account, i.e., only edges that share a node with e. We highlight that, by its very definition, Forman’s discrete curvature is associated to edges and therefore ideally suited for the edge-based study of networks (connectivity, directionality). Particularly, it does not require any technical artifice in extending a node curvature measure to edges, as some other approaches do. In particular, there is no need to artificially generate and incorporate two- or higher dimensional faces: Such an approach would impose severe constraints on computability.

Additionally, a “good” discretization of Ricci curvature, such as Forman’s proves to be [27], will capture the Ricci curvature’s essential characteristic of measuring the growth rate—a property that is of special interest in the context of dynamic networks. Therefore, the Forman-Ricci curvature represents a way of determining whether a network has the potential of infinite growth (negative curvature), or can only attain a maximal-and therefore computable-size (positive curvature). In particular, a network will be flat, i.e., it will have Forman-Ricci curvature equal to zero, if its growth and geodesics dispersion rates will be similar to that of the Euclidean plane. This aspect represents a further motivation for studying the Ricci curvature of networks, since it allows one to distinguish numerically between expander type networks of negative curvature, such as information networks, and small world networks that are, on average, of strictly positive curvature (see also [25]). We will further explore this in a forthcoming article [36].

2.2. Characterizing Large Data Sets with Ricci Curvature

We now want to explore the Forman curvature as a tool for characterizing real-world networks. Since this paper centers around peer-to-peer networks, we choose an example of email communication from [33]. In such network graphs (denoted G), nodes describe correspondents (peers) and edges the exchange of messages among the peers.

To characterize the network’s structure with curvature, we have to impose normalized weighting schemes

\begin{matrix} ω : V (G) \mapsto [0, 1] (nodes) \end{matrix}

(2)

\begin{matrix} γ : E (G) \mapsto [0, 1] (edges) \end{matrix}

(3)

on both nodes and edges. Naturally, the busiest communicators should have the highest weights. Therefore, we choose a combinatorial weighting scheme based on node degrees, i.e., the number of connections for each node v:

\begin{matrix} ω (v) = \frac{1}{\deg (v)} \sum_{e_{i} \sim v} γ (e_{i}) . \end{matrix}

(4)

Analogously, we want to weigh extensively used communication channels (edges) higher than rarely used ones . For this, we calculate the minimum path length l between each pair of connected nodes

(v_{i}, v_{j})

and impose

\begin{matrix} γ (v_{i}, v_{j}) = \{\begin{matrix} \frac{1}{l}, & l \leq 6, v_{i} \sim v_{j} \\ 0, & else . \end{matrix} \end{matrix}

(5)

The motivation behind this choice of weights lies in the “small world”-property (i.e., a maximum degree of separation of six [37]) that has been reportedly found in real-world. In accounting for this, we only check for indirect connections up to a path length of six and scale the weights according to the distribution. The resulting structure of the network is shown in Figure 1A, the size of the nodes reflects their weights.

Using edge and node weights, we can now determine the Forman curvature distribution in the network. A curvature map (Figure 1C) provides a planar representation that highlights clusters and distinguished regions. The histogram (Figure 1D) shows the distribution of the curvature values allowing for comparison with the node degree distribution (Figure 1B). By indicating a correlation between the two distributions, the results highlight the strong influence of node degree weights on the network’s topology. This is consistent with observations in email communications: Densely interconnected communities form around busy communication channels and active correspondents.

2.3. Ricci-Flow with Forman Curvature

As mentioned earlier, the Ricci flow as a powerful geometric tool was devised by Hamilton [21] and further developed by Perelman [19,20] in the course of his celebrated proof of the Poincare conjecture. Since then, it has continued to be an active and productive field of study, both in terms of theoretical questions, and also for diverse practical applications, including work by Gu et al. (see, e.g., [39]). Those mainly build on a combinatorial version introduced by Chow and Luo [40]. However, other discretizations of the flow, with reported applications in network and imaging sciences, are explored in the literature [25,41].

The classical Ricci flow is defined by

\frac{\partial g_{i j}}{\partial t} = - {Ric}_{F} (g_{i j}) \times g_{i j}

(6)

where

g_{i j}

denotes the metric of the underlying manifold, here represented by the earlier introduced weighting scheme of the network’s edges. Note that Equation (6) above shows that the Ricci flow evolves a manifold proportionally to its Ricci curvature, by “pushing” faster the regions of higher curvature. This is a fact that we exploit in our application to determine changes in dynamic (peer-to-peer) networks.

The reader might note the resemblance with the classical Laplace (or more precisely Laplace–Beltrami) flow that has become, by now, standard in Imaging and Graphics (see, e.g., [37,42] and the references therein), defined as

\frac{\partial I}{\partial t} = Δ I

(7)

where I denotes the image, viewed as a parametrized surface in

R^{3}

, and Δ denotes, as customary, the Laplacian. (The difference in sign is due to the two different conventions in defining the Laplacian).

The resemblance is neither accidental nor superficial. Indeed, the Ricci curvature can be viewed as a Laplacian of the metric. We address the practical implications of this observation in the sequel and a follow-up article by the authors [32] that addresses more theoretical questions of this matter.

In our discrete setting, lengths are replaced by the (positive) edge weights. Time is assumed to evolve in discrete steps and each “clock” (i.e., time step) has a length of 1. With these constraints the Ricci flow takes the form

\tilde{γ} (e) - γ (e) = - {Ric}_{F} (e) \times γ (e),

(8)

where

\tilde{γ} (e)

denotes the new (updated) value of

γ (e)

with

γ (e)

being the original—i.e., given-one. In this context, we want to discuss a few issues and observations regarding this last equation:

At each iteration step (i.e., in the process of updating $\tilde{γ} (e)$ to $γ (e)$ ), the Forman curvature has to be recomputed for each edge e, since it depends on its respective weight $γ (e)$ . This clearly increases the computational effort on magnitudes, however, the computation task is less formidable than it might appear at first.
As already stressed, we consider a discrete time model. Since for smoothing (denoising) a short time flow has to be applied (because, by the theory for the smooth case, a long time flow will produce a limiting state of the network), only a small number of iterations needs be considered. The precise number of necessary iterations is to be determined experimentally. Even though a typical number can be found easily, best results may be obtained for slightly different numbers—depending on the network, and the type and level of the noise, of course.
Ollivier also devised a continuous flow [11,22]. In the context of the present article, a continuous setting is not required, but for other types of networks, where the evolution is continuous in time, it might be preferable to implement the continuous variant, suitably adapted to the Forman curvature, rather than to Ollivier’s one.

In addition to the Ricci flow above, one can consider the scalar curvature flow that in our case will have the form:

\frac{d}{d t} γ (e_{i}) = - \frac{1}{2} ({scal}_{F} (v) - {scal}_{F} (v_{i})) γ (e_{i}),

(9)

where

e_{i} = e_{i} (v, v_{i})

denote the edges through the node v, and

{scal}_{F} (v)

the (Forman) scalar curvature at a node v, which we define by

{scal}_{F} (v) = \sum_{e_{i} \sim v} {Ric}_{F} (γ (e_{i})) .

(10)

2.4. Characterizing Dynamic Data with Ricci Flow

The introduced Ricci-flow can be utilized to characterize dynamic data. Given snapshots of a system at various (discrete) times

{t_{i}}

, we analyze the Ricci flow on the corresponding network representations. The flow yields insights into structural changes providing a tool to identify “interesting” network regions. Applications include efficient change detection in large dynamic data representing complex systems, as described in the following section.

Let

{(t_{i})}_{i \in I}

be a discrete time series with step size

Δ t

. Consider a complex dynamic system, the behavior of which we have snapshots at times

(t_{i})

represented in weighted network graphs

{(G_{i})}_{i \in I}

. Let

t_{i}

and

t_{i + 1}

be consecutive time points with corresponding graphs

G_{i}

and

G_{i + 1}

. The weighting schemes of the nodes

\begin{matrix} ω_{i, i + 1}^{0} : V (G) \mapsto [0, 1] \end{matrix}

(11)

and edges

\begin{matrix} γ_{i, i + 1}^{0} : E (G) \mapsto [0, 1] \end{matrix}

(12)

characterize the topological structure of the system at times

t_{i}

and

t_{i + 1}

. We can estimate the Ricci-flow for the time step

Δ t

by iterating

k = 1, \dots, K

times over

\begin{matrix} γ_{i, i + 1}^{k + 1} (e) = γ_{i, i + 1}^{k} (e) - Δ t \times {Ric}_{F} (γ_{i, i + 1}^{k} (e)) \times γ_{i, i + 1}^{k} (e) \end{matrix}

(13)

resulting in modified weighting schemes

γ_{i}^{K}

and

γ_{i + 1}^{K}

respectively. We conjecture that the correlation between these weighting schemes characterizes the flow. Regions that were subject to significant change during the time

Δ t

can be identified by thresholding the resulting correlation matrix.

3. Application

3.1. Change Detection with Ricci Flow

A major challenge of modern data science lies in characterizing dynamic effects in large data sets, such as structural changes in discrete time series of “snapshots” of a system’s state or frequently updated large data bases. Commonly, network graphs are inferred from data representing interactions and associations in the underlying data. In the case of peer-to-peer networks, the network describes the information flow (edges) between the peers (nodes).

We want to use the Forman-Ricci curvature to analyze dynamic changes in the structure of such interaction networks obtained from large data sets. Specifically, we want to characterize the information flow between the network’s nodes using the previously introduced Ricci flow. Our method follows the formalism described in Section 1 and is schematically displayed in Figure 2.

The analysis of the information flow can be used to detect changes or distinguished regions of activity in the data. More precisely, we take advantage of the property of the classic Ricci flow inherited by the suggested discretization, namely the faster evolution of regions with higher curvature. Thus, changes occurring in these parts of the network will be emphasized and can be detected by characterizing the corresponding Ricci flow. In contrast to a mere comparison of the changes in curvature, this allows for an analysis of the underlying dynamic effects and possibly predicting the network’s future evolution. Applications include the curation of large open-access data bases and the detection of rare events in experimental data, such as spiking neurons in measurements of neuronal activity.

3.2. Analysis of Gnutella Peer-To-Peer Network

We consider a series of discrete time snapshots of a complex peer-to-peer system, represented as network graphs

{(G_{i})}_{i \in I}

. To characterize the Ricci flow, we apply the earlier described formalism pairwise, i.e., we iterate (

K = 10

) over (13) for snapshots

(G_{i}, G_{i + 1})

at consecutive time points

t_{i}

and

t_{i + 1}

. Here we use the file-exchange service Gnutella as an example, analyzing the peer-to-peer networks resulting from exchange activities on two consecutive days (8 and 9 August 2002, [34,35]). Figure 3 shows the networks infered from the data sets and the corresponding curvature maps with the distribution of Forman Ricci curvature. We applied our change detection method with a correlation threshold of

t h r e s h = 0.8

to detect regions of significant changes, i.e., groups of peers with significant activity. The results are shown in Figure 4, represented as a heatmap of the thresholded correlation matrix.

Light spots in the heatmap correspond to network regions with large flow and structural changes, allowing for detection and localization of dynamic effects. The method provides a clear representation of dynamic changes in networks, especially in terms of the community structure of the network. Given the intrinsic community structure of a network (clusters), which can also be infered from evaluating the curvature, one can examine the influence of the flow on this specific structural property.

4. Discussion and Future Work

The dawning age of big data opens up great numbers of possibilities and perspectives to gain insights into the principles of nature, humanity and technology through the collection and (statistical) analysis of data. However, the vast amount of available data challenges our analysis methods leading to an increased need of automated tools that can perform rapid and efficient data evaluations.

Networks are an efficient and commonly used data representation, emphasizing interactions and associations. This representation is ideal for analyzing structures within the data under geometric aspects. We have used networks to characterize peer-to-peer systems and analyze dynamic effects in the information transfer between its peers. In particular, we introduced a method for detecting changes in these dynamics. Possible applications of this theoretical framework include denoising of experimental data and identification of “interesting” groups of data points and activities of network regions.

The geometric methods used in the present article build on R. Forman’s work on Ricci curvature in networks and the corresponding Ricci flow. Future work, especially two follow-up articles by the authors [32,36], will expand the range of geometric tools used in the methods and develop a deeper understanding of theoretical aspects. In what follows, we want to name and discuss some extensions and theoretical considerations that we shall address in future work.

One important fact, whose implications we shall discuss later on, is the so called Bochner-Weitzenboeck formula (see, e.g., [43]), which relates graph Laplacian and Ricci curvature through an algebraic–geometric approach. More precisely, it prescribes a correction term for the standard Laplacian (or Laplace–Beltrami) operator, in terms of the curvature of the underlying manifold. Given that the Laplacian plays a key role in the heat equation (see, e.g., [43]), it is easy to gain some basic physical intuition behind the phenomenon of curvature and flow: The heat evolution on a curved metal plate differs from that on a planar one in a manner that is evidently dependent on the shape (i.e., curvature) of the plate.

This suggests a number of possible future directions:

A task that is almost self evident, is to further experiment with very large data sets (numbers of data points in the order of ten thousand and more);
Another natural target is the use of our method on different types of networks, with special emphasis on Biological Networks;
A statistical analysis regarding the Ricci flow, similar to the one presented here and in [31], should also be performed on various standard types of networks in order to confirm and calibrate the characterization and classifying capabilities of the Ricci curvature and flow.

Slightly more demanding are future experiments and comparisons with the related flows, namely

The Forman curvature versions of the scalar and Laplace–Beltrami flows. Especially the last one seems to be promising for network denoising, as applications of the analogous flow in image processing showed [44]. Moreover, the Forman-Ricci curvature comes naturally coupled with a fitting version of the so called Bochner Laplacian (and yet with another, intrinsically connected, rough Laplacian). This aspect is subject to ongoing work and will be covered in a forthcoming paper by the authors.
As for the short time Ricci flow, a statistical analysis should be undertaken to validate the classification potential of the long term Ricci flow. A more ambitious, yet still feasible, future direction would be to explore network stability by considering the long time Forman-Ricci flow (as opposed to the short time one employed for denoising). This approach would exploit, in analogy with the smooth case [19,20], the propensity of the Ricci flow to preserve and quantify the overall, global Geometry (i.e., curvature) and essential topology of the network. This would allow us to study the evolution of a network “under its own pressure” and to detect and examine such catastrophic events as virus attacks and denial of service attempts. Given the basic numerical simplicity of our method, this approach might prove to be an effective alternative to the Persistent Homology method (see, e.g., [6]) for the 1-dimensional case of networks. Moreover, the Ricci flow does not need to make appeal to higher dimensional structures (namely simplicial complexes) that are necessary for the Persistent Homology based applications, with clear computational advantages (see, e.g., the code described in [45]), but also theoretically rigor. Furthermore, the here defined Ricci flow can be applied on weighted networks, whereas Persistant Homology requires unweighted complexes.

Acknowledgments

Emil Saucan acknowledges the warm support of the Max-Planck-Institute for Mathematics in the Sciences, Leipzig, for the warm hospitality during the writing of this paper. He would also like to thank Areejit Samal for many interesting discussions. Melanie Weber was supported by a scholarship of the Konrad Adenauer Foundation.

Author Contributions

This article represents an offshoot of the Melanie Weber’s masters thesis, written under the guidance of Jürgen Jost and Emil Saucan.

Conflicts of Interest

The authors declare no conflict of interest.

References

Watts, D.J.; Strogatz, S.H. Collective dynamics of small-world networks. Nature 1998, 393, 440–442. [Google Scholar] [CrossRef] [PubMed]
Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [PubMed]
Albert, R.; Barabási, A.L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47. [Google Scholar] [CrossRef]
Barabasi, A.L. Network Science; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
Barabási, A.L.; Oltvai, Z.N. Network biology: Understanding the cell’s functional organization. Nat. Rev. Genet. 2004, 5, 101–113. [Google Scholar] [CrossRef] [PubMed]
Petri, G.; Expert, P.; Turkheimer, F.; Carhart-Harris, R.; Nutt, D.; Hellyer, P.J.; Vaccarino, F. Homological scaffolds of brain functional networks. J. R. Soc. Interface 2014, 11, 20140873. [Google Scholar] [CrossRef] [PubMed]
Šubelj, L.; Bajec, M. Robust network community detection using balanced propagation. Eur. Phys. J. B 2011, 81, 353–362. [Google Scholar] [CrossRef]
Barthélemy, M. Spatial networks. Phys. Rep. 2011, 499, 1–101. [Google Scholar] [CrossRef]
Ellison, N.B.; Steinfield, C.; Lampe, C. The benefits of Facebook friends: Social capital and college students’ use of online social network sites. J. Comput.-Mediat. Commun. 2007, 12, 1143–1168. [Google Scholar] [CrossRef]
Banerjee, A.; Jost, J. Spectral plot properties: Towards a qualitative classification of networks. Netw. Heterog. Media 2008, 3, 395–411. [Google Scholar] [CrossRef]
Ollivier, Y. Ricci curvature of Markov chains on metric spaces. J. Funct. Anal. 2009, 256, 810–864. [Google Scholar] [CrossRef]
Jost, J.; Liu, S. Ollivierís Ricci curvature, local clustering and curvature-dimension inequalities on graphs. Discret. Comput. Geom. 2014, 51, 300–322. [Google Scholar] [CrossRef]
Wu, Z.; Menichetti, G.; Rahmede, C.; Bianconi, G. Emergent complex network geometry. Sci. Rep. 2015, 5, 10073. [Google Scholar] [CrossRef] [PubMed]
Sandhu, R.; Georgiou, T.; Reznik, E.; Zhu, L.; Kolesov, I.; Senbabaoglu, Y.; Tannenbaum, A. Graph curvature for differentiating cancer networks. Sci. Rep. 2015, 5, 12323. [Google Scholar] [CrossRef] [PubMed]
Eckmann, J.; Moses, E. Curvature of co-links uncovers hidden thematic layers in the world wide web. Proc. Natl. Acad. Sci. USA 2002, 99, 5825–5829. [Google Scholar] [CrossRef] [PubMed]
Shavitt, Y.; Tankel, T. On the curvature of the Internet and its usage for overlay construction and distance estimation. In Proceedings of the NFOCOM 2004 Twenty-Third AnnualJoint Conference of the IEEE Computer and Communications Societies, Hong Kong, China, 7–11 March 2004; Volume 1.
Saucan, E.; Appleboim, E. Curvature based clustering for DNA microarray data analysis. In Pattern Recognition and Image Analysis; Springer: Berlin/Heidelberg, Germany, 2005; pp. 405–412. [Google Scholar]
Narayan, O.; Saniee, I. Large-scale curvature of networks. Phys. Rev. E 2011, 84, 066108. [Google Scholar] [CrossRef] [PubMed]
Perelman, G.J. The entropy formula for the Ricci flow and its geometric applications. arXiv, 2002; arXiv:math/0211159. [Google Scholar]
Perelman, G.J. Ricci flow with surgery on three-manifolds. arXiv, 2003; arXiv:math/0303109. [Google Scholar]
Hamilton, R.S. The Ricci flow on surfaces. In Mathematics and General Relativity; Contemporary Mathematics; American Mathematical Society: Providence, RI, USA, 1988; Volume 71, pp. 237–262. [Google Scholar]
Ollivier, Y. A survey of Ricci curvature for metric spaces and Markov chains. Adv. Stud. Pure Math. 2010, 57, 343–381. [Google Scholar]
Ollivier, Y. A Visual introduction to Riemannian Curvatures and Some Discrete Generalizations. Avaliable online: http://www.yann-ollivier.org/rech/publs/visualcurvature.pdf (accessed on 8 November 2016).
Bauer, F.; Jost, J.; Liu, S. Ollivier-Ricci curvature and the spectrum of the normalized graph Laplace operator. Math. Res. Lett. 2012, 19, 1185–1205. [Google Scholar] [CrossRef]
Ni, C.-C.; Lin, Y.-Y.; Gao, J. Ricci curvature of the Internet topology. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China, 26 April–1 May 2015; pp. 2758–2766.
Sandhu, R.; Georgiou, T.; Tannenbaum, A. Market Fragility, Systemic Risk, and Ricci Curvature. arXiv, 2015; arXiv:1505.05182. [Google Scholar]
Forman, R. Bochner’s method for cell complexes and combinatorial Ricci curvature. Discret. Comput. Geom. 2003, 29, 323–374. [Google Scholar] [CrossRef]
Saucan, E.; Wolansky, G.; Appleboim, E. Combinatorial Ricci Curvature and Laplacians for Image Processing. In Proceedings of the 2nd International Congress on Image and Signal Processing (CISP’09), Tianjing, China, 17–19 October 2009; Volume 2, pp. 992–997.
Appleboim, E.; Saucan, E.; Zeevi, Y.Y. Ricci Curvature and Flow for Image Denoising and Superesolution. In Proceedings of the the 20th Signal Processing Conference (EUSIPCO), Bucharest, Romania, 27–31 August 2012; pp. 2743–2747.
Sonn, E.; Saucan, E.; Appelboim, E. Ricci Flow for Image Processing. In Proceedings of the 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), Eylat, Israel, 3–5 December 2014.
Sreejith, R.P.; Mohanraj, K.; Jost, J.; Saucan, E.; Samal, A. Forman curvature for complex networks. J. Stat. Mech. 2016, 063206. [Google Scholar] [CrossRef]
Weber, M.; Saucan, E.; Jost, J. Characterizing Complex Networks with Forman-Ricci curvature and associated geometric flows. arXiv, 2016; arXiv:1607.08654. [Google Scholar]
Kunegis, J. KONECT-The Koblenz Network Collection. In Proceedings of the International Conference on World Wide Web Companion, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 1343–1350.
Leskovec, J.; Kleinberg, J.; Faloutsos, C. Graph Evolution: Densification and Shrinking Diameters. In ACM Transactions on Knowledge Discovery from Data (ACM TKDD); ACM: New York, NY, USA, 2007; Volume 1. [Google Scholar]
Ripeanu, M.; Foster, I.; Iamnitchi, A. Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design. IEEE Int. Comput. 2002, 6, 50–57. [Google Scholar]
Weber, M.; Saucan, E.; Jost, J. Can one see the shape of a network? arXiv, 2016; arXiv:1608.07838v2. [Google Scholar]
De Sola Pool, I.; Kochen, M. Contacts and influence. Soc. Net. 1978, 1, 5–51. [Google Scholar] [CrossRef]
Michalski, R.; Palus, S.P. Matching Organizational Structure and Social Network Extracted from Email Communication. In Lecture Notes in Business Information Processing; Springer: Berlin/Heidelberg, Germany, 2011; Volume 87, pp. 197–206. [Google Scholar]
Sarkar, R.; Yin, X.; Gao, J. Greedy Routing with Guaranteed Delivery Using Ricci Flows. In Proceedings of the 8th International Symposium on Information Processing in Sensor Networks (IPSN’09), San Francisco, CA, USA, 13–16 April 2009; pp. 121–132.
Chow, B.; Luo, F. Combinatorial Ricci flows on surfaces. J. Differ. Geom. 2003, 63, 97–129. [Google Scholar]
Saucan, E. A Metric Ricci Flow for Surfaces and its Applications. Geom. Imag. Comput. 2014, 1, 259–301. [Google Scholar] [CrossRef]
Xu, G. Discrete Laplace–Beltrami operators and their convergence. Comput. Aided Geom. Des. 2004, 21, 767–784. [Google Scholar] [CrossRef]
Jost, J. Riemannian Geometry and Geometric Analysis; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Saucan, E.; Appleboim, E.; Wolanski, G.; Zeevi, Y.Y. Combinatorial ricci curvature for image processing. MICCAI 2008 Workshop Manifolds in Medical Imaging: Metrics, Learning and Beyond. arXiv, 2008; arXiv:0903.3676. [Google Scholar]
Mischaikow, K.; Nanda, V. Morse Theory for Filtrations and Efficient Computation of Persistent Homology. Discret. Comput. Geom. 2013, 50, 330–353. [Google Scholar] [CrossRef]

Figure 1. Characterization of a weighted and undirected peer-to-peer network (email correspondence [33,38]) with Forman-Ricci-curvature; (A) Network plot. Node sizes are scaled with respect to their node degrees; (B) (Weighted) node degree distribution; (C) Curvature map, representing the Forman-Ricci-curvature along the network’s edges. The curvature map consists of a heat map of a matrix where each entry

(i, j)

represents the Ricci curvature of the corresponding edge

Ric (e = e (i, j))

. Light yellow entries resemble edges with low curvature, red entries resemble those with high curvature; (D) Histogram showing the distribution of the Forman-Ricci-curvature.

Figure 1. Characterization of a weighted and undirected peer-to-peer network (email correspondence [33,38]) with Forman-Ricci-curvature; (A) Network plot. Node sizes are scaled with respect to their node degrees; (B) (Weighted) node degree distribution; (C) Curvature map, representing the Forman-Ricci-curvature along the network’s edges. The curvature map consists of a heat map of a matrix where each entry

(i, j)

represents the Ricci curvature of the corresponding edge

Ric (e = e (i, j))

. Light yellow entries resemble edges with low curvature, red entries resemble those with high curvature; (D) Histogram showing the distribution of the Forman-Ricci-curvature.

Figure 2. Schematic overview of the change detection workflow; (A) We consider a set of “snapshots” of a system or data base at discrete times

{(t_{i})}_{i \in I}

with step size

Δ t

; (B) We infer unweighted (binary) networks from each snapshot that represent the structure of the underlying data. To simplify comparison, we normalize all networks; (C) By superimposing weighting schemes based on “indirect connections”, we extend the unweighted networks to weighted ones (see Section 1); (D) We calculate the Ricci flow for each time step

Δ t

by iterating over a the given scheme; (E) For detection of changes, we compare the final (smoothed) weighting schemes after K iterations; (F) To identify regions that were subject to significant change, we threshold the correlation matrix obtained in (E). Light regions in the resulting map indicate such regions.

Figure 2. Schematic overview of the change detection workflow; (A) We consider a set of “snapshots” of a system or data base at discrete times

{(t_{i})}_{i \in I}

with step size

Δ t

; (B) We infer unweighted (binary) networks from each snapshot that represent the structure of the underlying data. To simplify comparison, we normalize all networks; (C) By superimposing weighting schemes based on “indirect connections”, we extend the unweighted networks to weighted ones (see Section 1); (D) We calculate the Ricci flow for each time step

Δ t

by iterating over a the given scheme; (E) For detection of changes, we compare the final (smoothed) weighting schemes after K iterations; (F) To identify regions that were subject to significant change, we threshold the correlation matrix obtained in (E). Light regions in the resulting map indicate such regions.

Figure 3. We analyze an Internet peer-to-peer network representing Gnutella file exchanges on two consecutive days (8 and 9 August 2002, from [34,35]); (A) Network plot (left) and curvature map (right) displaying the distribution of Forman curvature for a Gnutella snapshot on 8 August; (B) Analogous plots for a Gnuttela snapshot from the following day, 9 August.

Figure 4. Detection of changes and regions of distinguished activity in Gnutella snapshots of two consecutive days [34,35] with Ricci flow and a parameter choice of

K = 10

and

t h r e s h = 0.8

(the relatively small number of iteration steps was chosen for the sake of computation time). The figure shows a heat map of the correlation matrix of the edges’ Ricci-curvature. Highlighted spots correspond to single edges and assortments of edges and nodes (clusters) with high activity (flow).

Figure 4. Detection of changes and regions of distinguished activity in Gnutella snapshots of two consecutive days [34,35] with Ricci flow and a parameter choice of

K = 10

and

t h r e s h = 0.8

(the relatively small number of iteration steps was chosen for the sake of computation time). The figure shows a heat map of the correlation matrix of the edges’ Ricci-curvature. Highlighted spots correspond to single edges and assortments of edges and nodes (clusters) with high activity (flow).

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Weber, M.; Jost, J.; Saucan, E. Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets. Axioms 2016, 5, 26. https://doi.org/10.3390/axioms5040026

AMA Style

Weber M, Jost J, Saucan E. Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets. Axioms. 2016; 5(4):26. https://doi.org/10.3390/axioms5040026

Chicago/Turabian Style

Weber, Melanie, Jürgen Jost, and Emil Saucan. 2016. "Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets" Axioms 5, no. 4: 26. https://doi.org/10.3390/axioms5040026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forman-Ricci Flow for Change Detection in Large Dynamic Data Sets

Abstract

1. Introduction

2. Methods

2.1. Forman-Ricci Curvature on Networks

2.2. Characterizing Large Data Sets with Ricci Curvature

2.3. Ricci-Flow with Forman Curvature

2.4. Characterizing Dynamic Data with Ricci Flow

3. Application

3.1. Change Detection with Ricci Flow

3.2. Analysis of Gnutella Peer-To-Peer Network

4. Discussion and Future Work

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI