**6. Conclusions**

Over the past 20 years, state-of-the-art information filtering methods such as the MST and the PMFG have revolutionized the field of econophysics, and also made contributions to other closely related disciplines. In this paper, we suggested two related directions to extend this information-filtering paradigm. The first is through topological data analysis (TDA), and the second is through the calculation of Ollivier-Ricci curvature. The former improves our understanding of the topological backbones of financial networks, whereas the latter puts the geometrical information back onto the topological backbones.

In the TDA, we explored four toy models of fusions, namely (1) the merging of two ellipsoidal surfaces, (2) the merging of two biconvex surfaces, (3) the merging of two anisotropic ellipsoidal surfaces through a sequence of higher-dimensional connections, and finally (4) the merging of two random irregular surfaces. By applying the insights extracted from this exploration to a recent crash in the TWSE, we found the number of simplices increasing slowly with increasing filtration parameter half a year before the market crash, and rapidly with increasing close to the crash. This suggests a non-trivial topological transition accompanied the market crash. However, we found that the four fusion/fission models proposed were not able to fully explain the topological changes, and additional processes (cavitation, annihilation, rupture, healing, and puncture) that do not involve fusion or fission, were needed to explain the changes in Betti numbers.

Moving beyond TDA, we used the Ollivier-Ricci curvature to quantify the distribution of curvatures in PMFGs constructed from the correlation matrices of the TWSE. We explained that positive ORCs correspond to stock components deep within a cluster, whereas negative ORCs pinpointed the neck (bridge) regions that connect distinct clusters. When we examined the PMFGs for nine periods between August 2019 and March 2020, we found dramatic topological changes between successive periods. This prevented us from systematically identifying all topological changes that were specifically associated with neck regions in the PMFGs. Instead, we look only at two neck regions—associated with the links (176, 193) and (176, 393)—that featured prominently during this period. These three nodes are: (176) Tung Thih Electronic Co., Ltd, (193) C-Tech United Corp., and (393) Taiwan Semiconductor Co., Ltd. During the last time period, (176, 193) was no longer found in the PMFG, while the curvature of (173, 393) became nearly zero. Using ego-network visualizations of these three nodes and selective visualization of links between them, we saw that all three nodes lie on the peripheries of the clusters they belonged to. In the seventh time period, all three clusters were distinct. In the eighth time period, the cluster containing 176 merged with the cluster containing 193. Finally, in the ninth time period, this cluster broke up into a small cluster containing 193, while the larger cluster containing 176 proceeded to merge with the cluster containing 393.

**Supplementary Materials:** MATLAB and Python scripts for TDA are available at https://doi.org/10 .21979/N9/8XMZGF, accessed on 21 February 2021, whereas MATLAB and Python scripts and data files for Ricci curvature analysis are available at https://doi.org/10.21979/N9/EO5QON, accessed on 19 July 2021. The barcodes for the four toy models in Appendix B and the barcodes for the nine time periods of the TWSE in Section 4.3 are also available online at https://www.mdpi.com/1099-4 300/23/9/1211/s1.

**Author Contributions:** Conceptualization, S.A.C. and K.X.; methodology, S.A.C. and P.T.-W.Y.; software, K.X.; writing—original draft preparation, S.A.C. and P.T.-W.Y.; writing—review and editing, S.A.C., K.X. and P.T.-W.Y.; visualization, S.A.C. and P.T.-W.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All Python and Matlab scripts are provided, along with instructions on how to use them. This will download the raw data from Yahoo! Finance and perform the necessary computations to give the final results. See the links listed in Supplementary Material.

**Acknowledgments:** P.T.-W.Y. thanks the National Center for High-Performance Computing, Hsinchu, Taiwan for their technical support on providing high performance computing resources on Taiwania 1 server.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Construction of MST and PMFG**

For readers interested in how to put the algorithms in Figure 3 into practice, let us use the Pearson cross correlations between 10 Asian stock market indices as an example. These 10 stock market indices are shown in Table A1, and their ordered cross correlations are shown in Table A2.

To construct the MST for these 10 Asian indices, we start from the largest cross correlation *C*2,7 = 0.705 in Table A2, to draw a link between node 2 (Hong Kong) and node 7 (South Korea). We then proceed to go through the cross correlations within Table A2 in decreasing order, to add a link between node 2 and node 10 (Singapore), node 7 and node 8 (Taiwan), node 1 (Japan) and node 7, ..., until we reach *C*2,3 = 0.472 between node 2 and node 3. To arrive at the tree graph shown in Figure A1a, we have rejected eight links, because they would result in cycles if we accepted them.


**Table A1.** List of ten Asian stock market indices.

According to Table A2, *C*2,4 = 0.466 is the next cross correlation to be considered. Indeed, if we add a link between node 2 and the new node 4 (Thailand), no cycles are created. Therefore, we accept this new link, to end up with the tree graph shown in Figure A1b.

After adding the link between node 2 and node 4, the next link we should consider is *C*4,5 = 0.447 according to Table A2. However, as we can see, adding this link that is colored red in Figure A1c, accepting this link between node 4 and node 5 (India) leads to a cycle in the network. Therefore, we reject the link between node 4 and node 5. Thereafter, we reject six more links, before adding a link between node 6 (Indonesia) and node 8 to complete the MST in Figure A1d.


**Table A2.** Ordered list of Pearson cross correlations between the ten Asian indices shown in Table A1.

**Figure A1.** Intermediate and final stages in constructing the MST of 10 Asian indices: (**a**) after adding a link between node 2 and node 3; (**b**) after adding a link between node 2 and node 4; (**c**) after rejecting the red link between node 4 and node 5, due to the appearance of the cycle (5, 2, 4) in the network. Finally, (**d**) all 10 nodes are linked in the MST, after adding the link between node 6 and node 8.

Next, we construct the PMFG for the 10 Asian indices. Again, we start from the largest cross correlation *C*2,7 = 0.705 in Table A2, to draw a link between node 2 and node 7. Then, we go through the cross correlations within Table A2 in decreasing order, and arrive at the network shown in Figure A2a after adding a link between node 8 and node 10, without rejecting any stronger links.

According to Table A2, the next link that we should add is between node 7 and node 10. When we first draw this in Figure A2b, the link between node 7 and node 10 overlaps the link between node 1 and node 2. However, this does not necessarily imply that the link

must be rejected. If we move node 1 to the other side of 5–2–7–8, but keeping it within the loop formed by the 2–8 link, as shown in Figure A2c, we find no overlaps between links.

Similarly, when we add the next link between node 5 and node 7 in Figure A2d, the new link overlaps with the link between node 2 and node 10. Just like when we add the link between node 7 and node 10, we do not immediately reject this new link, but check if the network obtained thus far can be redrawn to accommodate the new link. Indeed, by moving node 5 into the triangle formed by nodes 2, 7, and 10, we show that the network with the new link continues to be planar, as shown in Figure A2e.

Finally, when we attempt to add the next link between node 5 and node 10, as shown in Figure A2f, we see that there is no rearrangemen<sup>t</sup> of the existing nodes and links that we can do, for the new link to not overlap with existing links (and remain planar). Therefore, this link has to be rejected.

**Figure A2.** Intermediate stages in constructing the PMFG of 10 Asian indices: (**a**) after adding the link between node 8 and node 10; (**b**) after adding the link between node 7 and node 10, before checking for planarity; (**c**) after adding the link between node 7 and node 10, and after checking for planarity; (**d**) after adding the link between node 5 and node 7, before checking for planarity; (**e**) after adding the link between node 5 and node 7, and after checking for planarity; (**f**) after adding the link between node 5 and node 10, before checking for planarity. After checking for planarity, this link between node 5 and node 10 is rejected. In this figure, MST links are colored blue, PMFG links are colored red, while links that are rejected are shown as red dashed curves.

(f)

(e)

#### **Appendix B. Topological Data Analysis of Toy Models of Fusions**

Before we attempt to understand the complex events underlying a market crash, we should first understand the simplest topological changes associated with the fusion between two manifolds, or the breaking up of a manifold into multiple pieces. In this paper, we worked out the TDA signatures of four toy models (see Figure A3): (1) fusion through a single point of contact, (2) fusion through a 1-dimensional set, (3) fusion through a set with increasing dimensionality, and (4) random. For these toy models, we first normalize the distance matrices by dividing them by the largest distance, so that when we do filtrations, the threshold will always terminate at = 1. According to some researchers in this field, TDA may detect artifacts or noise due to specific data sampling methods. To eliminate these artifacts due to data sampling, we randomized and reshuffled the data points. This helps us focus on the more meaningful topological features in the data set. For each toy model, we used the Javaplex software to compute the persistent Betti numbers at each stage of the fusion. The barcodes for all stages of all toy models are also shown in the Supplementary Material, and the persistent Betti numbers are read off from the barcodes at the largest filtration parameter used.

**Figure A3.** The four fusion models investigated: (**a**) two elliptical manifolds merging by first touching at a point, before developing a hyperbolic neck, and eventually having positive curvature everywhere; (**b**) two biconvex manifolds merging through their rims touching to form a void that shrinks in size; (**c**) a variation of (**a**), where the two anisotropic elliptical manifolds merging with their principal axes not aligned. After touching at a point, the neck between the two merging manifolds first becomes quasi-one-dimensional, before becoming fully two-dimensional; (**d**) the merging between two rough surfaces with random irregularities, which can be thought of as combinations of all of the above, plus emergen<sup>t</sup> features like a hole that eventually becomes a void.

We show the results in Figure A4. For model (A), which is the reverse of the one studied by Santos et al. [181], we found in Figure A4(A1) that while the two ellipsoids are well separated, *β*0 = 2 (two distinct objects), *β*1 = 0 (no irreducible loops on either ellipsoids), and *β*2 = 2 (one void enclosed by each ellipsoid). Then, in Figure A4(A2), the two deformed ellipsoids are just touching, and the point of contact has the form of a Dirac cone, which we expect to have non-trivial topological signatures of its own. As expected, the fusing of the two ellipsoids into a single manifold is reflected in *β*0 = 1. We also found *β*1 = 0, which tells us that there are no irreducible loops. Finally, the Dirac point kept the two voids separated, and hence *β*2 = 2. As shown in Figure A4(A3), the fusion is complete, but the neck region joining the original two ellipsoids has negative curvature. In this situation, we found that *β*0 = 1 (one distinct object), *β*1 = 0, and *β*2 = 1

(the two voids have merged). This is indeed what we expected, and it is indistinguishable from Figure A4(A4), which is the last stage of the whole coalescence process. The neck region has completely disappeared, and the curvature on the fused manifold is everywhere positive. Indeed, we found that *β*0 = 1, *β*1 = 0, and *β*2 = 1. These Betti numbers are identical to those of a sphere.

For model (B), we found in Figure A4(B1) that when the two biconvex shells are separated, we have *β*0 = 2 (two distinct objects), *β*1 = 0 (no irreducible loops), and *β*2 = 0. We were surprised that *β*2 = 0 since each shell should still enclose a void. We increased the number of data points for this case, but *β*2 remains zero, presumably because the typical gap in the voids is small, and the filtration procedure connects points across the voids. To test this hypothesis, we sliced the point cloud of (B3) into the top and bottom halves, moved them apart, and covered the gap in the equatorial plane between the inner and outer shells (see Figure A5a). For the top half, we found that *β*0 = 1, *β*1 = 0, and *β*2 = 1. In terms of symmetry, this tells us that for the two parts together, we will have *β*0 = 2, *β*1 = 0, and *β*2 = 2. This means that a void can indeed be identified if its narrowest part is distinctly larger than the typical distance between data points. In the next stage of the fusion, the two biconvex shells just touched in Figure A4(B2) to form an inner spherical shell and an outer ellipsoidal shell. As with (A2), the two two-dimensional shells also touched on a set of measure zero (even though the set is a one-dimensional ring, instead of a zero-dimensional point). This probably explains why *β*0 = 2 (i.e., the shells remain topologically distinct). The other Betti numbers also remained the same as in (B1). Thereafter, with regard to (B3) and (B4), we expected them to have identical topological features, and found indeed that *β*0 = 1 (one distinct object), *β*1 = 0 (no irreducible loops), and *β*2 = 2 (two voids enclosed) for the two cases. An observation we feel compelled to share is that for (B4), the density of red points (outer ellipsoid) and the density of blue points (inner sphere) are initially not the same, because we kept roughly the same number of blue and red points over the whole series of point clouds from (B1) to (B4). We then found the TDA calculations for (B4) were not complete even after four days (in contrast to (B1) to (B3), which took on average of two days). We suspected that this was because in the filtration process, the dense blue points at larger led to the emergence of very high-dimensional *k*-simplices. With every increase in dimension, the total number of simplices grew exponentially, and Javaplex slowed down. Indeed, when we performed an alternate TDA calculation in which we reduced the number of blue points, so that their density is comparable to the density of red points, the calculation became much faster: from longer than four days expected for the former, to roughly four hours in the latter.

Next, for model (C), we emulated the change in dimensionality during the fusion of two initially separated parabolic shells (Figure A4(C1)) approaching each other, by first connecting them with a one-dimensional line (Figure A4(C2)). As the fusion progressed, we connected the two parabolic shells with a two-dimensional rectangular sheet (Figure A4(C3)), and finally connected them with a two-dimensional cylindrical shell (Figure A4(C4)). In (C4), we removed points from the original parabolic shells that are within the neck region formed by the cylindrical shell to form the single surface with a channel shown in Figure A5b. We created this toy model as an alternative to model (A), in which the neck formation is locally isotropic. In physics, we know of materials and processes which are anisotropic, i.e., there are easy as well as hard axes. In these materials or processes, the neck formation is expected to go through a sequence of dimensionality changes. For (C1) in this model, we found that *β*0 = 2, *β*1 = 0, and *β*2 = 0. It is understandable that *β*0 = 2, because there are two distinct parabolic shells. We also understand why *β*1 = 0 (because there are no irreducible closed loops) and *β*2 = 0 (because the shells do not enclose any void). When we reached (C2), the two shells became connected. Therefore, we were not surprised to find that *β*0 = 1 (one distinct object). We were also not surprised to find *β*1 = 0 and *β*2 = 0 remaining the same as for (C1). As we moved to (C3) and (C4), we found *β*0 = 1, *β*1 = 0, and *β*2 = 0, the same as in (C2). We had expected differences


between (C2), (C3), and (C4), but it seems that these differences do not manifest themselves in the persistent Betti numbers, but require us to look more closely at the Betti curves.

**Figure A4.** Point clouds generated using MATLAB for different stages (1) to (4) of the four fusion models (**A**–**D**), and their associated Betti numbers. For different cases, the number of data points are different. For example, (**A2**) has more data points than the other stages of model (**A**), because we implemented it as a combination of two spheroids, plus a pair of Dirac cones.

**Figure A5.** (**a**) To study how the typical gaps between data points affect the value of *β*2 TDA produces, we split the point cloud in (B3) into top and bottom halves (blue points), moved them apart, and cover the gap in the equatorial plane between the inner and outer shells with green points. Such a structure would have a well-defined void whose average length scale is much larger than the distance between data points. (**b**) To show the channel formed in the neck region of (C4), we zoomed in to the green data points around the neck, and also rotated the view so that the channel formed by the two connected shells can be seen clearly, especially after the data points are meshed.

Finally, for model (D), we first generated two rugged surfaces on a 20 × 20 square grid. In Figure A4(D1), the two rugged surfaces were initially far apart, and then gradually moved closer, until they overlap at some parts, as shown in Figure A4(D2). We did not remove data points from the top rugged surface that penetrated the bottom rugged surface, or data points from the bottom rugged surface that penetrated the top rugged surface. In (D1), we have *β*0 = 1 (as opposed to *β*0 = 2 that we expected), *β*1 = 1 (as opposed to *β*1 = 0 that we expected), and *β*2 = 0 (which is what we expected). In contrast, for (D2), we have *β*0 = 1, *β*1 = 0, and *β*2 = 0, which are all as we expected. As the two rugged surfaces were brought closer and started to have more overlaps as in Figure A4(D3), their Betti numbers became *β*0 = 1, and *β*1 = 0, which were as expected, but we now had *β*2 = 1 (a void has been formed between the two surfaces.). As we made the two surfaces merge further, we witnessed the change of *β*2 from 1 to 0 (see Figure A4(D4)). This implies that the wrapped void formed by the two rugged surfaces in (D3), had disappeared in (D4). We believe this observation is the result of scale-dependence of *β*2, discussed at the end of Appendix C.

#### **Appendix C. Multiscale Analysis of Persistent Betti Numbers**

We cannot directly apply the results of Appendix B, and Figure 5 to analyze the TDA results of the last three time windows, because we cannot be sure the Betti numbers are persistent. To better understand under what conditions we can have persistent Betti numbers, we visualize two contrasting mechanisms for fusion and fission in Figure A6. In Figure A6a–e, fusion occurs when thresholds decrease. Since *Dij* = 2(1 − *Cij*), these events correspond to increases in cross correlations. To put it more simply, we started out having two clusters with strong intra-cluster correlations, and weak inter-cluster correlations. When the inter-cluster correlations also become strong, the two clusters merge into one larger cluster. In contrast, for Figure A6f–i, fusion occurs when thresholds increase. These events are the result of correlations within the two original clusters weakening, to become comparable to the intra-cluster correlations. Without strong intra-cluster correlations to distinguish between their members, the two clusters effectively merged into a large but weakly correlated cluster. In the first sequence (Figure A6a–e), the persistent Betti number *β*0 is the number of 0-simplexes where the lifetime changes most rapidly in the barcode. As we can see from Figure A6a–e, it is not difficult to identify the persistent *β*0 for this sequence.

In the second sequence, the situation is more interesting. Instead of just one set of persistent *β*0, we find two sets of persistent *β*0 emerging at different scales. This means that we have one picture at the lower scale, but an equally meaningful picture at the higher scale. This also means that we need to be cautious when interpreting changes in *β*0 as the result of fusions and fissions.

Similar scale-dependence can also occur for *β*2. Consider the situation shown in Figure A7, where we can visually discern three voids in a (three-dimensional) point cloud. The three voids have different sizes, and are such that the smallest void occurs in the densest part of the point cloud, while the biggest void occurs in the sparsest part of the point cloud. When we perform TDA on this point cloud, the smallest void would be the first to emerge, when the filtration parameter 1 is just large enough to create a 2-simplex that encapsulates this smallest void. At this value of 1, the points around the medium void and the biggest void are too far apart for the simplices formed around them to fully encapsulate. Therefore, over a fairly large range of filtration parameters, this is the only persistent void we will find, and thus *β*2 = 1.

When we continue to increase the filtration parameter, we will eventually reach the value 2, which is just large enough to create a 2-simplex encapsulating the medium void. At 2, the smallest void remains intact, while the sparse data points around the biggest void are still too far apart for the void to be encapsulated. If we continue to increase the filtration parameter, then at some point 1 , the filtration parameter would become comparable to the size of the smallest void, and this void disappears. Therefore, over the range 2 < < 1 , there are two persistent voids, and hence *β*2 = 2.

Finally, at some other point 1 < 3 < 2 , the biggest void become fully encapsulated and emerge as a topological feature. Later on, when the filtration parameter becomes large enough, we will find the medium void disappearing at 2 , and the biggest void disappearing at 3 . Therefore, depending on what scale we are at, the persistent *β*2 may be 1 or 2, before dropping down to 1, and then eventually becoming 0.

This phenomenon of *β*2 being scale-dependent explains the small values of *β*2 in (D3) and (D4), where we expect from the rugged surfaces approaching each together to produce lots of small voids. In other words, by the time these small voids are produced, the filtration parameter would become so large that points on opposite sides of the void will be linked, and thus, the small voids will disappear.

**Figure A6.** *Cont.*

**Figure A6.** Using hierarchical clustering dendrograms and their corresponding *H*0 barcodes to understand two fusion-and-fission sequences. In the first sequence, we first have the merging threshold between clusters C1 and C2 decreased going from (**<sup>a</sup>**–**b**), and thereafter the merging threshold between clusters C1+<sup>2</sup> and C3 decreased going from (**b**–**<sup>c</sup>**). There is then a rearrangemen<sup>t</sup> C1+2+<sup>3</sup> → C1+2 in (**d**), before the merging threshold between clusters C1 and C2 increased going from (**d**–**<sup>e</sup>**). The fusion (1, 2, 3) → (1 + 2, 3) → (1 + 2 + 3) thus occurs with decreasing threshold, while the fission (1 + 2) → (1, 2) occurs with increasing threshold. In the second sequence, the thresholds of C1, C2, and C3 first increased going from (**f**–**g**), before a rearrangemen<sup>t</sup> C1+2+<sup>3</sup> → C1+2 occurs in (**h**) with comparable thresholds. Finally, the thresholds of C1 and C2 decreased in (**i**) to give two distinct clusters. In this figure, a large distance *Dij* is associated with a small cross correlation *Cij*, and vice versa. For each dendrogram, we also show the persistent Betti number *β*0 in the corresponding barcode.

**Figure A7.** In this stylized depiction of a three-dimensional point cloud, we can visually discern three (persistent) voids. The smallest of these occurs in the densest part of the point cloud, whereas the biggest of these occurs in the sparsest part of the point cloud. (bottom) The barcodes associated with the emergences and disappearances of the three voids.

#### **Appendix D. Ricci Flow and the Poincare Conjecture**

In addition to its role in helping us understand dynamical reorganizations within complex systems, Ricci flow was also at the center of a major recent breakthrough in pure mathematics. In 1904, the French mathematician Poincare conjectured that "any closed simply connected 3-manifold is diffeomorphic to the standard 3-dimensional sphere *S*3" [219], but later believed that the same is true in any dimension. Between 1960 and 1980, the Poincare conjectures for all dimensions were proven [220–223], except for three dimension. This problem was recognized as mathematically hard, but also of fundamental importance, that it was included as one of the seven Millenium Prize problems identified by the Clay Mathematics Institute in 2000. Later, the Poincare conjecture for three dimensions was stated in a more general form by Thurston, who conjectured that "any closed orientable 3-manifold can be canonically cut along embedded 2-spheres and 2-tori so as to decompose into eight different geometrical pieces" [224]. In response to this challenge, Hamilton crafted the Ricci flow model [225], and initiated a program to apply Ricci flow to solve Thurston's *geometrization conjecture* [224]. After the Poincare conjecture in three dimensions was stated in more general terms, the original problem and Thurston's generalized geometrization conjecture went unsolved for 20 more years, until Perelman cracked the final puzzle. With his original and ingenious "Ricci flow with surgery" approach, he successfully expunged the singular region as a work around for solving the geometrization conjecture, and ultimately resolved the Poincare conjecture in three dimensions [226,227]. This achievement not only won him the 2007 Fields Medal, but also made him the recipient of the first Millennium Prize. Since 2007, many mathematicians started dabbling into Ricci flow and related fields, creating something of a "gold rush" in this field over the past 15 years.

#### **Appendix E. Computing the First Wasserstein Distance Using Linear Programming**

In Equation (6), we defined the first Wasserstein distance (also known as the earth mover distance) conceptually, so that it can be used in the definition of the Ollivier-Ricci curvature. In this appendix, let us describe how the first Wasserstein distance on a network can be computed using linear programming.

First, consider a network *G* with five nodes. To compute the first Wasserstein distance between the lazy random walk probability distribution *μ*2 (Figure A8a) and the lazy random walk probability distribution *μ*5 (Figure A8b), let us consider all possible redistributions *ξij* from *μ*2(*i*) to *μ*5(*j*), such that

$$
\mu\_2(i) = \sum\_{j=1}^8 \xi\_{ij\prime} \quad \mu\_5(j) = \sum\_{i=1}^8 \xi\_{ij}. \tag{A1}
$$

These are our constraints.

Next, let *dij* be the geodesic distance between node *i* and node *j*. On an unweighted network such as the one shown in Figure A8, the geodesic distance *dij* is the number of hops needed to go from node *i* to node *j*, and can therefore only take on integer values from 0 to 3. If there is more than one way to ge<sup>t</sup> from node *i* to node *j*, *dij* is the smallest number of hops. With this, we can write down the matrix of geodesic distances as

$$D = \begin{bmatrix} 0 & 1 & 2 & 2 & 3 \\ 1 & 0 & 1 & 1 & 2 \\ 2 & 1 & 0 & 2 & 1 \\ 2 & 1 & 2 & 0 & 1 \\ 3 & 2 & 1 & 1 & 0 \end{bmatrix} . \tag{A2}$$

Following this, let us define the cost of moving *ξij* from node *i* to node *j* to be *dijξij*. The total cost to make *μ*2 into *μ*5 is thus

$$\mathbb{C}(\mu\_2, \mu\_5) = \sum\_{i=1}^{5} \sum\_{j=1}^{5} d\_{ij} \xi\_{ij}^{\mathfrak{x}}.\tag{A3}$$

Our goal is to minimize *<sup>C</sup>*(*μ*2, *μ*5), subject to the constraints in Equation (A1).

This constrained minimization problem can be recasted into a linear programming problem, if we change the indices from *i* and *j* to a fused index *k* = (*i*, *j*). In terms of this fused index, the cost function can be rewritten as

$$C(\mu\_2, \mu\_5) = \sum\_{k=1}^{25} d\_k \zeta\_{k\prime}^k \tag{A4}$$

while the constraints can be written as

$$\sum\_{k=1}^{25} a\_{ik} \mathfrak{z}\_k = \mu\_2(i), \quad \sum\_{k=1}^{25} b\_{jk} \mathfrak{z}\_k = \mu\_5(j). \tag{A5}$$

In this formulation, *aik* = 1 if the first index in *k* is *i*, and *aik* = 0 otherwise, whereas *bjk* = 1 if the second index in *k* is *j*, and *bjk* = 0 otherwise.

The standard method for solving a high-dimensional linear programming problem is the *simplex method*, which can be found in numerous textbooks [228–230]. There are many variants for the simplex method, depending on whether the optimization is a maximization or minimization, and whether we are dealing with equality or inequality constraints. In general, the simplex method and its variants introduce auxiliary variables called *slack variables*, *excess variables*, or *artificial variables*. To illustrate how we can compute the first Wasserstein distance described above using the simplex method, we will also have to introduce artificial variables. Therefore, we should first eliminate redundant variables from with the list of 25 *ξk* = *ξij*.

**Figure A8.** (**a**) The probability distribution *μ*2 for lazy random walk starting from node 2, and (**b**) the probability distribution *μ*5 for lazy random walk starting from node 5, on a network with five nodes. In (**a**), the probability of hopping from 2 → 2 is *μ*2(2) = *α* = 12 , while the probability of hopping from 2 to its neighbors 1, 3, 4 are *μ*2(1) = *μ*2(3) = *μ*2(4) = 16 . These are shown in blue within the destination nodes. To complete the probability distribution *μ*2, we also need to specify *μ*2(5). Since it is not possible for the lazy random walk from node 2 to reach node 5 in a single hop, we set *μ*2(5) = 0, and show this probability in red within the destination node. In (**b**), we continue to have *μ*5(5) = 12 , but since node 5 has only two neighbors, we find *μ*5(3) = *μ*5(4) = 14 , and show these in blue within the destination nodes. We also show *μ*5(1) = *μ*5(2) = 0 in red within the destination nodes.

First of all, *μ*2(5) = 0. This cannot be broken down any further for redistribution to the various nodes in *μ*5, and so we can just drop the variables *ξ*5*j*, for 1 ≤ *j* ≤ 5. Additionally, *μ*5(1) = *μ*5(2) = 0, so none of the nodes in *μ*2 can make contributions to *j* = 1, 2. Hence, we can drop the variables *ξi*1 and *ξi*2, for 1 ≤ *i* ≤ 5. After eliminating these variables, we are then left with the 12 variables

$$
\hat{\xi}^1 \mathbf{1} \mathbf{3} \prime \hat{\xi}^1 \mathbf{4} \prime \hat{\xi}^1 \mathbf{5} \prime \hat{\xi}^2 \mathbf{5} \prime \hat{\xi}^2 \mathbf{4} \prime \hat{\xi}^2 \mathbf{5} \prime \hat{\xi}^3 \mathbf{5} \prime \hat{\xi}^4 \prime \hat{\xi}^3 \mathbf{5} \prime \hat{\xi}^4 \prime \hat{\xi}^4 \prime \hat{\xi}^4 \prime \hat{\xi}^4 \prime \hat{\xi}^4 \prime \hat{\xi}^4 \prime \tag{A6}
$$

Next, in terms of these remaining variables, we write the cost function as

$$\mathcal{C} = 2\mathfrak{z}\_{13}^{\mathfrak{x}} + 2\mathfrak{z}\_{14}^{\mathfrak{x}} + 3\mathfrak{z}\_{15} + \mathfrak{z}\_{23} + \mathfrak{z}\_{24} + 2\mathfrak{z}\_{25} + 3\mathfrak{z}\_{26} + 2\mathfrak{z}\_{34} + \mathfrak{z}\_{35} + 2\mathfrak{z}\_{43} + \mathfrak{z}\_{45}.\tag{A7}$$

We also write the seven equality constraints out explicitly as

$$
\mathfrak{J}\_{13} + \mathfrak{J}\_{14} + \mathfrak{J}\_{15} = \frac{1}{6}, \tag{A8}
$$

$$
\xi\_{23} + \xi\_{24} + \xi\_{25} = \frac{1}{2},
\tag{A9}
$$

$$
\zeta\_{33}^{\chi} + \zeta\_{34}^{\chi} + \zeta\_{35}^{\chi} = \frac{1}{6}, \tag{A10}
$$

$$
\xi\_{43} + \xi\_{44} + \xi\_{45} = \frac{1}{6} \,\, \tag{A11}
$$

$$
\xi\_{13} + \xi\_{23} + \xi\_{33} + \xi\_{43} = \frac{1}{4},
\tag{A12}
$$

$$
\xi\_{14} + \xi\_{24} + \xi\_{34} + \xi\_{44} = \frac{1}{4} \,\, \, \, \, \tag{A13}
$$

$$
\xi\_{15} + \xi\_{25} + \xi\_{35} + \xi\_{45} = \frac{1}{2}. \tag{A14}
$$

To solve the minimization problem iteratively, we need to start from an initial feasible solution. This is not easy to guess, if we limit ourselves to the 12 variables. Therefore, we introduce one artificial variable for each of the seven equality constraints, such that

$$
\xi\_{13} + \xi\_{14} + \xi\_{15} + a\_1 = \frac{1}{6},
\tag{A15}
$$

$$
\zeta\_{23}^{\chi} + \zeta\_{24}^{\chi} + \zeta\_{25}^{\chi} + a\_2 = \frac{1}{2},
\tag{A16}
$$

$$
\xi^{\mathfrak{x}}\mathfrak{x} + \mathfrak{z}\_{\mathfrak{z}4} + \mathfrak{z}\_{\mathfrak{z}5} + a\mathfrak{z} = \frac{1}{6},
\tag{A17}
$$

$$
\xi\_{43} + \xi\_{44} + \xi\_{45} + a\_4 = \frac{1}{6},
\tag{A18}
$$

$$
\xi\_{13} + \xi\_{23} + \xi\_{33} + \xi\_{43} + a\_5 = \frac{1}{4'} \tag{A19}
$$

$$
\zeta\_{14} + \zeta\_{24} + \zeta\_{34} + \zeta\_{44} + a\_6 = \frac{1}{4},
\tag{A20}
$$

$$
\xi\_{15} + \xi\_{25} + \xi\_{35} + \xi\_{45} + a\_7 = \frac{1}{2}.\tag{A21}
$$

This allows us to set *a*1 = 16 , *a*2 = 12 , *a*3 = 16 , *a*4 = 16 , *a*5 = 14 , *a*6 = 14 , *a*7 = 12 , and all the *ξ* variables to zero as our initial solution. However, as their names imply, *a*1 to *a*9 are artificial variables, so we must be able to set their values to zero at the end of our minimization. To ensure that this will happen, let us also add *<sup>M</sup>*(*<sup>a</sup>*1+ *a*2+ ··· + *<sup>a</sup>*7) to the cost function

$$\begin{aligned} \mathbf{C} &= 2\mathbf{j}\_{13} + 2\mathbf{j}\_{14} + 3\mathbf{j}\_{15} + \mathbf{j}\_{23} + \mathbf{j}\_{24} + 2\mathbf{j}\_{25} + 2\mathbf{j}\_{34} + \mathbf{j}\_{35} + 2\mathbf{j}\_{43} + \mathbf{j}\_{45} + \mathbf{j}\_{45} \\ &\quad \quad \quad \quad \quad \quad \quad \quad \quad (\mathbf{A}\mathbf{2}\mathbf{2}) \end{aligned} \tag{A22}$$

where *M* is a large number. For this reason, this variant of the simplex method is called the *big M method*.

After years of teaching the simplex method to undergraduates, mathematicians have learned how to present the procedure in simple matrix/tabular form. The starting point is shown Table A3, where the coefficients of unknowns in the equality constraints have been organized into the first seventh rows of a matrix, and the coefficients of unknowns in the cost function organized into the eighth row. Initially, the columns *V* and *r* are not populated. To populate column *V*, we inspect the coefficient matrix to identify the *basic variables*. A basic variable is one that appears in one and only one row. At this start of the optimization, it should be clear that the basic variables are the artificial variables *a*1 to *a*7.

Starting from *a*1 = 16 , *a*2 = 12 , *a*3 = 16 , *a*4 = 16 , *a*5 = 14 , *a*6 = 14 , *a*7 = 12 , and all the *ξ* variables set to zero, we want to ultimately be able to obtain a value of *C* that does not depend on the artificial variables *a*1 to *a*7. This can be achieved by subtracting *M* times the rows where the artificial variables appear from the last row where *C* appears, to produce the table shown in Table A4. In the last row of Table A4, all the coefficients associated with the *ξ* variables are negative, so this is not a feasible solution. To improve the solution, we need one of the *ξ* values to replace one of *a* as a basic variable. Looking for the most negative entry in the last row of Table A4, we find that this is <sup>−</sup>2*M*, in the columns associated with *ξ*33 and *ξ*44. We can pick either one as the *entering variable* to replace one of the *a*'s, which will be the *leaving variable*. Since we will surely pick *ξ*44 in the next iteration, let us for concreteness pick *ξ*33 as the entering variable. The final solution will not depend on the order we pick our entering variables. For *ξ*33 to become an entering variable, we determine the leaving variable by dividing the constants *b* by the coefficients in the *ξ*33 column, if this is possible, and populate the last column *r*. Since only two coefficients in the *ξ*33 column are non-zero, we compute *r* only along these two rows, to obtain the ratios 1 6 and 14 . Since 16 is the smaller of the two, we choose *a*3 to be the leaving variable.

However, for *ξ*33 to replace *a*3 as a basic variable, it must appear only in one row. In Table A4 it also appears in the row associated with *a*5, as well as the last row. Therefore, we must perform elementary row operations to eliminate *ξ*33 from the *a*5 row, as well as to eliminate *ξ*33 from the last row. After this is completed, we end up with the table shown in Table A5. Here we see also that the cost function has improved from −2*M* to −5*M*3 . Since there are still negative coefficients in the last row, we know that we can continue to improve the cost function. In Table A5, the most negative coefficient is <sup>−</sup>2*M*, which appears in the column associated with *ξ*44. We skipped *ξ*44 in favor of *ξ*33 the last iteration, so this is the right time to choose *ξ*44 as the entering variable. Thereafter, we divide the constants *b*'s by the coefficients in the row associated with *ξ*44, if this is possible, and look for the smallest ratio. In this case, we find only two ratios, 16 and 14 , and the smallest ratio is associated with *a*4, which will therefore be our new leaving variable.

Again, for *ξ*44 to become a basic variable, we must zero its coefficient in all other rows using elementary row operations. The matrix of coefficients then become the one shown in Table A6. We are making progress, because the cost function has improved to − 4*M* 3 , and many of the coefficients associated with *ξ*'s have become positive in the last row. To find the next entering variable, we continue to look in the last row for the most negative coefficient. This would be 1 − 2*M*, which appears in the column associated with *ξ*23, as well as the column associated with *ξ*24. As explained earlier, choosing *ξ*23 or *ξ*24 as the next entering variable will not change the solution, so let us make *ξ*23 the next entering variable. Then, from the ratios of the constants divided by the coefficients in the row associated with *ξ*23, we see that *a*5 will become the leaving variable.


**2** *−* **2***M* **2** *−* **2***M* **3** *−* **2***M* **1** *−* **2***M* **1** *−* **2***M* **2** *−* **2***M −***2***M* **1** *−* **2***M* **2** *−* **2***M* **2** *−* **2***M −***2***M* **1** *−* **2***M* **0000000** *−***2***M*


2 − 2*M* 2 − 2*M* 3 − 2*M* 1 − 2*M* 1 − 2*M* 2 − 2*M* 012 **201** 0 02*M* **2***M* 0 0 01 *−***4***M*

Performing elementary row operations to make *ξ*44 a proper basic variable, we obtained the table shown in Table A7. Again, the cost was improved, and more coefficients in the last row became positive. At this stage in the iterations, the most negative coefficient is 1 − 2*M*, appearing in the column associated with *ξ*24, which we postponed handling in the previous iteration. Therefore, we choose the next entering variable to be *ξ*24. Thereafter, dividing the constants by coefficients in the column associated with *ξ*24, we find the next leaving variable to be *a*6.

After another round of elementary row operations, we obtained the table shown in Table A8. In this table, the most negative coefficient in the last row is 2 − 2*M*, appearing in the columns associated with *ξ*25, and *ξ*45. Let us choose *ξ*25 to be the entering variable. For this variable, we find the ratio 13 for the row associated with *a*2, and 12 in the row associated with *a*7. Since 13 < 12 , we identify *a*2 as the leaving variable.

After elementary row operations to make *ξ*25 a basic variable, we obtained the table shown in Table A9. The steady improvement to the cost function is obvious, and there are also fewer negative coefficients in the last row. To choose the next entering variable, let us inspect those columns with 3 − 2*M* in the last row. These are associated with *ξ*13, *ξ*14, and *ξ*15. Since node 5 is the most important destination node, let us choose *ξ*15 as the next entering variable. The ratios are 16 associated with *a*1, and 16 associated with *a*7. Since they are the same, let us choose *a*1 to be our next leaving variable.

After one last round of elementary row operations, making *ξ*15 a basic variable, we obtained the table shown in Table A10. Now, none of the coefficients in the last row are negative. This means that we have found our solution, and the iterations can end. From Table A10, we can read off the basic variables as

$$\xi\_{15}^{\mathbb{Z}} = \frac{1}{6}, \quad \xi\_{23}^{\mathbb{Z}} = \frac{1}{12}, \quad \xi\_{24}^{\mathbb{Z}} = \frac{1}{12}, \quad \xi\_{25}^{\mathbb{Z}} = \frac{1}{3}, \quad \xi\_{33}^{\mathbb{Z}} = \frac{1}{6}, \quad \xi\_{44}^{\mathbb{Z}} = \frac{1}{6}. \tag{23}$$

We can check that *ξ*23 + *ξ*24 + *ξ*25 = 112 + 112 + 13 = 12 , satisfying the second constraint. Additionally, since *ξ*33 = 16 , this means that *ξ*34 = *ξ*35 = 0. Furthermore, *ξ*33 + *ξ*34 + *ξ*35 = 1 6 + 0 + 0 = 16 , satisfying the third constraint. Similarly, since *ξ*44 = 16 , we also have *ξ*43 = *ξ*45 = 0, and *ξ*43 + *ξ*44 + *ξ*45 = 0 + 16 + 0 = 16 satisfies the fourth constraint. As in the cases of the third and fourth constraints, *ξ*15 = 16 satisfies the first constraint by itself, meaning that *ξ*13 = *ξ*14 = 0. More importantly, *ξ*15 + *ξ*25 = 16 + 13 = 12 , which already satisfy the seventh constraint, so we must have *ξ*13 = *ξ*14 = 0. We also have *ξ*13 + *ξ*23 + *ξ*33 + *ξ*43 = 0 + 112 + 16 + 0 = 14 , satisfying the fifth constraint, and *ξ*14 + *ξ*24 + *ξ*34 + *ξ*44 = 0 + 112+ 0 + 16= 14satisfying the sixth constraint.

Let us observe that the above solution implies that *a*1 = *a*2 = *a*3 = *a*4 = *a*5 = *a*6 = *a*7 = 0. Let us also observe that *ξ*13, *ξ*14, *ξ*34, *ξ*35, *ξ*43, *ξ*45 did not make it onto the list of basic variables, and are thus non-basic variables. In the simplex method, non-basic variables are automatically set to zero at the end of the optimization. We also observe that we started out with seven artificial variables as basic variables, and iteratively replaced them with the redistribution variables. This means that at most seven redistribution variables can become basic, i.e., take on non-zero values at the end of the optimization. In this example, the optimization stopped after six redistribution variables became basic. The remaining six redistribution variables remained non-basic, and were thus set to zero, as dictated by the simplex method.

Finally, let us observe that after we prepared the coefficient matrix for the constraints and the cost function, including however many auxiliary variables as we need, the steps involved in each iteration are mechanical and easy to automate. In fact, most variants of the simplex method have been implemented in MATLAB, Python SciPy, and R as functions that users can call with minimal preparations.


1 **1** 3 − 2*M* 0 **0** 2 − 2*M* 0 **1** 3 − 2*M* **2** 0 **2** *−* **2***M* 00 1 **1** 2*M* − 1 **2***M −* **1** 0 1 *−M −*


#### **F. Ollivier-Ricci Curvature Analysis of a Toy Model Sequence of Fusion**

In Section 4.3, we saw that the Betti numbers—being topological quantities—are not able to distinguish between parts of some models. There is, therefore, the need to go beyond TDA. Most notably, (A3) and (A4) have the same Betti numbers, but whereas the curvature is positive everywhere in (A4), the neck region in (A3) has negative curvature. In this subsection, we will use model (A) to illustrate the power of the ORC.

To compute the ORC for the different phases of model (A), we created wire meshes for the surfaces involved from (A1) to (A4) (see Figure 9), as instances of the Graph class from networkx. We then installed GraphRicciCurvature and all the Python packages that this depends on, before creating instances of the OllivierRicci class using the networkx graphs as inputs, using *α* = 0.5 for the self-transition parameter. Finally, we called on the compute\_ricci\_curvature() method in the OllivierRicci class to compute the ORCs of all edges in the graphs.

As expected, the curvature ranged from slightly negative to slightly positive to significantly positive in (A1). Negative curvatures started appearing in (A2), at the Dirac point, and intensified in (A3). Finally, we have the curvature again ranging from slightly negative to slightly positive to significantly positive in (A4), thereby allowing us to distinguish between (A2), (A3), and (A4). The ORC changed most rapidly between (A2) and (A3).

**Figure 9.** The sequence of topological changes going from (**a**) two disjoint spheres, to (**b**) them touching at a point to form a Dirac cone, to (**c**) them fusing to form a smoothed neck, to (**d**) eventually rounding off to an ellipsoid with no neck. In these figures, the links are colored according to their Ollivier-Ricci curvatures, standardized across the four scenarios from ORC = −0.5 (deep red) to ORC ≈ 0 (green (slightly negative) and yellow (slightly positive)) to ORC = +0.5 (deep blue).
