Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks

Aditya, Satabdi; DasGupta, Bhaskar; Karpinski, Marek

doi:10.3390/biology3010001

Open AccessReview

Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks

by

Satabdi Aditya

¹,

Bhaskar DasGupta

^1,* and

Marek Karpinski

²

¹

Department of Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA

²

Department of Computer Science, University of Bonn, Bonn 53113, Germany

^*

Author to whom correspondence should be addressed.

Biology 2014, 3(1), 1-21; https://doi.org/10.3390/biology3010001

Submission received: 19 July 2013 / Revised: 11 November 2013 / Accepted: 9 December 2013 / Published: 19 December 2013

(This article belongs to the Special Issue Developments in Bioinformatic Algorithms)

Download

Browse Figures

Versions Notes

Abstract

:

In this survey paper, we will present a number of core algorithmic questions concerning several transitive reduction problems on network that have applications in network synthesis and analysis involving cellular processes. Our starting point will be the so-called minimum equivalent digraph problem, a classic computational problem in combinatorial algorithms. We will subsequently consider a few non-trivial extensions or generalizations of this problem motivated by applications in systems biology. We will then discuss the applications of these algorithmic methodologies in the context of three major biological research questions: synthesizing and simplifying signal transduction networks, analyzing disease networks, and measuring redundancy of biological networks.

Keywords:

transitive reduction; minimum equivalent digraph; network synthesis; disease networks; combinatorial algorithms

1. Introduction

In this survey paper, we review several transitive reduction problems on network that have applications in network synthesis and analysis involving cellular processes. Investigations of problems of these types that involve dealing with formal frameworks of very similar combinatorial nature have been done two by independent groups of communities of researchers, one being the theoretical computer science and computer networking research community and the other being the biological network research community. However, from the published literature it follows that there is minimal cooperation between such groups. The purpose of this survey is to promote a constructive dialogue among these two research communities working on similar problems so that intrigued biologists may probe further and learn new techniques from the perspective of formal analysis of algorithms and intrigued computer scientists may probe further to learn new terminologies and applications in biology. Following the general guidelines of this special issue, we first present the formal algorithmic ideas separately from their application and subsequently discuss the applications that involve these formal frameworks.

Minimum equivalent digraph is a classical computational problem (cf. [1]) with several recent extensions motivated by applications in social sciences and systems biology. A formal definition of the basic equivalent digraph problem is as follows.

Problem name: Minimum equivalent digraph (Min-Ed)
Input: a directed graph (digraph) G = (V, E).
Definition: for a digraph (V, E) the transitive closure of E is the relation on V × V defined as
Valid solution: A ⊆ E such that is equal to .
Objective: minimize |A|.

A complementary problem is the Max-Ed problem whose objective is to maximize |E\A|. Even though the complexity of finding an exact solution is the same for both Min-Ed and Max-Ed, the same may not necessarily be true for their approximate solutions (in the same manner as for node cover and independent set problems for general graphs [2]). For example, suppose that we have a graph with 1,000 edges and an exact solution for Min-Ed and Max-Ed with 490 edges. Suppose that an approximation algorithm for Min-Ed guarantees that we will find a solution with at most 980 edges. Thus, this approximation algorithm provides an approximation ratio of 980/490 = 2 for Min-Ed. However, the same algorithm for Max-Ed can have an approximation ratio as large as

(1)

Skipping the condition A ⊆ E in the definition of Min-Ed (or Max-Ed) yields the so-called transitive reduction (Tr) problem which was solved in polynomial-time by Aho, Garey and Ullman [3]. See Figure 1 for an illustration of valid solutions of Min-Ed.

1.1. Three Extensions of the Basic Version

In this subsection, we discuss three non-trivial extensions of the basic problem that have been formulated based on their applications. We will review in more details the applications of the basic version as well as the other extensions separately in Section 4.

Figure 1. Illustrations of two valid solutions of Min-Ed on an input graph: (a) The original graph G = (V, E); (b, c) Two valid solutions (V, A₁) and (V, A₂) of Min-Ed for G. The solution in (c) is optimal since it has fewer edges.

1.1.1. Min-Ed and Max-Ed with Critical Edges

This extension is the same as Min-Ed or Max-Ed except that a given subset D of edges must be present in any valid solution. Formally, we are given D ⊆ E as part of input and the condition “A ⊆ E” is changed to D ⊆ A ⊆ E. Let us denote this version as critical-Min-Ed and critical-Max-Ed, as appropriate. As we will see subsequently, this extension is quite non-trivial if one desires a good approximate solution.

1.1.2. Weighted Version of Min-Ed or Max-Ed

In this version, each edge has a weight (positive real number) and an optimal valid solution must have the minimum possible value of total edge weights. Formally, we have a weight function w:E → ℜ⁺ and the goal is either to minimize Σ_e∈A w(e) or to maximize Σ_e∈E w(e) − Σ_e∈A w(e). Let us denote this version as weighted-Min-Ed or weighted-Max-Ed, as appropriate. Obviously, the basic version is a special case of this weighted version when every edge weight is 1.

1.1.3. Binary Transitive Reduction (Btr)

This extension is a generalization of the basic version with critical edges and is described as follows [4,5,6,7]. We have an edge-labeling function ℓ:E → {−1, 1}. The label or parity of a path P = (u₀, u₁, …, u_k) is derived from the labels of its edges and given by ℓ(P) = Π_iℓ(u_i-1,u_i). The transitive closure relation is now generalized as Biology 03 00001 i005

= {(u_i, u_j, q):∃ path P using edges in E from u_i to u_j and ℓ(P) = q}. Then, A is a binary transitive reduction of E with a required subset D if D ⊆ A ⊆ E and Biology 03 00001 i006

=

. Obviously, the basic version with critical edges is a special case of Btr when every edge label is 1. There are two (maximization and minimization) objective functions corresponding to the two generalizations of the basic version Min-Ed and Max-Ed; they will be denoted by Min-Btr and Max-Btr, respectively. We will use the notation u_i Biology 03 00001 i007

u_j to indicate a path from node u_i to node u_j of parity p ∈ {−1, 1}.

The relationships between various versions of the basic equivalent digraph problem are as follows:

Min-Ed < Weighted-Min-Ed

Max-Ed < Weighted-Max-Ed

Min-Ed < critical-Min-Ed < Min-Btr

Max-Ed < critical-Max-Ed < Max-Btr

where A < B means problem A is a special case of problem B. The relationships between the problem Weighted-MIN-Ed and the problems critical-MIN-Ed and MIN-Btr (and, similarly between the problem Weighted-MAX-Ed and the problems critical-Max-Ed and Max-Btr) are not completely known, though it is possible to design approximation algorithms for critical-MIN-ED and Min-Btr based on approximation algorithms for Weighted-Min-Ed.

We review the following standard definitions in approximation algorithms theory. A ε-approximate solution (or simply a ε-approximation) of a minimization (respectively, maximization) problem is a polynomial-time solution with an objective value no smaller than (respectively, no larger than) ε times the value of the optimum; an algorithm of performance or approximation ratio ε produces an ε-approximate solution. A problem is APX-hard if there exists a ε > 1 such that no polynomial-time algorithm has an approximation ratio of ε unless P = NP. The notation OPT(G) (or simply OPT when G is clear from the context) will always denote the objective value of an optimal solution for the problem under consideration. We assume that the reader is familiar with the basic concepts of design and analysis of algorithms found in graduate level algorithms textbooks such as [2,8], and basic concepts of computational biology found in standard textbooks such as [9,10].

2. Summary of Known Algorithmic and Inapproximability Results

In this section, we briefly review known algorithmic and inapproximability results for the various equivalent digraph and transitive reduction problems defined in the previous section, leaving a more detailed description of algorithmic techniques used to obtain these results in the next section.

The algorithmic research work on Min-Ed was initiated by Moyles and Thomson [1] who described an efficient polynomial-time reduction of this problem for an arbitrary graph to that for a strongly connected graph, followed by an exact but exponential time algorithm for strongly connected graphs. Subsequently, an approximation algorithm for Min-Ed was detailed by Khuller, Raghavachari and Young [11] with an approximation ratio of Biology 03 00001 i008

≈ 1.617 + ε (for any constant ε > 0), which was improved to an approximation algorithm with an approximation ratio of Biology 03 00001 i009

independently by Vetta [12] and by Berman, DasGupta and Karpinski [13]. Except [13], none of these approximation algorithms will generalize directly to critical-Min-Ed with the same approximation ratio. The only non-trivial approximation algorithm known for either MAX-Ed or critical-MAX-Ed is a 2-approximation algorithm described in [13].

For weighted-Min-Ed, Frederickson and JàJà [14] designed a 2-approximation algorithm using an algorithm for minimum cost rooted arborescence due to Edmonds [15] and Karp [16]. Basically, it suffices to find a minimum cost in- arborescence and out-arborescence in respect to an arbitrary root node v ∈ V and take the union of all the edges in these two arborescences as the approximate solution.

Albert et al. [4] showed how to convert any algorithm for Min-Ed with an approximation ratio ρ to an algorithm for critical-Min-Ed with an approximation ratio of Biology 03 00001 i010

. They also provided a 2-approximation for Min-Btr, but in fact, minor modification of their method and analysis as outlined in [13] yields a Biology 03 00001 i011

-approximation. Other heuristics for these problems were investigated in [5,6] but none of these heuristics guarantees a better approximation ratio. Table 1 shows a theoretical comparison of running times and approximation ratios of some of the known algorithms for the transitive reduction problems. Unfortunately, a systematic comparative empirical evaluation of these algorithmic approaches is not available in the published literature. However, implementations of several algorithmic approaches on an individual level are available. For example, Kachalo et al. [6] provided a software called NET-SYNTHESIS which used some of the algorithmic approaches described in Section 3.2 and Section 3.4, and Milanovíc et al. [17] discussed two meta-heuristic approaches to solve a more general version of the Min-Btr problem.

On the inapproximability side, Papadimitriou [18] left it as an exercise to show that Min-Ed is NP-hard. Subsequently, Khuller, Raghavachari and Young [11] provided a formal proof of both NP-hardness and APX-hardness of Min-Ed for arbitrary graphs. Motivated by their cycle contraction method in [11], they were interested in the complexity of the problem when there is an upper bound γ on the length of any cycle in the input graph. In [18] the authors showed that Min-Ed can be solved in polynomial time if γ = 3, Min-Ed is NP-hard if γ = 5, and Min-Ed is APX-hard if γ ≥ 17. Reference [13] improved the APX-hardness result to show that both Min-Ed and Max-Ed are APX-hard even when γ ≥ 5. The exact complexity of both Min-Ed and Max-Ed when γ = 4 is still unresolved.

Table 1. Theoretical comparison of worst-case performance of some of the algorithms for the transitive reduction problems.

**Table 1.** Theoretical comparison of worst-case performance of some of the algorithms for the transitive reduction problems.
Problem name	Algorithmic approach	Worst-case running time using straightforward implementation	Approximation ratio
Min-Ed	Khuller, Raghavachari and Young [11]	O(n^1/ε)	1.617 + ε²
Min-Ed	Vetta [12] Berman, DasGupta and Karpinski [13]	O(n log n)
Max-Ed	Berman, DasGupta and Karpinski [13]	O(n log n)	2
critical-Min-Ed	Khuller, Raghavachari and Young [11]	O(n^1/ε)	2.617 + ε²
critical-Min-Ed	Berman, DasGupta and Karpinski [13]	O(n log n)
critical-Min-Ed	Frederickson and JàJà [14]	O(n)	2
critical-Min-Ed	Albert et al. [4]	O(n³)
critical-Max-Ed	Berman, DasGupta and Karpinski [13]	O(n log n)	2
weighted-Min-Ed	Frederickson and JàJà [14]	O(n)	2
Min-Btr	Albert et al. [4]	O(n³)	2
Min-Btr	Berman, DasGupta and Karpinski [13]	O(n log n)
Max-Btr	Berman, DasGupta and Karpinski [13]	O(n log n)	2

3. Review of a Few Algorithmic Techniques Used for Transitive Reduction Problems

In this section, we review a few key algorithmic techniques that have been used in the literature to investigate algorithmic complexities of various versions of the transitive reduction problem. Our goal is not to provide every technical detail involving these methods, but rather to bring our salient features of these techniques in a way that may be understood by the practitioners as well.

3.1. From General Graphs to Strongly Connected Graphs

Recall that a digraph (V, E) is strongly connected if and only if, for every pair of nodes u_i and u_j, both the paths Biology 03 00001 i012

and

exist. A reduction that was originally suggested in [1] and have been implicit in all subsequent works is the assumption that an ε-approximation algorithm for critical-Min-Ed and critical-Max-Ed when the given graph is strongly connected also implies an ε-approximation algorithm for the same problem on arbitrary digraphs. To understand why this is true, we first note that all these four problems can be solved easily in polynomial time using the following greedy approach if the input graph G = (V, E) is a directed acyclic graph (Dag) with D ⊆ E as the set of required edges (ϕ is the standard mathematical symbol of an empty set):

Compute a topological ordering u₁, u₂, …, u_n of the nodes of G (* thus, if (u_i, u_j) ∈ E then i < j *)

E’ = E ; A = ϕ
for i = n, n − 1, n − 2, …, 1 do
- for j = n, n − 1, n − 2, …, i + 1 do
  - if (u_i, u_j) ∈ E then
    if (u_i, u_j) ∈ D then add the edge (u_i, u_j) to A
    else if the path u_i u_j does not exist then add the edge (u_i, u_j) to A
Return (V, A) as the solution

It is easy to implement the above algorithm to run in polynomial time. Now, suppose that the input graph G is not a DAG and consider the strong component graph G’ = (V’, E’) of G:

V’ = {C|C is a strongly connected component of G}

E’ = {(C, C’)|C.C’ ∈ V’ and (u_i, u_j) ∈ E for some u_i ∈ C and u_j ∈ C’}

It is easy to see that G’ is a DAG and can be found in O(|V| + |E|) time [8]. Let A’ be the solution of our problem on G’. Suppose that we have ε-approximation algorithm for critical-Min-Ed or critical-Max-Ed on each strongly connected component of G. Then, the union of the edges in this ε-approximation for every strongly connected component of G together with the edges in A’ provide an ε-approximation for the entire graph G.

For Min-Btr or Max-Btr Albert et al. [4] provides a more complex reduction to show that an ε-approximation algorithm for strongly connected graphs also implies an ε-approximation algorithm for arbitrary digraphs. To achieve this, each strongly connected component is replaced a graph with constantly many edges and nodes (called “gadget” in [4]) and then these graphs are connected appropriately such that the resulting graph is a Dag and an ε-approximation for the entire graph can be recovered using an exact optimal solution of the Dag and ε-approximations of the strongly connected components.

Thus, for the remainder of this section, we assume without loss of generality that the input graph G is strongly connected.

3.2. The Cycle Contraction Method [11]

Consider an input graph G = (V, E) for the Min-Ed problem and suppose that G has a directed Hamiltonian cycle, i.e., a (directed) cycle that contains every node exactly once. Then clearly the edges in this cycle constitute an optimal solution of |V| edges. This intuition suggests a general strategy of repeatedly finding a longest cycle in the given graph, selecting the edges in this cycle and modifying the graph to reflect the selection of edges until we reach a valid solution.

However, finding a directed Hamiltonian cycle or the longest cycle is in general NP-hard [2]. To circumvent the NP-hardness issue, Khuller, Raghavachari and Young in [11] designed the following “cycle contraction” approach. Contraction of an edge (v_i, v_j) is nothing but the act of merging the two nodes v_i and v_j into a new single node v_ij and deleting any resulting self-loops or multi-edges. Similarly, contraction of a cycle is defined as the contraction of every edge of the cycle; see Figure 2 for an illustration. Note that if c is a constant then one can easily check in polynomial time if a graph has a cycle of at least c edges. The algorithm, parameterized by a constant c > 3 to be chosen by the user, now proceeds as follows:

for i = c, c − 1, … ,4 do
- while (the graph contains a cycle of at least i edges) do
  - Find a cycle C of at least i edges
  - Select the edges in C and contract C
- endwhile
endfor
(* now the graph contains no cycle of more than 3 edges *)

Solve Min-Ed on the reduced graph exactly using the algorithm in [19] and select the edges in this exact solution.

Figure 2. Illustration of a cycle contraction: (a) shows the original graph and (b) shows the graph after the cycle u₁, u₂, u₃, u₄, u₅, u₆, u₁ has been contracted.

It was shown in [11] that the above algorithm for Min-Ed returns a valid solution containing y edges where Biology 03 00001 i015

edges.

The above approach can also be applied to critical-Min-Ed by simply adding all the edges from the required set of edges D to the solution. The number of edges z in the resulting solution of critical-Min-Ed satisfies Biology 03 00001 i016

since obviously |D| ≤ OPT. Another possibility outlined in [4] is to replace every required edge (u_i, u_j) ∈ D by introducing a new node u_ij and adding two new edges (u_i, u_ij) and (u_ij, u_j), running the approximation algorithm for Min-Ed on this new graph, and then replacing the edges (u_i, u_ij) and (u_ij, u_j) in the solution by the original edge (u_i, u_j). If an optimal solution of critical-Min-Ed on G uses β edges from E\D then this approach returns a solution (V,A) with Biology 03 00001 i017

.

3.3. The Arborescence Approach [14]

A (rooted) spanning out-arborescence of a directed edge-weighted graph G = (V, E) is a directed acyclic spanning sub-graph (V, A) of G such that every node except one node (the root) has exactly one incoming edge and the weight of such an out-arborescence is the sum of the weight of its edges. A spanning in-arborescence is defined analogously except that every node except the root has exactly one outgoing edge. An exact polynomial-time solution for computing a spanning in-arborescence or spanning out-arborescence of minimum weight was provided by the authors in [15,16,20]. An overview of this algorithm for computing a minimum weight out-arborescence (as formulated in [16]) is as follows. We first remove all incoming edges to the root v. Then we proceed as follows. First, we select for each node, except the root v, an incoming edge of minimum weight. If these edges do not give a spanning arborescence, then there must be a (directed) cycle C formed by a subset of these edges. Let w(C) = min {w(e)|e ∈ C}. We contract the cycle C to a “mega”-node, and decrease the weight of every edge (u, v) from a node u ∉ C to a node v ∈ C by α − w(C), where α is the weight of the unique edge in C that is incoming to v. The process is then repeated on the reduced graph, and continued until we have a spanning arborescence on the remaining graph. The mega-nodes are then expanded in the reverse order. Each time a mega-node is expanded, exactly one of its edges that would produce two incoming edges to a node is discarded. A minimum weight in-arborescence can be computed by the same algorithm if we reverse the direction of all the edges of the input graph. See Figure 3 for an illustration.

For weighted-Min-Ed, Frederickson and JàJà [14] proposed the following simple algorithm that gives a 2-approximation for an input graph G = (V, E):

Select an arbitrary node v of G
Find a minimum weight spanning in-arborescence (V, A₁) of G rooted at v
Find a minimum weight spanning out-arborescence (V, A₂) of G rooted at v
Return (V, A₁ A₂) as the solution

Figure 3. An illustration of the algorithm to compute a minimum weight spanning out-arborescence. The thick black edges at the final fourth step are the edges in the solution.

The above solution is a valid solution since we can reach any node v_j starting from any node v_i by taking a path from v_i to the root v followed by a path from v to the node v_j. The solution is a 2-approximation since any valid solution of weighted-Min-Ed includes both a minimum weight spanning in-arborescence and a minimum weight spanning out-arborescence and thus OPT(G) ≥ max {|A₁|, |A₂|}. A simple example of an input graph was also provided in [14] for which the above algorithm provides a solution to total weight 2OPT(G).

For critical-Min-Ed, a very similar approach as described below can be used to again provide a 2-approximation for an input graph G = (V, E):

Define the weight w(e) of an edge e ∈ E as
Select an arbitrary node v_r of G
Find a minimum weight spanning in-arborescence T = (V, A₁) of G rooted at node v_r
Redefine the weight w(e) of an edge e ∈ E as
Find a minimum weight spanning out-arborescence T = (V, A₂) of G rooted at node v_r
Return (V, A₁ A₂ D) as the solution

Albert et al. [4] showed how to modify the above algorithm and combine it with any ρ-approximation algorithm for Min-Ed to obtain an improved algorithm for critical-Min-Ed with an approximation ratio of Biology 03 00001 i010

. Currently, the best possible value of ρ is 1.5 which leads to a Biology 03 00001 i011

-approximation for critical-Min-Ed using this approach.

3.4. From Critical-Min-Ed And Critical-Max-Ed To Min-Ed And Max-Ed [4,13]

The results in [4,13] show how to transform a solution to critical-Min-Ed (respectively, critical-Max-Ed) to a solution to Min-Ed (respectively, Max-Ed) by adding a single edge (We remind the reader that we assume that the input graph is strongly connected.) that can be found in polynomial time. The idea behind this is as follows. We can distinguish our input (and strongly connected) graph G based on whether G = (V, E) has a cycle of parity −1 (double parity graph) or not (single parity graph). Whether G is a single or double parity graph can be easily checked in O(|V|³) time by using a simple modification of the well-known Floyd-Warshall transitive closure algorithm [8] as outlined in [4]. Now we can observe the following:

If G is a single parity graph then for every pair of nodes u_i, u_j ∈ V, exactly one of the two the paths u_i u_j and u_i u_j exists. Then, we can simply ignore the edge labels and compute a solution (V, A) of critical-Min-Ed (respectively, critical-Max-Ed) on G. It can be seen that (V, A) also provides a valid solution for Min-Ed (respectively, Max-Ed).
Otherwise, G is a double parity graph. We again first ignore the edge labels and compute a solution (V, A) of critical-Min-Ed (respectively, critical-Max-Ed) on G. Note that (V, A) contains a rooted arborescence, say (V, A₁) with A₁ ⊆ A, rooted at some node u_r. We label each node u_i ∈ V with ℓ(u_i) = ℓ(P_i) where P_i is the unique path in (V, A₁) from u_r to u_i. Since G is a double parity graph, there must exist an edge (u_i, u_j) ∈ E such that ℓ(u_i) ℓ(u_j) ≠ ℓ(u_i, u_j), and adding this edge (if not already present) to A produces a valid solution of critical-Min-Ed or critical-Max-Ed for G.

3.5. Linear Programming Based Approach [13]

We refer the reader to a standard graduate level textbook such as [21] for basic concepts and definitions related to linear programming and its applications to designing approximation algorithms.

An exponential-size linear programming (Lp) formulation for the minimum weight rooted (at node u_r) out-arborescence problem for an edge-weighted input graph G = (V, E) was provided by Edmonds [15] in the following manner. We use a binary indicator variable Biology 03 00001 i039

for every edge e = (u_i, u_j) ∈ E which describes whether we select e (x_e = 1) or do not select e (x_e = 0) in our solution. For U ⊂ V, define ι(U) = {(u_i, u_j) ∈ E:u_i ∉ U and u_j ∈ U}. Then, the Lp formulation is:

(2)

Edmonds [15] showed that the above Lp always has an integral optimal solution (i.e., an optimal solution with x_e ∈ {0, 1} for all e ∈ E) which provides an optimal solution for our minimum weight rooted out-arborescence problem. Note that the above Lp has O(2^|V|) constraints in the worst case. However, the advantage of such a linear programming is that we can now make use of powerful mathematical tools, such as the duality theorem, from the theory of linear programming.

We can modify the above Lp formulation to a primal Lp formulation P₁ for Min-Ed provided we set w(e) = 1 for all e ∈ E and we remove “and u_r ∉ U” from the condition in constraint (1). The dual program D₁ of this Lp can be constructed by having a variable y_U for every Φ ⊂ U ⊂ V. Both the primal and the dual Lp are written down below for clarity.

(primal Lp P₁)	(dual Lp D₁)

We can change P₁ into a Lp formulation for Max-Ed if we replace the objective “minimize Σ_{e ∈ E} x_e” by “maximize |E| − Σ_e_{∈ E} x_e”, and change the dual D₁ accordingly to reflect this change. We can further change this formulation for Min-Ed and Max-Ed to critical-Min-Ed and critical-Max-Ed, respectively, by adding a constraint x_e ≥ 1 for every edge e ∈ D.

Note that P₁ does not provide a valid solution of the Min-Ed problem unless the constraint x_e ≥ 0 for every edge e ∈ E is replaced by the constraint x_e ∈ {0, 1}, resulting in an integer linear program (Ilp) whose exact solution is in general NP-hard to compute. We will denote this Ilp corresponding to P₁ by IP₁.

3.5.1. Applying Lp-Based Approach to Critical-Min-Ed

We provide a high-level overview of the primal-dual approach used in [13] for critical-Min-Ed on an input graph G = (V, E).

We start with an initial assignment of values to variables in IP₁ in the following manner. We keep only a subset of constraints of IP₁ such that the resulting Ilp can be solved exactly in polynomial time, giving an optimal solution A₁ ⊆ E. Then, it follows that OPT(G) ≥ |A₁|.
However, (V, A₁) may not be a valid solution for critical-Min-Ed on G (i.e., IP₁). Then, we try to make A₁ a valid solution by adding and/or removing edges so that we use a total of at most edges where OPT(G) ≥ η ≥ |A₁|, giving a - approximation for critical-Min-Ed. The edge alteration procedure was carried out in [13] using the DFS (depth-first-search) algorithm as originally outlined in a seminal paper by Tarjan (e.g., see the textbook [22]).

The initial solution A₁ referred to above in Step 1 is obtained in the following manner. For U ⊂ V, define o(U) = {(u_i, u_j) ∈ E:u_i ∈ U and u_j ∉ U}. Call a constraint of type Σ_{e ∈ ι(U)} x_e ≥ 1in IP₁ “tractable” if for some node u_i either ι(U) ⊆ ι({u_i}) or ι(U) ⊆ o({u_i}). It was shown in [13] that the set of tractable constraints of IP₁ can be found easily and the resulting Ilp can be solved exactly using any algorithm that finds a maximum matching in a bipartite graph. Figure 4 shows an example of the initial solution A₁ found by this approach.

Figure 4. An illustration of the initial solution A₁ discussed in the algorithm that applies a Lp-based approach to critical-Min-Ed in Section 3.5.1: (a) The input graph G; (b) The edges in the initial solution A₁. As one can see, the initial solution does not provide a valid solution of critical-MIN-Ed since the graph in Figure 4b is not strongly connected, but the final solution is obtained by adding and deleting edges from this initial solution.

The DFS-based edge addition/removal method referred to in Step 2 is highly technical with elaborate case analysis and is beyond the scope of this review paper. In a nutshell, difficulties may arise because in some cases the algorithm may be forced to use more than Biology 03 00001 i026

edges. Then, we look at the “non-tractable” constraints of the primal P₁ or dual D₁ to get an improved lower-bound η for OPT(G) (i.e., OPT(G) ≥ η > A₁) to ensure that we use at most Biology 03 00001 i025

edges. In the proof we need to crucially use the weak-duality theorem of linear programming which states that if OPT(P₁) and OPT(D₁) are the objective values of an optimal solution of P₁ and D₁, respectively, then OPT(P₁) ≥ OPT(D₁).

3.5.2. Applying Lp-Based Approach to Critical-Max-Ed

We provide an overview of the 2-approximation algorithm for critical-Max-Ed on an input graph G = (V, E) using a Lp-based approach as described in [13]. Call an edge e ∈ E a necessary edge if either e ∈ D or ι(U) = {e} for some U ⊂ V and let F be the set of necessary edges. If the edges in F provide a valid solution of critical-Max-Ed on G then (V, F) provide us with an optimal solution, thus assume that this is not the case below. In this case, Σ_e∈ι(U) = 0 for some Φ ⊂ U ⊂ V, so there must be a node u_r such that no edges in F enter u_r. As a pre-processing step, we repeatedly contract a cycle of necessary edges until no such cycles remain. Let OPT_in-arb(G) be the total weight of a minimum-weight in-arborescence of G rooted at u_r. Consider the Lp formulation for the minimum weight rooted out-arborescence problem as defined before:

and let

. Now, suppose that we set Biology 03 00001 i029

. This assignment of variables is a valid solution of the above Lp.

Now, compute a minimum weight out-arborescence T_out = (V,A_out) rooted at u_r. If there are z + 1 edges in E that are not in A_out, then OPT(G) ≤ z. Suppose now that we change w(e) for every e ∈ A_out to zero and keep the other weights unchanged. Our previous fractional solution, namely Biology 03 00001 i029

, is still a valid solution of the Lp, and thus the total value of the objective function of this fractional solution is at most Biology 03 00001 i030

, which together with the result of Edmonds [15] that showed that “the Lp always has an integral optimal solution” implies that OPT_in-arb(G) ≤ Biology 03 00001 i030

, which implies that we delete at least z + 1 − Biology 03 00001 i030

=

edges from the in-arborescence and take the remaining edges of the in-arborescence together with all the edges in A_out to get a valid solution of critical-Max-Ed on G. The total number of edges we have deleted in at least Biology 03 00001 i031

. A slight modification in the argument shows that in fact we can delete at least Biology 03 00001 i032

edges.

3.5.3. Limitations of Lp-Based Approaches

A standard way of understanding the limitations of any Lp-based approach for designing approximation algorithms is to measure the integrality gap, i.e., the ratio of the objective value of an optimal integral solution to that of an optimal fractional solution for a minimization problem and the ratio of the objective value of an optimal fractional solution to that of an optimal integral solution for a minimization problem [21]. In [13] it was shown that the integrality gap for P₁ was at least Biology 03 00001 i033

by giving an explicit construction of an input graph for which this ratio is achieved. The same input graph also shows that the integrality gap for the modification of P₁ corresponding to Max-Ed is at least Biology 03 00001 i009

.

4. Biological Applications

In this section, we discuss three applications of transitive reduction problems in computational biology and bioinformatics. For other non-biology applications of transitive reduction problems, such as in visualization of Enron email networks or in connectivity issues of computer networks, the reader may consult appropriate references such as [11,23].

We briefly review the standard regulatory network model that was mentioned in Section 1.1.3 in connection with the Min-Btr and Max-Btrproblems. A regulatory network is described by an edge-labelled directed graph G = (V, E) in which nodes represent individual components of the biological system and (directed) edges of the form (u_i, u_j) indicates that node u_i has an influence on node u_j. The edge labelling function ℓ:E → {−1, 1} indicates the nature of the causal relationship, with ℓ(u, u_j) = 1 and ℓ(u_i, u_j) = −1 indicating that u_i has an excitatory (positive) and inhibitory (negative) influence on u_j, respectively; pictorially, it is quite common to denote an excitory and an inhibitory edge by → and —|, respectively. This representation applies to both gene regulatory networks (describing the regulation of gene transcription and related processes) and signal transduction networks (describing the information flow from external signals to within-cell components). Some examples of large size biological networks include:

Mammalian network of signaling pathways and cellular machines in the hippocampal CA1 neuron having 512 nodes and 1,047 edges [24].
S. cerevisiae transcriptional regulatory network of interactions between transcription factor proteins and genes having 690 nodes and 1,082 edges [25].
C. elegans metabolic network having 651 nodes and 2,040 edges [26].
Oriented version of an unweighted PPI network constructed from S. cerevisiae interactions in the BioGRID database having 786 nodes and 2,453 edges [27].

Existence of such large networks rules out exact brute-force calculations of optimal solutions of transitive reduction problems and provides motivations to explore approximation algorithms for these problems.

4.1. Network Construction and Simplification from Direct and Double-Causal Data

Signal transduction and gene regulatory networks are crucial to the maintenance of cellular homeostasis and for cell behavior such as growth, survival, apoptosis, and movement. Deregulation of these networks is a key contributor to many disease processes such as developmental disorders, diabetes, vascular diseases, and cancer. In a signal transduction network (pathway), there is typically an input, perceived by a receptor, followed by a series of elements through which the signal percolates to the output node, which represents the final outcome of the signal transduction process. For a cellular signal transduction pathway not involving alterations in gene expression, elements often consist of proteinaceous receptors, intermediary signaling proteins and metabolites, effector proteins, and a final output, which represents the ultimate combined effect of the effector proteins. If the signal transduction process includes regulation of the transcript level of a particular gene, the intermediate signaling elements will also include the gene itself and the transcription factors that regulate it, as well any small RNAs that regulate the transcript’s abundance, with the final output being presence or absence of transcripts. Genome-wide experimental methods now identify interactions among thousands of proteins [28,29,30,31,32,33,34]. However, the state of the art understanding of many signaling processes is often limited to the knowledge of key mediators and of their positive or negative effects on the whole process. The experimental evidence about the involvement of specific components in a given signal transduction network frequently belongs to one of these two categories:

(i): “Direct” interactions corresponding to biochemical evidences that provide information on enzymatic activity or protein-protein interactions and represent direct physical interactions. An interaction of this type is of the form “A promotes B” or “A inhibits B”, and is represented in the usual manner by a directed edge A → B and A —| B, respectively. Edges corresponding to known (documented) direct interactions are marked as “critical” and belong to the set D of required edges.
(ii): “Putative” interaction patterns that arise, for example, during differential responses to a stimulus, which in a wild-type organism versus a mutant organism implicates the product of the mutated gene in the signal transduction process. This type of interaction pattern is not a direct interaction but rather corresponds to an indirect (double-causal) relationship most likely resulting from a chain of direct interactions and reactions, and is a 3-component inference represented by a small-size sub-graph among three or four nodes.

As noted above, inference of type (ii) may not give direct interactions but indirect causal relationships that correspond to reachability relationships in the unknown interaction network for which the Min-Btr and Max-Btr problems become directly applicable. More precisely, inferences of type (ii) typically lead to double-causal inferences of the type “C promotes the process through which A promotes B”, and may correspond to an intersection of two paths (one path from A to B and another path from C to B) in the interaction network (i.e., C is assumed to activate an unknown intermediary node of the A to B path).

The research works in [5,6,7] led to the development of an efficient and accurate method incorporating all relevant biological knowledge for synthesizing path-level information into a consistent network by constructing a minimal graph that maintains all reachability relationships without requiring expression information (unlike, say, many reverse-engineering approaches). Methods prior to [5,6,7] for synthesizing signal transduction networks, such as [28], only included direct biochemical interactions and were therefore restricted by the incompleteness of the experimental knowledge on pairwise interactions. Key steps in the network synthesis method developed in [5,6,7] are schematically shown in Figure 5. The first step is a distillation of experimental conclusions into qualitative regulatory relations between cellular components (This is a complex process by itself. It is important to note that human intervention will inevitably be an important component of the literature curation process even though automated text search engines such as GENIES [32] become more and more popular). Direct biochemical and pharmacological evidences, such as “A promotes B” are incorporated as a directed edge (A, B). Other kind of double-causal evidences (such as genetic evidences of differential responses to a stimulus) are handled in the third step in the schematic diagram. For the sake of concreteness, assume that such a double-causal interaction is of the form “C promotes the process through which A promotes B”. The only way such a double-causal interaction may correspond to a direct interaction is if C is an enzyme catalyzing a reaction in which A is transformed into B, and for this case the interaction can be represented as both A (the substrate) and C (the enzyme) activating B (the product), i.e., by two edges A → B and C → B. If the interaction between A and B is direct and C is not a catalyst of the interaction between A and B, we can assume that C activates A. In all other cases, this type of interaction corresponds to an intersection of two paths (A to B and C to B) in the interaction network by introducing new nodes (called “pseudo-nodes” in [5] and elsewhere since they are added only to satisfy the pathway properties). One important algorithmic idea in this network synthesis method is that of finding a minimal (Intuitively, by computing a minimal graph we want to be as close as possible to a “tree-like topology” while supporting all experimental observations. Implicit assumption of chain-like or tree-like topologies permeates the traditional molecular biology literature, e.g., signal transduction and metabolic pathways are assumed to be close to linear chains and genes are assumed to be regulated by one or two transcription factors [33].) network, in terms of number of non-critical edges (i.e., edges not in D), that is consistent with all (directed) reachability relationships between nodes, and is captured by the Min-Btr and Max-Btrproblem discussed earlier. For further details, see [5,6,7]. A software named NET-SYNTHESIS incorporating the method shown in Figure 5 using some of the algorithmic ideas described for Min-Btr and Max-Btr in Section 3 was first reported in [5,6] and is freely available for download. The input to NET-SYNTHESIS is a list of relationships among biological components (direct and double causal), and its output is a network diagram and a text file with the edges of the signal transduction network.

Figure 5. A schematic diagram of the network synthesis method in [5,6,7]. Human interaction is necessary since some choices may have to be made in distilling the component relationships, e.g., when there are conflicting reports in the literature.

4.1.1. Applications in Agronomic Research

Guard cells are central components in control of plant water status [34] and better understanding of their regulation is imperative for the goal of engineering of crops with improved drought tolerance. Plants both lose water and take in carbon dioxide through microscopic stomatal pores, each of which is regulated by a surrounding pair of guard cells. During drought, the plant hormone abscisic acid (ABA) inhibits stomatal opening and promotes stomatal closure, thereby promoting water conservation. ABA signal transduction in guard cells is one of the best characterized signaling systems in plants with many signal transduction proteins, secondary metabolites and ion channels having been identified to participate in the process [35,36,37].

The research works in [5,6] used the NET-SYNTHESIS software to generate a network for ABA-induced closure from is a list of about 140 interactions and causal inferences for ABA-induced closure published in Table S1 and Text S1 in [38]. A detailed comparison of this computer generated network with a manually curated network for ABA-induced closure published in [38] validated the accuracy of the algorithms for Min-Btr used in the software.

4.2. Analyzing Disease Networks (Biomedical Application)

Large Granular Lymphocytes (LGL) are medium to large size cells with eccentric nuclei and abundant cytoplasm. In normal adults, LGL comprise 10%~15% of the total peripheral blood mononuclear cells. The disease LGL leukemia is a disordered clonal expansion of LGL and their invasions in the marrow, spleen and liver. Ras is a small GTPase, which is essential for controlling multiple essential signaling pathways, and its deregulation is frequently seen in human cancers. Activation of H-Ras required its farnesylation, which can be blocked by farnesyltransferase inhibitiors (FTIs). This envisions FTIs as future drug target for anti-cancer therapies. One of these FTI is tipifarnib which shows apoptosis induction effect to leukemic LGL in vitro. This observation, together with the finding that Ras is constitutively activated in leukemic LGL cells, leads to the hypothesis that Ras plays an important role in LGL leukemia, and may function through influencing Fas/FasL pathway.

Kachalo et al. in [6] used the NET-SYNTHESIS software together with its specific transitive reduction algorithms to synthesize a cell-survival/cell-death regulation related signaling network from the Transpath 6.0 database with additional information manually curated from literature search, having 359 nodes representing proteins/protein families and mRNAs participating in pro-survival and Fas-induced apoptosis pathways and 1,295 edges representing regulatory relationships between nodes, including protein interactions, catalytic reactions, transcriptional regulation and known double-causal regulations. Using Min-Btr and other algorithms, they were able to reduce the size of the original network to 267 nodes and 751 edges to focus special interest on the effect of Ras on apoptosis response through Fas/FasL pathway that involve the 33 known T-LGL deregulated proteins. Further work in this direction was done by Zhang et al. in [39] in building and analyzing a network model of signaling components of survival of cytoxic T lymphocytes in LGL-leukemia using the NET-SYNTHESIS software.

For further applications of transitive reduction problems to drug target identification, see [40].

4.3. Measuring Topological Redundancy of Biological Networks

The concept of redundancy is well known in information theory. Informally, redundancy refers to identical elements performing the same function (There are also other definitions of the redundancy concept in the context of other biological applications that is completely different from ours. For example, in some context redundancy refers to paralogous genes that provide functional backup for one another [41]). In computer networks and electronic systems, such measures are useful in analyzing properties such as fault-tolerance. It is an accepted fact that biological networks do not necessarily have the lowest possible degeneracy or redundancy. For example, the connectivity of neurons in brains suggests a high degree of degeneracy [42]. As Tononi, Sporns and Edelman observed in [43], a specific and useful notion of redundancy has yet to be firmly incorporated into biological thinking, often because of the lack of a suitable formal theoretical framework. A further reason for the lack of incorporation of these notions in biological thinking is the lack of computationally efficient procedures for computing these measures for large-scale networks even when formal definitions are available. Therefore, such studies are often done in a somewhat ad-hoc fashion, such as in [44]. There are notions of redundancy available in the field of analysis of undirected graphs based on clustering coefficients [45] or betweenness centrality measures [46]. However, such notions are not appropriate for the analysis of biological networks where we must distinguish positive from negative regulatory interactions or where we wish to study possible relationships of the dynamics of the network with its redundancy.

Based on the Min-Btr and Max-Btr problems, Albert et al. in [47] proposed a new combinatorial measure of redundancy that is amenable to efficient algorithmic analysis. Note that binary transitive reduction of a graph (V, E) does not change pathway level information of the network and removes an edge from one node u_i → u_j or u_i —| u_j only when a similar alternate pathway, namely u_i Biology 03 00001 i034

u_j or u_i Biology 03 00001 i035

u_j respectively, exists, thus truly removing redundant connections. Thus, if (V, E₁) is an optimal solution of Min-Btr and Max-Btr on the input graph G = (V, E) then Biology 03 00001 i036

provides a measure of global compressibility of the network. Based on this intuition, Albert et al. in [47] proposed a new redundancy measure Biology 03 00001 i037

, where the |E| term in the denominator is simply a “min-max normalization” of the measure to ensure that 0 < R < 1. Note that the higher the value of R is, the more redundant the network is. Since Min-Btr or Max-Btr can be computed efficiently, Albert et al. were able to evaluate R on a variety of large biological and directed social networks to derive interesting conclusions such as transcriptional networks are less redundant than signaling networks, directed social networks are more redundant than biological networks, the topological redundancy of the C. elegans metabolic network is largely due to its inclusion of currency metabolites and the redundancy of signaling networks is highly (negatively) correlated with the monotonicity of their dynamics.

5. Conclusions

In this review paper, we have elaborated on a few graph-theoretic problems that involve finding an “equivalent” sparser graph, explain several key mathematical and algorithmic tools that may be used to design efficient computational methods to solve these problems and then provided details of three biological applications of these problems. The idea of transitive reductions, in a more simplistic setting or in a different form, has also been used to identify structure of gene regulatory networks [48,49,50,51,52]. Of particular interest is a network “deconvolution” problem, considered by Feizi et al. [52], that is in some sense an inverse of the transitive reduction problems studied in this paper: their goal was to infer the original network given a set of direct (edge-level) and indirect (pathway-level) information about the graph. The authors in this paper showed that an exact closed-form solution of this problem can be found using an infinite-series summation. We hope that our review will lead to further interests in transitive reduction type problems and will promote further collaboration between the computational biology and the graph algorithms community.

Acknowledgments

S. Aditya and B. DasGupta was partially supported by NSF grants IIS-1160995 and DBI-1062328. B. DasGupta also thanks his collaborators R. Albert, P. Berman, R. Dondi, A. Gitter, G. Gürsoy, R. Hegde, S. Kachalo, P. Pal, G. S. Sivanathan, E. D. Sontag, A. Zelikovsky, K. Wesrbrooks and R. Zhang for their collaboration in the research projects reviewed in this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Moyles, D.M.; Thompson, G.L. Finding a minimum equivalent of a digraph. J. ACM 1969, 16, 455–460. [Google Scholar] [CrossRef]
Garey, M.R.; Johnson, D.S. Computers and Intractability—A Guide to the Theory of NP-Completeness; W. H. Freeman & Co.: New York, NY, USA, 1979. [Google Scholar]
Aho, A.; Garey, M.R.; Ullman, J.D. The transitive reduction of a directed graph. SIAM J. Comput. 1972, 1, 131–137. [Google Scholar] [CrossRef]
Albert, R.; DasGupta, B.; Dondi, R.; Sontag, E.D. Inferring (Biological) signal transduction networks via transitive reductions of directed graphs. Algorithmica 2008, 51, 129–159. [Google Scholar] [CrossRef]
Albert, R.; DasGupta, B.; Dondi, R.; Kachalo, S.; Sontag, E.D.; Zelikovsky, A.; Westbrooks, K. A novel method for signal transduction network inference from indirect experimental evidence. J. Comput. Biol. 2007, 14, 927–949. [Google Scholar] [CrossRef]
Kachalo, S.; Zhang, R.; Sontag, E.D.; Albert, R.; DasGupta, B. NET-SYNTHESIS: A software for synthesis, inference and simplification of signal transduction networks. Bioinformatics 2008, 24, 293–295. [Google Scholar] [CrossRef]
Albert, R.; DasGupta, B.; Sontag, E.D. Inference of Signal Transduction Networks from Double Causal Evidence. In Methods in Molecular Biology: Topics in Computational Biology; Fenyo, D., Ed.; Springer Science + Business Media, LLC: New York, NY, USA, 2010; Volume 673. [Google Scholar]
Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
Gusfield, D. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology; Cambridge University Press: New York, NY, USA, 1997. [Google Scholar]
Pevzner, P.A. Computational Molecular Biology: An Algorithmic Approach; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
Khuller, S.; Raghavachari, B.; Young, N. Approximating the minimum equivalent digraph. SIAM J. Comput. 1995, 24, 859–872. [Google Scholar] [CrossRef]
Vetta, A. Approximating the minimum strongly connected subgraph via a matching lower bound. In Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms, Washington, DC, USA, 7–9 January 2001; pp. 417–426.
Berman, P.; DasGupta, B.; Karpinski, M. Approximating transitive reduction problems for directed networks. In Proceedings of the 11th Algorithms and Data Structures Symposium, Banff, AB, Canada, 21–23 August 2009; pp. 74–85.
Frederickson, G.N.; JàJà, J. Approximation algorithms for several graph augmentation problems. SIAM J. Comput. 1981, 10, 270–283. [Google Scholar] [CrossRef]
Edmonds, J. Optimum branchings. In Mathematics and the Decision Sciences, Part 1; Dantzig, G.B., Veinott, A.F., Jr., Eds.; American Mathematical Society Lectures on Applied Mathematics: Providence, RI, USA, 1968; pp. 335–345. [Google Scholar]
Karp, R.M. A simple derivation of Edmonds’ algorithm for optimum branching. Networks 1972, 1, 265–272. [Google Scholar] [CrossRef]
Milanovíc, M.; Matíc, D.; Savíc, A.; Kratica, J. Two metaheuristic approaches to solving the p-ary transitive reduction problem. Appl. Comput. Math. 2011, 10, 294–308. [Google Scholar]
Papadimitriou, C. Computational Complexity; Addison-Wesley: New York, NY, USA, 1994; p. 212. [Google Scholar]
Khuller, S.; Raghavachari, B.; Young, N. On strongly connected digraphs with bounded cycle length. Discret. Appl. Math. 1996, 69, 281–289. [Google Scholar] [CrossRef]
Chu, Y.; Liu, T. On the shortest arborescence of a directed graph. Sci. Sin. 1965, 4, 1396–1400. [Google Scholar]
Vazirani, V. Approximation Algorithms; Springer: New York, NY, USA, 2001. [Google Scholar]
Aho, A.; Hopcroft, J.E.; Ullman, J.D. The Design and Analysis of Computer Algorithms; Addison-Wesley: Reading, MA, USA, 1974. [Google Scholar]
Dubois, V.; Bothorel, C. Transitive reduction for social network analysis and visualization. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, Compiègne, France, 19–22 September 2005; pp. 128–131.
Ma’ayan, A.; Jenkins, S.L.; Neves, S.; Hasseldine, A.; Grace, E.; Dubin-Thaler, B.; Eungdamrong, N.J.; Weng, G.; Ram, P.T.; Rice, J.J.; et al. Formation of regulatory patterns during signal propagation in a mammalian cellular network. Science 2005, 309, 1078–1083. [Google Scholar] [CrossRef]
Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Alon, D.U. Network motifs: Simple building blocks of complex networks. Science 2002, 298, 824–827. [Google Scholar] [CrossRef]
Jeong, H.; Tombor, B.; Albert, R.; Oltvai, Z.N.; Barabasi, A.L. The large-scale organization of metabolic networks. Nature 2000, 407, 651–654. [Google Scholar] [CrossRef]
Gitter, A.; Klein-Seetharaman, J.; Gupta, A.; Bar-Joseph, Z. Discovering pathways by orienting edges in protein interaction networks. Nucleic Acids Res. 2011, 39, e22. [Google Scholar] [CrossRef]
Lee, T.I.; Rinaldi, N.J.; Robert, F.; Odom, D.T.; Bar-Joseph, Z.; Gerber, G.K.; Hannett, N.M.; Harbison, C.T.; Thompson, C.M.; Simon, I.; et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002, 298, 799–804. [Google Scholar] [CrossRef]
Giot, L.; Bader, J.S.; Brouwer, C.; Chaudhuri, A.; Kuang, B.; Li, Y.; Hao, Y.L.; Ooi, C.E.; Godwin, B.; Vitols, E.; et al. A protein interaction map of Drosophila melanogaster. Science 2003, 302, 1727–1736. [Google Scholar] [CrossRef]
Han, J.-D.J.; Bertin, N.; Hao, T.; Goldberg, D.S.; Berriz, G.F.; Zhang, L.V.; Dupuy, D.; Walhout, A.J.M.; Cusick, M.E.; Roth, F.P.; et al. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 2004, 430, 88–93. [Google Scholar] [CrossRef]
Li, S.; Armstrong, C.M.; Bertin, N.; Ge, H.; Milstein, S.; Boxem, M.; Vidalain, P.-O.; Han, J.-D.J.; Chesneau, A.; Hao, T.; et al. A map of the interactome network of the metazoan C. elegans. elegans. Science 2004, 303, 540–543. [Google Scholar]
Friedman, C.; Kra, P.; Yu, H.; Krauthammer, M.; Rzhetsky, A. GENIES: A natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 2001, 17, S74–S82. [Google Scholar] [CrossRef]
Alberts, B. Molecular Biology of the Cell; Garland Publishing: New York, NY, USA, 1994. [Google Scholar]
Hetherington, A.M.; Woodward, F.I. The role of stomata in sensing and driving environmental change. Nature 2003, 424, 901–908. [Google Scholar] [CrossRef]
Fan, L.M.; Zhao, Z.; Assmann, S.H. Guard cells: A dynamic signaling model. Curr. Opin. Plant Biol. 2004, 7, 537–546. [Google Scholar] [CrossRef]
Blatt, M.R.; Grabov, A. Signal redundancy, gates and integration in the control of ion channels for stomatal movement. J. Exp. Bot. 1997, 48, 529–537. [Google Scholar] [CrossRef]
MacRobbie, E.A. Signal transduction and ion channels in guard cells. Philos. Trans. R. Soc. Lond. BBiol. Sci. 1998, 353, 1475–1488. [Google Scholar] [CrossRef]
Li, S.; Assmann, S.M.; Albert, A. Predicting essential components of signal transduction networks: A dynamic model of guard cell abscisic acid signaling. PLoS Biol. 2006. [Google Scholar] [CrossRef]
Zhang, R.; Shah, M.V.; Yang, J.; Nyland, S.B.; Liu, X.; Yun, J.K.; Albert, R.; Loughran, T.P. Network model of survival signaling in LGL leukemia. Proc. Natl. Acad. Sci. USA 2008, 105, 16308–16313. [Google Scholar] [CrossRef]
Albert, R.; DasGupta, B.; Mobasheri, N. Some perspectives on network modeling in therapeutic target prediction. Biomed. Eng. Computat. Biol. 2013, 5, 17–24. [Google Scholar]
Kafri, R.; Bar-Even, A.; Pilpel, Y. Transcription control reprogramming in genetic backup circuits. Nat. Genet. 2005, 37, 295–299. [Google Scholar] [CrossRef]
Kolb, B.; Whishaw, I.Q. Fundamentals of Human Neuropsychology; Freeman: New York, NY, USA, 1996. [Google Scholar]
Tononi, G.; Sporns, O.; Edelman, G.M. Measures of degeneracy and redundancy in biological networks. Proc. Natl. Acad. Sci. USA 1999, 96, 3257–3262. [Google Scholar] [CrossRef]
Papin, A.; Palsson, B.O. Topological analysis of mass-balanced signaling networks: A framework to obtain network properties including crosstalk. J. Theor. Biol. 2004, 227, 283–297. [Google Scholar] [CrossRef]
Beckage, N.; Smith, L.; Hills, T. Semantic network connectivity is related to vocabulary growth rate in children. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society, Portland, OR, USA, 11–14 August 2010; pp. 2769–2774.
Dall’Astaa, L.; Alvarez-Hamelina, I.; Barrata, A.; Vázquezb, A.; Vespignania, A. Exploring networks with traceroute-like probes: Theory and simulations. Theor. Comput. Sci. 2006, 355, 6–24. [Google Scholar] [CrossRef]
Albert, R.; DasGupta, B.; Gitter, A.; Gürsoy, G.; Hegde, R.; Pal, P.; Sivanathan, G.S.; Sontag, E.D. A new computationally efficient measure of topological redundancy of biological and social networks. Phys. Rev. E 2011, 84, 036117. [Google Scholar] [CrossRef]
Wagner, A. Estimating coarse gene network structure from large-scale gene perturbation data. Genome Res. 2002, 12, 309–315. [Google Scholar] [CrossRef]
Chen, T.; Filkov, V.; Skiena, S. Identifying gene regulatory networks from experimental data. In Proceedings of the 3rd Annual International Conference on Computational Molecular Biology, Lyon, France, 11–14 April 1999; pp. 94–103.
Klamt, S.; Flassig, R.J.; Sundmacher, K. Transwesd: Inferring cellular networks with transitive reduction. Bioinformatics 2010, 26, 2160–2168. [Google Scholar] [CrossRef]
Bosnacki, D.; Odenbrett, M.R.; Wijs, A.; Ligtenberg, W.; Hilbers, P. Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors. BMC Bioinform. 2012, 13, 281. [Google Scholar] [CrossRef]
Feizi, S.; Marbach, D.; Médard, M.; Kellis, M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nat. Biotechnol. 2011, 31, 726–733. [Google Scholar]

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Aditya, S.; DasGupta, B.; Karpinski, M. Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks. Biology 2014, 3, 1-21. https://doi.org/10.3390/biology3010001

AMA Style

Aditya S, DasGupta B, Karpinski M. Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks. Biology. 2014; 3(1):1-21. https://doi.org/10.3390/biology3010001

Chicago/Turabian Style

Aditya, Satabdi, Bhaskar DasGupta, and Marek Karpinski. 2014. "Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks" Biology 3, no. 1: 1-21. https://doi.org/10.3390/biology3010001

Article Menu

Algorithmic Perspectives of Network Transitive Reduction Problems and their Applications to Synthesis and Analysis of Biological Networks

Abstract

1. Introduction

1.1. Three Extensions of the Basic Version

1.1.1. Min-Ed and Max-Ed with Critical Edges

1.1.2. Weighted Version of Min-Ed or Max-Ed

1.1.3. Binary Transitive Reduction (Btr)

2. Summary of Known Algorithmic and Inapproximability Results

3. Review of a Few Algorithmic Techniques Used for Transitive Reduction Problems

3.1. From General Graphs to Strongly Connected Graphs

3.2. The Cycle Contraction Method [11]

3.3. The Arborescence Approach [14]

3.4. From Critical-Min-Ed And Critical-Max-Ed To Min-Ed And Max-Ed [4,13]

3.5. Linear Programming Based Approach [13]

3.5.1. Applying Lp-Based Approach to Critical-Min-Ed

3.5.2. Applying Lp-Based Approach to Critical-Max-Ed

3.5.3. Limitations of Lp-Based Approaches

4. Biological Applications

4.1. Network Construction and Simplification from Direct and Double-Causal Data

4.1.1. Applications in Agronomic Research

4.2. Analyzing Disease Networks (Biomedical Application)

4.3. Measuring Topological Redundancy of Biological Networks

5. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI