Path Problems in Networks with Vector-Valued Edge Weights Giri K. Tayi,1 Daniel J. Rosenkrantz,2 S. S. Ravi2 1 School of Business Administration, University at Albany-SUNY, Albany, New York, 12222 2 Department of Computer Science, University at Albany-SUNY, Albany, New York, 12222 Received 30 June 1997; accepted 5 November 1998 Abstract: We consider path problems in networks where each edge is associated with a vector of weights. One application where such path problems arise is in transporting hazardous materials. In that context, the network is embedded in a cluster of communities (or zones), and it is important to consider the impact of an accident along an edge on the surrounding zones. This impact is modeled as a cost vector for each edge, where each component represents the impact of an accident on a zone. Under this model, we formulate two kinds of path problems, namely, routing and feasibility problems. These formulations utilize various definitions of equity with respect to cost impact on the zones. We present complexity results and pseudopolynomial algorithms for general versions as well as efficient algorithms for special cases. We also carry out a comparative analysis of different routing problems. © 1999 John Wiley & Sons, Inc. Networks 34: 19 –35, 1999 1. INTRODUCTION Path problems in networks where each edge is associated with a vector of weights have been studied by several researchers (see, e.g., [1, 6 – 8, 9, 13]). The vector of weights associated with an edge may include length, transit time, energy consumption, travel costs, etc. In such problems, which arise in transportation and communication networks, the goal is to find paths which simultaneously optimize these weights. Typically, such problems have been formulated as optimizing one weight subject to constraints on other weights. For example, many of the above references Correspondence to: S. S. Ravi; e-mail: ravi@cs.albany.edu Contract grant sponsor: NSF; contract grant numbers: CCR-90-06396; CCR-97-34936 © 1999 John Wiley & Sons, Inc. consider the restricted shortest path problem where each edge is associated with a two-component vector (l, t), where l and t represent the length of and the transit time along the edge, respectively. The goal is to find a path of shortest length subject to a constraint on the total transit time. In this paper, we consider a different class of path problems that arise in applications such as transporting hazardous materials. In that context, the network is embedded in a region, and it is important to consider the impact of an accident along an edge on the surrounding region. It is common to use monetary costs to quantify this impact. These costs account for the consequences of an accident and may include costs related to health care, property damage, insurance, etc. We assume that the region in which the network is embedded consists of a number of communities, CCC 0028-3045/99/010019-17 19 20 TAYI, ROSENKRANTZ, AND RAVI each of which we refer to as a zone. Thus, the concept of a zone as considered in this paper is independent of the nodes or edges of the network. This is in contrast to [2], where a zone is defined as the l-neighborhood of a node. In practice, zones may typically be based on political, economic, and historic considerations. When an accident occurs along an edge of the network, several zones could be impacted. For instance, an edge may straddle several zones or wind-blown material from the site of an accident may affect several zones. Each zone is primarily concerned with the impact on that zone of an accident on each edge. The size of a zone does not affect the problems considered here. The vector associated with an edge represents the cost incurred by different zones due to an accident along that edge. Thus, the number of components of each vector is equal to the number of zones. The abstract model of zones also accommodates situations where each zone represents a different dimension of the impact of an accident, such as personal injuries, clean-up costs, and long-term pollution. The focus of this paper is on path problems given the cost vectors for all the edges. In these path problems, it is important to choose paths that are equitable so that paths that overcharge some zones relative to other zones are avoided. Thus, such a context requires formulations of path problems where the goal is equitable cost distribution across the zones. The notion of cost equity can be modeled in a number of ways: For example, given a source v s and a destination v d , one may be interested in finding a path from v s to v d that minimizes the maximum cost to any zone. In this paper, we formulate path problems under different definitions of cost equity, with a focus on the algorithmic aspects of the resulting problems. We present complexity results and pseudopolynomial algorithms as well as efficient algorithms for special cases. We also carry out a comparative analysis of path problems under different definitions of cost equity. For convenience in exposition, we use the terms “component” of a vector and a “zone” synonymously throughout the paper. Several approaches for solving path problems where each edge is associated with a vector of weights have been proposed in the literature. Warburton [13] considered multiobjective shortest path (MOSP) problems. He proposed pseudopolynomial dynamic programming algorithms for generating all the nondominated* paths between a given pair of nodes. He also showed how these pseudopolynomial algorithms can be transformed into fully polynomial approximation schemes. Hassin [8] also considered MOSP problems and presented approximation schemes which are more efficient than those given in [13]. Some of the path problems formulated in this paper are, indeed, MOSP problems, and, hence, the approaches given in [8, 13] are useful * For a definition of domination, see Section 2. in solving these problems. Comparisons of our results with those in [13] and [8] are presented in Sections 4.1 and 5.2.2, respectively. Wijeratne et al. [14] considered MOSP problems when the edges have stochastic attributes. They transformed the stochastic MOSP problem into a deterministic MOSP problem and provided comparison rules to determine the set of nondominated paths. The remainder of the paper is organized as follows: Section 2 presents several routing and feasibility problem formulations. A comparative analysis of the routing problems is performed in Section 3. This is followed (Section 4) by an analysis of the individual routing problems when the number of zones is fixed (i.e., held constant) and when the number of zones is a variable (i.e., part of the problem instance). A similar analysis is carried out for the individual feasibility problems in Section 5. This section also presents pseudopolynomial algorithms which use the concept of efficient sets (introduced in [13]), when the number of zones is held constant. Conclusions are provided in Section 6. 2. ANALYTICAL MODELS AND PROBLEM FORMULATIONS 2.1. Preliminaries In this paper, we represent a network embedded in a region as an undirected connected graph G(V, E). The sources, destinations, road intersections, etc., form the node set V of the graph. An edge e [ E represents a link between its end points. Let k denote the number of zones, and let z 1 , z 2 , . . . , z k denote the zones themselves. Accordingly, each edge e j has a k-component vector CV(e j ) 5 (c j1 , c j2 , . . . , c jk ) associated with it. Component c ji is, in general, a nonnegative rational number and represents the cost impact on zone z i when an accident occurs along that edge (1 # i # k, 1 # j # uEu). The k-component vector associated with an edge will be referred to as the cost vector for that edge. Given a path P consisting of edges e 1 , e 2 , . . . , e t , the cost vector of P, denoted by CV(P) 5 (CV P1 , CV P2 , . . . , CV Pk ), is defined as the vector sum of the cost vectors of the edges, that is, O c , 1 # i # k. t CV Pi 5 li (2.1) l51 The value of CV Pi represents the cost impact on zone z i (1 # i # k) due to transportation along path P. In practice, it is unlikely that the given graph G(V, E) would contain a cycle with a cost vector whose components are all zero. Hence, throughout this paper, we assume that there is no such cycle. However, if a graph contains such a cycle, the nodes and edges of the cycle can be coalesced into PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS a single node such that all the edges incident on the nodes of the cycle are now incident on this single node. This process may introduce multiedges into the graph. These multiedges can be converted into simple edges by introducing a node on each such edge. Given two k-component cost vectors 9 5 (c 1 , c 2 , . . . , c k ) and 99 5 (c91 , c92 , . . . , c9k ), we say that 9 dominates 99 (denoted by 9 d 99) if c i # c9i , for 1 # i # k. Further, 9 strictly dominates 99 (denoted by 9 a 99) if 9 d 99 and there is an i (1 # i # k) such that c i , c9i . This notion of dominance can be extended to paths using their respective cost vectors. Given two paths P and Q between the same pair of nodes u and v and given their respective cost vectors CV(P) and CV(Q), we say that P dominates Q (denoted by P d Q) if CV(P) d CV(Q). Path P strictly dominates path Q (denoted by P a Q) if CV(P) a CV(Q). According to the above definitions, path P “dominates” path Q means that P is at least as good a solution as Q, that is, no component of the cost vector associated with P is larger than the corresponding component of the cost vector associated with Q. Further, “strict domination” implies that P is a better solution than Q, that is, P dominates Q and at least one component of the cost vector associated with P is smaller than the corresponding component of the cost vector associated with Q. This notion of strict domination is similar to the concept of Pareto optima used in multiobjective optimization [12]. In the context of the problems considered in this paper, any path which is not included in the Pareto optimal set is strictly dominated by some other path. Using the above notion of dominance, we define feasible paths, edges, nodes, and zones. A feasible path between nodes u and v is one which is not strictly dominated by another path between the same pair of nodes. Throughout this paper, we use ^ uv to denote the set of all feasible paths between nodes u and v. The feasibility definitions given below are with respect to a particular pair of nodes. An edge e is feasible with respect to a pair of nodes u, v if e is part of a feasible path between u and v. An edge e 5 {v i , v j } is inherently feasible if e itself is a feasible path between its end points v i and v j . Edges that are not inherently feasible can be deleted since they cannot be part of any feasible path. A node w is feasible with respect to a pair of nodes u, v if w occurs in some feasible path between u and v. A zone z i is feasible with respect to a pair of nodes u, v if there is some feasible path P between u and v such that CV Pi . 0. If a zone z i is not feasible with respect to a pair of nodes u, v, then z i need not be considered when generating nondominated paths between u and v. The following example illustrates these definitions: Example 1. Consider the network shown in Figure 1 which has two zones and five nodes. Each edge is labeled with its two-component cost vector. With respect to the pair of nodes v1 and v5, it can be seen from the figure that path P 5 ^v 1 , v 4 , v 5 & strictly dominates the path Q 5 ^v 1 , v 2 , v 3 , 21 Fig. 1. Example to illustrate the notion of feasibility. v 5 & since CV(P) 5 (1, 1) while CV(Q) 5 (6, 6). The reader can verify that the path P is not strictly dominated by any other path between v1 and v5. Therefore, P is a feasible path. Again, with respect to the pair of nodes v 1 and v 5 , edge {v 1 , v 4 } is feasible, node v 4 is feasible, and z 1 and z 2 are both feasible zones. Note that edge {v1, v4} is also inherently feasible. In contrast, edge {v2, v3} is not feasible with respect to the pair v1, v5, since there is no feasible path between v1 and v5 containing that edge. Indeed, because of path ^v2, v4, v3&, edge {v2, v3} is inherently infeasible. The notion of feasibility as defined in this paper differs from the classical optimization definition in the sense that here a feasible path is one that is not strictly dominated by another path. As will be shown in Section 3, for some routing problems, there always exist optimal solutions that are feasible, while for some other problems, an optimal solution need not necessarily be feasible. 2.2. Routing Problems Given a network and a pair of nodes u (source) and v (destination), the goal of routing problems is to find paths between u and v under a variety of objectives all of which deal with equitable distribution of cost. The following formulations reflect these objectives using an optimization format: Min-Average Cost Path (MACP) Instance: Undirected graph G(V, E); the number k of zones; for each e i in E, a cost vector c i 1, c i 2, . . . , c i k); two nodes vs (the source) and vd (the destination). Requirement: Find a path P from vs to vd in G such that m(P) 5 (¥ ki51 CVPi )/k is minimized. Instead of using average cost as the minimization objective in the above formulation, the sum of the components of the cost vector associated with path P could also be used. Mathematically, these objectives yield equivalent optimization problems. We have chosen the above MACP formulation as it assists in defining other problems involving variance and absolute deviation. 22 TAYI, ROSENKRANTZ, AND RAVI Min–Max Cost Path (MMCP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that max{CV P1 , CV P2 , . . . , CV Pk } is minimized. Balanced Cost Path (BCP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that CV P1 5 CV P2 5 . . . 5 CV Pk , if such a path exists. Obtaining a perfectly balanced cost path is an ideal which may be impossible to achieve. Hence, it is of interest to consider formulations which focus on paths that minimize the cost imbalance across the zones. Accordingly, we present several such formulations below. The first formulation quantifies the imbalance in a simple manner, namely, the maximum difference between the costs associated with any pair of zones. The remaining formulations, however, measure the imbalance with respect to the average cost per zone ( m (P)). Minimum Imbalance Cost Path (MICP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that max{uCV Pi 2 CV Pj u : 1 # i , j # k} is minimized. Min-Variance Cost Path (MVCP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that [¥ ki51 (CV Pi 2 m (P)) 2 ]/k is minimized. Min-Total Absolute Deviation Cost Path (MTDCP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that ¥ ki51 uCV Pi 2 m (P)u is minimized. Min–Max Absolute Deviation Cost Path (MMDCP) Instance: As in MACP on previous page. Requirement: Find a path P from v s to v d in G such that max{uCV P1 2 m (P)u, uCV P2 2 m (P)u, . . . , uCV Pk 2 m (P)u} is minimized. It can be seen that problem MACP defined on previous page is simply the problem of computing a shortest path from vs to vd, where each edge is assigned a weight equal to the sum of the k components of the cost vector associated with that edge. As a consequence, this version of the path problem is efficiently solvable [3]. Therefore, the other problems defined above will be the primary focus of Sections 3 and 4 of this paper. 2.3. Feasibility Problems We formulate the path, edge, node, and zone feasibility problems as decision problems using the format suggested in [5]. Path Feasibility (PF) Instance: Undirected graph G(V, E); the number k of zones; for each e i in E, a cost vector (c i 1, c i 2, . . . , c i k); a path P 5 ^v i 1, v i 2, . . . , v i t&. Question: Is P a feasible path from v i 1 to v i t? Edge Feasibility (EF) Instance: Undirected graph G(V, E); the number k of zones; for each e i in E, a cost vector (c i 1, c i 2, . . . , c i k); an edge e 5 {v i 1, v i 2}; a pair of nodes u, v [ V. Question: Is e a feasible edge with respect to u, v? When u and v are the end points of the edge e, the resulting version of EF is referred to as the Inherent Edge Feasibility (IEF) problem. Node Feasibility (NF) Instance: Undirected graph G(V, E); the number k of zones; for each e i in E, a cost vector (c i 1, c i 2, . . . , c i k); a pair of nodes u, v [ V; a node w distinct from u and v. Question: Is w a feasible node with respect to u, v? Zone Feasibility (ZF) Instance: Undirected graph G(V, E); the number k of zones; for each e i in E, a cost vector (c i 1, c i 2, . . . , c i k); a zone ]; a pair of nodes u, v [ V. Question: Is ] a feasible zone with respect to u, v? Detailed analyses of these feasibility problems are provided in Section 5. 3. COMPARATIVE ANALYSES OF ROUTING PROBLEMS 3.1. Overview In this section, we carry out comparative analyses of the routing problems formulated in Section 2.2. This section consists of two parts: In the first part, we consider the effect of feasibility on optimal solutions. In the second part, we study the extent to which an optimal solution to one routing problem can be used as an approximate solution to another routing problem. 3.2. Optimal Solutions and Feasibility (a) MACP Problem: For any instance of the MACP problem, all optimal solutions are feasible, as shown by the following proposition: Proposition 3.1. For any instance of the MACP problem, every optimal solution is feasible. Proof. Let P be an optimal but infeasible solution to the MACP problem instance. Since P is infeasible, there must PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS be a path P9 which strictly dominates P. Then, the sum of the components of the cost vector associated with P9 must be strictly less than the corresponding sum for P. This contradicts the optimality of P. ■ (b) MMCP Problem: For the MMCP problem, optimal solutions may be either feasible or infeasible. For example, consider a graph with two zones: Let u and v be two vertices in the graph with exactly two paths P 1 and P 2 between them. Let the cost vectors associated with P 1 and P 2 be (c, 1) and (c, 0), respectively, where c $ 1. Note that both P 1 and P 2 are optimal solutions; however, P 2 strictly dominates P 1 , so P 1 is infeasible. The following proposition points out a feature of the MMCP problem: Proposition 3.2. For any instance of the MMCP problem, there is always an optimal solution which is feasible. Proof. Let 3 5 {P 1 , P 2 , . . . , P t } be the set of all optimal solutions to the MMCP instance. Define the following binary relation R on 3: ^P i , P j & [ R if and only if P j strictly dominates P i . Since strict domination is an antisymmetric and transitive binary relation, there must exist at least one maximal element in 3 that is not strictly dominated by any element in 3 [11]. Let path P i [ 3 be one such maximal element. Thus, P i is an optimal and feasible solution. ■ (c) BCP Problem: In the case of the BCP problem, it is easy to construct instances for which there is no solution (i.e., there is no balanced path). However, when balanced paths do exist, it can be seen that there is always one balanced path that is not strictly dominated by any other balanced path. Note, however, that a BCP solution may not necessarily be a feasible path. For example, consider a graph with two zones. Let u and v be two vertices in the graph with exactly two paths P 1 and P 2 between them. Let CV(P 1 ) 5 (2, 1) and CV(P 2 ) 5 (2, 2). Note that P 2 is the solution to the BCP problem instance; however, it is infeasible since it is strictly dominated by P 1 . (d) MICP, MVCP, MTDCP, and MMDCP Problems: In the case of the MICP problem, all optimal solutions may be infeasible. For example, consider a graph with two zones. Let u and v be two vertices in the graph with exactly two paths P 1 and P 2 between them. Let CV(P 1 ) 5 (c, 2) and CV(P 2 ) 5 (c, 1), where c . 2. Note that P 1 is the unique optimal solution to the MICP instance. However, P 2 is the only feasible path. This example also shows that for instances of the MVCP, MTDCP, and MMDCP problems all optimal solutions may be infeasible. 23 3.3. Comparison of Optimal Solutions In practice, it is often important to compare the optimal solutions to alternate problem formulations so as to gain an understanding of the inherent trade-offs. To keep the presentation brief, we carry out such a comparison using the MMCP, MICP, and MACP problems. Specifically, we carry out pairwise comparisons of these problems to study the suitability of an optimal solution to one problem as an approximate solution to the other problem. (a) Comparison of MMCP and MICP Problems: Let G(V, E) be a network and u and v be a pair of nodes in G. Recall that ^ uv is the set of all feasible paths between u and v. Note that a path between u and v which is optimal for the MICP problem need not be in ^ uv . However, when paths are restricted to be in ^ uv , a path that is optimal for the MICP problem is near-optimal for the MMCP problem, irrespective of the number of zones. Our next theorem establishes this result. In the following, we assume that there is no path from u to v with cost vector (0, 0, . . . , 0). This trivial case can be easily detected, and such a path is an optimal solution to all the above routing problems. Theorem 3.1. Given a graph G(V, E) and a pair of nodes, u, v [ V, let P and Q be paths in ^uv which have minimum values for the MICP and MMCP objective functions, respectively, among the paths in ^uv. Let Pmax and Qmax be the maximum values in the cost vectors associated with P and Q, respectively. Then, Pmax , 2Qmax. Further, a ratio Pmax/Qmax arbitrarily close to 2 is achievable. Proof. Let the number of zones be k and the cost vectors associated with paths P and Q be ( p 1 , p 2 , . . . , p k ) and (q 1 , q 2 , . . . , q k ), respectively. Further, let P min and Q min denote the minimum values in the cost vector associated with P and Q, respectively. Since P is an optimal solution to the MICP instance among the paths in ^uv, we have, Pmax 2 Pmin # Qmax 2 Qmin. In other words, Pmax # (Qmax 2 Qmin) 1 Pmin. Since Qmin $ 0, we have Pmax # Qmax 1 Pmin. Now, there are three cases to consider: (i) Suppose that P min # Q max . Then, from the preceding inequality, P max # 2Q max . (ii) Suppose that P min . Q max . Then, p i . q j for all 1 # i, j # k, and so Q strictly dominates P. This contradicts the assumption that P is in ^ uv , and, hence, this case cannot arise. (iii) Suppose that P min 5 Q max . Then, p i $ q j for all 1 # i, j # k. In particular, p i $ q i for all 1 # i # k. If there is an i such that p i . q i , then Q strictly dominates P, a contradiction. So, p i 5 q i for all 1 # i # k. Consequently, P max 5 Q max . Since Q max . 0, we have P max , 2Q max . 24 TAYI, ROSENKRANTZ, AND RAVI This completes the proof of the upper bound. To show that the bound of 2 is achievable, consider a graph G(V, E) with k 5 2 zones. Let u and v be two nodes in G such that there are only two paths P and Q between them. Let CV(P) 5 (c 2 e , 2c 2 2 e ) and CV(Q) 5 (c, 0), where c is an integer . 1 and e is an arbitrarily small positive quantity. Note that paths P and Q are in ^ uv . Further, P is an optimal path for the MICP problem with P max 5 2c 2 2 e . Q is an optimal path for the MMCP problem with Q max 5 c. By choosing an appropriate value for e, the ratio P max /Q max 5 2(c 2 e )/c can be made arbitrarily close to 2. ■ In contrast to Theorem 3.1, the following example points out that even when paths are restricted to be in ^ uv an optimal solution to the MMCP problem need not be a near-optimal solution to the MICP problem, thus highlighting the asymmetric nature of these two problems. Example 2. Consider a graph G(V, E) with two zones. Let u and v be two nodes in G such that there are only two paths P and Q between them. Let CV(P) 5 (c 2 e, c 1 e) and CV(Q) 5 (c, 0), where c is an integer . 1 and e is an arbitrarily small positive quantity. Note that paths P and Q are in ^uv. Q is the unique optimal solution to the MMCP problem with maximum cost equal to c and maximum imbalance also equal to c. On the other hand, P is the unique optimal solution to the MICP problem with maximum imbalance equal to 2e and maximum cost equal to c 1 e. By making c arbitrarily large, Q becomes arbitrarily suboptimal with respect to the minimum imbalance objective. (b) Comparison of MICP and MACP Problems: In the preceding comparison, the results were independent of the number of zones. However, the number of zones plays a role in the comparison between MICP and MACP problems as indicated in the following theorem: (i) Suppose that P min 5 Q max . Then, for all i, 1 # i # k, p i $ P min 5 Q max $ q i . Since P is in ^ uv , p i 5 q i for all i, 1 # i # k. In other words, the cost vectors associated with P and Q are identical and, therefore, P avg 5 Q avg . Therefore, P avg /Q avg 5 1 , 2k 2 1, since k $ 2. (ii) Suppose that Pmin , Qmax. Since P is an optimal solution to the MICP problem, Pmax 2 Pmin # Qmax 2 Qmin # Qmax. So, Pmax # Qmax 1 Pmin , 2Qmax. Note that ¥ki51 qi $ Qmax. Therefore, Qavg $ Qmax/k. Also, ¥ki51 pi # Pmin 1 (k 2 1) Pmax. Using the facts Pmin , Qmax and Pmax , 2Qmax, we have ¥ki51 pi , (2k 2 1) Qmax. In other words, Pavg , (2k 2 1)/kQmax. Therefore, Pavg/Qavg , 2k 2 1. This completes the proof of the upper bound. To show that the bound of 2k 2 1 is asymptotically achievable, consider a graph G(V, E) with k $ 2 zones. Let u and v be two nodes in G such that there are only two paths P and Q between them. Let CV(P) 5 (c 2 e , 2c 2 2 e , 2c 2 2 e , . . . , 2c 2 2 e ) and CV(Q) 5 (c, 0, 0, . . . , 0), where c is an integer . 1 and e is an arbitrarily small positive quantity. Note that paths P and Q are in ^ uv . Further, P is an optimal path for the MICP problem with P avg 5 (2k 2 1)/k (c 2 e ). Q is an optimal path for the MACP problem with Q avg 5 c/k. Thus, by choosing an appropriate value for e, the ratio P avg /Q avg 5 (2k 2 1)(c 2 e )/c can be made arbitrarily close to 2k 2 1. ■ In contrast to Theorem 3.2, the graph considered in Example 2 points out that even when paths are restricted to be in ^uv an optimal solution to the MACP problem (namely, path Q with Qavg 5 c/2 and imbalance 5 c) need not be a nearoptimal solution to the MICP problem, since path P (with Pavg 5 c) has an imbalance of only 2e. (c) Comparison of MMCP and MACP Problems: The comparison between MMCP and MACP problems also depends on the number of zones, as shown in the following theorem: Theorem 3.2. Let G(V, E) be a graph with k $ 2 zones and let u and v be a pair of nodes in V. Let P and Q be paths in ^uv that have minimum values for the MICP and MACP objective functions, respectively, among the paths in ^uv. Let CV(P) 5 (p1, p2, . . . , pk) and CV(Q) 5 (q1, q2, . . . , qk). Let Pavg 5 (¥ki51 pi)/k and Qavg 5 (¥ki51 qi)/k be the average values of the cost vectors associated with P and Q, respectively. Then, Pavg , (2k 2 1) Qavg. Further, a ratio Pavg/ Qavg arbitrarily close to 2k 2 1 is achievable. Theorem 3.3. Let G(V, E) be a graph with k $ 2 zones and let u and v be a pair of nodes in V. Let P and Q be paths in ^uv that have minimum values for the MMCP and MACP objective functions, respectively, among the paths in ^uv. Let Pavg and Qavg (Pmax and Qmax) be the average (maximum) values of the cost vectors associated with P and Q, respectively. Then, Proof. Let Pmin and Qmin (Pmax and Qmax) denote the minimum (maximum) values in the cost vector associated with P and Q, respectively. Since P is in ^uv, Pmin # Qmax. Now, there are two cases to consider: 1. Pavg # kQavg. Further, a ratio Pavg/Qavg arbitrarily close to k is achievable. 2. Qmax # kPmax. Further, a ratio of Qmax/Pmax arbitrarily close to k is achievable. PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS 25 TABLE I. Comparison of MMCP, MICP, and MACP problems Objective Value Problem Path and Risk Vector MMCP MICP MACP MMCP MICP MACP P: (7c/8, c/4) Q: (13c/16, 15c/16) R: (c, 0) 7c/8 15c/16 c 5c/8 c/8 c 9c/16 7c/8 c/2 Proof. For both parts of the theorem, let the cost vectors associated with P and Q be ( p 1 , p 2 , . . . p k ) and (q 1 , q 2 , . . . , q k ), respectively. PART 1. Obviously, P avg # P max . Since P is an optimal solution to the MMCP problem, P max # Q max . Therefore, P avg # Q max # ¥ ki51 q i 5 kQ avg . To show that the bound k is asymptotically achievable, let the cost vectors associated with P and Q be (c 2 e , c 2 e , . . . , c 2 e ) and (c, 0, . . . , 0), respectively, where c . e and e is an arbitrarily small positive quantity. Then, P avg 5 c 2 e and Q avg 5 c/k. Thus, the ratio P avg /Q avg 5 k(c 2 e )/c can be made arbitrarily close to k. PART 2. Obviously, Q max # ¥ ki51 q i 5 kQ avg . Since Q is an optimal solution to the MACP problem, Q avg # P avg . Therefore, Q max # kQ avg # kP avg # kP max . To show that the bound k is asymptotically achievable, let the cost vectors associated with P and Q be ((c/k) 1 e , (c/k) 1 e , . . . , (c/k) 1 e ) and (c, 0, . . . , 0), respectively, where c . 0 and e is an arbitrarily small positive quantity. Then, P max 5 (c/k) 1 e and Q max 5 c. Thus, the ratio Q max /P max 5 c/((c/k) 1 e ), which can be made arbitrarily close to k. ■ Remarks. We note that, in general, even when the paths are required to be feasible (i.e., in ^ uv ), it is possible that no single path is an optimal solution to all the three problems. We present an example to illustrate this behavior: Example 3. Consider a graph G(V, E) with two zones. Let u and v be two nodes in G such that there are only three paths P, Q, and R between them. Let the cost vectors associated with P, Q, and R be (7c/8, c/4), (13c/16, 15c/16), and (c, 0), respectively, where c . 0. Note that paths P, Q and R are in ^ uv . From Table I, it can be verified that P, Q, and R are the unique optimal solutions to the MMCP, MICP, and MACP problems, respectively. We end this section by pointing out a specific behavior of the MICP problem which is not exhibited by the other two problems. This behavior arises when paths containing cycles are considered. For both the MMCP and MACP prob- Fig. 2. Cyclic paths and MICP problem. lems, there is always an acyclic path which is optimal. This is because a path containing cycles can be transformed into a simple path (i.e., a path in which no node is visited more than once) by removing the cycles, and this causes no increase in either the maximum or the total cost value. On the other hand, the maximum imbalance value can be improved by considering cyclic paths, as shown by the following example: Example 4. Consider the graph G(V, E) with two zones shown in Figure 2. Let u and v be the pair of nodes which are specified in the problem instances for MICP and MMCP. Note that the path Q 5 ^u, a, v& is the optimal solution to both the MMCP and MACP problems with maximum cost 5 2 and average cost 5 1. This is a unique optimal solution to the MMCP problem irrespective of the types of paths considered. It can be seen that Q is the only path in ^ uv and, hence, Q is also the optimal solution to the MICP problem when paths are restricted to be in ^ uv . The maximum imbalance value associated with Q is 2. However, when the restriction on paths is relaxed to permit all acyclic paths, the optimal solution to the MICP problem is P 5 ^u, a, b, v& with the maximum imbalance value being 1. When there is no constraint on the type of paths, an optimal solution to the MICP problem is P9 5 ^u, a, b, a, v& with a maximum imbalance value of zero. Note that P9 includes the cycle ^a, b, a&. 4. ROUTING PROBLEMS 4.1. Overview We first present complexity results for the routing problems defined in Section 2.2. Among these problems, the MMCP problem is known to be NP-complete [5] even when the number of zones is fixed at 2. Hassin [8] considered a variation of the MMCP problem for two zones and presented both pseudopolynomial algorithms and polynomial 26 TAYI, ROSENKRANTZ, AND RAVI Fig. 3. Reduction from PARTITION to routing problems. time approximation schemes (PTASs). However, when the number of zones is a variable (i.e., part of the problem instance as in the problem formulations presented in Section 2.2), we show that the MMCP, BCP, MICP, MVCP, MTDCP, and MMDCP problems are strongly NP-hard, that is, there are no pseudopolynomial algorithms for any of these problems unless P 5 NP. We also observe that for the MICP, MVCP, MTDCP, and MMDCP problems there are no polynomial time algorithms that approximate the optimal value to within any constant factor. For the MMCP problem, we show that there is no polynomial algorithm that approximates the optimal value to within any factor less than 2. The formulations in Section 2.2 correspond to the optimization versions of the routing problems. In developing complexity results, we assume that the routing problems are appropriately transformed into their respective decision versions. It can be seen that all these decision problems are in NP since we can guess a path from the source to the destination and verify that the cost vector associated with the path satisfies the requirement in each case. So, the complexity results presented in Sections 4.2 and 4.3 address only the NP-hardness aspect of these problems. 4.2. Fixed Number of Zones When there are only two zones, the MMCP problem is NP-hard since it corresponds to the SHORTEST WEIGHT-CONSTRAINED PATH problem as defined in [5, p. 214]. We first derive the NP-completeness of the other routing problems when the number of zones is fixed. These results use a reduction from the PARTITION problem, which is known to be NP-complete [5] and is formally defined below: Proof. We present the details of the reduction for the BCP problem and indicate how the same construction also establishes the NP-hardness of the other problems. The reduction is from the PARTITION problem. Given an instance of PARTITION, we construct the network with two zones shown in Figure 3. In that figure, nodes v 1 , v 2 , . . . , v n11 are referred to as the main nodes while the others are referred to as intermediate nodes. Let s 5 ¥ ni51 x i . Without loss of generality, we assume that s is even; otherwise, the PARTITION instance will not have a solution. Note that any path from v1 to v n11 must traverse all the main nodes in the order v 1 , v 2 , . . . , v n11 , and for any such path, the total of the two components of the cost vector is equal to s. Using this fact, it can be verified that there is a balanced path from v1 to v n11 if and only if the PARTITION instance has a solution. As can be observed from the definitions in Section 2.2, the BCP problem is a special case of the MICP problem where the maximum imbalance is zero. Therefore, the NPhardness of BCP implies the NP-hardness of MICP as well. The preceding construction can also be used to verify that there is a path from v1 to v n11 with variance zero if and only if the PARTITION problem has a solution. Therefore, the MVCP problem is also NP-hard. The NP-hardness results for the MTDCP and MMDCP problems can be established in the same manner as for the MVCP problem. ■ 4.3. Variable Number of Zones We now consider the routing problems with a variable number of zones and show that they are strongly NPcomplete. To do so, we utilize reductions from the EXACT COVER BY 3-SETS (X3C) problem [5] defined below: PARTITION Instance: A set S 5 { x 1 , x 2 , . . . , x n } of nonnegative integers. Question: Can S be partitioned into two subsets S 1 and S 2 such that ¥ x[S 1 x 5 ¥ x[S 2 x? EXACT COVER BY 3-SETS (X3C) Instance: A finite set X with uXu 5 3q and a collection T of 3-element subsets of X. Question: Does T contain a subcollection T9 such that the sets in T9 are pairwise disjoint and the union of all the sets in T9 is equal to X? Proposition 4.1. The problems BCP, MICP, MVCP, MTDCP, and MMDCP are NP-complete even when there are only two zones. Proposition 4.2. The problems of MMCP, BCP, MICP, MVCP, MTDCP, and MMDCP are strongly NP-complete when the number of zones is variable. PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS 27 Fig. 4. Reduction from X3C to routing problems. Proof. We sketch the reduction from the X3C problem to the MMCP problem. Minor modifications of this reduction establish the NP-hardness of the other problems. Consider an instance of X3C defined by the set X 5 {t 1 , t 2 , . . . , t 3 q} and the collection of sets T 5 {T 1 , T 2 , . . . , T n }. We create an instance of MMCP with 3q zones as shown in Figure 4. Nodes v1 through v q11 are referred to as the main nodes while the others are referred to as intermediate nodes. The intermediate nodes between main nodes v1 and v i11 are denoted by v i1 , v i2 , . . . , v in , 1 # i # q. The cost vectors associated with the edges are as follows: For each edge of the form {v ij , v i11 }, all 3q components of the cost vector are zero (1 # i # q, 1 # j # n). Let T j 5 {t j1 , t j2 , t j3 }. Then, the cost vector associated with {v i , v ij } will have the value 1 for zones j 1 , j 2 , and j 3 , and the value zero for the remaining 3q 2 3 zones. It can be verified that there is a path P from v1 to v q11 with max{CV P1 , CV P2 , . . . , CV P3q } 5 1 if and only if the X3C instance has a solution. ■ It can be seen from Figures 3 and 4 that the networks constructed in the NP-hardness proofs above are two terminal series-parallel [4]. Thus, the routing problems remain NP-complete even for such simple networks. 4.4. Remarks on Obtaining Approximate Solutions All the routing problems, except the BCP problem, are optimization problems. A natural question that arises at this point is whether there exist efficient approximation algorithms (heuristics) that produce near-optimal solutions. By examining the proof of Proposition 4.1, it can be seen that for the MICP, MVCP, MTDCP, and MMDCP problems determining whether the optimal solution value is equal to zero is NP-hard. So, for these problems, obtaining a nearoptimal solution that is within a multiplicative factor of the optimal value is also NP-hard. Unlike the MVCP, MTDCP, and MMDCP problems, we can efficiently determine whether the optimal solution value is zero for any MMCP problem instance, irrespective of the number of zones. This can be accomplished as follows: Given the network G(V, E) and a pair of nodes u, v, we construct a network G9 by retaining from G only those edges whose cost vectors are zero in every component. It can be verified that the optimal solution value to the given MMCP problem instance is zero if and only if there is a path in G9 between u and v. In view of this, establishing nonapproximability results for the MMCP problem requires a different approach. The following theorem shows that approximating the solution to the MMCP problem within a factor less than 2 is also strongly NP-hard. Theorem 4.1. Unless P 5 NP, there is no polynomial time approximation algorithm with a performance guarantee of 2 2 e, for any e . 0, for the MMCP problem with variable number of zones. Proof. Suppose there is an approximation algorithm ! with a performance guarantee of 2 2 e, for some e . 0, for the MMCP problem. We show that ! can be used to solve arbitrary instances of the X3C problem in polynomial time, contradicting the hypothesis that P Þ NP. We use the reduction from X3C to MMCP described in the proof of Proposition 4.2 (Fig. 4). In that construction, for any path P9 from v1 to v q11 with cost vector (c 1 , c 2 , . . . , c 3q ), ¥ 3q i51 c i 5 3q. Therefore, for any path from v1 to v q11 , the cost vector will have at least one component whose value is $1. Using this fact, it can be verified that when ! is executed on the resulting instance of the MMCP problem it returns a solution value strictly less than 2 if and only if the X3C instance has a solution. ■ 5. ANALYSIS OF FEASIBILITY PROBLEMS 5.1. Overview In this section, we address the feasibility problems for both fixed and variable number of zones. For the fixed case, we provide weak NP-hardness results (Section 5.2.1) and then develop pseudo-polynomial algorithms (Section 5.2.2). For 28 TAYI, ROSENKRANTZ, AND RAVI Fig. 5. Details of the Zone Feasibility algorithm for two zones. the case of variable number of zones, we present strong NP-hardness results (Section 5.3). 5.2. Fixed Number of Zones 5.2.1. Complexity Results Proposition 5.1. Problems PF, IEF, EF, and NF are NPhard when there are two or more zones. Problem ZF is NP-hard when there are three or more zones. Proof. We first consider the PF problem. The reduction from PARTITION to PF is identical to that given in the proof of Proposition 4.1, except that the edge {v 1 , v n11 }, with a cost vector ((s 1 1)/2, (s 1 1)/2), is also added. Now, it can be verified that path P consisting of the single edge {v 1 , v n11 } is infeasible if and only if the PARTITION instance has a solution. This establishes the NP-hardness of PF. The path P constructed above is a single edge {v 1 , v n11 }. This edge is inherently infeasible if and only if the given instance of PARTITION has a solution, thus establishing the NP-hardness of IEF. The NP-hardness of EF follows by restriction. To prove the NP-hardness of NF, a node w is added to the edge {v 1 , v n11 }. The cost vectors associated with edges {v 1 , w} and {w, v n11 } are ((s 1 1)/2, (s 1 1)/2) and (0, 0), respectively. Now, node w is infeasible with respect to the pair of nodes v 1 , v n11 if and only if the given instance of PARTITION has a solution. To show the NP-hardness of ZF for three zones, we again use the reduction for PF, except that each edge will have a cost vector with three components corresponding to the three zones z 1 , z 2 , and z 3 . For each edge other than {v 1 , v n11 }, the first two components of the cost vector remain the same while the third component is zero. For {v 1 , v n11 }, the cost vector is ((s 1 1)/2, (s 1 1)/2, 1). It can be verified that zone z 3 is infeasible with respect to the pair of nodes v 1 , v n11 if and only if the PARTITION instance has a solution. ■ Zone Feasibility Algorithm for Two Zones: In contrast to the other feasibility problems discussed earlier, ZF has a polynomial time algorithm when there are only two zones. This algorithm is shown in Figure 5. Given the network G, the goal is to determine the feasibility of zone z 1 with respect to the pair of nodes u and v. The algorithm essentially involves the computation of two shortest path lengths, namely, L 0 in Step 2 and L t in Step 4. L 0 is the minimum value such that there exists a feasible path P between u and v in G9 (and, therefore, also in G) with cost vector (0, L 0 ). L t is the minimum value such that there exists a feasible path P9 between u and v in G0 (and also in G) with cost vector of the form ( x, L t ). We now formally prove the correctness of the algorithm: Theorem 5.1. Given a network G(V, E) with two zones z1, z2 and a pair of nodes u, v, Algorithm ZF:2-Zones correctly determines if z1 is feasible with respect to u, v. Proof. Consider the lengths L 0 and L t computed by the algorithm. We observe that L t # L 0 , since all the edges in G9 are also in G0. If L t , L 0 , then there exists a feasible path between u and v with a cost vector of the form ( x, L t ), where x is strictly positive. In other words, z 1 is feasible. If L t 5 L 0 , the first component of the cost vector associated with any feasible path between u and v is zero, and, hence, z 1 is infeasible. ■ It is well known that shortest path computations with scalar edge weights can be carried out efficiently [3]. Therefore, Algorithm ZF:2-Zones runs in polynomial time. PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS 5.2.2. Pseudo-Polynomial Algorithms for Efficient Set Construction The results obtained in the preceding section indicate the computational intractability of the feasibility problems even for a fixed number of zones. To cope with this intractability, practitioners explore alternative solution approaches. One such approach is the design of pseudo-polynomial algorithms. These algorithms entail running times that are polynomial functions of the size of the problem instance and the maximum value that occurs in the input. Therefore, they are viable only when the maximum input value is relatively small. For ease of exposition, the pseudo-polynomial algorithms for the feasibility problems are presented assuming that the underlying network is a directed graph (digraph) and that all components of any cost vector are integers. These two requirements can be met by transforming an undirected graph into a digraph (by replacing each edge with a pair of oppositely directed edges whose cost vectors are the same as that of the undirected edge) and by appropriate scaling. Recall from Section 2 that the set of all feasible paths between the pair of nodes u and v is denoted by ^ uv . In general, the number of feasible paths between u and v (i.e., u^ uv u) may be exponential in the size of the problem instance. Further, many of these paths may have the same cost vector. Hence, the number of distinct cost vectors associated with all feasible paths may be considerably smaller. So, such a set of distinct cost vectors, denoted by % uv , has been referred to as an efficient set [13]. The feasibility algorithms presented in this section are based on the computation of these efficient sets. These algorithms run in pseudopolynomial time because for any pair of nodes u and v the cardinality of the efficient set (i.e., u% uv u) can be bounded by a pseudo-polynomial function. The following lemma can be used to establish this bound: Lemma 5.2. Let G(V, A) be a directed graph where each edge has an associated k-component cost vector. For zone zi and node x, let si,x denote the maximum value of the ith component of the cost vectors associated with the outgoing edges from x. If there are no outgoing edges from x, then let si,x 5 0. Let 5i 5 ¥x[V si,x. For any pair of nodes u and v in G and for any cost vector 9 5 (c1, c2, . . . , ck) in %uv , ci # 5i, 1 # i # k. Proof. The proof is by contradiction. Suppose there is a vector 9 5 (c 1 , c 2 , . . . , c k ) in % uv with c i . 5 i for some i (1 # i # k). Let P be a path whose cost vector is 9. By the definition of 5 i , some node must appear more than once in P, that is, P must contain one or more cycles. Let P9 be the path obtained from P by eliminating all the cycles and let 99 5 (c91, c92, . . . , c9k) be the cost vector associated with P9. Since each node occurs at most once in P9, c9j # 5j, 1 # j # k. 29 Further, the nonnegativity of the component values implies that c9j # cj for 1 # j # k, and c9i is strictly less than ci. In other words, 99 strictly dominates 9. This is a contradiction, since %uv contains only cost vectors corresponding to feasible paths. ■ Let 5 5 ) k21 i51 (5 i 1 1). Since the value of the ith component of any vector in the efficient set % uv can vary from 0 to 5 i , the number of distinct combinations of the first k 2 1 component values is 5. Now, for each combination of the cost values for the first k 2 1 components, the last (kth) component can take on only one value, because the vectors in % uv correspond to feasible paths. Therefore, the cardinality of % uv is at most 5, that is, the size of any efficient set is only pseudopolynomial. Warburton [13] presented pseudopolynomial algorithms for computing efficient sets when the underlying directed graph is acyclic. For such graphs, his algorithm permits both negative and positive edge weights. The running time of his algorithm is O(n 2 V max log n) for an acyclic graph with n nodes, when the number of zones k 5 2 and the maximum cardinality of the efficient set for any node is V max . When the number of zones k is $3, Warburton used an algorithm due to Kung et al. [10] to achieve a running time of O(kn 2 V max (log V max ) k22 ). Our algorithm described below is applicable to graphs with cycles, provided there are no negative edge weights. We also discuss an implementation of the algorithm with a running time that matches Warburton’s algorithm. For expository reasons, we first describe our algorithm for two zones and present a numerical example. The algorithm uses an approach similar to Dijkstra’s shortest path algorithm [3]. Next, we discuss the general version of the algorithm along with the implementation. Algorithm for Efficient Set Construction with Two Zones: Let the nodes of the digraph be numbered from 1 to n, where, for convenience, 1 represents nodes u and n represents node v. Thus, the goal of the algorithm is to construct the efficient set of vectors corresponding to all the paths from node 1 to node n. Without loss of generality, assume that 51 # 52, where 51 and 52 are as defined in Lemma 5.2. The algorithm maintains a two-dimensional array D with n rows and (51 1 1) columns. Each row represents a node of the digraph, and each column represents a cost value in the range 0 to 51. Each cell D[i, r ] of the array contains a cost value as defined below: D@i, r # 5 min$cu There is a path from node 1 to node i with cost vector ~r, c!}. Given this definition, the computed cost values in the ith row are D[i, r ], where 0 # r # 51. These D[i, r ] values enable construction of the efficient set of vectors corre- 30 TAYI, ROSENKRANTZ, AND RAVI Fig. 6. Example for efficient set construction. sponding to all the paths from node 1 to node i. More precisely, the efficient set % 1i is the maximal† subset of C 1i 5 {(0, D[i, 0]), (1, D[i, 1]), . . . , (5 1 , D[i, 5 1 ])} such that no vector in % 1i is strictly dominated by a vector in C 1 . Note that some of the D[i, r ] values may be ` because there may not exist a path from node 1 to node i with a cost vector whose first component is r. Pairs whose second components are ` will not be included in the efficient set. The D[i, r ] values can be computed using dynamic programming. For convenience, the computations are performed in three phases. Initially, all the entries in the array D are set to `: Phase 1 (Initialization of Column 0): Here, values for D[i, 0], 1 # i # n, are computed. A digraph G9 is constructed from the given digraph G by retaining only those directed edges whose cost vectors are of the form (0, y). The length of each directed edge e in G9 is set to the value of the second component of the cost vector associated with e in G. Then, the value of D[i, 0] is simply the length of a shortest path from node 1 to node i in G9. These values can be computed using any standard algorithm for the single-source shortest path problem [3]. Phase 2 (Preliminary Values for Column r 1 1): Assuming that values D[i, s], 1 # i # n and 0 # s # r , are available, preliminary values of D[i, r 1 1], 1 # i # n, are computed as follows: Consider node i (1 # i # n). Let i 1 , i 2 , . . . , i t be the nodes such that G contains a directed edge (i , , i) with cost vector (a,, b,), where a , . 0, 1 # , # t. Now, D@i, r 1 1# 5 min $D@i ,, r 1 1 2 a ,# 1 b ,%. (5.2) 1#,#t The condition a, . 0 is necessary in computing the pre† By “maximal” we mean that no proper superset of % 1i has the stated property. liminary value of D[i, r 1 1] since only cost values D[i, s] where 0 # s # r are available. The case when a, 5 0 is considered in Phase 3. Also, in computing the preliminary value of D[i, r 1 1] from Eq. (5.2), the quantity D[i , , r 1 1 2 a , ] is considered only if r 1 1 2 a, $ 0. Phase 3 (Final Values for Column r 1 1): At the beginning of this phase, all nodes are deemed unconfirmed. The following steps are repeated until every node is confirmed or the values in column r 1 1 corresponding to unconfirmed nodes are all `: Find a node q such that D[q, r 1 1] is the minimum value among the entries in column r 1 1 corresponding to unconfirmed nodes. (Ties are broken arbitrarily.) Node q is labeled confirmed since it can be verified that among all paths from node 1 to node q with cost vectors of the form ( r 1 1, y) the smallest value for y is D[q, r 1 1]. Having confirmed q, for any node v to which there is a directed edge (q, v) with cost vector of the form (0, y), the corresponding D[v, r 1 1] is changed to min{D[v, r 1 1], D[q, r 1 1] 1 y}. Given the final values for D[i, s], 1 # i # n and 0 # s # r , Phases 2 and 3 together compute the final values for D[i, r 1 1], 1 # i # n. Therefore, by repeating these two phases for each value of r in the range 0 to 51 2 1, all the values in the D array can be computed. We now present a numerical example to illustrate the computations in each phase of the two-zone algorithm. Example 5. Consider the five-node digraph G in Figure 6(a). The goal is to construct the efficient set %1,5. Using the notation of Lemma 5.2, 51 5 4 in this example. Therefore, the D array has five columns (corresponding to cost values 0, 1, 2, 3, and 4). Since there are five nodes in the graph, the D array has five rows. Phase 1: The digraph G9 produced by Phase 1 of the algorithm is shown in Figure 6(b). From an examination of G9, it can be seen that D[1, 0] 5 0, D[2, 0] 5 16, D[3, 0] 5 1, D[4, 0] 5 1, and D[5, 0] 5 15. PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS Fig. 7. Entries of the D array. In the following, we give the details of the computations for Phases 2 and 3 for r 5 1. Recall that in these two phases the computations use the digraph G rather than G9. Phase 2: For node 1, the only incoming edge is from node 2. However, the first component of the cost vector associated with that edge is zero, and D[1, 1] remains `. For node 2, the only incoming edge such that the first component of the cost vector associated with that edge is nonzero is from node 3. Therefore, D[2, 1] 5 D[3, 0] 1 1 5 2. Proceeding in a similar fashion, it can be verified that the preliminary values for the column associated with r 5 1 are D[3, 1] 5 `, D[4, 1] 5 `, and D[5, 1] 5 D[3, 1 2 1] 1 2 5 D[3, 0] 1 2 5 3. Phase 3: In examining the column associated with r 5 1, the minimum value of 2 is achieved by node 2, and, hence, node 2 is confirmed. For node 2, the only outgoing edge with a cost vector of the form (0, y) is to node 1. Therefore, D[1, 1] is updated to min{D[1, 1] 5 `, D[2, 1] 1 1} 5 3. Next, node 5 is confirmed. The only outgoing edge from node 5 with a cost vector of the form (0, y) is to node 2, which has already been confirmed. Hence, no update occurs. Proceeding similarly, we confirm node 1, and this updates both D[3, 1] and D[4, 1] to 4. Finally, nodes 3 and 4 are confirmed, requiring no further updates since all other nodes have already been confirmed. The final values in the column associated with r 5 1 are D[1, 1] 5 3, D[2, 1] 5 2, D[3, 1] 5 4, D[4, 1] 5 4, and D[5, 1] 5 3. When Phases 2 and 3 are repeated for the remaining r values, we obtain the entries for the D array as shown in Figure 7. Examining the fifth row of this array, the efficient set %1,5 5 {(0, 15), (1, 3), (3, 2)}. Generalization and Efficient Implementation: It can be seen that the above algorithm runs in pseudopolynomial time since the number of entries in the D array is pseudopolynomial and only polynomial time is used to compute 31 each entry. When each cost vector has k $ 3 components, a straightforward generalization of the algorithm can be obtained using an array of size n ) k21 i51 (5 i 1 1), where n is the number of nodes in the graph G(V, A) and the values of 5 i , 1 # i # k 2 1, are as defined in Lemma 5.2. Such an array-based implementation provides for all possible cost vectors. However, many of these vectors may not be part of any efficient set, especially when the graph is sparse. We now discuss an alternative implementation that stores a k-component cost vector only when it is confirmed to be part of an efficient set. The running time of this implementation matches the running time of Warburton’s algorithm for acyclic graphs [13]. To make the general version of the algorithm and its implementation easier to follow, we divide the discussion into three parts: The first part provides a static description of some quantities that characterize the efficient sets of all the nodes. The second part, which provides an operational view of the algorithm, explains how several data structures can be used to efficiently compute the quantities defined in the first part. The third part establishes the running time of our implementation. (a) A Static View of the Computation: For a k-component vector 9 5 ( a 1 , a 2 , . . . , a k21 , a k ), the projection of 9, denoted by P(9), is defined to be the (k 2 1)component vector ( a 1 , a 2 , . . . , a k21 ). Recall that for an edge e of the given graph G(V, A), CV(e) represents the cost vector associated with e. An edge e is strong if P(CV(e)) is not all zero and weak otherwise. Given two r-component vectors - 5 ( x 1 , x 2 , . . . , x r ), and = 5 ( y 1 , y 2 , . . . , y r ), the lexicographic order, denoted by ,,, is as follows: - ,, = if and only if there is an i, 1 # i # r, such that x i , y i and for all j, 1 # j # i 2 1, x j 5 y j . Note that ,, is a total order on r-component vectors. For two r-component vectors - and =, we use #, = to mean that - ,, = or - 5 =. For a given (k 2 1)-component vector 0 and a node v, define E-list0(v) as the lexicographically sorted (according to ,,) list of k-component vectors = such that = [ %1v (i.e., = is in the efficient set for v) and P(=) #, 0. For a strong edge e 5 (u, v) and a (k 2 1)-component vector 0 5 ( a 1 , a 2 , . . . , a k21 ), define Critical0 (e) as the lexicographically first entry 9 in E-list0(v) such that 0 , , P(9 1 CV(e)), and NULL if no such 9 exists. For a (k 2 1)-component vector 0, define M-set0 as follows: M-set0 5 ø $Critical0~e! e[G~V, A! 1 CV~e!ue is a strong edge and Critical0~e! Þ NULL%. From the above definitions, it can be seen that when M-set0 is empty, then for each node v, E-list0(v) gives the 32 TAYI, ROSENKRANTZ, AND RAVI vectors in %1v in lexicographic order. The array-based implementation given earlier in this section can be envisioned as finding efficient vectors by considering all possible (k 2 1)-component projections, in the lexicographic order given by ,,. The implementation to be described next also searches for efficient vectors in the lexicographic order of projections, but only considers possibilities that can be obtained by extending a known efficient vector by adding one edge. At any stage of the algorithm, all efficient vectors whose projections are #, some particular 0 have been found, and the lexicographically next extension is to be explored. (b) Data Structures and an Operational View of the Algorithm: We now discuss suitable data structures for the items E-list0(v), Critical0 (e), and M-set0 defined above and how these items can be computed efficiently. Suppose that the algorithm has reached the stage where it has found all efficient vectors whose projections are #, 0, for some (k 2 1)-component vector 0. First, consider E-list0(v). For any (k 2 1)-component vector 09 with 0 ,, 09, it can be seen that for any node v, E-list0(v) is a prefix of E-list09(v). Therefore, we use a linked list, denoted by E-list(v), for each node v to implement E-list0(v). We use a heap, called M-heap, to implement M-set0. Each entry stored in this heap is a record, called M-record, with two fields: The key field stores a (k 2 1)-component vector and the data field stores the identifier of an edge (an ordered pair of integers representing the head and tail nodes of the edge). For each strong edge e 5 (u, v) such that Critical0 (e) is not NULL, the M-heap contains an M-record with II(Critical0 (e) 1 CV(e)) as the key and (u, v) as the data. The operations performed on the M-heap are insertion of a record, and extraction of a record with the lowest key value under the total order ,,. At any time during the execution of the algorithm, the M-heap contains at most one record per strong edge. Thus, the number of records in the M-heap is O(n 2 ). As a consequence, each operation on the M-heap can be performed in O(k log n) time using standard techniques [3]. For any strong edge e 5 (u, v), the quantity Critical0 (e), called the current critical entry for e, is implemented as a pointer to an appropriate vector in Elist(u). During the execution of the algorithm, the critical entries of strong edges get updated. At certain steps in the algorithm, the critical entry for a given edge may be set to NULL. Whenever the critical entry for an edge is updated to a non-NULL value, the algorithm creates an M-record and inserts it into the M-heap. Recall that for each node v the vectors in E-list(v) are all nondominated. When a new k-component vector is generated, it is appended to the appropriate E-list, provided that it is not dominated by any of the current vectors in that E-list. The task of efficiently checking whether or not a new k-component vector is dominated by any of the current vectors can be carried out using a treelike data structure described by Willard and Lueker [15]. We refer to this data structure as a DC-tree (for Dominance Check tree). For each node v, a separate DC-tree, denoted by DC-tree(v), is used. The total order ,, used on the keys of M-records ensures that when a candidate vector for some E-list is generated the first component of the new vector is greater than or equal to the first components of all the vectors in that E-list. Therefore, the dominance check need not consider the first component. Thus, each element stored in a DC-tree is a (k 2 1)-component vector. For a DC-tree containing N such (k 2 1)-component vectors, each domination check followed by the insertion of the new vector (when it is not dominated) can be carried out in O((log N) k22 ) time [15]. The confirmation phase of the algorithm uses another heap, called C-heap, that stores at most one record (called C-record) for each node of the graph. Thus, the C-heap contains at most n records at any time. Each C-record contains three fields: The key field stores an integer value, the data field stores a node identifier, and a Boolean field (flag) that indicates whether or not the corresponding node has been processed (during the current confirmation phase). The C-heap supports insert and find-min operations, whose semantics are as follows: The insert operation, which is in effect an insert/modify operation, is specified using two integer parameters, namely, a node identifier and a cost. If the C-heap does not currently contain a record for the specified node, the insert operation adds a new C-record to the C-heap. This new record has the node identifier parameter as the data value, the cost parameter as the key value, and false as the flag value. If the C-heap currently contains a C-record for the specified node, then the cost parameter is used to update the key stored in that C-record provided that the cost parameter is less than the stored key value and the value of the flag is false. The find-min operation on the C-heap returns a C-record with the minimum key value among the C-records whose flag value is false and changes the flag value for this record to true. If there is no C-record with false as the flag value, the find-min operation returns NULL. Using standard techniques, the C-heap data structure can be implemented to perform each insert and find-min operation in O(log n) time. We now present the steps of the algorithm using the above data structures. The initialization phase of the algorithm is completed by considering the subgraph G9 containing only the weak edges of G. As discussed in the case of two zones, this corresponds to computing the weight of a shortest path in G9 from node 1 to the other nodes, where the weight of each edge is the last component of its cost vector in G. For each node v reachable from node 1 in G9, this computation provides the first vector in E-list(v). This vector is of the form (0, 0, . . . , 0, a), where a is the length of a shortest path from node 1 to node v in G9. The E-lists of nodes that are unreachable from node 1 in G9 remain empty. For each strong edge (u, v) in G, if E-list(u) is PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS nonempty, then the critical entry for that edge is set to the vector in E-list(u), and a corresponding M-record is inserted into the M-heap. For other edges, the critical entry is set to NULL. Further, for each node u, if E-list(u) is nonempty, a vector is inserted into DC-tree(u). This vector contains the last k 2 1 components of the vector in Elist(u). Next, the algorithm repeats the following steps until the M-heap becomes empty: (i) Extract all the M-records with the lowest key value from the M-heap, and initialize the C-heap to empty. (ii) For each extracted M-record, carry out a preliminary computation phase. (iii) Carry out a confirmation phase, updating E-lists and DC-trees appropriately when new efficient vectors are found. Details of Steps (ii) and (iii) are presented below. Note that the key values for all the extracted M-records in Step (i) are equal. Let ( g 1 , g 2 , . . . , g k21 ) denote this key value. For each extracted M-record, say -, the preliminary computation phase [Step (ii)] consists of the following: Suppose that - has (u, v) as the data value. Further, let the cost vector and the current critical entry of (u, v) be (c 1 , c 2 , . . . , c k21 , c k ) and ( a 1 , a 2 , . . . , a k21 , a k ), respectively. (Note that g i 5 a i 1 c i , 1 # i # k 2 1.) The critical entry for (u, v) is updated to the successor (if any) of the current critical entry in E-list(u). If the resulting critical entry is non-NULL, a new M-record corresponding to (u, v) is created and inserted into the M-heap. In any case, a C-record with v as the node identifier and a k 1 c k as the key value is inserted into the C-heap. After completing the preliminary computation phase for all the extracted M-records, the confirmation phase [Step (iii)] begins. In this phase, the following steps are repeated until a find-min operation on the C-heap returns NULL. Perform a find-min operation on the C-heap. Suppose that the node corresponding to the C-record returned by the find-min operation is x and the key value is b. This implies that there is a path with cost vector 99 5 ( g 1 , g 2 , . . . , g k21 , b ) from node 1 to node x. Vector 99 is a candidate efficient vector for node x. Using DC-tree( x), we check whether 99 is dominated by any other vector in % 1x . If it is not dominated, 99 is appended to E-list( x) and inserted into DC-tree( x). The strong edges leaving x are examined. For each strong edge ( x, w) for which the critical entry is NULL, the critical entry is updated to 99 and a corresponding M-record is inserted into the M-heap. Next, for each weak edge ( x, w) with cost vector (0, 0, . . . , 0, d), an insert operation with node identifier w and key value b 1 d is performed on the C-heap. This completes the description of the general version of the algorithm and its implementation. We now estimate the running time of this implementation: (c) Running Time Analysis: Following Warburton [13], let V max denote the cardinality of the largest efficient set, that is, the number of entries in any E-list is at most V max . 33 The running time of the implementation is made up of three principal components, namely, the times spent in performing the operations on the M-heap, C-heap, and DC-tree. For each entry in an E-list, the implementation inserts at most nV max M-records, since the outdegree of each node is at most n. Since the graph has n nodes, the total number of M-records created during the course of the algorithm is at most n 2 V max . Therefore, the time spent in performing insert operations on the M-heap is at most O(n 2 V max k log n). Clearly, the time spent in performing deletion operations on the M-heap is also O(n 2 V max k log n). In the case of the C-heap, the dominant contribution of the running time is due to the insert operations. At most, n 2 V max M-records are extracted from the M-heap, and each such record causes one insert operation on the C-heap. Similarly, the confirmation phases perform at most n 2 V max C-heap insert operations. Thus, the time spent on C-heap operations is O(n 2 V max log n). Each DC-tree contains at most V max entries and so the time needed to perform a dominance check followed by an insertion (if necessary) is O((log V max ) k22 ). The total number of dominance checks performed is O(n 2 V max ), since n 2 V max is an upper bound on the number of kcomponent vectors generated. So, the time spent in performing the DC-tree operations is O(n 2 V max (log V max ) k22 ). Therefore, the overall running time of the algorithm is O(max{kn 2 V max log n, n 2 V max (log V max ) k22 }). For k 5 2, the running time of our implementation is O(n 2 V max log n). For this special case, the running time of our implementation matches that of Warburton’s algorithm. For k $ 3, our running time of O(n 2 V max (log V max ) k22 ) is asymptotically faster than Warburton’s algorithm by the factor k. This improvement is due to the fact that our implementation uses the more efficient dominance check operation described in [15]. For both analyses, the n 2 term in the time bound can be replaced by the number of edges in the graph, yielding a better bound for sparse graphs. 5.2.3. Algorithms for Feasibility Problems In this section, we utilize the preceding algorithm for efficient set construction in developing pseudo polynomial algorithms for the feasibility problems defined in Section 2.3. (a) Path feasibility (PF): Let P be the path to be tested for feasibility. We can easily compute the cost vector CV(P) associated with P. Now, P is feasible if and only if CV(P) is a member of the efficient set % uv . The special case where P is the edge {u, v} itself is the Inherent Edge Feasibility (IEF) problem. In the ensuing edge and node feasibility algorithms, we utilize a special binary operation, denoted by ], on sets of vectors. Recall from Eq. (2.1) that the sum of two kcomponent vectors is another vector obtained by adding the corresponding components. Given two sets #1 and #2 of k-component cost vectors, the set #1 ] #2 is constructed in 34 TAYI, ROSENKRANTZ, AND RAVI two steps: First, a set of vectors # is obtained by adding each pair of vectors 9 and 99, where 9 [ #1 and 99 [ #2. Then, the set #1 ] #2 is the maximal subset of # such that no vector in #1 ] #2 is strictly dominated by a vector in #. for feasibility. Zone z i is feasible if and only if the efficient set % uv contains a vector whose ith component is nonzero. (b) Edge Feasibility (EF): Let e 5 {a, b}, with cost vector CV(e), be the edge to be tested for feasibility. In the following description, we assume that nodes u, v, a, and b are all distinct. (Minor modifications can be made for cases when this is not so.) The steps of the algorithm are as follows: We now establish strong NP-hardness for feasibility problems for the case of variable number of zones. These results use reductions from the EXACT COVER BY 3-SETS (X3C) problem defined in Section 4. 1. Compute the efficient set % uv for G. 2. Let G9 be the graph obtained from G by deleting edge e 5 {a, b}. Compute efficient sets % ua , % bv , % ub , and % av for G9. 1 2 1 3. Let %9uv 5 %uv ø %uv , where %uv 5 ((%ua ] {CV(e)}) ] 2 %bv) and %uv 5 ((%ub ] {CV(e)}) ] %av). 4. Edge e is infeasible if and only if % uv and %9uv are disjoint. Proof. We sketch the reduction from X3C to PF. Minor modifications of this reduction establish the strong NPhardness of the other problems. The reduction from X3C to PF is identical to that described in the proof of Proposition 4.2, except that the edge {v 1 , v q11 }, with cost vector (3/2, 1, 1, . . . , 1), is added. It can be verified that the path P consisting of the single edge {v 1 , v q11 } is infeasible if and only if the X3C instance has a solution. This establishes the strong NP-hardness of PF as well as EF, since the path P consists of a single edge. To prove the strong NP-hardness of NF, we add a node w to the edge {v 1 , v q11 }. The cost vectors associated with {v 1 , w} is (3/2, 1, 1, . . . , 1) and with {w, v q11 } is (0, 0, . . . , 0). Now, it can be verified that node w is infeasible if and only if the X3C instance has a solution. To prove the strong NP-hardness of ZF, we introduce a new zone z 3q11 . For each edge except {v 1 , v q11 }, the first 3q components of the cost vector remain the same while the (3q 1 1)st component is zero. For the edge {v 1 , v q11 }, the cost vector is (3/2, 1, 1, . . . , 1, 1). Now, it can be verified that z 3q11 is infeasible if and only if the X3C instance has a solution. ■ The validity of the above algorithm can be seen from the following: By definition, the set % uv contains all vectors associated with feasible paths from u to v. Efficient sets % ua , % ub , etc., are interpreted similarly. Since G is undirected, some of the paths in % uv may traverse edge e in either direction while others may not traverse the edge at all. 1 From Step 3 above, % uv is the efficient set containing vectors corresponding to paths from u to v which contain 2 edge e and traverse it in the direction a to b. Similarly, % uv is the efficient set containing vectors corresponding to feasible paths from u to v which contain edge e but traverse it in the direction b to a. Therefore, %9uv is the efficient set containing vectors corresponding to feasible paths from u to v which contain edge e, traversing it in either direction. Suppose that edge e is feasible. Then, it must be part of a feasible path between u and v. The cost vector for this path is in both % uv and %9uv , so these sets have a vector in common. On the other hand, if there is at least one vector common to both % uv and %9uv , then there is a feasible path corresponding to that vector which traverses edge e. In other words, edge e is feasible. (c) Node Feasibility (NF): Let a be the node to be tested for feasibility. Initially, efficient set % uv is computed for G. After computing the efficient sets % ua and % va for G, the set %9uv is computed as %9uv 5 % ua ] % va . Note that sets % ua and % va are interpreted as in the algorithm for edge feasibility. Now, node a is infeasible if and only if % uv and %9uv are disjoint. This can be proven using an argument similar to that of edge feasibility. (d) Zone Feasibility (ZF): Let z i be the zone to be tested 5.3. Variable Number of Zones Proposition 5.2. The problems PF, EF, NF, and ZF are strongly NP-hard for variable number of zones. It can be seen from the preceding reductions that all the feasibility problems are NP-hard even for two terminal series-parallel networks. Of the various feasibility problems studied in the preceding section, the complement of PF (i.e., path infeasibility) can be shown to be in NP. This is because the infeasibility of a given path P between a pair of nodes can be checked by guessing another path P9 between the same pair of nodes and verifying that P9 strictly dominates P. Similarly, the complement of IEF (i.e., inherent-edge infeasibility) is in NP. However, for the other problems, this “guess-and-verify” process breaks down because the number of paths to be considered may not be polynomial in the size of the problems. Thus, it is not clear whether the other infeasibility problems are in NP. 6. CONCLUSIONS In this paper, we considered path problems in networks where each edge is associated with a cost vector. Specifi- PATH PROBLEMS IN VECTOR-VALUED EDGE-WEIGHT NETWORKS cally, we considered routing and feasibility problems for transportation networks in which an accident along an edge can impact multiple zones. The results obtained include the complexities of various problems with fixed and variable number of zones, polynomial and pseudo-polynomial algorithms, and lower bounds on performance guarantees obtainable in polynomial time. Further, we presented comparisons between solutions to routing problems under different definitions of equity of cost. These comparisons bring out the asymmetric behavior of these problems. REFERENCES [1] [2] [3] [4] [5] Y.P. Aneja, V. Aggarwal, and K.P.K. Nair, Shortest chain subject to side constraints, Networks 13 (1983), 295–302. R. Batta and S.S. Chiu, Optimal obnoxious paths on a network: Transportation of hazardous materials, Oper Res 36 (1988), 84 –92. T. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to algorithms, MIT Press, Cambridge, MA, 1991. R.J. Duffin, Topology of series-parallel networks, J Math Appl 10 (1965), 303–318. M.R. Garey and D.S. Johnson, Computers and intractability: A guide to the theory of NP-completeness, W.H. Freeman, San Francisco, CA, 1979. 35 [6] R. Gopalan, R. Batta, and M.H. Karwan, The equity constrained shortest path problem, Comput OR 17 (1990), 297–307. [7] G.Y. Handler and I. Zang, A dual algorithm for the constrained shortest path problem, Networks 10 (1980), 293– 310. [8] R. Hassin, Approximation schemes for the restricted shortest path problem, Math OR 17 (1992), 36 – 42. [9] M.I. Henig, The shortest path problem with two objective functions, Eur J Oper Res 25 (1985), 281–291. [10] H.T. Kung, F. Luccio, and F.P. Preparata, On finding the maxima of a set of vectors, J ACM 22 (1975), 469 – 477. [11] K.H. Rosen, Discrete mathematics and its applications, 2nd ed., McGraw-Hill, New York, 1991. [12] R. Steuer, Multiple criteria optimization: Theory and applications, Wiley, New York, 1986. [13] A. Warburton, Approximation of pareto optima in multipleobjective, shortest-path problems, Oper Res 35 (1987), 70 – 79. [14] A.B. Wijeratne, M.A. Turnquist, and P.B. Mirchandani, Multiobjective routing of hazardous materials in stochastic networks, Eur J Oper Res 65 (1993), 33– 43. [15] D.E. Willard and G.S. Lueker, Adding range restriction capability to dynamic data structures, J ACM 32 (1985), 597– 617.

1/--страниц