Deferred-Query: An Efficient Approach for Some Problems on Interval Graphs Maw-Shang Chang, Sheng-Lung Peng, Jenn-Liang Liaw Department of Computer Science and Information Engineering, National Chung Cheng University, Min Hsiung, Chiayi 621, Taiwan, Republic of China Received 14 April 1992; accepted 25 April 1996 Abstract: This paper introduces the idea of a deferred-query approach to design O(n) algorithms for the domatic partition, optimal path cover, Hamiltonian path, Hamiltonian circuit, and maximum matching problems on interval graphs given n endpoint-sorted intervals. The previous best-known algorithms run in O(n log log n) or O(n 1 m) time, where m is the number of edges in the corresponding interval graphs. © 1999 John Wiley & Sons, Inc. Networks 34: 1–10, 1999 Keywords: interval graphs; graph algorithms; domatic partition; Hamiltonian path; Hamiltonian circuit; optimal path cover; maximum matching 1. INTRODUCTION Interval graphs have been used in many practical applications. In particular, they have applications in genetics, archaeology, biology, psychology, sociology, engineering, computer scheduling, traffic control, and storage information retrieval [8, 15, 30]. A graph G 5 (V, E) is an interval graph if its vertices can be put into a one-to-one correspondence with a family I of intervals on the real line such that two vertices are adjacent in G if and only if their corresponding intervals have a nonempty intersection. The family I of intervals is called an interval model for G. If no interval in I is properly contained in some other interval in I, then G is called a proper interval graph. For an interval family I, G(I) denotes the graph constructed from I. Booth Correspondence to: M.-S. Chang; e-mail: mschang@cs.ccu.edu.tw Contract grant sponsor: National Science Council of the Republic of China; contract grant number: NSC 81-0408-E-194-04 AMS(MOS) subject classifications: 05C85, 68Q25, 68Q20, 68R10, 90C27 © 1999 John Wiley & Sons, Inc. and Lueker [6] gave a linear algorithm for deciding whether a given graph is an interval graph and constructing, in the affirmative case, the required interval family. An interval model of an interval graph can be constructed in O(m 1 n) time [6], where n and m are the number of vertices and edges, respectively, of the input graph. Besides, the endpoints of constructed intervals can be restricted to be distinct integers between 1 and 2n. On the other hand, if we are given an interval model where endpoints are not all integers, we can sort the endpoints in a nondecreasing order and use the indices of endpoints in the sorted list to construct a new interval model. The endpoints of the new interval model will be restricted to distinct integers between 1 and 2n. Hence, most researchers on interval graphs are interested in the complexity of problems on a graph G(I) where the endpoints of I are restricted to distinct integers between 1 and 2n [24]. We refer to such a set of intervals (i.e., satisfying the restriction that the endpoints are distinct integers between 1 and 2n) as a set of sorted intervals. For simplicity, we refer to a problem on G(I) given a set I of (sorted) intervals as a problem on a set I of (sorted) intervals. CCC 0028-3045/99/010001-10 1 2 CHANG, PENG, AND LIAW A set of vertices D is a dominating set of a graph G 5 (V, E) if every vertex in V 2 D is adjacent to a vertex in D. The domatic number of G, denoted d(G), is the maximum number of pairwise disjoint dominating sets in G. The domatic number was defined and studied in [7]. In particular, the authors showed that d(G) # d 1 1 for any graph G, where d is the minimum degree of vertices in G. G is domatically full if d(G) 5 d 1 1. Farber [10] showed that strongly chordal graphs are domatically full. Interval graphs are domatically full since they are special strongly chordal graphs. The domatic partition problem is to partition V into d(G) disjoint dominating sets. The domatic partition problem of a graph is NP-hard on general graphs [13]. Bertossi [3] solved this problem in O(n 2.5 ) time for interval graphs and O(n log n) for proper interval graphs. Independently, O(m 1 n) time algorithms were given in [22] and [29] for the domatic partition problem on interval graphs. It can be solved in linear time for strongly chordal graphs [18, 28]. Bonuccelli [5] showed that it remained NP-hard for circular-arc graphs and gave an O(n 2 log n) time algorithm for proper circular-arc graphs. Recently, Kaplan and Shamir [18] showed it remained NP-hard for split graphs and bipartite graphs. We say a path P of a graph G covers a vertex v of G if P visits v. A path cover for a graph G is a set of vertex disjoint paths that cover all the vertices of G. An optimal path cover of G is a path cover of minimum cardinality. The optimal path cover problem involves finding an optimal path cover for the given graph. Like many other graph problems, this problem is NP-hard for arbitrary graphs [13]. Arikati and Rangan [2] showed it can be solved in O(m 1 n) time on interval graphs. Hsu et al. [17] gave an O(n log n) algorithm for the problem on a set of intervals. A path (circuit) in G is simple if no vertex occurs more than once. A Hamiltonian path in a graph G is a simple path including all vertices. A Hamiltonian circuit in a graph G is a simple circuit including all vertices. The problems of finding a Hamiltonian path and a Hamiltonian circuit are NP-hard for some classes of graphs, such as bipartite graphs [20], edge graphs [4], planar cubic 3-connected graphs [14], and some other classes. Keil [19] showed that the Hamiltonian circuit problem can be solved in O(m 1 n) time on interval graphs. Manacher et al. [24] showed an O(n log n) algorithm for solving the Hamiltonian path and Hamiltonian circuit problems on a set of intervals. A set of pairwise disjoint edges is called a matching. A matching of maximum cardinality is called a maximum matching. The maximum matching problem is defined to be the problem of finding a maximum matching for the given graph. There are algorithms which solve this problem on interval graphs in O(m 1 n) time [26], on bipartite graphs in O(n 2.5 ) time [16], on circular-arc graphs given arc models in O(n log n) time [21], and on general graphs in O(n 2.5 ) or O( =nm) time [9, 25]. In this paper, we present a new approach, called the deferred-query approach, to design efficient algorithms for the domatic partition, optimal path cover, Hamiltonian path, Hamiltonian circuit, and maximum matching problems on a set of sorted intervals. Using this approach, all of the above problems can be solved in O(n) time. Note that all the previous results for these problems remain O(m 1 n) or O(n log log n) on a set of sorted intervals. Manacher et al. [24] posted an open question: Can the Hamiltonian circuit and Hamiltonian path problems be solved in O(n) time on a set of sorted intervals? By the results of this paper, the answer is positive. Throughout the paper, we assume that I 5 {v 1 , v 2 , . . . , v n } is a set of n sorted intervals. For i 5 1, 2, . . . , n, let a i and b i denote the coordinates of the left and right endpoints, respectively, of interval v i . The rest of this paper is organized as follows: The algorithms for the domatic partition problem are presented in Section 2. Section 3 presents the algorithms for the optimal path cover problem. The Hamiltonian path and circuit problems are discussed in Section 4. The maximum matching problem is solved in Section 5. Concluding remarks are given in Section 6. 2. THE DOMATIC PARTITION PROBLEM O(n) algorithms for the domatic partition problem on a set of sorted intervals were independently developed in [23] and [27] by different approaches. The deferred-query approach originated from the algorithm for the domatic partition problem on a set of sorted intervals by Peng and Chang [27]. In this section, we will briefly describe the algorithm presented in [27] as an example to show how to solve problems on a set of sorted intervals by a deferredquery approach. Since interval graphs are domatically full [10], the algorithm first finds the minimum degree of the input graph. It is easy to find the minimum degree of a graph in O(m 1 n) time. But this problem can be solved in O(n) time on a set of sorted intervals [23, 27]. In Section 2.1, we describe a simple greedy algorithm with O(n log log n) running time [23, 27]. In Section 2.2, we show how a deferred-query approach is used to design an O(n) algorithm, which obtains the same output as that of the algorithm described in Section 2.1. 2.1. A Greedy Algorithm In this section, intervals of I are numbered from 1 to n in the increasing order of their left endpoints. Hence, for any two intervals v i and v j , if i , j, then a i , a j . The algorithm maintains d 1 1 disjoint sets of intervals, where d is the minimum degree of G(I). Initially, these sets are empty. Then, intervals are visited one by one in the increasing order to determine which set it should be placed into. At the end of the algorithm, every interval is placed into a set and every set is a dominating set. We first define some variables which DEFERRED-QUERY APPROACH are important for the algorithm: DP 5 {P i : P i is a subset of I, 1 # i # d 1 1} is a family of d 1 1 disjoint sets of intervals. Each set P i is associated with a variable pr i which is the maximum b j of v j in P i for 1 # i # d 1 1. pr i is called the maximum right endpoint of P i . Initially, all P i are empty. Intervals are placed into a set of DP one by one. PQ is a priority queue which keeps the maximum right endpoints of all of the sets of DP. PQ supports the operations of finding the minimum element of PQ and inserting an element into and deleting an element from PQ. In fact, this algorithm is a greedy one. When it visits an interval, it places the interval into a set of DP. Based upon a greedy rule, it selects the set whose current maximum right endpoint is the minimum among the sets of DP. Algorithm GDP. Find a domatic partition of G(I) given a set I of sorted intervals. Input. A set of sorted intervals I 5 {v 1 , v 2 , . . . , v n }. Output. A domatic partition DP of size d 1 1. Method. 1. Find the minimum degree d of G(I); 2. for i 5 1 to d 1 1 do 3. Pi 4 f; 4. pr i 4 i 2 ( d 1 1); 5. Insert pr i into PQ; end for 6. for i 5 1 to n do 7. Find the minimum pr k from PQ; 8. P k 4 P k ø {v i }; 9. if b i . pr k then do 10. Delete pr k from PQ; 11. pr k 4 b i ; 12. Insert pr k into PQ; end if end for; 13. Output DP. The correctness of this algorithm has been proven in [23, 27]. The time complexity of Algorithm GDP is dominated by the operations of finding the minimum element of PQ and inserting an element into and deleting an element from PQ. Thus, if d . log log n, we can use a two-level priority queue [33] to implement PQ. This data structure uses O(n) space and it takes O( log log n) time to complete each of the above-mentioned operations. Therefore, the time complexity of Algorithm GDP is O(n log log n). 2.2. A Linear-time Algorithm Using a Deferred-query Approach The most time-consuming steps of Algorithm GDP involves finding the minimum element of PQ, inserting an element 3 into PQ, and deleting an element from PQ. The two-level priority queue [33] is the most efficient data structure to support these three operations as far as we know. We want to improve the time complexity of this algorithm from O(n log log n) to O(n). The new algorithm visits the endpoints of I one by one in the increasing order. Each time the left endpoint of an interval v i is visited, it determines which set v i should be placed into it based upon the same greedy rule as that used by Algorithm GDP. It keeps the set number of those sets whose maximum right endpoints have been visited in a queue NQ in the increasing order of their maximum right endpoints. Initially, we assume that NQ contains d 1 1 sets whose maximum right endpoints have been visited. Since the algorithm visits the endpoints in increasing order and a set number is placed into NQ when the algorithm visits its maximum right endpoint, the set numbers in NQ are easily kept in the increasing order of their maximum right endpoints. When the left endpoint of interval v i is visited, the set number of the set into which v i should be placed will be the first element of NQ if NQ is not empty. Thus, the set into which v i should be placed can be determined in constant time in this case. It is possible that NQ 5 A when the left endpoint of interval v i is being visited. In this case, the algorithm then defers the decision until it visits the maximum right endpoint of a set of DP. This is the spirit of the deferred-query approach. A queue, called VQ, is then used to keep those interval numbers of intervals whose left endpoints have been visited and which have not been placed into any set of DP. Since the algorithm visits the endpoints in the increasing order and an interval number is placed into VQ when the algorithm visits its left endpoint, interval numbers in VQ are easily kept in the increasing order by their left endpoints. In summary, if the endpoint currently visited is the left endpoint of interval v i , we place v i into the set whose set number is the first element of NQ if NQ Þ A and we append i to VQ otherwise. If we place v i into the set whose set number is the first element of NQ, the maximum right endpoint of this set becomes b i . Of course, it has not been visited. Thus, we delete the first element of NQ from NQ in this case. On the other hand, suppose that the endpoint currently visited is the right endpoint of an interval v i . There are two cases: In Case 1, v i has been placed into a set of DP, and in Case 2, v i has not been placed into any set of DP. In Case 1, b i is the maximum right endpoint of set DN(i), which is the set containing v i , since the algorithm always places an interval into a set after both the left endpoint of this interval and the maximum right endpoint of this set have been visited. If VQ is empty, DN(i) is appended to the end of NQ. Note that VQ 5 A implies that all intervals whose left endpoints have been visited have been placed into some sets of DP. On the other hand, suppose that VQ is not empty. Then, NQ must be empty because interval numbers are appended to VQ only if NQ 5 A and a set 4 CHANG, PENG, AND LIAW number is appended to NQ only if VQ 5 A. Let j be the first element of VQ. Apparently, v j should be placed into the set containing v i by the greedy rule. But we have to be cautious of the new maximum right endpoint of this set. If b j , b i , then b i is still the maximum right endpoint of the set containing v i after v j is placed into the set. Hence, the algorithm executes the following while loop: 18. 19. while (VQ Þ A and b j , b i , where j is the first element of VQ) do 1. Place v j into the set containing v i , that is, DN(i); 2. Delete j from VQ; end while 21. 22. Then, if VQ 5 A, the set number of the set containing v i is appended to the end of NQ. Otherwise, place v j , where j is the first element of VQ, into the set containing v i and delete j from VQ. In Case 2, the algorithm does nothing. In summary, the new algorithm always determines the set into which an interval should be placed based upon the same greedy rule as the old algorithm uses. If the minimum of the maximum right endpoints of sets in DP is unknown when the left endpoint of an interval is visited, then the algorithm defers the decision until the minimum of the maximum right endpoints of sets in DP is ready. The details of this new algorithm are presented in the following: Algorithm DDP. To find a domatic partition of G(I) given a set I of sorted intervals. Input. A set of sorted intervals I 5 {v 1 , v 2 , . . . , v n }. Output. A domatic partition DP of size d (G(I)) 1 1. Method. 1. 2. 3. 4. 5. 6. 7. 9. 10. 11. 12. 13. 14. 15. 16. Find the minimum degree d of G(I); NQ 4 {1, 2, . . . , d 1 1}; VQ 4 f ; for i 5 1 to n do DN(i) 4 0; for h 5 1 to 2n do Case 1. h 5 a i of some interval v i : if NQ 5 f then append i to the end of VQ; else do Let j be the first element of NQ; DN(i) 4 j and delete j from NQ; end else Case 2. h 5 b i of some interval v i : if DN(i) Þ 0 then do while (VQ Þ A and b j , b i where j is the first element of VQ) do Place v j into the set containing v i ; Delete j from VQ; end while if VQ 5 A then 17. 20. Append DN(i) to the end of NQ; else Let j be the first element of VQ; DN( j) 4 DN(i), i.e. place v j to the set containing v i ; Delete j from VQ; end if else end if end for P i 4 {v j : DN( j) 5 i}; Output DP 5 {P i : 1 # i # d 1 1}. Theorem 1. Algorithm DDP partitions a set of sorted intervals into d 1 1 dominating sets in O(n) time and O(n) space. Proof. It is straightforward to verify that the output of Algorithm DDP is the same as that of Algorithm GDP. Thus, the correctness follows Algorithm GDP. Space and time complexity are easily seen to be linear. ■ Instead of maintaining a sophisticated data structure to store enough information to be used at any time, Algorithm DDP uses a simple data structure to keep information and makes decisions at a time when enough information has been gathered. We refer to this algorithm design approach as deferred-query. We will give some other examples that use a deferred-query approach in the following sections. 3. THE OPTIMAL PATH COVER PROBLEM We first present an O(n log n) time greedy algorithm for the optimal path cover problem on a set of sorted intervals [17] in Section 3.1. Then, we devise an O(n) algorithm based upon this greedy algorithm by a deferred-query approach in Section 3.2. In this section, we number the intervals of I from 1 to n in increasing order according to their right endpoints. Hence, for any two intervals v i and v j , b i , b j if i , j. This ordering will be used in the rest of this paper. 3.1. A Greedy Algorithm In this subsection, we present a greedy algorithm for the optimal path cover problem on a set of sorted intervals [17]. This algorithm first starts with a single-vertex path P that starts from v1 and ends at v1. Then, based on a greedy principle, it extends the path P. By the induction hypothesis, suppose that P 5 x 1 3 x 2 3 . . . 3 x s , s $ 1. The smallest uncovered neighbor of x s is chosen and P is extended to cover it. If no such neighbor exists, then path P is stopped and, if possible, a new path is started in the remaining graph from the smallest-numbered uncovered vertex. Let S be a set of numbers and min S denote a minimum DEFERRED-QUERY APPROACH number in S. The algorithm is formally presented in the following: Algorithm GOPC. Find an optimal path cover of G(I) of a set I of sorted intervals. Input. A set of sorted intervals I 5 {v 1 , v 2 , . . . , v n }. Output. An optimal path cover of G(I). Method. 1. U 4 I; r 4 1; 2. P r 4 v 1 ; U 4 U 2 {v 1 }; 3. while U Þ f do 4. Let x s be the path-end of P r ; 5. S 4 {v j : ( x s , v j ) [ E, v j [ U}; 6. if S Þ f then 7. k 4 min{ j: v j [ S}; P r 4 P r 3 v k ; else 8. k 4 min{ j: v j [ U}; r 4 r 1 1; P r 4 v k ; 9. U 4 U 2 {v k }; end while 10. Output P 1 , . . . , P r ; Careful implementation of this algorithm runs in O(n log log n) time on a set of sorted intervals [17]. 3.2. A Linear-time Algorithm Using a Deferred-Query Approach Based upon the greedy Algorithm GOPC, presented in the previous subsection, we devise a new algorithm by using a deferred-query approach. Similarly, we start from a single vertex path P 1 5 v 1 . Algorithm GOPC tries to extend path P 1 to cover another interval. Based upon the greedy principle, it selects the smallest uncovered neighbor of the path-end of P 1 , that is, v1 for the present. This is a timeconsuming step. By a deferred-query approach, we defer the job of selecting an uncovered interval to extend path P 1 . We continue to visit the next interval v2. If v2 is adjacent to v1, then, obviously, v2 is the smallest uncovered neighbor of the path-end of path P 1 . We then extend path P 1 to v2. Hence, path P 1 now has two intervals, that is, P 1 5 v 1 3 v 2 , and v2 becomes the new path-end of P 1 . On the other hand, suppose that v2 is not adjacent to v1. We start another new single vertex path P 2 5 v 2 . Now, we have a path P 1 5 v 1 3 v 2 or two paths P 1 5 v 1 and P 2 5 v 2 depending on whether v2 is adjacent to v1 or not. Without loss of generality, suppose that we have P 1 5 v 1 3 v 2 3 . . . 3 v i21 before we visit v i . We do not exactly find the smallest uncovered neighbor of the path-end of P 1 , which is v i21 at this time. We simply go on visiting v i . Of course, v i is the smallest uncovered neighbor of v i21 if v i is adjacent to v i21 and we extend P 1 to cover v i based upon the greedy rule in this case. But if v i is not adjacent to v i21 , 5 assume that we know interval v c 1, which is the smallest uncovered neighbor of v i21 . We start P 2 from v i , that is, P 2 5 v i . Of course, P 1 and P 2 should be linked into one path by v c 1, that is, P 1 3 v c 1 3 P 2 , when we visit v c 1 later. It is possible that v c 1 does not exist. In that case, P 1 will be a path output by the new algorithm and it is also the first path output by Algorithm GOPC. So we have to determine whether the interval currently visited is v c 1 when we visit other intervals later. In fact, the first neighbor of v i21 that we visit later must be v c 1. Now, suppose that we have two paths P 1 5 v 1 3 v 2 3 . . . 3 v i21 and P 2 5 v i immediately after we visit v i . Next, we visit v i11 . There are three cases: In Case 1, v i11 is adjacent to v i21 . Obviously v i11 is the smallest uncovered neighbor of v i21 when path P 1 reaches v i21 , that is, v c 1. Hence, P 1 and P 2 are connected into one path, that is, P 1 3 v c 1 [ v i11 3 P 2 . The new path will be the same as the path obtained by Algorithm GOPC when P 1 reaches v i . In Case 2, v i11 is adjacent to the path-end of P 2 , v i , but is not adjacent to the path-end of P 1 , v i21 . We extend P 2 to cover v i11 . In other words, we assume that we know v c 1 and continue to extend the path the way Algorithm GOPC does in this case. In Case 3, v i11 is not adjacent to the path-ends of either P 2 or P 1 . We will start a new path P 3 assuming that we know v c 2, which is the smallest uncovered neighbor of the path-end of P 2 . Of course, v c 2 will be used to link P 2 and P 3 into one path later when we visit it. But from now on, we should determine whether the interval currently visited is v c 1 or v c 2 or the smallest uncovered neighbor of the path-end of P 3 . Assume that we have three paths immediately before we visit v i12 . We say a path P 1 is to the left of path P 2 if the path-end of P 1 is less than the path-end of P 2 . To determine whether the interval currently visited is v c 1 or v c 2 or the smallest uncovered neighbor of the path-end of P 3 , we simply find the leftmost path-ends of paths P 1 , P 2 , and P 3 that are adjacent to v i12 . If the currently visited interval is adjacent to the path-end of P 1 , then it is v c 1. If the currently visited interval is not adjacent to the path-end of P 1 but is adjacent to the path-end of P 2 , then it is v c 2. If it is not adjacent to the path-ends of either P 1 or P 2 but is adjacent to the path-end of P 3 , then it is the smallest uncovered neighbor of the path-end of P 3 . If it is not adjacent to any of the path-ends of P 1 , P 2 , and P 3 , then we start P 4 assuming that we know v c 3, which will be used to link P 3 and P 4 into a path. In summary, we visit intervals one by one in increasing order to extend paths to cover intervals. Initially, there are no paths. When we visit an interval v i , we find the two leftmost paths that can be connected into one path by v i and connect them. If v i is only adjacent to the path-end of the rightmost path, then we extend the rightmost path to cover v i . Suppose that v i is adjacent to none of the path-ends of any paths. We start a new path which has the only interval v i . The algorithm stops after all intervals have been visited. 6 CHANG, PENG, AND LIAW The paths obtained in this way form an optimal path cover. It is straightforward to verify that the output of this algorithm is the same as that of Algorithm GOPC. Hence, this algorithm is correct. But in the new algorithm, it seems difficult to implement a constant-time operation to find the two leftmost paths that can be connected into one path by v i . Our studies showed that this could be accomplished by using Gabow and Tarjan’s static disjoint set union algorithm where the union tree is a path [11]. Before discussing the algorithm in detail, we briefly introduce the disjoint set union-find problem [1, pp. 124 –125; 11]. The problem is to carry out three kinds of operations on disjoint sets: makeset(x): Create a new singleton set { x} whose name is x for an element x which is in no existing set. find(x): Return the name of the set containing element x. union( x, y): Form a new set that is the union of the sets containing x and y, destroying the two old sets. The name of the new set is the name of the old set containing x. The fastest known algorithm for this problem runs in O( p a ( p 1 q, q) 1 q) time and O(q) space, where a is an inverse of Ackermann’s function [31, 32], q is the number of elements, and p is the number of the find operations performed. A special case of the disjoint set union-find problem can be formalized as follows: We are given a tree T of q nodes that correspond to the initial q singleton sets. Let the parent of the node v in T be denoted by parent(v). The problem involves performing a sequence of union and find operations such that each union can be only of the form union( parent(v), v). T is called the static union tree and the problem is referred to as the static tree set union. For this special case, each union and find operation can be supported in O(1) amortized time on a random access machine, and the total space required is O(q) [11]. Now, we briefly describe the data structures used by the deferred-query algorithm for the optimal path cover problem on a set of intervals. We call the sets of left endpoints {a j : b i21 , a j , b i } for 1 # i # n consecutive left endpoint sets, where b 0 5 0. It is possible that a consecutive left endpoint set is empty. The algorithm first scans endpoints to find all consecutive left endpoint sets. The algorithm forms a set for each consecutive left endpoint set by using makeset and union operations. Initially, a consecutive left endpoint set is associated with the right endpoint that is closest to the largest left endpoint in the set. Hence, the consecutive left endpoint set {a j : b i21 , a j , b i } is associated with right endpoint b i . What we mean by saying that a set is associated with a right endpoint is that both of the operations of finding the set from the right endpoint and finding the right endpoint from the set can be done in constant time. This can be done easily by pointers. Besides the left endpoint sets, the algorithm also maintains a list L. The elements stored in L are right endpoints. Initially, L 5 {b 1 , b 2 , . . . , b n }. Elements of L are stored in a list in increasing order. We use next(i) to keep the interval number of the next interval of v i in the path if v i is in a path. If v i is currently the path-end of a path, then start(i) stores the interval number of the path-start of this path. Thus, if next(i) 5 0, then v i is the path-end of a path starting from interval v start(i) . When we start a new path that starts from and ends at v i , we simply let start(i) 5 i and next(i) 5 0. If start(i) 5 i, then v i is the path-start of a path. We present the algorithm formally in the following and then explain it below. Algorithm DOPC. Find an optimal path cover of G(I) of a set I of sorted intervals. Input. A set of sorted intervals, I 5 {v 1 , v 2 , . . . , v n }. Output. An optimal path cover of G(I). Method. 1. Scan the endpoints to find all consecutive left point sets. Each right endpoint b i , 1 # i # n, is associated with the consecutive left endpoint set {a j : b i21 , a j , b i }. 2. L 4 {b 1 , b 2 , . . . , b n }; 3. for i 5 1 to n do 4. Find the left endpoint set S containing a i ; 5. Find the right endpoint b k associated with S; 6. if k 5 i then {next(i) 4 0; start(i) 4 i}; 7. if k Þ i then 8. Let b h be the successor of b k in L; 9. if h 5 i then 10. next(k) 4 i; start(i) 4 start(k); 11. Unite the set associated with b k to the set associated with b i ; 12. Let the new set be associated with b i ; 13. Delete b k from L; end if 14. if h Þ i, then 15. next(k) 4 i; next(i) 4 start(h); start(h) 4 start(k); 16. start(i) 4 start(k); start(start(h)) 4 start(k); 17. Unite the set associated with b k to the set associated with b h ; 18. Let the new set be associated with b h ; 19. if i Þ n then 20. Unite the set associated with b i to the set associated with b i11 ; 21. Let the new set be associated with b i11 ; end if 22. Delete both b k and b i from L; end if DEFERRED-QUERY APPROACH end if end for 23. for each v i [ L do 24. Output a path starting from v start(v i) through next pointer to v i ; end for The algorithm visits intervals one by one in increasing order according to the for loop of line 3. It maintains two invariants: Invariant 1: For each right endpoint b i in L, b i is associated with the left endpoint set {a j : b p , a j , b i }, where b p is the predecessor of b i in L. Invariant 2: The right endpoints in L can be partitioned into two parts immediately before interval v i is visited. One part is the set of right endpoints which are less than b i . The other part is the set of right endpoints which are equal to or greater than b i . The former are all path-ends of paths constructed so far. The latter are all right endpoints of unvisited intervals. It is easy to verify that both invariants are true initially. The algorithm first finds the left endpoint set S that contains a i when visiting v i (line 4). Suppose that S is associated with right endpoint of v k (line 5). By Invariant 1, if k 5 i, than v i is not adjacent to any path-end of paths obtained so far. Otherwise, v k is the leftmost path-end that is adjacent to v i . Thus, if k 5 i, we start a new path that has only one vertex v i , by setting next(i) 5 0 and start(i) 5 i (line 6). Obviously, both invariants are still true after we start a new path. Suppose that k Þ i and b h is the successor of b k in L (lines 7 and 8). By Invariant 1, if h 5 i, then v i is adjacent to only the rightmost path-end. In this case, we extend the path whose path-end is v k by setting next(k) 5 i and start(i) 5 start(k) (line 10). To maintain both invariants, we unite the set associated with b k to the set associated with b i and let the new set be associated with b i (lines 11 and 12). Then, we delete b k from L (line 13). Thus, both Invariants 1 and 2 are maintained to be true after we extend the rightmost path to v i in this case. On the other hand, suppose that h Þ i (line 14). The two paths ending at v k and v h , respectively, are connected into one path by v i . Thus, we set next(k) 5 i, next(i) 5 start(h), start(h) 5 start(k) (line 15). To indicate that both v i and v start(h) are not path-starts any more, we set start(i) 5 start(k) and start(start(h)) 5 start(k) (line 16). To maintain both invariants, we unite the set associated with b k to the set associated with b h and let the new set be associated with b h (lines 17 and 18). Since v i is not a path-end, we unite the set associated with b i to the set associated with b i11 if i Þ n (line 20). Also, let this new set be associated with b i11 (line 21). Then, we delete both b k 7 and vi from L since they are not path-ends any more (line 22). Let the left endpoints be sorted in increasing order where a i 1 , a i 2 , . . . , a i n. We called this the increasing left endpoint sequence. By Invariant 1, it is easy to see that every left endpoint set formed by the algorithm is an interval of the increasing left endpoint sequence. Besides, only two sets of adjacent intervals of the increasing left endpoint sequence will be united to form a new set. This is a special case of the static tree set union problem where the union tree is a path [11], also known as interval union-find problem [12]. Thus, we have the following theorem: Theorem 2. Given a set I of sorted intervals, Algorithm DOPC computes an optimal path cover of G(I) in O(n) time and space. 4. THE HAMILTONIAN PATH AND HAMILTONIAN CIRCUIT PROBLEM If Algorithm DOPC produces only a path, then it is a Hamiltonian path. Otherwise, there does not exist any Hamiltonian path in the interval graph G. Thus, the Hamiltonian path problem can be solved in O(n) time on a set of sorted intervals by using Algorithm DOPC. In [24], Mancher et al. gave an O(n log log n) algorithm for the Hamiltonian circuit problem on a set of sorted intervals. In the same paper, the authors posted a question to ask whether it can be reduced to O(n). Their algorithm consists of two steps: Their first step decides whether G(I) has a Hamiltonian path, and it constructs, in the affirmative case, a Hamiltonian path P in O(n log log n) time. Using the Hamiltonian path P constructed in the first step as input, the second step produces a Hamiltonian circuit in O(n) time if and only if G(I) has a Hamiltonian circuit. In fact, their algorithm for the first step is equivalent to Algorithm GOPC presented in Section 3.1 except that it halts whenever the first path cannot be extended to cover any interval. Hence, the first step of their algorithm can be done in O(n) time by using Algorithm DOPC. This leads to the following theorem: Theorem 3. The Hamiltonian path and Hamiltonian circuit problems can be solved in O(n) time and space on a set of sorted intervals. 5. THE MAXIMUM MATCHING PROBLEM The maximum matching problem on interval graphs was discussed in [26]. Moitra and Johnson [26] used a greedy approach to design a sequential algorithm for this problem in O(m 1 n) time. In fact, it can be easily implemented to run in O(n log n) time [21]. We first present this greedy 8 CHANG, PENG, AND LIAW algorithm. Then, by applying a deferred-query approach and using the static disjoint set union-find data structure where the union tree is a path [11], we solve this problem in O(n) time on a set of sorted intervals. Note that intervals are numbered from 1 to n in the increasing order of their right endpoints. Liang and Rhee [21] solved the maximum matching problem on a circular-arc graph in O(n log n) time by reducing the problem to two instances of the maximum matching problem on interval graphs. By our results, Liang and Rhee’s algorithm can be implemented to run in O(n) time on circular-arc graphs given endpoint-sorted arcs. 5.1. A Greedy Algorithm The main idea of the greedy algorithm given by Moitra and Johnson [26] is as follows: Initially, all intervals are unmatched. The algorithm visits intervals one by one in the increasing order. If the interval v i currently visited is unmatched and it has an unmatched neighbor, then match v i with v k , the smallest unmatched neighbor of v i . The algorithm is formally presented in the following: Algorithm GM. Finding a maximum matching of G(I) of a set I of sorted intervals. Input. A set of sorted intervals I 5 {v 1 , v 2 , . . . , v n }. Output. A maximum matching M of G(I). Method. 1. U 4 I; M 4 f ; 2. for i 5 1 to n do 3. if v i is unmatched then do 4. Let N u (v i ) 5 {v j : v j is adjacent to v i and v j [ U}; 5. if N u (v i ) Þ f then do 6. Let v k 5 min N u (v i ); 7. M 4 M ø {(v i , v k )}; U 4 U 2 {v i , v k }; end if end if end for 8. Output M. 5.2. A Linear-time Algorithm Using a Deferred-query Approach In this section, we use a deferred-query approach to design an O(n) algorithm to solve this problem on a set of sorted intervals. The basic ideas of the new approach are as follows: Assume that M is the maximum matching obtained from Algorithm GM. If (v i , v j ) [ M, where i , j, then (v i , v j ) is determined when v i is visited by Algorithm GM. In the new algorithm, we defer the decision of (v i , v j ) until v j is visited. In the new algorithm, we visit intervals one by one in increasing order. When interval v i is being visited, we match v i with its smallest unmatched neighbor which has been visited. It is straightforward to verify that the matching output by this approach is the same as that output by the greedy algorithm presented in the previous subsection. By using the same disjoint set union-find technique as that used by Algorithm DOPC for the optimal path cover problem, we can also solve the maximum matching problem in O(n) time and space. Similarly, the algorithm maintains a list L of right endpoints in the increasing order. Initially, L 5 {b 1 , b 2 , . . . , b n } and b i is associated with the consecutive left endpoint set {a j : b i21 , a j , b i } for 1 # i # n. The new algorithm is formally presented in the following: Algorithm DM. Find a maximum matching of G(I) of a set I of sorted intervals. Input. A family of sorted intervals, I 5 {v 1 , v 2 , . . . , v n }. Output. A maximum matching M. Method. 1. Scan the endpoints to find all consecutive left endpoint sets. Each right endpoint b i , 1 # i # n, is associated with the consecutive left endpoint set {a j : b i21 , a j , b i }. 2. M 4 f ; 3. L 4 {b 1 , b 2 , . . . , b n }; 4. for i 5 1 to n do 5. Find the left endpoint set S containing a i ; 6. Find the right endpoint b k that is associated with S; 7. if i Þ k then do 8. M 4 M ø {(v i , v k )}; 9. Find the successor b h of b k in L; 10. Unite the set associated with b k to the set associated with b h ; 11. Let b h be associated with this new set; 12. Delete b k from L; 13. if i Þ n then do 14. Unite the set associated with b i to the set associated with b i11 ; 15. Let b i11 be associated with this new set; end if 16. Delete b i from L; end if end for 17. Output M. Note that the algorithm maintains two invariants: Invariant 1: For each right endpoint b i in L, b i is associated with the DEFERRED-QUERY APPROACH left endpoint set {a j : b p , a j , b i }, where b p is the predecessor of b i in L. Invariant 2: The right endpoints in L can be partitioned into two parts immediately before interval v i is visited. One part is the set of right endpoints which are less than b i . The other part is the set of right endpoints which are equal to or greater than b i . The former are right endpoints of all unmatched intervals that have been visited. The latter are all right endpoints of unvisited intervals. It is straightforward to verify that these two invariants are true initially. When interval v i is being visited, we first find the set S containing a i and the right endpoint b k that is associated with S (lines 5 and 6). By the two invariants, v i is not adjacent to any visited and unmatched interval if k 5 i and v k is the smallest visited and unmatched interval that is adjacent to v i . Thus, v i is matched with v k (line 8). Since v i and v k are matched, their right endpoints should be deleted from L to maintain Invariant 2 (line 12 and 16). To maintain Invariant 1, we unite the set associated with b k to the set associated with b h , where b h is the successor of b k in L, and let b h be associated with this new set (lines 9 to 11). We also unite the set associated with b i to the set associated with b i11 and let b i11 be associated with this new set if i Þ n (lines 13 to 15). By the two invariants, it is easy to see that the union and find operations can be implemented by the static tree union-find algorithm in O(n) time [11]. Thus, we have the following theorem: Theorem 4. Given a set I of sorted intervals, Algorithm DM computes a maximum matching of G(I) in O(n) time and space. REFERENCES [1] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The design and analysis of computer algorithms, Addison-Wesley, Reading, MA, 1974. [2] S.R. Arikati and C.P. Rangan, Linear algorithm for optimal path cover problem on interval graphs, Info Process Lett 35 (1990), 149 –153. [3] A.A. Bertossi, On the domatic number of interval graphs, Info Process Lett 28 (1988), 275–280. [4] A.A. Bertossi, The edge Hamiltonian path problem is NPcomplete, Info Process Lett 3 (1981), 157–159. [5] M.A. Bonuccelli, Dominating sets and domatic number of circular arc graphs, Discr Appl Math 12 (1985), 203–213. [6] K.S. Booth and G.S. Lueker, Testing for consecutive ones property, interval graphs and graph planarity using PQ-tree algorithms, J Comput Syst Sci 13 (1976), 335–379. [7] E.J. Cockayne and S.T. Hedetniemi, Towards a theory of domination in graphs, Networks 7 (1977), 247–261. [8] J.E. Cohen, J. Komlós, and T. Mueller, The probability of an interval graph, and why it matters, Proc Symp Pure Math 34 (1979), 97–115. [9] S. Even and O. Kariv, An O(n 2.5 ) algorithm for maximum matching in general graphs, Proc 16th IEEE Ann Symp Founda Comput Sci, 1975, pp. 100 –112. [10] M. Farber, Domination, independent domination, and duality in strongly chordal graphs, Discr Appl Math 7 (1984), 115–130. [11] H.N. Gabow and R.E. Tarjan, A linear-time algorithm for a special case of disjoint set union, J Comput Syst Sci 30 (1985), 209 –221. [12] Z. Galil and G.F. Italiano, Data structures and algorithms for disjoint set union problems, ACM Comput Surv 23 (1991), 319 –344. [13] M.R. Garey and D.S. Johnson, Computers and intractability: A guide to the theory of NP-completeness, W.H. Freeman, San Francisco, CA, 1979. [14] M.R. Garey, D.S. Johnson, and R.E. Tarjan, The planar Hamiltonian circuit problem is NP-complete, SIAM J Comput 5 (1976), 704 –714. [15] M.C. Golumbic, Algorithmic graph theory and perfect graphs, Academic, New York, 1980. [16] J.E. Hopcroft and R.M. Karp, An n 2.5 algorithm for maximum matching in bipartite graphs, SIAM J Comput 2 (1973), 225–231. [17] W.L. Hsu, W.K. Shih, and T.C. Chern, An O(n 2 logn) time algorithm for the Hamiltonian cycle problem, SIAM J Comput 21 (1992), 1026 –1046. [18] H. Kaplan and R. Shamir, The domatic number problem on some perfect graph families, Info Process Lett 49 (1994), 51–56. [19] J.M. Keil, Finding Hamiltonian circuits in interval graphs, Info Process Lett 20 (1985), 201–206. [20] M.S. Krishnamoorthy, An NP-hard problem in a bipartite graphs, SIGACT News 7 (1976), 26. 6. CONCLUDING REMARKS In this paper, we introduce the idea of a deferred-query approach to design O(n) time algorithms for the domatic partition, optimal path cover, Hamiltonian path, Hamiltonian circuit, and maximum matching problems on a set of sorted intervals. This answers the open question, posted by Manacher et al. [24]: Can the Hamiltonian circuit problem be solved in O(n) time on a set of sorted intervals? It seems that the greedy algorithms on a set of sorted intervals with O(m 1 n) or O(n log log n) running time can be implemented to run in O(n) time by using a deferred-query approach. Some problems on circular-arc graphs can be reduced to problems on interval graphs [21]. Hence, this approach can also be used to design efficient algorithms for some problems on circular-arc graphs. The authors are grateful to an anonymous referee and Prof. S. L. Lee for their patient and many constructive suggestions on the revision of this paper. 9 10 [21] [22] [23] [24] [25] [26] CHANG, PENG, AND LIAW Y.D. Liang and C. Rhee, Finding a maximum matching in a circular-arc graphs, Info Process Lett 45 (1993), 185–190. T.L. Lu, P.H. Ho, and G.J. Chang, The domatic number problem in interval graphs, SIAM J Discr Math 3 (1990), 531–536. G.K. Manacher and T.A. Mankus, Determining the domatic number and a domatic partition of an interval graph in time O(n) given its sorted model, manuscript (1991). G.K. Manacher, T.A. Mankus, and C.J. Smith, An optimum O(n log n) algorithm for finding a canonical Hamiltonian path and a canonical Hamiltonian circuit in a set of intervals, Info Process Lett 35 (1990), 205–211. S. Micali and V.V. Vazirani, An O( =uVuuEu) algorithm for finding maximum matching in general graphs, Proc 21st IEEE Ann Sympos Founda Comput Sci, Long Beach, CA, 1980, pp. 17–27. A. Moitra and R.C. Johnson, A parallel algorithm for maximum matching on interval graphs, Int Conf Parallel Process vol. III, 1989, pp. 114 –120. [27] S.L. Peng and M.S. Chang, A new approach for domatic number problem on interval graphs, Proceed Nat Comput Sympos, Taiwan, R.O.C., 1991, pp. 236 –241. [28] S.L. Peng and M.S. Chang, A simple linear time algorithm for the domatic partition problem on strongly chordal graphs, Info Process Lett 43 (1992), 297–300. [29] A.S. Rao and C.P. Rangan, Linear algorithm for domatic number problem on interval graphs, Info Process Lett 33 (1989), 29 –33. [30] F.S. Roberts, Graph theory and its applications to problems of society, SIAM, Philadelphia, PA, 1978. [31] R.E. Tarjan, Efficiency of a good but not linear set union algorithm, J ACM 26 (1975), 215–225. [32] R.E. Tarjan and J. van Leeuwen, Worst-case analysis of set union algorithms, J ACM 31 (1984), 245–281. [33] P. Van Emde Boas, Preserving order in a forest in less than logarithmic time and linear space, Info Process Lett 6 (1977), 80 – 82.

1/--страниц