International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) Parallelization of Shortest Path Algorithm Using OpenMP and MPI Rajashri Awari Dept.of Info.Tech. MTech. Yeshwantrao Chavan College of Engineering Nagpur, India rajashreewawari@gmail.com Abstract—Graph problem solve by using the standard graph Algorithm. Example isLarge matrix problems, graphs solving problems, equations etc.There is a representation of two algorithms.The algorithm to find all pair shortest path is Floyd War shall algorithm, single source shortest path is Dijkstra’salgorithm. These two algorithm implement in serial formulation.Parallel algorithm is considerably effective for large graph size. By using these algorithm find the shortest distance, such as city to city, routing, networking, social media. Parallel algorithm used for calculating or finding shortest path of graph. With the help of graph algorithm these operations can be done in parallel and reduce the computation time and efficiency. All pair shortest path problem apply for directed and undirected graph whose finite nodes and edges in un-weighted and undirected graph. Keywords- Shortest Path Algorithms, OpenMP, MPI. 1. INTRODUCTION All pair shortest path problem find the shortest path in graph using the graph algorithm. Finding the shortest distance for all objects in a graph is a common task in solving many day to day and scientific problems. The algorithm for finding shortest path, find their application in many fields such as Google maps, routing protocol etc. There are two types for finding the shortest path, single source shortest path and all pair shortest path, using two algorithms Dijkstra’s algorithm and Floyd algorithm. Dijkstra’s algorithm requires non-negative edge weights, whereas Floyd's algorithm requires negative-weight edges. The shortest path algorithm is the basic algorithm for research hotspot. To improve the transportation system the best use of shortest path to implement internet protocol network. It’s use as in the parallel algorithm. Otherwise in serial implementation it is very difficult improve its performance. ( Dijkstra’s algorithm). [2] In serial implementation algorithm it takes long time to find the shortest distance if all pairs of vertices in graph. So it is difficult task to find the shortest distance from source to destination. Therefore use the parallel algorithm or parallel processing takes less time than the serial implementation. And It’s easy to calculate the shortest distance or path find in graph or many other applications. Fundamental problem in graph theory is single source shortest path. [1] The All pair shortest path problem and single source shortest path problem start initially consider source s to all other vertices in Graph. These two algorithm Floyd and Dijkstra’s algorithm is most important find the shortest path problems between all pair of vertices in the graph. The graph solving problems increase in size, efficient parallel shortest path processing becomes important as computational and memory requirements increase. Because of number of size input required for graph solving problem.[N*N] matrix problem. Retrieve information about the process use molecular dynamic simulation in order to atomistic system that cannot be accessed during experiment. [3] For large structure or system it takes long time to perform their operations and this is why the parallelization used to perform operation in less time. Performance for the operation in less time to reduce the efficiency and speedup factor using the OpenMP(Open multiprocessing) and the MPI (Message Passing Interface)and also use the parallel Floyd algorithm, parallel Dijkstr’s algorithm.[3] The All-Pairs Shortest Paths (APSP) problem and single source shortest path problem is one of the most important, and most studied, algorithmic graph problems. Graph shows a weighted directed graph G = (V,E) where V is the vertices and E is the edges of the graph.[5] The real-world applications including VLSI routing, bioinformatics, social network analysis, and communications. Certain a directed weighted graph, APSP is the problem of finding the shortest distance between every two nodes in the graph. This problem reduces to multiple breadth-first search traversals in the case of an unweighted directed graph. The following figure shows directed and undirected graph. Figure (1) shows undirected graph, Figure (2) shows directed graph as follows. 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 304 International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) path with shortest distance. Thus, running the breadth-ﬁrstsearch algorithm (BFS) from every node in the graph is an allpair shortest path algorithm that works on un-weighted graphs. Minimum Spanning Tree algorithms classified on one of two approaches, that of Floyd Algorithm, Dijkstra’s Algorithm. However, many algorithms developed to finding the shortest path problem. Processor consists at each node presented at the algorithm in graph.[11] The distance query can be done through the single source shortest path when x=s, where s is single source node. To find the shortest distance its require time to calculate the shortest distance, and total time require for the process of all edge deletion. The running time of the algorithm is n number of nodes in the graph and m is the number of edges. Constructing the graph problem by using the breath-first search tree from all nodes presented in the tree of all pair shortest path on un-weighted, undirected graph. This can be done the time complexity in the order of (m,n).i.e. o(m,n). Figure (1) Undirected Graph Figure (2) Directed Graph The above two figure shows the Directed and Undirected Graph. Which shows that Direction from one path to another or shows the distance between two path. 2. LITERATURE SURVEY Nicolae Goga[3] for large systems, the simulations can take a lot of time and this is why MPI parallelization is used in order to process data in less time. The molecular dynamics package, GROMACS, accelerates the computation [3] and passes messages through the communication channels of the MPI. In this article, a new algorithm for stochastic and dissipative element dynamics is presented and implement in GROMACS to be run using the parallelization given by MPI. Yulian Yu, Wenhong Wei, Qiang Peng [4] the parallel problem solving parallel speed-up ratio is a measure of the corresponding benefits. Considering the single source shortest path problem it’s good for parallel speedup ratio, computational efficiency to reduce the computation time. Using vector matrix multiplication operation single source multilevel graph may find the multiplication operations are replace by addition operation. And the addition operation replace by the minimum operation with the help of parallelization by reducing the time and cost. Osama G. Attia [5] All pair shortest path difficulty in unweighted directed graphs reduces the problem for ﬁnding the Danny Z. Chen, Every pair of vertices in the graph solving by the all pair shortest path algorithm. The large graph whose edges and node is finite number solve by the sequential manner. It finds the time complexity as well as space complexity. The time complexity takes O(N3) and space complexity takes O(N2 ) [6].An interval circular arc graph solving the sequential algorithm takes O(n1ogn) time for single source shortest path problem. The all pair shortest paths information for solving the graph with minimum spanning tree can be completely available very efficiency.The all pair shortest path finds the for every nodes or vertices takes o(n 2 logn) time. Rithesh G. Prabhu and Anjjan S Narayan, proposes a new method that the all pair shortest path on a GPU used to increasing the efficiency. The graph consists as node connected to each other. There are two types of graph, such as dense graph and sparse graph. Large number of edges consists of dense graph and the few or small edge consists of the sparse graph. The dense graph is fully suited for Floyd algorithm and required the large memory to store the graph or matrix of the graph. For large graph matrix solving problem can implemented on GPU, because memory restraints. [8] By reducing the calculation sparse graph improves their efficiency. GPU is used as the non-graphic application. . 3. OVERVIEW OF ALGORITHM Consider the directed graph G with the n number of vertices. The vertices considered as 1 to N. And E is the edges of the graph. So graph can be considered as G = (N, E) where N is the number of vertices in the graph. Floyd War shall algorithm consists as the dynamic programming algorithm, which calculates the shortest path or distance of matrix D k is the Kth matrix. And this is defined as the Dk(i,j). Where i, j represents the source and the destination. The shortest path from vertex i to the vertex j, means the shortest distance between the i to j. Consider i is the source and j is the destination, which is 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 305 International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) marked as 1 to k, where k is the iteration number. Dijkstra’s algorithm is calculated the singlesource shortest path problem. Floyd algorithm solves the problem as directed graph or undirected graph having weight[16]. Floyd algorithm consider the one condition that is D(i, j) > D(i, k)+ D(k,j). This condition must satisfy the Floyd algorithm to calculate the shortest distance between the all pair of vertices in the graph. [16] The Floyd Warshall algorithm compare all possible paths through the graph between each pair of vertices.[16] 3.1.1 EXAMPLE OF FLOYD ALGORITHM 3.1 FLOYD ALGORITHM The Floyd-Warshall algorithm use a dynamic programming approach to solve the all-pairs shortest-paths problem on a directed graph G = (V,E). The bellow equation shows weight in between ‘i’ and ‘j’. Where ‘i’ is the source and ‘j’ denote destination.[21] Figure (4) Weighted Graph Let d(k)ij be the weight of a shortest path from vertex. i, j represents as path from source to destination. Consider I is the source and j is the destination. wij represents the weight from i to j shown by the following equation. A recursivedefinition from the above formulation is given by: The above matrix shows the adjacency matrix of graph. (fig.4) Floyd's algorithm has a worst casewhere n is the number of vertices in the graph. Following steps shows how Floyd algorithm works as the shortest path using (fig 3) the flow of algorithm. The Floyd Algorithm flow is shown below serially such as, Steps of Floyd Algorithm The above steps show Floyd algorithm calculate minimum distance to all pair shortest path by using (fig.4) weighted graph. Figure(3). Floyd Algorithm 3.2 DIJKSTRA’S ALGORITHM The Floyd War shall algorithm is to find the shortest path between the pair of vertices. Above seven steps (fig 3) shows the flow of Floyd algorithm. Ans in a weighted graph with positive or negative edge weights (but with non-negative cycles) single implementation of the algorithm will find the lengths (summed weights) of the shortest paths among all pairs of vertices, though it does not return detail of the paths themselves. D[i,j] represents the distance of i to j. Dijkstra's algorithm is used to find the single source shortest path problem. Consider the single vertex to all other vertices in the graph. Dijkstra’s algorithm requires the non-negative weight edge cycle. .Applying the Problem on Graph to find the distance between source to destination. For a weighted graph G = (V, E, w), the single-source shortest weighted path sproblem is to find the nearest paths from a vertex v is subsetof V to every other vertices in V, where V is the finite 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 306 International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) set of vertices, E is the finite set of edge, w is weight between the nodes of graph. Dijkstra's algorithm solves the single source shortest-path problem on directed and undirected graphs with non-negative weights edgs. Dijkstra's algorithm, find the minimum path from single vertex. Dijkstra’s algorithm is similar to Prim's minimum spanning tree algorithm. Resembling Prim's algorithm, it incrementally find the shortest paths from s to the other vertices of graph (G). It is also greedy. i.e., it always selects an edge to a vertex that appeared closest.[13] The algorithm works as follows: 1. Assign to every node a tentative distance value: set it to zero for our initial node and to infinity for all other nodes. 2. Mark all nodes unvisited. Set the initial node as current. Create a set of the unvisited nodes called the unvisited set consisting of all the nodes except the initial node. 3. For the current node, consider all of its unvisited neighbors and calculate the tentative distances. For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6+2=8. If this distance isless than the previously recorded tentative distance of B, then overwrite that distance. Even though a neighbor has been examined, it is not marked as “visited” at this time, and it remains in the unvisited set. 4. When complete the neighbors of the current node and mark the current node as visited and remove it from the unvisited set. A visited node will never be checked again. 5. If the destination node has been marked visited (when planning a route between two specific nodes) or if the smallest tentative distance among the nodes in the unvisited set is infinity (when planning a complete traversal), then stop. The algorithm has finished. 6.Select the unvisited node that is marked with the smallest tentative distance, and set it as the new “current node” then go back to step 3. Dijkstra’s algorithm works the single vertex to all other vertex.It consists of the single source. But it can also be used to solve the all-pairs shortest paths problem by executing the single-source algorithm on each process, for eachvertex v. This algorithm as Dijkstra's all-pairs shortest paths algorithm, The complexity of the all-pairs algorithm is Q(n3).Dijkstra's shortest path algorithm is implement and existing, and the performances of parallel and serial execution are compared. The parallel time execution is better than the serial time execution. Because its take minimum time for the execution. The algorithm implementation was parallelized using OpenMP (Open Multi-Processing).[13] 3.2.1 EXAMPLE OF DIJKSTRA’S ALGORITHM Steps of Dijkstra’s Algorithm Above example shows the steps of Dijkstra’s algorithm using the flow of algorithm. (fig.5) 4. TECHNIQUES Figure (5) Dijkstra’s Algorithm The two algorithms used in this paper as Floyd Warshall and Dijkstra’s Algorithm. These Algorithms implemented in c, c++ for serial formulation. For large graph matrix problem solve these two algorithms as serially. By using dataset as large matrix got from search engine which consist node to node and its weight. Large matrix size consists as 500*500, 1000*1000, 2000*2000. Using large size matrix converted algorithm in serial form implanted in c, c++. And the serial form can be converted into the parallel form using OpenMP and MPI for reducing or consuming time and the efficiency. Requirement of software is c,c++. With the help of matrix size up to 2000 implemented serial form by using automatic program which consist or enter the size of matrix manually. 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 307 International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) And another program implanted as write matrix, it consist as enter the matrix for example 4*4 matrix etc. Create the file name as input which runs the main program as Floyd and Dijkstras algorithm. Following shows the serial implementation of Floyd and Dijkstra’s algorithm. Where N shows the input size of matrix and number of node, Ts is the serial time execution of the algorithm in second. It will be shows that the time execution of Dijkstra’s algorithm is smaller than the Floyd algorithm, Because Floyd algorithm solved as all pair shortest path where Dijkstra’s algorithm single source shortest path. Further implementation or work to be carried of these two algorithms is parallel implantation of these two algorithms. 5. STEPS 1. 2. 3. Serial Implementation of Floyd and Dijkstra’s algorithm. Parallel Implementation of Floyd and Dijkstra’s Algorithm. Comparison of serial and parallel. 6. PARALLEL PROGRAMMING ENVIRONMENT Parallel programming environment consist OpenMP. OpenMP stands for Open Multi-Processing. Following shows the theoretical idea of OpenMP and MPI. OpenMP is a paradigm and suitable application programming interface (API) for writing multiple threads programs on a shared memory computer.[17] There are three components of application programming interface such as compiler directive, runtime library routines and environment variables. OpenMP consist of the Fork-Join model. Fork-Join is the execution model for parallel programming execution. OpenMP programming performs operation on the single thread as the initial thread. The threads are executed as the sequentially or serial form, otherwise parallel construct create. Fig shows the fork-join model. Master thread is called as join. Derive thread and master thread is works simultaneously. At the starting of the program only one thread is active this thread is execute serially.[17] Ahead finishing point of the parallel section, those copied threads will stop or hang up, and only the master thread continue, which is called a join. The OpenMP API cover only user-directed parallelization, wherein the user evidently specify the performance to be taken by the compiler and runtime system in order to perform the program in parallel. OpenMP-compliant implementations are not required to check for dependency, conflict, deadlocks, race conditions, or other problems that effect from non-conforming programs. The user Create the compliant program for using OpenMP. OpenMP does not cover compiler produce continual parallelization and orders to the compiler to help parallelization. The Fig.6 shows that Fork- join model in OpenMP.[17](Fig.6) Figure (6) fork join model in OpenMP PCG algorithms parallelize using the OpenMP. The reason is that OpenMP supports incrementally parallelization. That means, the common of the serial code is not changed and the user only needs to identify and parallelize just the most timeconsuming parts of the code, which are usually apply in loop. 7. MESSAGE PASSING INTERFACE (MPI) Basically, it consists of the parallel programming. Message passing interface is a programming concept. Parallel computers basically use for the Network of Workstations. Various machines across the degree of portability, the major goal of message passing interface. Between the any pair of MPI provides message passing. Set of process of communication pattern used for graph. Graph consists of nodes, edges. Where node is represented as process and edges represented as connects the process. [10] Figure (7) MPI It is a process communicating through messages. Above figure (fig (7)) shows the communication of message passing. Public domain systems demonstrated – MPI can be efficiently and portably implemented. Each vendor has implemented its own variant. Molecular dynamics simulations are used in order to Retrieve information about the processes which take place inside an atomistic system that cannot be accessed through the experiments. For large systems, the simulations can take a lot of time and this is why MPI parallelization is used in order to process data in less time. When working with a large set of data which is subject to an algorithm, simulations can take a lot of time. A solution of this matter is the MPI parallelization makes the processing of data in a small amount of time.[10] 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 308 International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC 2017) 8. CONCLUSION In this paper studied the shortest path algorithm to solve the all pair shortest path and single source shortest path of graph. Finding minimum spanning tree of graph studied the two sequential algorithms. Graph calculates the single source shortest path and all pair shortest path using two serial algorithms, it can be converted into parallel which resulting the number of iterations reduce consisting less time. Based on two sequential algorithms Floyd and Dijkstra’s find the serial implementation. Then using OpenMP and MPI convert the serial code into parallel as less amount of time based on the speedup factor and efficiency and cost. Initial stage of above two algorithms is serial implementation. Further implement will parallelize the above two algorithm and its comparison. 9. FEATURE SCOPE By using OpenMP consuming or reducing the execution time for designing the parallel programs, so large problems can be solve in less amount of time. As well as compute the performance evaluation parameter speedup and efficiency based on the algorithms. [12] Qiang Peng, Yulian Yu and Wenhong Wei, “The Shortest Path Parallel Algorithm on Single Source Weighted Multi-level Graph”, 2009 Second International Workshop on Computer Science and Engineering, 2009. [13] Fang Zhou Lin, Nan Zhao, „Parallel Implementation of Dijkstra's Algorithm“ , pp COP5570, 1999. [14] Dandan Li, Xiaohui Ji, Qun Wang”A Parallel PCG Solver for Largescale Groundwater Flow Simulation Based on OpenMP ”, Fourth International Symposium on Parallel Architectures, Algorithms and Programming, p306-p309,2011. [15] P.Harish, P.J. Narayan. "Accelerating Large Graph Algorithms on the GPU Using CUDA". HiPC (2007). Volume 4873, pi 97-208,2007. [16] R. W. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6):345, 1962. [17] Dandan Li, Xiaohui Ji, Qun Wang, “A Parallel PCG Solver for Largescale Groundwater Flow Simulation Based on OpenMP”, Fourth International Symposium on Parallel Architectures, Algorithms and Programming,pp306-309,2011. [18] C. Demetrescu and G. Italiano,“A new approach to dynamic all pairs shortest paths,” Journal of the ACM, vol. 51, no. 6, pp. 968–992, 2004. [19] R. Hassin and E. Zemel, “On shortest paths in graphs with random weights,” Mathematics of Operations Research, vol. 10, no. 4, pp. 557– 564, 1985. [20] A. Frieze and G. Grimmett, “The shortest-path problem for graphs with random arc-lengths, ”Discrete Applied Mathematics, vol. 10, pp. 57–77, 1985. [21] Da-chuan Wei, ”Implementation of Route Selection Function Based on Improved Floyd Algorithm”, WASE International Conference on Information Engineering,pp 223-227,2010. 10. REFERENCES [1] Guoqing Lei, Yong Dou, Rongchun Li, and Fei Xia,” An FPGA Implementation for Solving the Large Single Source-Shortest-Path Problem”, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS— II, VOL. 63, NO. 5, pp473-477, MAY 2016. [2] Quoc-Nam Tran,” Designing Efﬁcient Many-core Parallel Algorithms for All-Pairs Shortest-Paths using CUDA”, Seventh International Conference on Information Technology, p435p675, 2010. [3] Nicolae Goga, Oludele Awodele,”Improved GROMACS Algorithms using the MPI Parallelization”,The 5th IEEE International Conference on E-Health and Bioengineering – EHB, 2015.Desalegn Getachew Melese , Huagang Xiong ,Qiang Gao ,”Consumed Energy as factor for Cluster Head Selection”,IEEE Conference,pp.1-4,2010. [4] Yulian Yu, Wenhong Wei ”The Shortest Path Parallel Algorithm on Single Source Weighted Multi-level Graph”,Second International Workshop on Computer Science and Engineering ,2009 [5] Osama G. Attia, Alex Grieve, Kevin R. Townsend, Phillip Jones, and Joseph Zambreno ,”Accelerating All-Pairs Shortest Path Using a Message-Passing Reconﬁgurable Architecture”,2015. [6] Danny Z. Chen, “Solving the All-Pair Shortest Path Problem on Interval and Circular-Arc Graphs.(Preliminary Version)“,The National Science Foundation,1994 [7] Uri Zwick, Yuval Peres,”All-Pairs Shortest Paths in O(n2) time with high probability”51st Annual Symposium on Foundations of Computer Science,,2010 IEEE [8] B.Neelima , Rithesh G. Prabhu and Anjjan S Narayan,”Yet Another Proposal for All Pair Shortest Path on GPU”,2014 [9] D.S Banejee, S.Sharma, K. Kothapalli. Work Efficient Parallel Algorithms for Large Graph Exploration. High Performance Computing (HPC), 2013 20th International Conference. p433-442, 18 Dec. 2013. [10] Clay P. Breshears, ”From Shared to Distributed Memory: Converting Non-Numeric Parallel Algorithms to Message Passing Interface (MPI)”,MIT press,1997.Guoliang Xing,Min ming Li,Tian Wang,Weijia jia ,Jun Huang,”Efficient Rendezvous Algorithms for mobility-Enabled wireless Sensor Network ”,IEEE Transaction on mobile computing ,vol.11,no.1,jan 2012. [11] Y. Zhao, Y. Han, Z. Fan, F. Qiu, Y.-C. Kuo, A. E. Kaufman, and K. Mueller, “Visual simulation of heat shimmering and mirage,” IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 1, pp. 179–189, 2007. 978-1-5090-3243-3/17/$31.00 ©2017 IEEE 309

1/--страниц