close

Вход

Забыли?

вход по аккаунту

?

978-3-319-69462-7 3

код для вставкиСкачать
Network-Aware Stochastic Virtual Machine
Placement in Geo-Distributed Data Centers
(Short Paper)
Hana Teyeb1,2(B) , Nejib Ben Hadj-Alouane3 , and Samir Tata4
1
Faculty of Sciences of Tunis, University of Tunis El Manar, OASIS, Tunis, Tunisia
hana.teyeb@gmail.com
2
SAMOVAR, Telecom SudParis, CNRS, University of Paris-Saclay,
9 Rue Charles Fourier, 91011 Evry, France
3
National Engineering School of Tunis, OASIS, Tunis, Tunisia
nejib bha@yahoo.com
4
IBM Research - Almaden, 650 Harry Rd, San Jose, CA 95120, USA
stata@us.ibm.com
Abstract. In this work, we focus on the stochastic network-aware virtual machine placement (VM) problem in geodistributed data centers
(DCs). We consider the uncertainty of the inter-VMs traffic while making placement and migration decisions. First, we propose a stochastic
program with the objective of minimizing inter-DCs traffic. Then, we
propose an equivalent optimization model using sampling methods and
we present a two-step approach to solve the problem. Experiments show
the effectiveness of the proposed approach.
Keywords: Cloud computing · IaaS · VM placement · VM migration
Stochastic integer programming
1
·
Introduction
In order to achieve reliability and serve world wide users, large-scale cloud
providers are relying on a geo-distributed infrastructure where data centers
(DCs) are built in different locations and interconnected within a backbone network. In such a context, network congestion is a crucial issue. The increasing
traffic exchange of the applications hosted in the VMs may cause bottlenecks
in the network resulting in performance degradation of the cloud system. Thus,
it is important to minimize the traffic circulating within the backbone traffic.
Recent studies [1–3] have shown that the workload of VMs is highly dynamic
which may cause the existent placement and migration schemes to be inefficient.
Most of the existent works [4–6] make migration decisions based on deterministic
demand estimation and workload characterization without considering stochastic
properties. There are several optimization techniques to solve the VM placement
problem. Among these techniques, we cite deterministic and Stochastic Integer
c Springer International Publishing AG 2017
H. Panetto et al. (Eds.): OTM 2017 Conferences, Part I, LNCS 10573, pp. 37–44, 2017.
https://doi.org/10.1007/978-3-319-69462-7_3
38
H. Teyeb et al.
Programming (SIP) [7]. In contrast to deterministic approach, the SIP technique
considers uncertain parameters. Many existing works have used SIP to deal with
resource provision, load balancing and capacity planning problems [1,8,9]. Others have studied VM consolidation problem with stochastic demand [10,11]. In
[12], the authors have proposed a joint approach that combines VM placement
and bandwidth allocation. The main limitation of the aforementioned works is
the fact that they did not consider inter-VMs communication while making the
placement decisions. In this paper, we propose stochastic integer programming
(SIP) formulation that aims to solve the network-aware VM placement problem
in geo-distributed DCs while minimizing the inter-DCs traffic. The remainder
of this paper is organized as follows. In Sect. 2, it presents the problem formulations as well as the proposed heuristic for solving the network-aware VM
placement problem in geo-distributed DCs. As for Sect. 3, presents and discusses
the experiment results. Finally, we conclude in Sect. 4.
2
Problem Formulation
Table 1 presents the notations used in the presented formulations. DC edge
routers are responsible for connecting the DCs to Wide Area Network (WAN)
[13]. In the following, we use the decision variables defined below.
• ϕhi defines the amount of traffic originated from the VM i ∈ V and sent from
the DC h ∈ D (i.e. the traffic sent to the backbone network).
• xhi is equal to 1 if the VM i ∈ V is placed in the DC h ∈ D, 0 otherwise.
• zih is equal to 1 if the VM i is migrated from the DC h ∈ D.
i
which denotes the amount of traffic originated from the VM i ∈ V and
• fhk
circulating between the DCs h ∈ D and k ∈ D.
Table 1. Notations.
Symbol Description
D
The set of data centers
V
The set of virtual machines
R
The set of hardware resources (CPU, RAM, storage)
Mi
The migration cost of the VM i (i ∈ V ).
It is the amount of data transferred when the VM i is migrated
aki
Takes 1 if the VM i can be placed in the DC k, 0 otherwise (i ∈ V ,
k ∈ D)
chr
The capacity of the DC h in terms of resource r (h ∈ D, r ∈ R)
uir
The amount of resource r consumed by the VM i (r ∈ R, i ∈ V )
Eh
The bandwidth capacity of the edge router of the DC h ∈ D
xh0i
is equal to 1 if the VM i is already placed in the DC h.
(i ∈ V , h ∈ D)
Network-Aware Stochastic Virtual Machine Placement
2.1
39
Stochastic Optimization Model
In this formulation, we consider a random variable d
ij that describes the
amount of traffic exchanged between each pair of VMs (i.e. inter-VMs bandwidth
demand). The variable follows a probability distribution that can be estimated
from runtime measurement. We assume that the distribution can be obtained
using statistical process to analyze historical data. Many studies [1,10,11] have
shown that the resource demand of VMs follows a Normal distribution N (μ, σ 2 ).
Thus, we consider that the inter-VMs bandwidth demand follows also the Normal
distribution. Our aim is to minimize the amount of traffic circulating between
the different DCs.
Mi .zih + ϕhi
(1)
min
i∈V h∈D
Subject to the following constraints:
h
xh .d
d
ϕh ij .x −
ij
i
i
j∈V
k∈D
P r(
i
fhk
−
i
fkh
=
k∈D
Mi .zih +
i∈V
ahi
uir .xhi chr
i∈V
zih xh0i −
zih ∈ {0.1}
xki ∈ {0.1}
ϕhi 0
i
fhk
0
j∈V
h
d
ij xi −
h
d
ij .xj
∀i ∈ V, ∀h ∈ D
(2)
∀i ∈ V, ∀h ∈ D
(3)
∀h ∈ D
(4)
∀i ∈ V
(5)
∀i ∈ V, ∀h ∈ D
(6)
∀r ∈ R, ∀h ∈ D
(7)
∀i ∈ V, h ∈ D
(8)
j∈V
ϕhi Eh ) 1 − i∈V
xhi = 1
h∈D
xhi j
j∈V
xhi
∀i ∈ V, h, k ∈ D
∀i ∈ V, k ∈ D
∀i ∈ V, h, k ∈ D
∀i ∈ V, h, k ∈ D
The constraints (2) and (3) are both flow conservation constraints. They are
both stochastic due to the random variable d
ij which refers to the inter-VMs
communication traffic. The constraint (4) ensures that for each DC edge router,
the total traffic, which includes the inter-VMs communication traffic and the
migration traffic, does not exceed the bandwidth capacity of the edge router
with a high probability (1 − ). The constraint (4) ensures the service quality
guarantee with a predefined threshold ε. The constraint (5) is a demand satisfaction constraint. As for the constraint (6) it restricts the placement of VMs in a
particular number of DCs. The constraint (7) represents the capacity constraint
on the DCs. It ensures that the amount of resources consumed by different VMs
40
H. Teyeb et al.
placed in a given DC does not exceed the resource capacities of the DC. The
constraint (8), ensures that only already existing VMs can be considered as
candidates for the migration. Suppose that it has finite support, hence, we can
enumerate the set of all different scenarios. We can then, formulate an equivalent
deterministic optimization problem that can be solved as a Mixed Integer Linear
Program (MILP). However, the size of the problem space can grow very large
as the number of scenarios increases. Therefore, in the next section, we propose
an alternative solution using sampling-based methods.
2.2
Equivalent Optimization Formulation
In order to solve the stochastic problem, we propose an equivalent formulation
h
using sampling methods. Let us consider the function g(x) =
j∈V dij .xi −
h ∀i ∈ V, ∀h ∈ D It is clearly impossible to enumerate all the posj∈V xj .dij ,
sible outcomes. Hence, sampling techniques are a commonly used tool. In order
to discretize the stochastic function g(.), we apply Sample Average Approximation (SAA) method [14]. Sampling-based methods often approximate well, with
a small number of samples, problems that have a very large number of scenarios [15]. In this work, we use Monte Carlo methods to generate samples of
N = {1, . . . , n} replications of the random variable d
ij using the Normal distrib2
2
), where μij is the mean and σij
is the variance. Let us consider
ution N (μij , σij
of the stochastic function g(.) by applythe function gn (.), as the discretization
n
ing SAA methods, gn (x) = n1 i=1 g(x, ξi ) Where ξi is a random element such
n
1
k
that: d
ij = n
k=1 ξij and n is the number of iterations. Hence, the equivalent
deterministic constraint of (2) is obtained by replacing g(x) by gn (x).
gn (x) =
n
n
1
1
k
k
(
ξij
).xhi −
(
ξij
).xhj
n
n
j∈V
j∈V
k=1
∀i ∈ V, ∀h ∈ D
(9)
k=1
Thus, the constraint (2) becomes ϕhi gn (x) ∀i ∈ V, ∀h ∈ D. As mentioned above, we consider that the inter-VMs bandwidth demand d
ij follows
2
the Normal distribution N (μij , σij ). Hence, we can estimate dij by its mean
h
i
μij (d
∀h ∈
ij μij ). Let us consider the following equation, ϕi =
k∈D fhk ,
i
D, ∀i ∈ V . The value of fhk can be obtained from the flow conservation constraint
i
i
h
h ∀i ∈ V, ∀h ∈ D. If we
(3) k∈D fhk
= k∈D fkh
+ j∈V d
ij xi −
j∈V dij .xj ,
replace (Sect. 2.2) in (4), we obtain:
i
h
h
Mi .zih +
(
fkh
+
d
d
P r(
ij xi −
ij .xj ) Eh ) 1 − ∀h ∈ D
i∈V
i∈V k∈D
j∈V
j∈V
(10)
2
Since, d
ij follows the Normal distribution N (μij , σij ), then, because we assume
that the traffic of each pair of VM (i, j) ∈ V , is independent if i = j,
the aggregate traffic demand
i∈V
j∈V dij follows the Normal distribution
Network-Aware Stochastic Virtual Machine Placement
41
2
N ( i∈V j∈V μij , i∈V j∈V σij
) according to the property of normal distri
bution and Central Limit Theorem (CLT). Note that, the term i∈V Mi .zih is
deterministic, thus, it does not follow a probability distribution. Let us denote
h h
by αh =
i∈V
j∈V dij xi −
j∈V dij .xj . We need to estimate the Normal
distribution parameters μαh and σα2 h . Since, xhi ∈ {0, 1}, ∀i ∈ V, h ∈ D,
and because we assume that the traffic of each pair of VM (i, j) ∈ V , is
independent if i = j, then,by applying
SAA methods,
we can estimate μαh
and σα2 h as follows μαh = i∈V j∈V μij xhi − i∈V j∈V μij .xhj , ∀h ∈ D,
2 h
2
xi + i∈V j∈V σij
.xhj , ∀h ∈ D. Hence, it easy to show
σα2 h = i∈V j∈V σij
that the constraint (10) is equal to the overloading probability constraint presented in (11), where φ−1 (.) is the inverse of the cumulative distribution function
of the Standard Normal distribution. In this work, we consider that 0.5 and
φ−1 (1 − ) 0.
Eh −
i∈V
k∈D
i
fkh
+
i∈V
i∈V
Mi .zih +
2 h
j∈V σij xi
i∈V
+
i∈V
j∈V
μij xh
i −
2
h
j∈V σij .xj
i∈V
j∈V
μij .xh
j
−1
φ
(1 − )
(11)
The constraint can be written as follows:
i
Eh Mi .zih +
fkh
+ μαh + φ−1 (1 − ). σα2 h ∀h ∈ D
i∈V
(12)
i∈V k∈D
The objective function as well as the rest of constraints are the same. We replace
d
ij by μij . One of the challenges of this formulation is the non-linearity of the
constraint (12). The linearization of this constraint leads to a very large number
of variable which will enlarge the research space. To cope with this problem, we
propose, in the next section, an iterative two-step approach.
2.3
Network-Aware Stochastic VM Placement Algorithm
h
Let us denote by β h = φ−1 (1 − ). σα2 h and γ h =
i∈V Mi .zi +
i
i∈V
k∈D fkh + μαh . We denote by SIPγ+β , the optimization program presented in Sect. 2.2. We define SIPγ , the optimization program where the constraint (12) is replaced by Eh γ h ∀h ∈ D. Let us define the list P ref List
which contains the different values of ε. P ref List is sorted by increasing value of
ε. Smaller ε means that the risk of overloading must be very low. The algorithm
tries to solve the stochastic optimization problem within two iterations. In the
first iteration, it solves the SIPγ model without considering the non-linear term
β h . In fact, the optimization program SIPγ provides a deterministic VM placement scheme as it does not consider the variance of inter-VMs communication
traffic in the future. In addition, the term γ ensures that the overall traffic sent
via the DC edge router does not exceed its capacity. Afterword, the algorithm
evaluates the term β h by considering the solution provided by SIPγ . In the
second iteration, it tries to solve the SIPγ+β model by adding the term β h to
the overloading probability constraint. If the SIPγ+β model is feasible, then the
new placement scheme is the solution provided by SIPγ+β , otherwise, we relax
42
H. Teyeb et al.
Algorithm 1. Stochastic VM Placement Algorithm.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Input: Initial Placement scheme, stochastic traffic matrix, P ref List
Output: New VMs Placement Plan
ε ← P ref List[0]
Solve SIPγ
Record solutions
Calculate β h with the solutions provided by SIPγ
Solve SIPγ+β
if SIPγ+β is feasible then
New Placement Plan ← Solutions of SIPγ+β
else
i←i+1
end
while SIPγ+β is not feasible and i P ref List.size() do
ε ← P ref List[i]
Check feasibility of SIPγ+β
i←i+1
end
if SIPγ+β is feasible then
New Placement Plan ← Solutions of SIPγ+β
else
New Placement Plan ← Solutions of SIPγ
Add SLA violation
end
return New Placement Plan
the value of ε and try to solve it again. The new placement scheme is either the
solution provided by SIPγ+β , if it exists, or the solution provided by SIPγ with
an SLA violation due to the high risk of network overload.
3
Performance Evaluation
The different experiments were carried out on a machine that has an Intel Xeon
3; 3 GHz CPU and 8 GB of RAM. We have used the commercial solver CPLEX
12.5 [16] to solve the different formulations. In all experiments, we have fixed
the number of DCs to six. We generated for each pair of VMs traffic a sample of
10000 replications according to Normal distribution. Then, we applied SAA to
approximate the values of the traffic matrix. We randomly generated groups of
(mean, variance range) for the inter-VM traffic and set each pair of VMs traffic to
a value generated by a randomly chosen group. We assume that client’s demands
are independent. We have implemented the deterministic equivalent model in
Java with the above listed parameters. We studied first the performance of the
proposed algorithm in terms of computational time. The results depicted in
Fig. 1 show that the value of ε has no considerable impact on the execution time
of the model. We note that the execution time does not exceed 10 s for a total
Network-Aware Stochastic Virtual Machine Placement
43
Fig. 1. Execution Time versus the total number of VMs.
Fig. 2. Variantion of the number of migrations for ε ∈ {0.1, 0.01, 0.001}.
number of VM |V | = 1800. The value of the parameter ε affects the bandwidth
utilization. Smaller ε requires the system to reserve more bandwidth in order
to accommodate the possible variance of inter-VM traffic. To ensure the non
violation of the overloading probability with smaller ε, some VMs may have
to be migrated to another DC. We have varied the value of and we plotted
the number of migrations performed for each value. We have considered that
the bandwidth capacity of the DC edge router are large enough to satisfy the
bandwidth demand for all values of ε. The results depicted in Fig. 2 show that
smaller ε produces smaller number of migration. This can be explained by the
fact that smaller ε means that the risk of network overloading is very small. Since,
migration produces additional traffic, SIPγ+β tries to minimize the number of
migration in order to prevent from extensive migrations.
4
Conclusion
In this work, we proposed a SIP formulation that aims to solve the network-aware
VM placement problem in geo-distributed DCs. We considered the uncertainty
of inter-VMs traffic. Our objective was to minimize the overall traffic circulating
in the backbone network in order to prevent from congestion problems. In order
to solve the problem, we proposed an equivalent optimization model based on
sampling methods as well as an efficient algorithm to cope with non-linearity
problem.
44
H. Teyeb et al.
References
1. Yu, L., Chen, L., Cai, Z., Shen, H., Liang, Y., Pan, Y.: Stochastic load balancing for
virtual resource management in datacenters. IEEE Trans. Cloud Comput. PP(99),
1 (2016)
2. Benson, T., Akella, A., Maltz, D.A.: Network traffic characteristics of data centers
in the wild. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet
Measurement, pp. 267–280 (2010)
3. Kandula, S., Sengupta, S., Greenberg, A., Patel, P., Chaiken, R.: The nature of
data center traffic: measurements & analysis. In: Proceedings of the 9th ACM
SIGCOMM Conference on Internet Measurement Conference, pp. 202–208 (2009)
4. Beloglazov, A., Buyya, R.: Managing overloaded hosts for dynamic consolidation
of virtual machines in cloud data centers under quality of service constraints. IEEE
Trans. Parallel Distrib. Syst. 24(7), 1366–1379 (2013)
5. Xiao, Z., Song, W., Chen, Q.: Dynamic resource allocation using virtual machines
for cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 24(6), 1107–
1117 (2013)
6. Gong, Z., Gu, X., Wilkes, J.: Press: Predictive elastic resource scaling for cloud
systems. In: International Conference on Network and Service Management, pp.
9–16 (2010)
7. Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming
- Modeling and Theory, vol. 16, 2nd edn. SIAM, Philadelphia (2014)
8. Maguluri, S.T., Srikant, R., Ying, L.: Stochastic models of load balancing and
scheduling in cloud computing clusters. In: INFOCOM Proceedings IEEE, pp.
702–710 (2012)
9. Ghosh, R., Longo, F., Xia, R., Naik, V.K., Trivedi, K.S.: Stochastic model driven
capacity planning for an infrastructure-as-a-service cloud. IEEE Trans. Serv. Comput. 7(4), 667–680 (2014)
10. Wang, M., Meng, X., Zhang, L.: Consolidating virtual machines with dynamic
bandwidth demand in data centers. In: INFOCOM Proceedings IEEE, pp. 71–75
(2011)
11. Jin, H., Pan, D., Xu, J., Pissinou, N.: Efficient VM placement with multiple deterministic and stochastic resources in data centers. In: 2012 IEEE Global Communications Conference, pp. 2505–2510 (2012)
12. Chase, J., Niyato, D.: Joint optimization of resource provisioning in cloud computing. IEEE Trans. Services Comput. PP(99), 1 (2015)
13. Corporate Headquarters: Data center networking: enterprise distributed data centers solutions reference nework design. In: Solutions Reference Network Design,
Cisco Systems Inc (2003)
14. Kim, S., Pasupathy, R., Henderson, S.G.: A Guide to Sample Average Approximation. Handbook of Simulation Optimization. International Series in Operations
Research & Management Science, pp. 207–243. Springer, New York (2015). doi:10.
1007/978-1-4939-1384-8 8
15. Homem-de Mello, T., Bayraksan, G.: Monte carlo sampling-based methods for
stochastic optimization. Surveys Oper. Res. Manag. Sci. 19(1), 56–85 (2014)
16. IBM Corporation ILOG CPLEX: http://www.ilog.com/products/cplex/. Accessed
04 Feb 2013
Документ
Категория
Без категории
Просмотров
0
Размер файла
331 Кб
Теги
978, 69462, 319
1/--страниц
Пожаловаться на содержимое документа