ENERGY-AWARE LOAD BALANCING AND
APPLICATION SCALING FOR THE CLOUD ECOSYSTEM
ABSTRACT:
In this paper we introduce an energy aware operation
model used for load balancing and application scaling on a cloud. The basic
philosophy of our approach is defining an energy optimal operation regime and
attempting to maximize the number of servers operating in this regime. Idle and
lightly –loaded servers are switched to one of the sleep states to save energy.
The load balancing and scaling algorithms also exploit some of the most
desirable features of server consolidation mechanisms discussed in the
literature.
EXISTING
SYSTEM:
In the last few years packaging computing cycles and
storage and offering them as a metered service became a reality. Large farms of
computing and storage platforms have been assembled and a fair number of cloud
service providers(CSPs) offer computing services based on three cloud delivery
models SAAS (software as a service).PAAS (Platform As A Service) and IASS(Infrastructure
as a service).Warehouse scale computers(WSCS) are the building blocks of a
cloud infrastructure. A hierarchy of networks connects 50; 000 to 100; 000
servers in a WSC. The servers are housed in racks; typically, the 48 servers in
a rack are connected by a 48-port Gigabit Ethernet switch. The switch has two
to eight up links which go to the higher level switches in the network
hierarchy .Cloud elasticity the ability to use as many resources as needed at
any given time, and low cost, a user is charged only for the a resources it
consumes, represents solid incentives for many organizations to migrate their computational
activities to a public cloud.
The number of CSPs, the spectrum of services offered
by the CSPs, and the number of cloud users have increased drastically during
the last few years. For example, in 2007 the EC2 (Elastic Computing Cloud) was
the first service provided by AWS (Amazon Web Services); five year later, in
2012, AWS was used by businesses in 200 countries. Amazon’s S3 (Simple Storage
Services) has surpassed two trillion objects and routinely runs more than 1.1
million peak requests per second.
PROPOSED
SYSTEM:
There are three primary contributions of this paper
(1) a new model of cloud servers that is based on
different operating regimes with various degrees of energy efficiency”
(processing power versus energy consumption);
(2) a novel algorithm that performs load balancing
and application scaling to maximize the number of servers operating in the
energy-optimal regime; and
(3) analysis and comparison of techniques for load
balancing and application scaling using three differently-sized clusters and
two different average load profiles.
Models for
energy-aware resource management and application placement policies and the
mechanisms to enforce these policies such as the ones introduced in this paper
c an be evaluated theoretically, experimentally, though simulation, based on
published data, or though a combination of these techniques. Analytical models
can be used to derive high-level insight on th behavior of the system in a very
short time but the biggest challenge is in determining the values of the
parameters; while the results from an analytical model can give a good
approximation of the relative trends to expect, her may be significant errors
in the absolute predictions.
Experimental data is collected on small scale
systems such experiments provide useful performance data for individual system
components but no insights on the interaction between the system and
applications and the scalability of the policies. Trace based workload analysis
such as the ones are very useful though they provide information for a
particular experiment set up hard ware configuration and applications.
Typically trace based simulation need more time to produce results.
Traces can
also be very large and it is hard to generate resprestative traces from one
class of machines that will be valid for all the classes of simulated machines.
To evaluate the energy aware load balancing and application scaling policies
and mechanism introduced in this paper we chose simulation using data published
in the literature.
MODULE 1
LOAD
BALANCING IN CLOUD COMPUTING
Cloud computing” is a term, which involves
virtualization, distributed computing, networking, software and web services. A
cloud consists of several elements such as clients, datacenter and distributed
servers. It includes fault tolerance, high availability, scalability,
flexibility, reduced overhead for users, reduced cost of ownership, on demand
services etc. Central to these issues lies the establishment of an effective
load balancing algorithm. The load can be CPU load, memory capacity, delay or
network load. Load balancing is the process of distributing the load among
various nodes of a distributed system to improve both resources utilization and
job response time while also avoiding a situation where some of the nodes are
heavily loaded while mother nodes are idle or doing very little work. Load
balancing ensures that all the processor in the system or every node in the
network does approximately the equal amount of work at any instant of time.
This technique can be sender initiated, receiver initiated or symmetric type
(combination of sender initiated and receiver initiated types). Our objective
is to develop an effective load balancing algorithm using Divisible load scheduling
theor to maximize or minimize different performance parameters (throughput,
latency for example) for the clouds of different sizes ( virtual topology
depending on the application requirement).
MODULE 2
ENERGY
EFFICIENCY OF A SYSTEM
The energy efficiency of a system is captured by the
ratio performance per Watt of power.” During the last two decades the performance
of computing systems has increased much fater than their energy efficiency.
Energy proportional systems. In an ideal world, the energy consumed by an idle
system should be near zero and grow linearly with the system load. In real
life, even systems whose energy requirements scale linearly, when idle, use more than half the energy they
use at full load. Data collected over a long period of time shows that the
typical operating regime for data center servers is far from an optimal energy
consumption regime. An energy-proportional system consumes no energy when idle,
very little energy under a light load, and gradually, more energy as the load
increases. An ideal energy proportional system consumes no energy when idle,
very little energy uder a light load and gradually more energy as the load
increases. An ideal energy proportional system is always operating at 100%
efficiency.
Energy
efficiency of a data center the dynamic range of subsystems. The energy
efficiency of a data center is measured
by the power usage effectiveness(PUE),
the ratio of total energy used to power a data center to the energy used to
power computational servers, storage servers, and other IT equipment. The PUE
has improved from around 1:93 in 2003 to 1:63 in 2005 recently, google reported
a PUE ratio as low as 1.15. the improvement in PUE forces us to concentrate on
energy efficiency of computational resources. The dynamic range is the
difference between the upper and the lower limits of the energy consumption of
a system as a function of the load placed on the system. A large dynamic range
means that a system is able to operate at a lower fraction of its peak energy when
its load is low. Differnet subsystems of a computing systems behave differently
in terms of energy efficiency while many processors have reasonably good energy
proportional profiles significant improvements in memory and disk subsystems are
necessary. The largest consumer of energy in a server is the processor,
followed by memory, and storage systems. Estimated distribution of the peak
power of different hardware system in one of the Google’s datacenters is CPU
33%, DRAM 30%.
Disks 10%, network 5%, and others 22%. The power
consumption can vary from 45w to 200w
per multicore CPU. The power consumption of servers has increased over time
during the period 2001-2005 the estimated average power se has increased from
193 to 225 W for volume servers, from 457 to 675 for mid range severs and from
5,832 to 8,163 W for high end ones. Volume servers have a price less than $25
K, mid-range servers have a price between $25 K and $499 K, and high-end
servers have a price tag larger than $500 K. Newer processors include power
saving technologies. The processors used in servers consume less than one third
of their peak power at very-low load and have a dynamic range of more than 70%
of peak power the processors used in mobile and /or embedded applications are
better in this respect. According to the dynamic power range of other
components of a system is much narrower less than 50% for DRAM, 25% for disk
drivers, and 15% for networking switches. Larger servers often use 32 64 dual
in line memory modules (DIMMs) the power consumption of one DIMM is in the 5 to
21 W range.
MODULE 3
RESOURCE
MANAGEMENT POLICIES FOR LARGER SCALE DATA CENTERS.
These policies can be
loosely grouped into five classes:
·
Admission
control
·
Capacity
allocation
·
Load balancing
·
Energy
optimization
·
Quality of
service(QOS) guarantees.
The explicit it goal of an admission control policy
is to prevent the system from accepting workload in violation of high level
system policies a system should not accept additional workload preventing it
from completing work already in progress or contracted. Limiting the workload
requires some knowledge of the global state of the system; in a dynamic system
such knowledge, when available, is at best obsolete. Capacity allocation means
allocating resources for individual instances; an instance is an activation of
a service. Some of the mechanisms for capacity allocation are based on either
static or dynamic thresholds. Economy of scale aspects the energy efficiency of
data processing.
For example, google reports that the annual energy
consumption for an Email service varies significantly depending on the business
size and can be 15 times larger for a small business than for a larger one.
Cloud computing can be more enrgy efficient than on premise computing for many
organizations.
MODULE 4
SERVER
CONSOLIDATION
·
The term
server consolidation is used to describe:
·
Switching
idle and lightly loaded systems to a sleep state
·
Workload
migration to prevent overloading of systems
Any optimization of cloud performance and energy
efficiency by redistributing the workload.
Server consolidation policies. Several policies have
been proposed to decide when to switch a server to a sleep state. The reactive
policy responds to the current load; it switches the servers to a sleep state
when the load decreases and switches them to the running state when the load
increases. Generally, this policy leads to SLA violations and could work only
for slow varying predictable loads. To reduce SLA violations one can envision a
reactive with extra capacity policy when one attempts to have a safety margin and keep running a fraction
of the total number of servers, e.g., 20% above those needed for the current
load. The auto scale policy is a very conservative reactive policy in switching
servers to sleep state to avoid the power consumption and the delay in
switching them back to running state. This can be advantageous for un
predictable policies. The moving window policy estimates the workload by
measuring the average request rate in a window of size seconds uses this
average to predict the load for second(_+2). And so on. The predictive linear
regression policy uses a linear regression to predict the future load.
MODULE 5
ENERGY
AWARE SCALING ALGORITHMS
The objective of the algorithms is to ensure that
the large possible number of active servers operate within the boundaries of
their respective optimal operating regime. The actions implementing this policy
are:
·
Migrate VMs from
a server operating in the undesirable low regime and then switch the server to
a sleep state;
·
Switch an idle
server to a sleep state and reactive servers in a sleep state when the cluster
load increases
·
Migrate the VMs
from an overloaded server, a server operating in the undesirable high regime with
applications predicated to increase their demands for computing in the next
teallocation cycles.
For example, when deciding to migrate some of the
VMs running on a server or to switch a server to a sleep state, we can adopt a
conservative policy similar to the one advocated by auto scaling to save
energy.Predicitve policies, such as the ones discussed will be used to allow a
server to operate in a suboptimal regime when historical data regarding its
workload indicates that it is likely to return to the optima regime in the near
future.
The cluster leader has relatively accurate
information about the cluster load and its trends. The leader could use
predictive algorithms to initiate a gradual wake up process for servers in a
deeper sleep state, C4 C6, when the workload is above a high water mark and the
workload is continually increasing. We set up the high water mark at 80% of the
capacity of active servers; a threshold of 85% s used for deciding that a
server is overloaded, based on an analysis of workload traces. The leader could
also choose to keep a number of servers in C1 or C2 states because it takes
less energy and time to return to the C0 state from these states. The energy
management component of the hypervisor can use only local information to
determine the regime of a server.
CONCLUSION:
The realization that power consumption of cloud
computing centers is significant and is expected to increase substantially in
the future motivates the inters of the research community in energy aware
resource management and application placement policies and the mechanisms to
enforce these policies. Low average server utilization and its impact on the
environment make it imperative to devise new energy aware policies which
identify optimal regimes for the cloud servers and at the same time, prevent
SLA violations. A Quantitative evaluation of an optimization algorithm or an
architectural enhancement is a rather intricate and time consuming process;
several benchmarks and system configurations are used to gather the data
necessary to guide future developments.
For example, to evaluate the effects of
architectural enhancements supporting instruction level or data level
Parallelism on th processor performance and their power consumption several
benchmarks are used. The results show numerical outcomes for the individual
applications in each benchmark.
Similarly, the effects of an energy aware algorithm
depend on the system configuration and on the application and cannot be
expressed by a single numerical value. Research on energy aware resource
management in large scale numerical value.
Research on energy aware resource management in
large scale systems often use simulation for a quasi quantitative and more
often a qualitative evaluation of optimization algorithms of procedures.
As stated Firs,they(WSCs) are a new class of
large-scale machines driven by a new and rapidly evolving set of workloads.
Their size alone makes them difficult to experiment with or to simulate
efficiently. It is rather difficult to experiments with the systems discussed
in this paper and this precisely the reason why we choose simulation.
Author(s)
Paya,A. Ashkan Paya is
with Computer Science Division, EECS Department University of Central Florida,
Orlando, FL 32816, USA. (email:ashkan paya@knights.ucf.edu)
Marinescu, D.
Marinescu, D.
Comments
Post a Comment