Workload Balance Migration Strategy¶

Synopsis¶

display name: Workload Balance Migration Strategy

goal: workload_balancing

Workload balance using live migration

Description

It is a migration strategy based on the VM workload of physical servers. It generates solutions to move a workload whenever a server’s CPU or RAM utilization % is higher than the specified threshold. The threshold specified is used to trigger a migration, but it is also used to determine if there is an available host, with low enough utilization, to migrate the instance. The VM to be moved should make the host close to average workload of all compute nodes.

Requirements

Hardware: compute node should use the same physical CPUs/RAMs

Software: Ceilometer component ceilometer-agent-compute running in each compute node, and Ceilometer API can report such telemetry “instance_cpu_usage” and “instance_ram_usage” successfully.

You must have at least 2 physical compute nodes to run this strategy.

Limitations

We cannot forecast how many servers should be migrated. This is the reason why we only plan a single virtual machine migration at a time. So it’s better to use this algorithm with CONTINUOUS audits.

It assume that live migrations are possible

Metrics¶

The workload_balance strategy requires the following metrics:

metric	service name	plugins	unit	comment
`cpu`	ceilometer	none	percentage	CPU of the instance. Used to calculate the threshold
`memory.resident`	ceilometer	none	MB	RAM of the instance. Used to calculate the threshold

Note

The parameters above reference the instance CPU or RAM usage, but the threshold calculation is based of the CPU/RAM usage on the hypervisor.
The RAM usage can be calculated based on the RAM consumed by the instance, and the available RAM on the hypervisor.
The CPU percentage calculation relies on the CPU load, but also on the number of CPUs on the hypervisor.
The host memory metric is calculated by summing the RAM usage of each instance on the host. This measure is close to the real usage, but is not the exact usage on the host.

Cluster data model¶

Default Watcher’s Compute cluster data model:

Nova cluster data model collector

The Nova cluster data model collector creates an in-memory representation of the resources exposed by the compute service.

Actions¶

Default Watcher’s actions:

action

description
migration
Migrates a server to a destination nova-compute host

This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating.

The action schema is:
schema = Schema({
 'resource_id': str,  # should be a UUID
 'migration_type': str,  # choices -> "live", "cold"
 'destination_node': str,
 'source_node': str,
})
The resource_id is the UUID of the server to migrate. The source_node and destination_node parameters are respectively the source and the destination compute hostname.

Note

Nova API version must be 2.56 or above if destination_node parameter is given.

action	description
`migration`	Migrates a server to a destination nova-compute host This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating. The action schema is: schema = Schema({ 'resource_id': str, # should be a UUID 'migration_type': str, # choices -> "live", "cold" 'destination_node': str, 'source_node': str, }) The resource_id is the UUID of the server to migrate. The source_node and destination_node parameters are respectively the source and the destination compute hostname. Note Nova API version must be 2.56 or above if destination_node parameter is given.

Planner¶

Default Watcher’s planner:

Weight planner implementation

This implementation builds actions with parents in accordance with weights. Set of actions having a higher weight will be scheduled before the other ones. There are two config options to configure: action_weights and parallelization.

Limitations

This planner requires to have action_weights and parallelization configs tuned well.

Configuration¶

Strategy parameters are:

parameter	type	default value	description
`metrics`	String	instance_cpu_usage	Workload balance base on cpu or ram utilization. Choices: [‘instance_cpu_usage’, ‘instance_ram_usage’]
`threshold`	Number	25.0	Workload threshold for migration. Used for both the source and the destination calculations. Threshold is always a percentage.
`period`	Number	300	Aggregate time period of ceilometer
`granularity`	Number	300	The time between two measures in an aggregated timeseries of a metric. This parameter is only used with the Gnocchi data source, and it must match to any of the valid archive policies for the metric.

Efficacy Indicator¶

None

Algorithm¶

For more information on the Workload Balance Migration Strategy please refer to: https://specs.openstack.org/openstack/watcher-specs/specs/mitaka/implemented/workload-balance-migration-strategy.html

How to use it ?¶

Create an audit template using the Workload Balancing strategy.

$ openstack optimize audittemplate create \
  at1 workload_balancing --strategy workload_balance

Run an audit using the Workload Balance strategy. The result of the audit should be an action plan to move VMs from any host where the CPU usage is over the threshold of 26%, to a host where the utilization of CPU is under the threshold. The measurements of CPU utilization are taken from the configured datasouce plugin with an aggregate period of 310.

$ openstack optimize audit create -a at1 -p threshold=26.0 \
        -p period=310 -p metrics=instance_cpu_usage

Run an audit using the Workload Balance strategy to obtain a plan to balance VMs over hosts with a threshold of 20%. In this case, the stipulation of the CPU utilization metric measurement is a combination of period and granularity.

$ openstack optimize audit create -a at1 \
       -p granularity=30 -p threshold=20 -p period=300 \
       -p metrics=instance_cpu_usage --auto-trigger

External Links¶

None.

Workload Balance Migration Strategy