Zone migration

Synopsis

display name: Zone migration

goal: hardware_maintenance

Zone migration using instance and volume migration

This is zone migration strategy to migrate many instances and volumes efficiently with minimum downtime for hardware maintenance.

Note

The term Zone in the strategy name is not a reference to Openstack availability zones but rather a user-defined set of Compute nodes and storage pools. Currently, migrations across actual availability zones is not fully tested and might not work in all cluster configurations.

Requirements

Metrics

None

Cluster data model

Default Watcher’s Compute cluster data model:

Nova cluster data model collector

The Nova cluster data model collector creates an in-memory representation of the resources exposed by the compute service.

Storage cluster data model is also required:

Cinder cluster data model collector

The Cinder cluster data model collector creates an in-memory representation of the resources exposed by the storage service.

Actions

Default Watcher’s actions:

action

description

migrate

Migrates a server to a destination nova-compute host

This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating.

The action schema is:

schema = Schema({
 'resource_id': str,  # should be a UUID
 'migration_type': str,  # choices -> "live", "cold"
 'destination_node': str,
 'source_node': str,
})

The resource_id is the UUID of the server to migrate. The source_node and destination_node parameters are respectively the source and the destination compute hostname.

Note

Nova API version must be 2.56 or above if destination_node parameter is given.

volume_migrate

Migrates a volume to destination node or type

By using this action, you will be able to migrate cinder volume. Migration type ‘swap’ can only be used for migrating attached volume. Migration type ‘migrate’ can be used for migrating detached volume to the pool of same volume type. Migration type ‘retype’ can be used for changing volume type of detached volume.

The action schema is:

schema = Schema({
    'resource_id': str,  # should be a UUID
    'migration_type': str,  # choices -> "swap", "migrate","retype"
    'destination_node': str,
    'destination_type': str,
})

The resource_id is the UUID of cinder volume to migrate. The destination_node is the destination block storage pool name. (list of available pools are returned by this command: cinder get-pools) which is mandatory for migrating detached volume to the one with same volume type. The destination_type is the destination block storage type name. (list of available types are returned by this command: cinder type-list) which is mandatory for migrating detached volume or swapping attached volume to the one with different volume type.

Planner

Default Watcher’s planner:

Weight planner implementation

This implementation builds actions with parents in accordance with weights. Set of actions having a higher weight will be scheduled before the other ones. There are two config options to configure: action_weights and parallelization.

Limitations

  • This planner requires to have action_weights and parallelization configs tuned well.

Configuration

Strategy parameters are:

parameter

type

default

required

description

compute_nodes

array

None

Optional

Compute nodes to migrate.

storage_pools

array

None

Optional

Storage pools to migrate.

parallel_total

integer

6

Optional

The number of actions to be run in parallel in total.

parallel_per_node

integer

2

Optional

The number of actions to be run in parallel per compute node in one action plan.

parallel_per_pool

integer

2

Optional

The number of actions to be run in parallel per storage pool.

priority

object

None

Optional

List prioritizes instances and volumes.

with_attached_volume

boolean

False

Optional

False: Instances will migrate after all volumes migrate. True: An instance will migrate after the attached volumes migrate.

Note

  • All parameters in the table above have defaults and therefore the user can create an audit without specifying a value. However, if only defaults parameters are used, there will be nothing actionable for the audit.

  • parallel_* parameters are not in reference to concurrency, but rather on limiting the amount of actions to be added to the action plan

  • compute_nodes, storage_pools, and priority are optional parameters, however, if they are passed they require the parameters in the tables below:

The elements of compute_nodes array are:

parameter

type

default

required

description

src_node

string

None

Required

Compute node from which instances migrate.

dst_node

string

None

Optional

Compute node to which instances migrate.

The elements of storage_pools array are:

parameter

type

default

required

description

src_pool

string

None

Required

Storage pool from which volumes migrate.

dst_pool

string

None

Optional

Storage pool to which volumes migrate.

src_type

string

None

Required

Source volume type.

dst_type

string

None

Required

Destination volume type

The elements of priority object are:

parameter

type

default

Required

description

project

array

None

Optional

Project names.

compute_node

array

None

Optional

Compute node names.

storage_pool

array

None

Optional

Storage pool names.

compute

enum

None

Optional

Instance attributes. [“vcpu_num”, “mem_size”, “disk_size”, “created_at”]

storage

enum

None

Optional

Volume attributes. [“size”, “created_at”]

Efficacy Indicator

The efficacy indicators for action plans built from the command line are:

[{'name': 'live_instance_migrate_ratio', 'description': 'Ratio of actual live migrated instances to planned live migrate instances.', 'unit': '%', 'value': 0}, {'name': 'cold_instance_migrate_ratio', 'description': 'Ratio of actual cold migrated instances to planned cold migrate instances.', 'unit': '%', 'value': 0}, {'name': 'volume_migrate_ratio', 'description': 'Ratio of actual detached volumes migrated to planned detached volumes migrate.', 'unit': '%', 'value': 0}, {'name': 'volume_update_ratio', 'description': 'Ratio of actual attached volumes migrated to planned attached volumes migrate.', 'unit': '%', 'value': 0}]

In Horizon, these indictors are shown with alternative text.

  • live_migrate_instance_count is shown as The number of instances actually live migrated in Horizon

  • planned_live_migrate_instance_count is shown as The number of instances planned to live migrate in Horizon

  • planned_live_migration_instance_count refers to the instances planned to live migrate in the action plan.

  • live_migrate_instance_count tracks all the instances that could be migrated according to the audit input.

Algorithm

For more information on the zone migration strategy please refer to: http://specs.openstack.org/openstack/watcher-specs/specs/queens/implemented/zone-migration-strategy.html

How to use it ?

$ openstack optimize audittemplate create \
  at1 hardware_maintenance --strategy zone_migration

$ openstack optimize audit create -a at1 \
  -p compute_nodes='[{"src_node": "s01", "dst_node": "d01"}]'

Note

  • Currently, the strategy will not generate both volume migration and instance migrations in the same audit. If both are requested, only volume migrations will be included in the action plan.

  • The Cinder model collector is not enabled by default. If the Cinder model collector is not enabled while deploying Watcher, the model will become outdated and cause errors eventually. See the Configuration option to enable the storage collector documentation.

Support caveats

This strategy offers the option to perform both Instance migrations and Volume migrations. Currently, Instance migrations are ready for production use while Volume migrations remain experimental.