Host Maintenance Strategy

Synopsis

display name: Host Maintenance Strategy

goal: cluster_maintaining

[PoC]Host Maintenance

Description

It is a migration strategy for one compute node maintenance, without having the user’s application been interrupted. If given one backup node, the strategy will firstly migrate all instances from the maintenance node to the backup node. If the backup node is not provided, it will migrate all instances, relying on nova-scheduler.

Requirements

  • You must have at least 2 physical compute nodes to run this strategy.

Limitations

  • This is a proof of concept that is not meant to be used in production

  • It migrates all instances from one host to other hosts. It’s better to execute such strategy when load is not heavy, and use this algorithm with ONESHOT audit.

  • It assumes that cold and live migrations are possible.

Requirements

None.

Metrics

None

Cluster data model

Default Watcher’s Compute cluster data model:

Nova cluster data model collector

The Nova cluster data model collector creates an in-memory representation of the resources exposed by the compute service.

Actions

Default Watcher’s actions:

action

description

migration

Migrates a server to a destination nova-compute host

This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating.

The action schema is:

schema = Schema({
 'resource_id': str,  # should be a UUID
 'migration_type': str,  # choices -> "live", "cold"
 'destination_node': str,
 'source_node': str,
})

The resource_id is the UUID of the server to migrate. The source_node and destination_node parameters are respectively the source and the destination compute hostname (list of available compute hosts is returned by this command: nova service-list --binary nova-compute).

Note

Nova API version must be 2.56 or above if destination_node parameter is given.

change_nova_service_state

Disables or enables the nova-compute service, deployed on a host

By using this action, you will be able to update the state of a nova-compute service. A disabled nova-compute service can not be selected by the nova scheduler for future deployment of server.

The action schema is:

schema = Schema({
 'resource_id': str,
 'state': str,
 'disabled_reason': str,
})

The resource_id references a nova-compute service name (list of available nova-compute services is returned by this command: nova service-list --binary nova-compute). The state value should either be ONLINE or OFFLINE. The disabled_reason references the reason why Watcher disables this nova-compute service. The value should be with watcher_ prefix, such as watcher_disabled, watcher_maintaining.

Planner

Default Watcher’s planner:

Weight planner implementation

This implementation builds actions with parents in accordance with weights. Set of actions having a higher weight will be scheduled before the other ones. There are two config options to configure: action_weights and parallelization.

Limitations

  • This planner requires to have action_weights and parallelization configs tuned well.

Configuration

Strategy parameters are:

parameter

type

default Value description

maintenance_node

String

The name of the compute node which need maintenance. Required.

backup_node

String

The name of the compute node which will backup the maintenance node. Optional.

Efficacy Indicator

None

Algorithm

For more information on the Host Maintenance Strategy please refer to: https://specs.openstack.org/openstack/watcher-specs/specs/queens/approved/cluster-maintenance-strategy.html

How to use it ?

$ openstack optimize audit create \
  -g cluster_maintaining -s host_maintenance \
  -p maintenance_node=compute01 \
  -p backup_node=compute02 \
  --auto-trigger