Skip to content

Proxmox Cluster Operations Fail with "WorkOrder failed"

When attempting to create new, scale up, or scale down a Digital Rebar Platform (DRP) managed Cluster that is utilizing the Proxmox cloud-wrappers capability, the operation may fail if the Resource Broker is not configured with the correct Proxmox Hypervisor node name. This results in Terraform Apply failures in the Resource Broker, which are also reflected in the Cluster operations failing.

Ultimately this may be a result of the Resource Broker configuration of the Param proxmox/node not correctly matching the Proxmox Hypervisor's defined node name.

Symptom

When attempting to create, expand, or contract (reduce/destroy) a Cluster that utilizes a Proxmox Resource Broker, the operation fails with a Cluster Job Log (Activity) error similar to:

ERROR: WorkOrder failed!

This results in the Cluster operation being unable to complete successfully.

Problem

While there may be several possible reasons for the failure, it is possible that the Resource Broker configuration contains a proxmox/node configuration that does not allow the unerlaying Terraform Provider to correctly complete it's work. This can be determined by evaluating the Cluster Job Log (Activity) to determine the matching Resource Broker Job Log, and reviewing the Resource Broker's Job Log failure messages.

The Cluster Job Log will contain a link to the Work_Order that was initiated on the Resource Broker, similar to:

# note this is a reference only, the Work_Order ID will be different
ux://work_orders/1ef5bfcd-bf90-623b-9670-bb6428abb326/activity

Clicking on this link will show the Job Log for the Work_order that was executed on the Resource Broker. Going directly to the Resource Broker, it is possible to review the Activity Logs for the last failed log if no other operations have been attempted via the Cluster or Resource Broker since the original failure.

The following errors may be observed in this case:

Error: error creating VM: 596 error:0A000086:SSL routines::certificate verify failed, error status:
...snip...
Error: Plugin did not respond
...snip...
The plugin encountered an error, and failed to respond to the
plugin.(*GRPCProvider).ApplyResourceChange call. The plugin logs may contain
more details.
...snip...
ERROR: Terraform did not succeed - fail

This error can occur regardless of the Param proxmox/tls-insecure configured value.

Ultimately, the backing Terraform Plugin Provider and the Proxmox API operations must utilize the Proxmox Hypervisors configured node name. The DRP Endpoint must also be able to resolve the defined node name in DNS correctly for the Resource Broker to be able make the API calls successfully.

Proxmox defines the Hypervisor node name independently from the operating system hostname, DNS records, or IP Addresses. It is usually beneficial to ensure that the operating system hostname and the Proxmox node name are both the same. For this reason, DRP provisions Hypervisor node names and operating system hostname to match the DRP Machine Name to help reduce confusion.

Solution

Modify the backing Resource Broker configuration to use the Proxmox hypervisor's hostname and also ensure the DRP Endpoint can resolve it to a reachable IP Address. The node name value can also be verified by logging in to the shell of the Proxmox Hypervisor, and reviewing the directory name(s) found in the /etc/pve/nodes/ directory. A directory will exist with the node name that must be used.

The configuration change is corrected by modifying the Param proxmox/node which is in the Profile that has the same name as the Resource Broker's name.

For example, if the Resource Broker name is proxmox-sm-mach-03, then the Profile with the same name contains the proxmox/node Param definition which needs to be modified. This can be found on the Machines detail tab named Profiles, or the Profile can be modified by finding it in the Portal's Profiles menu entry (on the left).

Assuming the Resource Broker name is proxmox-sm-mach-03 and the Proxmox Hypervisor hostname is set to sm-mach-03, the following CLI will make the appropriate change.

BROKER="proxmox-sm-mach-03"
HYPERVISOR="sm-mach-03"
drpcli profiles set $BROKER param proxmox/node to $HYPERVISOR

Additional Information

Additional resources and information related to this Knowledge Base article.

See Also

Versions

All versions of Cloud Wrappers, Proxmox, Clusters, Resource Brokers, Work Orders, and DRP

Keywords

proxmox, clusters, resource_brokers, work_orders, cloud-wrappers

Revision Information

KB Article     :  kb-00083
initial release:  Fri August 16 11:18:39 AM PDT 2024
updated release:  Fri August 16 11:18:39 AM PDT 2024