Skip to content

HA Cluster Status

There are several ways to inspect the health and state of a DRP HA cluster. The primary tool is drpcli system ha, which provides sub-commands for each aspect of cluster state. All commands should target the cluster's virtual IP address (or the active node directly).

Key Status Commands

Bash
# Show the complete cluster state (nodes, settings, VIP)
drpcli system ha dump

# Show the Consensus ID of the currently active node
drpcli system ha active

# Show the current HA state of the node you are talking to
drpcli system ha state

# Return true if at least one passive node can take over
drpcli system ha failOverSafe

# List all known cluster members
drpcli system ha peers

# Show the current Raft leader's Consensus ID
drpcli system ha leader

# Show the Consensus ID of the node you are talking to
drpcli system ha id

For automated monitoring, drpcli system ha failOverSafe returns true when the cluster is safe to fail over and false when it is not. Pass an optional timeout (up to 5 seconds) to wait for the cluster to become failover-safe before returning.

The ha-state.json File

Every cluster member persists its HA configuration in /var/lib/dr-provision/ha-state.json. This file is the authoritative source for HA settings and takes precedence over command-line flags on restart. Do not edit this file manually except when following the documented VIP-change procedure.

Key fields in ha-state.json and what they indicate:

Field Description
Enabled true when HA is active on this node
Passive true when this node is a passive (non-active) member
ConsensusEnabled true when this node participates in Raft consensus
ConsensusID Unique identifier for this node in the Raft cluster
HaID Shared cluster identifier; must match on every member
ActiveUri The virtual IP URL that external clients connect to
ApiUrl This node's own API URL (unique per node)
ConsensusAddr The address:port used for Raft replication traffic
LoadBalanced true if an external load balancer manages the VIP
VirtAddr The virtual IP address in CIDR notation
VirtInterface The network interface where the VIP is assigned
Observer Reserved for future use; nodes marked Observer cannot become active
Valid false means the state file is being initialized from CLI flags

Interpreting Cluster Health

A healthy cluster shows:

  • drpcli system ha active returns a non-empty Consensus ID.
  • drpcli system ha failOverSafe returns true.
  • drpcli system ha peers lists all expected member IDs.
  • drpcli system ha dump shows each node with Passive: false for the active node and Passive: true for all others.

If drpcli system ha active returns empty, the cluster is in the process of electing a new leader. Wait a few seconds and retry. If the condition persists, check network connectivity between nodes on the ConsensusAddr port and review the dr-provision service logs.

API Endpoint

The license and system status API also provides endpoint-level information:

Bash
# Show endpoint info including version and HA state
drpcli info get