Skip to content

High Availability

Digital Rebar Provision supports High Availability (HA) deployments that reduce outage windows and protect against single-node failures. HA uses the RAFT consensus protocol to maintain consistent state across a cluster of DRP endpoint nodes.

HA Architecture

DRP HA is an active-passive configuration:

  • One node is the leader and handles all write operations
  • Remaining nodes are followers that replicate state from the leader
  • If the leader becomes unavailable, RAFT elects a new leader from the followers

Note

DRP HA cannot be used for read/write load sharing. All API write operations are handled by the leader. This simplifies consistency guarantees but means HA provides fault tolerance, not horizontal scaling.

For scaling guidance, see Scaling & Sizing.

RAFT Consensus

DRP uses the RAFT protocol on port 8093/tcp for:

  • Leader election when the current leader fails
  • Log replication — changes made to the leader are replicated to followers before being committed
  • Cluster membership — nodes join and leave the HA cluster through RAFT membership changes

All DRP nodes in an HA cluster must be able to communicate bi-directionally on port 8093.

Availability Zones

HA nodes can span availability zones to protect against zone-level failures. Considerations:

  • Odd node counts — RAFT requires a majority quorum. Use 3 or 5 nodes; 2 or 4 nodes provide no additional fault tolerance over 1 or 3
  • Network latency — High latency between zones increases write commit time; keep cross-zone latency low
  • Quorum requirements — A 3-node cluster tolerates 1 failure; a 5-node cluster tolerates 2 failures

State Replication

All DRP object data (machines, workflows, content packs, plugins, files) is replicated across HA nodes via RAFT. The only exception is the DRP binary version itself — nodes may run different versions during a rolling upgrade. Once upgraded, RAFT replicates any remaining content differences.

Manager Integration

When a DRP HA cluster is registered with a Manager, the Manager treats the entire HA cluster as a single logical endpoint. Configuration pushed via VersionSets is applied to the cluster; HA then replicates it across all nodes.

See Also