21.16. Clusters v4.8+

Digital Rebar multi-machine clustering patterns were updated to leverage contexts for cluster management in v4.8.

See also: Universal Cluster Operations

21.16.1. Shared Data via Cluster Profile

The Cluster Profile is shared Profile that has been assigned to all Machines in the cluster including the Manager. The profile is self-referential: it must contain the name of the profile in a parameter so that machine action will be aware of the shared profile.

The Digital Rebar API has special behaviors allow machines to modify these templates including an extention for Golang template rendering (see Rendering Templates) to include .GenerateProfileToken. This special token must be used when updating the shared template.

For example, if we are using the Profile example to create a cluster, then we need to include the Param cluster/profile: example in the Profile. While this may appear redundant, it is essential for the machines to find the profile when they are operating against it.

Typically, all cluster scripts start with a “does my cluster profile exist” stanza from the cluster-utilities template.

Which is typically included in any cluster related task with the following:

{{ template "cluster-utilities.tmpl" .}}

The cluster-utilities template has the following code to initialize variables for cluster tasks.

{{ if .ParamExists "cluster/profile" }}
CLUSTER_PROFILE={{.Param "cluster/profile"}}
PROFILE_TOKEN={{.GenerateProfileToken (.Param "cluster/profile") 7200}}
echo "  Cluster Profile is $CLUSTER_PROFILE"
{{ else }}
echo "  WARNING: no cluster profile defined!  Run cluster-initialize task."
{{ end }}

21.16.2. Adding Data to the Cluster Profile

As data collects on the cluster profile from the manager or other members, it is common to update shared data as Params in the cluster profile. This makes the data available to all members in the cluster.

drpcli -T $PROFILE_TOKEN profiles add $CLUSTER_PROFILE param "myval" to "$OUTPUT"

Developers should pay attention to timing with Param data. Params that are injected during template rendering (e.g.: {{ .Param "myval" }}) are only evaluated when the job is created and will not change during a task run (aka a job).

If you are looking for data the could be added or changed inside a job then you should use the DRPCLI to retrieve the information from the shared profile with the -T $PROFILE_TOKEN pattern.

21.16.3. Resolving Potential Race Conditions via Atomic Updates with JSON PATCH

When cases where multiple machines write data into the Cluster Profile, there is a potential for race conditions. The following strategy is used to address these cases in a scalable way. No additional tooling is required.

The Digital Rebar CLI and UX use JSON PATCH (https://tools.ietf.org/html/rfc6902) instead of PUT extensively. PATCH allows atomic field-level updates by including tests in the update. This means that simulataneous upates do not create “last in” race conditions. Instead, the update will fail in a predictable way that can be used in scripts.

The DRPCLI facilitates use of PATCH for atomic operations by allowing scripts to pass in a reference (aka pre-modified) object. If the -r reference object does not match then the update will be rejected.

This allows machines take actions that require synchronization among the cluster when waiting on operations to finish on other machines. This requirement is mitigated by the manager pattern

The following example shows code that runs on all maachines but only succeeds for the cluster leader. It assumes the Param my/data is set to default to “none”.

{{template "setup.tmpl" .}}
cl=$(get_param "my/data")
while [[ $cl = "none" ]]; do
  drpcli -r "$cl" -T "$PROFILE_TOKEN" profiles set $CLUSTER_PROFILE param "my/data" to "foo" 2>/dev/null >/dev/null && break
  # sleep is is a hack but it allows for backoffs
  sleep 1
  # get the cluster info
  cl=$(get_param "my/data")

21.16.4. Cluster Filter to Collect Members

The cluster/filter Param plays a critical role in allowing the cluster manager to collect members of the cluster. The filter is a DRPCLI string that is applied to a DRPCLI machines list or DRPCLI machines count call to indentify the cluster membership.

This process is baked into the helper routines used for the cluster pattern and should be defined in the cluster profile if the default is not sufficient. By default, the cluster/filter is set to Profiles Eq $CLUSTER_PROFILE and will select all the machines attached to the cluster profile including the manager. Deveopers may choose to define clusters by other criteria such as Pool membership, machine attributes or Endpoint.

This shows how cluster/filter can be used in a task to collect the cluster members including the manager. --slim is used to reduce the return overhead.


CLUSTER_MEMBERS=”$(drpcli machines list {{.Param “cluster/filter”}} –slim Params,Meta)”

In practice, additional filters are applied to further select machines based on cluster role or capability (see below).

21.16.5. Starting Workflow on Cluster Members

During the multi-machine task(s), a simple loop can be used to start Workflows on the targeted members.

This example shows a loop that selects all members who are cluster leaders (cluster/leader Eq true) and omits the cluster manager as a safe guard (cluster/manager Eq false). Then it apples the target workflow and sets an icon on each leader.

CLUSTER_LEADERS="$(drpcli machines list cluster/manager Eq false cluster/leader Eq true {{.Param "cluster/filter"}} --slim Params,Meta)"
UUIDS=$(jq -rc ".[].Uuid" <<< "$CLUSTER_LEADERS")
for uuid in $UUIDS; do
  echo "  starting k3s leader install workflow on $uuid"
  drpcli machines meta set $uuid key icon to anchor > /dev/null
  drpcli machines workflow $uuid k3s-machine-install > /dev/null

Since these operations are made against another machine, multi-machine task(s) need to be called with an ExtraClaims definition that allows * actions for the scope: machines.

21.16.6. Working with Cluster Roles

As discussed above, the cluster pattern includes three built in roles: manager, leader and worker (assumed as not-leader and not-manager). The cluster/leaders are selected randomly during the cluster-initialize when run on the cluster manager. The default number of leaders is 1.

Developers can define additional roles by defining and assigning Params to members during the process. The three built in roles are used for reference.