Update control nodes in Cluster API clusters#
This page explains what can trigger a control plane upgrade in k0smotron and how to perform one.
What triggers a control plane upgrade#
k0smotron continuously reconciles each K0sControlPlane resource and detects that an upgrade is needed when any of the following change:
k0s version change#
The most common trigger is changing spec.version in the K0sControlPlane resource. k0smotron compares this value against the version reported by each control plane machine and starts the upgrade process if they differ.
Version values are normalized before comparison, so for example v1.31.2 and v1.31.2+k0s.0 are treated as equivalent.
Configuration change#
k0smotron handles configuration changes differently depending on which part of spec.k0sConfigSpec is modified:
- Changes to
spec.k0sConfigSpec.k0s(the k0sClusterConfigobject) are applied using k0s dynamic configuration. k0smotron patches theClusterConfigresource directly in the workload cluster without replacing any machines. Note that some fields cannot be changed via dynamic configuration and are ignored. See the k0s documentation for the full list.
- Changes to any other field in
spec.k0sConfigSpec(such asargs,files,preStartCommands, etc.) are detected by comparing a hash of the bootstrap config stored in each machine's annotations against the current spec. Machines whose config no longer matches are marked for replacement, using the Recreate workflow regardless ofspec.updateStrategy.
Warning
k0smotron only detects configuration changes made directly in the K0sControlPlane spec. The spec.k0sConfigSpec.files field supports loading file content from external Secret or ConfigMap objects via contentFrom, but if only the content of those objects changes, k0smotron will not detect it and no upgrade will be triggered. To propagate updated file content, create a new Secret or ConfigMap and update contentFrom in the K0sControlPlane spec to reference the new object.
Machine template change#
When spec.machineTemplate.infrastructureRef points to a new or changed infrastructure template, machines that were cloned from an older template revision are marked for replacement. k0smotron detects this by inspecting the cluster.x-k8s.io/cloned-from-name and cluster.x-k8s.io/cloned-from-groupkind annotations on each infrastructure machine.
Like configuration changes, template changes always trigger machine recreation.
Update strategies#
k0smotron supports three update strategies, configured via spec.updateStrategy:
| Strategy | Behavior |
|---|---|
InPlace (default) |
Updates k0s on existing machines without replacing them, using k0s autopilot |
Recreate |
Creates new machines first, then removes old ones |
RecreateDeleteFirst |
Removes old machines first, then creates new ones |
Warning
The Recreate strategy is not supported for clusters running in --single mode.
Warning
The RecreateDeleteFirst strategy requires at least 3 control plane nodes.
Monitoring upgrade status#
The K0sControlPlane status fields give visibility into an in-progress upgrade:
kubectl get k0scontrolplane <name> -o yaml
Relevant status fields:
| Field | Description |
|---|---|
status.replicas |
Total number of non-terminated control plane machines |
status.readyReplicas |
Machines that are fully running and ready |
status.upToDateReplicas |
Machines running the desired k0s version |
status.availableReplicas |
Machines currently available to serve traffic |
status.version |
Minimum Kubernetes version across all machines |
For InPlace upgrades, you can also inspect the autopilot plan running inside the workload cluster:
kubectl --kubeconfig <workload-cluster-kubeconfig> get plan autopilot -o yaml
See the k0s autopilot documentation for a description of the plan states.
Updating the control plane using k0s autopilot (InPlace)#
When spec.updateStrategy is InPlace (or omitted), k0smotron uses k0s autopilot to update k0s on each control plane node without replacing the machines. This is faster than recreating machines and keeps any local data on the node intact.
-
Check the configuration of your deployed cluster. For example:
apiVersion: cluster.x-k8s.io/v1beta2 kind: Cluster metadata: name: docker-test namespace: default spec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 serviceDomain: cluster.local services: cidrBlocks: - 10.128.0.0/12 controlPlaneRef: apiGroup: controlplane.cluster.x-k8s.io kind: K0sControlPlane name: docker-test-cp infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerCluster name: docker-test --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: DockerCluster metadata: name: docker-test namespace: default spec: --- apiVersion: controlplane.cluster.x-k8s.io/v1beta2 kind: K0sControlPlane metadata: name: docker-test-cp spec: replicas: 3 version: v1.31.2+k0s.0 updateStrategy: InPlace k0sConfigSpec: args: - --enable-worker k0s: apiVersion: k0s.k0sproject.io/v1beta1 kind: ClusterConfig metadata: name: k0s spec: api: extraArgs: anonymous-auth: "true" # anonymous-auth=true is needed for k0s to allow unauthorized health-checks on /healthz telemetry: enabled: true machineTemplate: infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerMachineTemplate name: docker-test-cp-template --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: DockerMachineTemplate metadata: name: docker-test-cp-template namespace: default spec: template: spec: customImage: kindest/node:v1.31.0 -
Update
spec.versionto the target k0s release:apiVersion: controlplane.cluster.x-k8s.io/v1beta2 kind: K0sControlPlane metadata: name: docker-test-cp spec: replicas: 3 version: v1.31.3+k0s.0 # updated version updateStrategy: InPlace k0sConfigSpec: args: - --enable-worker k0s: apiVersion: k0s.k0sproject.io/v1beta1 kind: ClusterConfig metadata: name: k0s spec: api: extraArgs: anonymous-auth: "true" telemetry: enabled: true machineTemplate: infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerMachineTemplate name: docker-test-cp-template -
Apply the change:
kubectl apply -f ./path-to-file.yamlk0smotron creates an autopilot
Planresource inside the workload cluster that orchestrates the rolling update across all control plane nodes.
Updating the control plane using the Cluster API workflow (Recreate)#
When spec.updateStrategy is Recreate, k0smotron replaces control plane machines one at a time: it creates new machines at the desired version, waits for them to become ready, then removes the old ones.
When spec.updateStrategy is RecreateDeleteFirst, it removes an old machine first before creating the replacement. This is useful when resources are constrained, but requires at least 3 control plane nodes to maintain quorum during the rollout.
Warning
The Recreate strategy is not supported for clusters running in --single mode.
-
Check the configuration of your deployed cluster. For example:
apiVersion: cluster.x-k8s.io/v1beta2 kind: Cluster metadata: name: docker-test namespace: default spec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 serviceDomain: cluster.local services: cidrBlocks: - 10.128.0.0/12 controlPlaneRef: apiGroup: controlplane.cluster.x-k8s.io kind: K0sControlPlane name: docker-test-cp infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerCluster name: docker-test --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: DockerCluster metadata: name: docker-test namespace: default spec: --- apiVersion: controlplane.cluster.x-k8s.io/v1beta2 kind: K0sControlPlane metadata: name: docker-test-cp spec: replicas: 3 version: v1.31.2+k0s.0 updateStrategy: Recreate k0sConfigSpec: args: - --enable-worker k0s: apiVersion: k0s.k0sproject.io/v1beta1 kind: ClusterConfig metadata: name: k0s spec: api: extraArgs: anonymous-auth: "true" # anonymous-auth=true is needed for k0s to allow unauthorized health-checks on /healthz telemetry: enabled: true machineTemplate: infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerMachineTemplate name: docker-test-cp-template --- apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: DockerMachineTemplate metadata: name: docker-test-cp-template namespace: default spec: template: spec: customImage: kindest/node:v1.31.0 -
Update
spec.versionto the target k0s release:apiVersion: controlplane.cluster.x-k8s.io/v1beta2 kind: K0sControlPlane metadata: name: docker-test-cp spec: replicas: 3 version: v1.31.3+k0s.0 # updated version updateStrategy: Recreate k0sConfigSpec: args: - --enable-worker k0s: apiVersion: k0s.k0sproject.io/v1beta1 kind: ClusterConfig metadata: name: k0s spec: api: extraArgs: anonymous-auth: "true" # anonymous-auth=true is needed for k0s to allow unauthorized health-checks on /healthz telemetry: enabled: true machineTemplate: infrastructureRef: apiGroup: infrastructure.cluster.x-k8s.io kind: DockerMachineTemplate name: docker-test-cp-template -
Update the resources:
kubectl apply -f ./path-to-file.yaml
Known issues#
Due to a bug in older k0s autopilot versions, the control plane upgrade may get stuck on the Cordoning stage when control plane nodes also run workloads (for example, when the --enable-worker flag is used). This bug is fixed in the latest patch versions of k0s.
If the upgrade stalls, use the following steps to recover:
-
Identify the node that is stuck:
kubectl --kubeconfig <workload-cluster-kubeconfig> get plan autopilot -o yaml -
Manually drain the node.
-
Patch the
k0sproject.io/autopilot-signal-dataannotation on the correspondingControlNodeobject: change thestatusfield in the JSON value fromCordoningtoApplyingUpdate. -
Repeat for any other nodes that are stuck.