Qingular

Cluster Version Upgrade

·CKAk8s练习

Complete workflow for upgrading a Kubernetes cluster version using kubeadm, including control plane upgrade, node upgrade, and etcd upgrade considerations.

← Back to CKA Practice Index

Overview

Cluster version upgrade is a key hands-on topic in the CKA exam. Upgrades must follow the Kubernetes version skew policy, upgrading from control plane to worker nodes in order, and only one minor version at a time.


1. Kubernetes Version Skew Policy

1.1 Version Support Rules

kube-apiserver
      |
      |-- kube-controller-manager / kube-scheduler (must be ≤ API Server version)
      |-- kubelet (allowed to be 2 minor versions behind or 1 minor version ahead of API Server)
      |-- kubectl (allowed to be ±1 minor version from API Server)
      |-- kube-proxy (must match kubelet version)

1.2 Upgrade Path Diagram

# Kubernetes version format: v<major>.<minor>.<patch>
# v1.29.x -> v1.30.x -> v1.31.x (cannot skip minor versions)

# Correct upgrade path
v1.29.x -> v1.30.x -> v1.31.x

# Invalid upgrade path (not allowed)
v1.29.x -> v1.31.x

2. Pre-upgrade Preparation

2.1 Check Current Version and Upgrade Plan

# Check current cluster version
kubectl version --short
kubeadm version
kubelet --version

# View node version details
kubectl get nodes -o wide

# Check upgrade plan (critical command)
sudo kubeadm upgrade plan

# Example output:
# Components that must be upgraded manually:
# kube-apiserver | 1.30.0 -> 1.31.0
# kube-controller-manager | 1.30.0 -> 1.31.0
# kube-scheduler | 1.30.0 -> 1.31.0
# kubelet | 1.30.0 -> 1.31.0 [config]
# etcd | 3.5.12-0 -> 3.5.15-0
# ...

2.2 Backup and Checks

# Backup etcd
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot-$(date +%Y%m%d).db

# Backup important configuration
sudo cp -r /etc/kubernetes /etc/kubernetes-backup-$(date +%Y%m%d)

# Check cluster health
kubectl get nodes
kubectl get pods --all-namespaces
kubectl get cs

# Check etcd health
kubectl exec -n kube-system etcd-<node> -- etcdctl endpoint health --cluster

# Confirm all nodes are Ready
kubectl wait --for=condition=Ready node --all --timeout=60s

2.3 Upgrade Strategy Selection

# Strategy 1: Upgrade all nodes
# Control plane -> Worker nodes

# Strategy 2: Canary upgrade
# Upgrade one worker node first to validate -> Then upgrade all

# Strategy 3: Blue-green deployment (requires extra resources)
# Deploy new version cluster, switch traffic

3. Upgrading Control Plane Nodes

3.1 First Control Plane Node

# 1. Update kubeadm (Ubuntu/Debian)
sudo apt update
sudo apt-cache madison kubeadm  # View available versions
sudo apt-mark unhold kubeadm && sudo apt-get install -y kubeadm=1.31.0-1.1 && sudo apt-mark hold kubeadm

# 1. Update kubeadm (CentOS/RHEL)
sudo yum update -y kubeadm-1.31.0 --disableexcludes=kubernetes

# 2. Verify the update
kubeadm version

# 3. Pre-upgrade check
sudo kubeadm upgrade plan v1.31.0

# 4. Drain the current node
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

# 5. Apply the upgrade (important!)
sudo kubeadm upgrade apply v1.31.0

# 6. Update kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.31.0-1.1 kubectl=1.31.0-1.1
sudo apt-mark hold kubelet kubectl

# 7. Restart kubelet
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# 8. Restore node scheduling
kubectl uncordon <node-name>

# 9. Verify
kubectl get nodes
kubeadm version

3.2 Subsequent Control Plane Nodes

# Starting from the second control plane node, use upgrade node instead of upgrade apply
# 1. Update kubeadm
sudo apt-mark unhold kubeadm && sudo apt-get install -y kubeadm=1.31.0-1.1 && sudo apt-mark hold kubeadm

# 2. Drain the node
kubectl drain <cp-node-2> --ignore-daemonsets --delete-emptydir-data

# 3. Upgrade the node (different from the first node's upgrade apply)
sudo kubeadm upgrade node

# 4. Update kubelet and kubectl
sudo apt-mark unhold kubelet kubectl && sudo apt-get install -y kubelet=1.31.0-1.1 kubectl=1.31.0-1.1 && sudo apt-mark hold kubelet kubectl

# 5. Restart kubelet
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# 6. Restore node scheduling
kubectl uncordon <cp-node-2>

# 7. Verify
kubectl get nodes

4. Upgrading Worker Nodes

# Execute on each worker node (run drain from the control plane node)

# 1. Drain the node from the control plane
kubectl drain <worker-node> --ignore-daemonsets --delete-emptydir-data

# 2. Update kubeadm on the worker node
sudo apt-mark unhold kubeadm && sudo apt-get install -y kubeadm=1.31.0-1.1 && sudo apt-mark hold kubeadm

# 3. Upgrade kubelet configuration on the worker node
sudo kubeadm upgrade node

# 4. Update kubelet and kubectl on the worker node
sudo apt-mark unhold kubelet kubectl && sudo apt-get install -y kubelet=1.31.0-1.1 kubectl=1.31.0-1.1 && sudo apt-mark hold kubelet kubectl

# 5. Restart kubelet
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# 6. Restore node scheduling from the control plane
kubectl uncordon <worker-node>

# 7. Verify on the worker node
kubectl version --short
kubelet --version

5. Upgrade Verification

# Verify all node versions
kubectl get nodes

# NAME           STATUS   ROLES           VERSION
# control-plane-1 Ready    control-plane   v1.31.0
# control-plane-2 Ready    control-plane   v1.31.0
# worker-1        Ready    <none>          v1.31.0
# worker-2        Ready    <none>          v1.31.0

# Verify cluster functionality
kubectl run nginx --image=nginx --restart=Never
kubectl get pods
kubectl delete pod nginx

# View component versions
kubectl get pods -n kube-system -o yaml | grep "image:" | sort -u

# Verify system Pods are running normally
kubectl get pods -n kube-system

6. etcd Upgrade Considerations

6.1 etcd Version Compatibility

# Check current etcd version
kubectl exec -n kube-system etcd-<node> -- etcdctl version
kubectl get pods -n kube-system etcd-<node> -o yaml | grep image:

# etcd upgrade principles:
# - etcd can be upgraded across multiple patch versions
# - Check Kubernetes version compatibility before upgrading etcd
# - etcd 3.5.x is compatible with all Kubernetes v1.29+

6.2 etcd Upgrade Steps

# If using kubeadm-managed etcd (stacked etcd):
# kubeadm upgrade will automatically handle etcd upgrade

# Manual etcd upgrade (external etcd cluster):
# 1. Backup etcd data
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-pre-upgrade.db

# 2. Upgrade etcd members one by one
# Stop etcd service
sudo systemctl stop etcd

# Replace etcd binary
sudo mv /usr/local/bin/etcd /usr/local/bin/etcd.old
sudo cp /tmp/etcd-v3.5.15-linux-amd64/etcd /usr/local/bin/etcd

# Start etcd service
sudo systemctl start etcd

# 3. Verify etcd cluster health
ETCDCTL_API=3 etcdctl endpoint health --cluster

# 4. Check member status
ETCDCTL_API=3 etcdctl member list

7. Upgrade Rollback

# If the upgrade fails, rollback steps:

# 1. Restore etcd from backup
sudo kubeadm reset -f
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot-<date>.db \
    --data-dir=/var/lib/etcd-restored

# 2. Downgrade kubeadm and kubelet to the old version
sudo apt-mark unhold kubeadm kubelet
sudo apt-get install -y kubeadm=1.30.x-1.1 kubelet=1.30.x-1.1
sudo apt-mark hold kubeadm kubelet

# 3. Reinitialize kubeadm
sudo kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all

# 4. Restore nodes
kubectl uncordon <node-name>

8. Upgrade Workflow Summary

┌─────────────────────────────────────────────┐
│           Upgrade Preparation                │
│ 1. Backup etcd, config, check cluster health│
│ 2. Run kubeadm upgrade plan                 │
└─────────────────┬───────────────────────────┘
                  │
┌─────────────────▼───────────────────────────┐
│           Upgrade Control Plane              │
│ 1. First control plane node:                │
│    kubeadm upgrade apply v1.31.0             │
│ 2. Subsequent control plane nodes:          │
│    kubeadm upgrade node                      │
│ 3. Each node: drain → upgrade → uncordon    │
└─────────────────┬───────────────────────────┘
                  │
┌─────────────────▼───────────────────────────┐
│           Upgrade Worker Nodes              │
│ 1. drain node                                │
│ 2. kubeadm upgrade node                      │
│ 3. Upgrade kubelet & kubectl                │
│ 4. Restart kubelet                           │
│ 5. uncordon node                             │
└─────────────────┬───────────────────────────┘
                  │
┌─────────────────▼───────────────────────────┐
│           Upgrade Verification               │
│ 1. kubectl get nodes (version check)         │
│ 2. Run a test Pod                            │
│ 3. Check system Pod health                   │
└─────────────────────────────────────────────┘

CKA Exam Key Points

  1. kubeadm upgrade plan first, then act -- Always check the plan before upgrading
  2. First control plane uses upgrade apply -- All other control planes and worker nodes use upgrade node
  3. drain must include --ignore-daemonsets -- Otherwise DaemonSet Pods will cause drain to fail
  4. apt-mark unhold/hold -- The exam may use apt-mark to lock versions; you need to unhold before upgrading
  5. Upgrade nodes one at a time -- Drain one node at a time, finish upgrading and uncordon before moving to the next
  6. Only upgrade to the next minor version -- Cannot skip versions (e.g., v1.29 -> v1.31)

🧪 Complete Hands-on Example: Upgrade Cluster from v1.29 to v1.30

Scenario Description

Upgrade the cluster's control plane nodes and worker nodes from Kubernetes v1.29 to v1.30, following the official upgrade path, processing nodes one at a time.

Prerequisites

  • Cluster version is v1.29.x
  • etcd data and important configurations have been backed up
  • All nodes are in Ready state
  • Current node has the Kubernetes apt repository (pkgs.k8s.io) configured

Steps

Step 1: Pre-upgrade checks and backup

# Check current version
kubectl version --short
# Client Version: v1.29.0
# Server Version: v1.29.0

# Check upgrade plan
sudo kubeadm upgrade plan
# ...
# Components that must be upgraded manually:
# kube-apiserver | 1.29.0 -> 1.30.0
# kube-controller-manager | 1.29.0 -> 1.30.0
# ...

# Backup etcd and configuration
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot-preupgrade.db \
    --cacert=/etc/kubernetes/pki/etcd/ca.crt \
    --cert=/etc/kubernetes/pki/etcd/server.crt \
    --key=/etc/kubernetes/pki/etcd/server.key
sudo cp -r /etc/kubernetes /etc/kubernetes-backup

Step 2: Upgrade the first control plane node

# Drain the control plane node
kubectl drain control-plane-1 --ignore-daemonsets --delete-emptydir-data

# Update kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1
sudo apt-mark hold kubeadm

# Verify new version
kubeadm version
# kubeadm version: &version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.0"}

# Apply upgrade
sudo kubeadm upgrade apply v1.30.0
# [upgrade/successful] SUCCESS! Your cluster was upgraded to v1.30.0. Enjoy!

# Update kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubelet kubectl

# Restart kubelet
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# Restore node scheduling
kubectl uncordon control-plane-1

Step 3: Upgrade worker nodes

# Drain the worker node from the control plane
kubectl drain worker-1 --ignore-daemonsets --delete-emptydir-data

# SSH to the worker node and upgrade kubeadm
ssh worker-1
sudo apt-mark unhold kubeadm
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.0-1.1
sudo apt-mark hold kubeadm

# Upgrade node configuration
sudo kubeadm upgrade node

# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.30.0-1.1 kubectl=1.30.0-1.1
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload
sudo systemctl restart kubelet
exit

# Restore worker node scheduling from the control plane
kubectl uncordon worker-1

# Repeat the same steps to upgrade worker-2, worker-3...

Verification Results

# Verify all node versions
kubectl get nodes
# NAME               STATUS   ROLES           VERSION
# control-plane-1    Ready    control-plane   v1.30.0
# worker-1           Ready    <none>          v1.30.0
# worker-2           Ready    <none>          v1.30.0

# Verify cluster functionality
kubectl get pods --all-namespaces
kubectl run test-nginx --image=nginx --restart=Never
kubectl get pod test-nginx
kubectl delete pod test-nginx

Exam Tips

  • Must run kubeadm upgrade plan before upgrading -- Upgrading directly without checking the plan will lose points
  • First control plane uses kubeadm upgrade apply, remaining control planes and all worker nodes use kubeadm upgrade node
  • Each node upgrade must follow the drain -> upgrade -> uncordon sequence
  • Pay attention to apt-mark unhold to release version locks and apt-mark hold to relock
  • Only upgrade to the next minor version (v1.29 -> v1.30); cannot skip versions (v1.29 -> v1.31)

Official Documentation