Skip to main content
wegoagain

Kubernetes

This is my practical Kubernetes handbook. It is organised so you can either learn progressively or jump straight to troubleshooting and operations topics.

Beginner Track

Start with fundamentals, then move to config and networking.

1. Core Concepts

These are the fundamental building blocks that run your containers. Understanding these resources is essential before moving to more advanced topics.

Pod

The smallest deployable unit in Kubernetes. Think of it as a wrapper around one or more containers that share the same network and storage. Pods are designed to be temporary. They can be created, destroyed, and replaced at any time. This is why you rarely create Pods directly. Instead, you use controllers like Deployments that manage Pods for you.

Command Description
kubectl get pods List pods in current namespace
kubectl describe pod <name> Show detailed pod info and events
kubectl delete pod <name> Delete a pod (controller recreates it)
Example YAML
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:alpine
    ports:
    - containerPort: 80

Pod Lifecycle CKA

Pods progress through distinct phases during their lifetime. Understanding these phases helps with debugging and designing resilient applications.

Phase Description Common Causes
Pending Pod accepted but containers not running Image pulling, PVC binding, resource constraints
Running At least one container is running Normal operation
Succeeded All containers terminated successfully Job completed, init container finished
Failed All containers terminated, at least one failed CrashLoopBackOff, OOMKilled, error exit code
Unknown Cannot determine pod state Node communication lost
Container States

Within each Pod, individual containers have their own state:

State Meaning
Waiting Container waiting to start (image pull, init container running)
Running Container executing
Terminated Container exited (success or failure)
# Check container states
kubectl get pod <name> -o jsonpath='\u007b.status.containerStatuses[*].state\u007d'

# Check init container status
kubectl get pod <name> -o jsonpath='\u007b.status.initContainerStatuses[*].name\u007d'

Deployment CKA

The most common way to run stateless applications like web servers or APIs. Deployments manage ReplicaSets, which manage Pods. This layered approach enables rolling updates. When you update the container image, the Deployment creates new Pods with the new version, waits for them to be ready, then gradually removes the old Pods. Your application stays online throughout the update.

Command Description
kubectl create deploy <name> --image=<img> Create deployment imperatively
kubectl scale deploy <name> --replicas=3 Scale to 3 replicas
kubectl rollout status deploy <name> Watch rollout progress
Example YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80

Deployment Patterns CKAD

Beyond rolling updates, Kubernetes supports advanced deployment strategies for zero-downtime releases and risk mitigation.

Pattern Description Use Case
Blue-Green Run both versions simultaneously, switch traffic instantly Zero-downtime deployments, easy rollback
Canary Route small % of traffic to new version, gradually increase Risk mitigation, A/B testing
Blue-Green Deployment Example

Deploy two versions simultaneously. Switch traffic by updating the Service selector. Instant rollback by switching back.

# 1. Create blue deployment (current version)
kubectl create deployment nginx-blue --image=nginx:1.14 --replicas=3

# 2. Create green deployment (new version)
kubectl create deployment nginx-green --image=nginx:1.15 --replicas=3

# 3. Create service pointing to blue
kubectl expose deployment nginx-blue --name=nginx --port=80 --selector=version=blue

# 4. Switch traffic to green (instant)
# 4. Switch traffic to green (instant) - Edit service to change selector
kubectl edit service nginx  # Change selector to version=green

# 5. Rollback to blue if issues
# 5. Rollback to blue if issues - Edit service to change selector  
kubectl edit service nginx  # Change selector to version=blue

# 6. Cleanup old deployment
kubectl delete deployment nginx-blue
Canary Deployment Example

Start with 1 replica of the new version alongside the old. Gradually scale up canary while scaling down stable. Monitor for errors before full rollout.

# 1. Stable deployment (9 replicas)
kubectl create deployment nginx --image=nginx:1.14 --replicas=9

# 2. Canary deployment (1 replica)
kubectl create deployment nginx-canary --image=nginx:1.15 --replicas=1

# 3. Both services route to same endpoint via label
kubectl label deployment nginx version=stable
kubectl label deployment nginx-canary version=stable

# 4. Monitor canary - check logs, metrics
kubectl logs -l app=nginx-canary --tail=100

# 5. Gradually increase canary
kubectl scale deployment nginx-canary --replicas=3
kubectl scale deployment nginx --replicas=7

# 6. Continue until fully migrated
kubectl scale deployment nginx-canary --replicas=10
kubectl delete deployment nginx

Note: For traffic percentage control, use an Ingress controller with canary annotations (Nginx, Traefik) or a service mesh (Istio, Linkerd).

ReplicaSet CKA

The engine inside a Deployment that keeps your application running. ReplicaSets ensure the correct number of Pod copies are always running. If a Pod crashes or a node fails, the ReplicaSet automatically creates a new Pod to replace it. You typically do not work with ReplicaSets directly. Deployments manage them for you. Understanding ReplicaSets helps explain why applications remain available during failures.

Command Description
kubectl get rs List ReplicaSets
kubectl describe rs <name> Show ReplicaSet details and events

StatefulSet CKA

Use StatefulSets for applications that need stable identities, like databases or message queues. Unlike Deployments where Pods are interchangeable, StatefulSet Pods get predictable names (mysql-0, mysql-1) and are created in order. Each Pod maintains its own persistent storage that survives restarts. This makes StatefulSets ideal for clustered databases where each node needs to know its identity.

Command Description
kubectl get sts List StatefulSets
kubectl scale sts <name> --replicas=3 Scale StatefulSet
Example YAML
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

DaemonSet CKA

Ensures exactly one Pod runs on every node in your cluster. Perfect for system-level services that need to be present everywhere. Common examples include log collectors that gather container logs from each node, monitoring agents that report node metrics, or network plugins that set up networking on each machine. When you add a new node to the cluster, the DaemonSet automatically deploys its Pod there without any manual intervention.

Command Description
kubectl get ds List DaemonSets
kubectl describe ds <name> Show DaemonSet details

Job / CronJob CKA

Jobs run Pods that complete a task and then exit. Use a Job for one-time tasks like database migrations or batch data processing. Use a CronJob to run Jobs on a schedule, such as nightly backups or periodic cleanup tasks. Unlike Deployments which keep Pods running indefinitely, Jobs and CronJobs are designed for tasks that have a definite end point.

Command Description
kubectl create job <name> --image=<img> Create a one-time job
kubectl get jobs List jobs and completion status
kubectl get cronjobs List scheduled CronJobs
CronJob Example
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup
spec:
  schedule: "0 2 * * *"  # Daily at 2am
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["/bin/sh", "-c", "backup.sh"]
          restartPolicy: OnFailure

Multi-container Pods CKAD

Pods can contain multiple containers that share network and storage. Common patterns:

Pattern Use Case
Init Container Runs before main container (wait for DB, download config)
Sidecar Runs alongside main container (log shipper, proxy)
Ambassador Proxy outbound connections (localhost to external)
Adapter Transform output for consumption (log formatting)
Init Container Example
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  initContainers:
  - name: wait-for-db
    image: busybox
    command: ['sh', '-c', 'until nc -z db 5432; do sleep 2; done']
  containers:
  - name: app
    image: myapp:latest
Init Container Characteristics
Characteristic Behavior
Execution Order Run sequentially, not in parallel
Success Requirement Must complete successfully (exit 0) before next init or app containers start
Resource Limits Have separate resource limits from app containers
Features Support all container features (volumes, env vars, security contexts)
Restart Policy Always restart on failure (unlike Jobs)

Common Use Cases:

  • Wait for services to be ready (database, message queue)
  • Initialize configuration files from templates
  • Run database migrations before app starts
  • Generate certificates or keys
  • Clone Git repositories or download assets
# Check init container status
kubectl get pod <name>
kubectl describe pod <name> | grep -A 20 "Init Containers"

# View init container logs
kubectl logs <pod> -c <init-container-name>
Complete Multi-Container Patterns

Sidecar Pattern (Log Shipping):

apiVersion: v1
kind: Pod
metadata:
  name: web-with-logger
spec:
  containers:
  - name: nginx
    image: nginx
    volumeMounts:
    - name: logs
      mountPath: /var/log/nginx
  - name: log-aggregator
    image: fluentd
    volumeMounts:
    - name: logs
      mountPath: /var/log/nginx
  volumes:
  - name: logs
    emptyDir: 

Adapter Pattern (Data Transformation):

apiVersion: v1
kind: Pod
metadata:
  name: app-with-adapter
spec:
  containers:
  - name: main-app
    image: myapp
    volumeMounts:
    - name: data
      mountPath: /data
  - name: adapter
    image: log-adapter
    volumeMounts:
    - name: data
      mountPath: /input
    command: ['sh', '-c', 'transform /input/logs.json']

Ambassador Pattern (Proxy):

apiVersion: v1
kind: Pod
metadata:
  name: app-with-proxy
spec:
  containers:
  - name: main-app
    image: myapp
    env:
    - name: DB_HOST
      value: localhost:5432
  - name: db-proxy
    image: pg-proxy
    ports:
    - containerPort: 5432

Static Pods CKA

Pods managed directly by kubelet on a node, not by the API server. Control plane components (kube-apiserver, etcd) run as static pods. Manifests live in /etc/kubernetes/manifests/.

Command Description
ls /etc/kubernetes/manifests/ List static pod manifests
kubectl get pods -n kube-system Static pods appear with node name suffix

Workload Controller Comparison CKA

Choose the right controller based on your application's requirements. Each controller optimises for different use cases.

Controller Best For Pod Identity Storage
Deployment Stateless apps (web servers, APIs) Interchangeable Shared (ReadWriteMany)
StatefulSet Stateful apps (databases, message queues) Stable (pod-0, pod-1) Dedicated per Pod
DaemonSet Node agents (logging, monitoring) One per node hostPath or emptyDir
Job One-time tasks (migrations, batch) Completes and exits Ephemeral
CronJob Scheduled tasks (backups, cleanup) Creates Jobs on schedule Ephemeral
Decision Guide

Ask these questions:

  1. Does the app need stable network identity? → StatefulSet (databases need pod-0, pod-1 naming)
  2. Should it run on every node? → DaemonSet (log collectors, monitoring agents)
  3. Does it complete and exit? → Job (database migrations, batch processing)
  4. Should it run on a schedule? → CronJob (nightly backups, cleanup tasks)
  5. None of the above? → Deployment (most common, stateless apps)

2. Configuration

Applications need configuration. This section covers how to inject settings, store sensitive data securely, and control resource usage.

ConfigMap CKA

Store non-sensitive configuration data like application settings, feature flags, or configuration files. You can inject ConfigMap data into Pods as environment variables or mount it as files. Important note: changing a ConfigMap does not automatically restart running Pods. They will see the new values only after they restart.

Command Description
kubectl create cm <name> --from-literal=key=value Create from literal values
kubectl create cm <name> --from-file=config.txt Create from file
kubectl get cm <name> -o yaml View ConfigMap contents
Usage in Pod
# As environment variable
env:
- name: LOG_LEVEL
  valueFrom:
    configMapKeyRef:
      name: app-config
      key: log_level

# As mounted file
volumes:
- name: config
  configMap:
    name: app-config
volumeMounts:
- name: config
  mountPath: /etc/config

Secret CKA

Store sensitive data like passwords, API keys, and TLS certificates. Secrets are base64-encoded but not encrypted by default. Anyone with access to the cluster can read them. For production, use tools like Sealed Secrets or External Secrets Operator to encrypt secrets before storing them in Git. Like ConfigMaps, Secrets can be mounted as files or injected as environment variables.

Command Description
kubectl create secret generic <name> --from-literal=pass=secret Create generic secret
kubectl create secret tls <name> --cert=tls.crt --key=tls.key Create TLS secret
kubectl get secret <name> -o jsonpath='{.data.pass}' | base64 -d Decode secret value

Namespace CKA

A way to divide a cluster into multiple virtual clusters. Use namespaces to separate different environments like development, staging, and production, or to isolate team resources. Each namespace has its own DNS scope. Services in different namespaces can still communicate using the full DNS name like <service>.<namespace>.svc.cluster.local.

Command Description
kubectl create ns <name> Create namespace
kubectl get pods -n <namespace> List pods in specific namespace
kubectl config set-context --current --namespace=<ns> Set default namespace

Resource Requests & Limits CKA

Control how much CPU and memory your containers can use. Requests tell the scheduler the minimum resources a container needs. The scheduler uses this to decide which node can run the Pod. Limits set the maximum resources a container can consume. If a container exceeds its memory limit, Kubernetes kills it with an OOMKilled error. If it exceeds its CPU limit, the CPU gets throttled but the container keeps running.

Example YAML
containers:
- name: app
  image: myapp
  resources:
    requests:
      memory: "128Mi"
      cpu: "250m"      # 0.25 CPU cores
    limits:
      memory: "256Mi"
      cpu: "500m"      # 0.5 CPU cores

ResourceQuota CKA

Limits aggregate resource consumption per namespace. Prevents one team from consuming all cluster resources. Can limit CPU, memory, storage, and object counts (pods, services, secrets).

Command Description
kubectl get resourcequota -n <ns> Show quota usage
kubectl describe resourcequota <name> Show limits and current usage
Example YAML
apiVersion: v1
kind: ResourceQuota
metadata:
  name: dev-quota
  namespace: dev
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "20"

LimitRange CKA

Sets default resource requests/limits for containers in a namespace. Prevents pods from being created without resource specs. Can also enforce min/max constraints.

Example YAML
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - default:          # Default limits
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:   # Default requests
      cpu: "100m"
      memory: "128Mi"
    type: Container

SecurityContext CKAD

Security settings at Pod or container level: run as specific user, drop capabilities, read-only filesystem, prevent privilege escalation. Critical for hardening production workloads.

Example YAML
spec:
  securityContext:           # Pod level
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  containers:
  - name: app
    securityContext:         # Container level
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]

3. Observability & Debugging

When things go wrong, you need to understand what is happening. This section covers health checks, logs, and troubleshooting techniques.

Probes (Liveness, Readiness, Startup) CKA

Kubernetes uses health checks to monitor containers. Each probe serves a different purpose. Use liveness probes to detect when an application has crashed or deadlocked. Kubernetes restarts the container if this probe fails. Use readiness probes to know when an application is ready to accept traffic. Kubernetes removes the Pod from Service endpoints until it passes. Use startup probes for slow-starting applications. This probe disables the other probes until the container is ready, preventing premature restarts.

Probe Purpose
livenessProbe Is the container alive? Failure = container restart
readinessProbe Is the container ready for traffic? Failure = removed from Service
startupProbe For slow-starting apps. Disables other probes until success
Example YAML
containers:
- name: app
  livenessProbe:
    httpGet:
      path: /healthz
      port: 8080
    initialDelaySeconds: 10
    periodSeconds: 5
  readinessProbe:
    httpGet:
      path: /ready
      port: 8080
    initialDelaySeconds: 5
    periodSeconds: 3
Probe Configuration Options
Field Default Description
initialDelaySeconds 0 Wait time before first probe
periodSeconds 10 How often to probe
timeoutSeconds 1 Probe timeout
successThreshold 1 Consecutive successes to pass
failureThreshold 3 Consecutive failures to fail
Probe Types & Examples

HTTP GET (most common):

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
    httpHeaders:
    - name: Custom-Header
      value: Awesome

TCP Socket:

livenessProbe:
  tcpSocket:
    port: 3306

Exec Command:

livenessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy

Startup Probe (for slow-starting apps):

startupProbe:
  httpGet:
    path: /startup
    port: 8080
  failureThreshold: 30
  periodSeconds: 10  # 30 * 10 = 5 minutes max

Logging CKA

View container output to debug issues. Kubernetes captures stdout and stderr from containers. You can view logs from running Pods or from previous instances after a crash. In production, set up a log aggregation system like Loki or Elasticsearch because logs disappear when Pods are deleted.

Command Description
kubectl logs <pod> View pod logs
kubectl logs <pod> -c <container> Logs from specific container
kubectl logs <pod> -f --tail=100 Follow last 100 lines

Events & Troubleshooting CKA

Events are the first place to look when something goes wrong. They show scheduling decisions, image pulls, crashes, and more. Events are kept for 1 hour by default.

Command Description
kubectl get events --sort-by='.lastTimestamp' Recent events sorted by time
kubectl describe pod <name> Pod details + events (most useful!)
kubectl get pods -o wide Show node placement and IPs

Troubleshooting Patterns CKA

Symptom Check
Pod stuck Pending Resources, node selectors, taints, PVC binding
Pod CrashLoopBackOff Logs, command/args, probes, permissions
Pod ImagePullBackOff Image name, tag, registry auth (imagePullSecrets)
Service not routing Selector labels match, endpoints exist, port numbers
Detailed Troubleshooting Workflows

Pod Pending - Diagnostic Steps:

# 1. Check events for scheduling issues
kubectl describe pod <name> | grep -A 10 Events

# 2. Check resource constraints
kubectl describe node <node> | grep -A 5 "Allocated resources"

# 3. Check PVC binding status
kubectl get pvc
kubectl describe pvc <name>

# 4. Check node taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

# 5. Verify node selectors match
kubectl get pod <name> -o jsonpath='\u007b.spec.nodeSelector\u007d'
kubectl get nodes --show-labels | grep <label>

CrashLoopBackOff - Diagnostic Steps:

# 1. Check logs from previous container
kubectl logs <pod> --previous

# 2. Check events for OOMKilled
kubectl describe pod <name> | grep -i "killed\|oom"

# 3. Verify command and args
kubectl get pod <name> -o jsonpath='\u007b.spec.containers[0].command\u007d\u007b.spec.containers[0].args\u007d'

# 4. Check probe configuration
kubectl describe pod <name> | grep -A 5 "Liveness\|Readiness"

# 5. Check resource limits
kubectl describe pod <name> | grep -A 3 "Limits\|Requests"

ImagePullBackOff - Diagnostic Steps:

# 1. Verify image name and tag
kubectl get pod <name> -o jsonpath='\u007b.spec.containers[0].image\u007d'

# 2. Check if image exists in registry
docker pull <image>  # or crictl pull on node

# 3. Check imagePullSecrets
kubectl get pod <name> -o jsonpath='\u007b.spec.imagePullSecrets\u007d'

# 4. Create registry secret if needed
kubectl create secret docker-registry regcred \
  --docker-server=<registry> \
  --docker-username=<user> \
  --docker-password=<pass>

Service Not Routing - Diagnostic Steps:

# 1. Check service selector matches pod labels
kubectl get svc <service> -o jsonpath='\u007b.spec.selector\u007d'
kubectl get pods -l <selector-key>=<selector-value> --show-labels

# 2. Verify endpoints exist
kubectl get endpoints <service>

# 3. Check port configuration
kubectl get svc <service> -o jsonpath='\u007b.spec.ports\u007d'
kubectl get pod <pod> -o jsonpath='\u007b.spec.containers[0].ports\u007d'

# 4. Test connectivity from within cluster
kubectl run debug --rm -it --image=busybox -- /bin/sh
wget -O- http://<service>.<namespace>.svc.cluster.local:<port>

4. Networking

Applications need to talk to each other. This section covers how Pods discover and communicate with other services, both inside and outside the cluster.

Service CKA

Pods come and go. Their IP addresses change constantly. Services solve this by providing a stable IP address and DNS name that always routes to healthy Pods. You define a Service with a label selector. Kubernetes automatically maintains a list of matching Pods and load balances traffic between them. If a Pod dies and a new one starts, the Service automatically updates to send traffic to the new Pod.

Type Description
ClusterIP Internal-only (default). Accessible within cluster only.
NodePort Exposes on each node's IP at a static port (30000-32767).
LoadBalancer Provisions external load balancer (cloud or MetalLB).
ExternalName Maps to external DNS name (CNAME record).
Command Description
kubectl expose deploy <name> --port=80 Create ClusterIP service
kubectl get svc List services
kubectl get endpoints <svc> Show which Pods back the service
Example YAML
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  type: ClusterIP
  selector:
    app: nginx        # Must match Pod labels
  ports:
  - port: 80          # Service port
    targetPort: 8080  # Container port

Ingress CKA

Layer 7 (HTTP/HTTPS) routing rules. Instead of exposing multiple NodePorts, you expose one Ingress controller (Traefik, Nginx) and route by hostname or path: grafana.example.com → Grafana Service.

Command Description
kubectl get ingress List ingress resources
kubectl describe ingress <name> Show routing rules and backend services
Example YAML
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: app
            port:
              number: 80
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls

NetworkPolicy CKA

Firewall rules for Pods. By default, all Pods can talk to all Pods. NetworkPolicies restrict this: "Only backend Pods can reach database on port 5432." Requires a CNI plugin that supports policies (Calico, Cilium).

Command Description
kubectl get netpol List network policies
kubectl describe netpol <name> Show policy rules
Example: Deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
spec:
  podSelector:     # Applies to all pods
  policyTypes:
  - Ingress          # Deny all incoming traffic
Example: Allow from specific pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-policy
spec:
  podSelector:
    matchLabels:
      app: database
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - port: 5432

DNS (CoreDNS) CKA

Every Service gets a DNS name automatically. CoreDNS runs as a Deployment in kube-system and resolves cluster DNS.

DNS Pattern Example
Same namespace <service>
Cross namespace <service>.<namespace>
Fully qualified <service>.<namespace>.svc.cluster.local

5. Storage

Pods are temporary. When they restart, they lose all data. This section covers how to store data that survives Pod restarts.

PersistentVolume (PV) CKA

A PersistentVolume represents actual storage in the cluster (NFS share, cloud disk, or local SSD). Each PV has properties: capacity (size), accessModes (how it can be mounted), and storageClassName (type of storage). When created, a PV waits for a PVC that matches these properties to claim it. With dynamic provisioning (StorageClasses), you don't create PVs manually. They're created automatically when a PVC requests them.

Command Description
kubectl get pv List PersistentVolumes
kubectl describe pv <name> Show PV details and claim binding
Example YAML
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-data
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: /data/pv

PersistentVolumeClaim (PVC) CKA

A request for storage. When you create a PVC, Kubernetes looks for a PV that matches three criteria: (1) accessModes must match, (2) requested storage must fit within PV capacity, (3) storageClassName must match. If no PV exists and you specified a StorageClass, a new PV is created automatically. The PVC then binds to the PV, making it unavailable to other claims.

Command Description
kubectl get pvc List PVCs and their status
kubectl describe pvc <name> Show bound PV and events
Example YAML
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: longhorn
How PVC Binds to PV

Static Provisioning (Admin creates PV):

  • Admin creates PV with: capacity=10Gi, accessModes=ReadWriteOnce, storageClassName=fast-ssd
  • User creates PVC requesting: storage=5Gi, accessModes=ReadWriteOnce, storageClassName=fast-ssd
  • Kubernetes matches all three criteria and binds PVC to PV

Dynamic Provisioning (StorageClass):

  • User creates PVC with: storageClassName=longhorn
  • Longhorn provisioner creates a new PV automatically
  • PVC binds to the newly created PV

Key Point: The Pod only references the PVC by name, never the PV directly.

StorageClass CKA

Defines how to create storage automatically. A StorageClass specifies which provisioner to use (Longhorn, AWS EBS, GCP PD, etc.) and parameters like replication or disk type. When a PVC requests storageClassName: longhorn, the Longhorn provisioner creates a new PV automatically. No manual disk allocation needed.

Command Description
kubectl get sc List StorageClasses
kubectl describe sc <name> Show provisioner and parameters
StorageClass Examples

Longhorn (Distributed Storage):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn
provisioner: driver.longhorn.io
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
reclaimPolicy: Delete
volumeBindingMode: Immediate

AWS EBS (Cloud Provider):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Local SSD (Performance):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
# Requires pre-created PVs on nodes

Key Parameters:
reclaimPolicy: Delete (default) or Retain
volumeBindingMode: Immediate or WaitForFirstConsumer
allowVolumeExpansion: Enable PVC resizing

Volume Types CKA

Beyond PVCs, Kubernetes supports ephemeral and special volume types:

Type Use Case
emptyDir Scratch space shared between containers. Deleted when Pod dies.
hostPath Mounts directory from the node. Use sparingly (ties Pod to node).
configMap / secret Mount ConfigMap or Secret as files.
projected Combine multiple sources into one volume. Useful for service account tokens with additional metadata.

Using Volumes in Pods CKA

Volumes are defined at two levels in a Pod spec. Understanding how they connect is essential for working with storage.

Level Field Purpose
Pod spec volumes Defines the volume source (PVC, ConfigMap, Secret, etc.)
Container spec volumeMounts Mounts the volume into the container at a specific path

The name field is the link. The volumeMount references a volume by name, and that volume must be defined in the Pod's volumes array.

Complete Example: Multiple Volume Types
apiVersion: v1
kind: Pod
metadata:
  name: multi-volume-pod
spec:
  # Pod-level: Define the volumes
  volumes:
  - name: data                    # Reference name
    persistentVolumeClaim:
      claimName: my-pvc           # Actual PVC
  - name: config                  # Reference name
    configMap:
      name: app-config            # Actual ConfigMap
  - name: secrets                 # Reference name
    secret:
      secretName: app-secrets     # Actual Secret
  - name: cache                   # Reference name
    emptyDir:                   # Ephemeral storage

  containers:
  - name: app
    image: myapp:latest
    # Container-level: Mount the volumes
    volumeMounts:
    - name: data                  # Matches volumes[0].name
      mountPath: /data            # Where it appears in container
    - name: config                # Matches volumes[1].name
      mountPath: /etc/config      # Config files here
      readOnly: true              # ConfigMaps are read-only
    - name: secrets               # Matches volumes[2].name
      mountPath: /etc/secrets     # Secrets here
      readOnly: true              # Secrets are read-only
    - name: cache                 # Matches volumes[3].name
      mountPath: /tmp/cache       # Temp storage here

Key points: The volume names (data, config, etc.) must match between volumes and volumeMounts. One volume can be mounted in multiple containers, and each container can choose its own mountPath.

Access Modes CKA

Defines how many nodes can mount a volume simultaneously and in what mode. Choose based on your application architecture.

Mode Meaning Use Case
ReadWriteOnce (RWO) Single node can mount read-write Most common. Databases, single-instance apps.
ReadOnlyMany (ROX) Multiple nodes can mount read-only Shared config files, static assets.
ReadWriteMany (RWX) Multiple nodes can mount read-write Shared file storage (NFS, CephFS).

6. Scheduling

Not every node is suitable for every workload. Sometimes you need Pods on machines with SSDs, or you want to spread replicas across different availability zones. This section covers how to influence where Pods run.

nodeSelector CKA

The simplest way to control which nodes run your Pods. Add labels to nodes, then specify those labels in your Pod spec. The scheduler only places the Pod on nodes that match all the labels you specify. For example, you might label certain nodes as having SSD storage, then use nodeSelector to ensure database Pods only run on those nodes.

Example YAML
spec:
  nodeSelector:
    disktype: ssd
    zone: us-west-1a
Command Description
kubectl label node <name> disktype=ssd Add label to node
kubectl get nodes --show-labels Show all node labels

Affinity & Anti-Affinity CKA

A more powerful alternative to nodeSelector with three types. Node affinity works like nodeSelector but supports expressions and soft preferences. Pod affinity places Pods near other Pods, useful for co-locating related services. Pod anti-affinity spreads Pods across different nodes, ensuring high availability by preventing all replicas from running on the same machine.

Type Behavior
requiredDuringScheduling Hard requirement - must match or Pod won't schedule
preferredDuringScheduling Soft preference - scheduler tries but not guaranteed
Node Affinity Example
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - us-west-1a
            - us-west-1b
Pod Anti-Affinity Example
# Spread replicas across nodes
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: web
        topologyKey: kubernetes.io/hostname

Taints & Tolerations CKA

Taints and tolerations work together to control Pod placement. Taints are applied to nodes and prevent Pods from scheduling there. Tolerations are applied to Pods and allow them to ignore specific taints. Control plane nodes use taints to prevent regular workloads from running on them. You can also use taints to reserve certain nodes for specific workloads, like GPU machines for machine learning tasks.

Effect Behavior
NoSchedule New Pods won't schedule without toleration
PreferNoSchedule Scheduler avoids but not guaranteed
NoExecute Evicts existing Pods without toleration
Command Description
kubectl taint node <n> key=value:NoSchedule Add taint to node
kubectl taint node <n> key:NoSchedule- Remove taint (note the minus)
Toleration Example
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

7. Security

Security in Kubernetes involves controlling who can access resources and how workloads run. This section covers authentication, authorisation, and workload security. These concepts appear in almost every real deployment, making them essential to understand early.

RBAC (Role-Based Access Control) CKA

RBAC controls what users and applications can do in the cluster. The system works with four resources that define and assign permissions. First, a Role or ClusterRole defines what actions are allowed on which resources. Then, a RoleBinding or ClusterRoleBinding connects those permissions to users, groups, or ServiceAccounts. Think of it like giving someone a key card. The Role defines which doors the card opens, and the Binding gives the card to a specific person.

Resource Scope Purpose
Role Namespace Defines permissions within a namespace
ClusterRole Cluster-wide Defines cluster-wide permissions
RoleBinding Namespace Binds Role to users/groups/ServiceAccounts
ClusterRoleBinding Cluster-wide Binds ClusterRole cluster-wide
Command Description
kubectl auth can-i create pods Check if you can create pods
kubectl auth can-i list secrets --as=system:serviceaccount:default:mysa Check permissions as ServiceAccount
Role + RoleBinding Example
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: dev
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: dev
subjects:
- kind: ServiceAccount
  name: dev-sa
  namespace: dev
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

ServiceAccount CKA

Gives Pods an identity to authenticate with the Kubernetes API. Every namespace automatically gets a default ServiceAccount, but you should create custom ones for applications that need specific permissions. Combine ServiceAccounts with RBAC Roles to control what each application can access. For example, a backup tool might need permission to read Secrets, while a monitoring agent might only need to read Pod metrics.

Command Description
kubectl create sa <name> Create ServiceAccount
kubectl get sa List ServiceAccounts
Using ServiceAccount in Pod
spec:
  serviceAccountName: my-custom-sa
  automountServiceAccountToken: false  # Disable if not needed

RBAC Permissions Reference CKA

Every RBAC rule has three components: apiGroups (which API), resources (what objects), and verbs (what actions). Understanding these helps you create precise permissions without over-privileging.

apiGroups Common Resources
"" (core) pods, services, nodes, configmaps, secrets, namespaces
apps deployments, statefulsets, daemonsets, replicasets
networking.k8s.io ingresses, networkpolicies
rbac.authorization.k8s.io roles, clusterroles, rolebindings
metrics.k8s.io nodes, pods (CPU/memory metrics from metrics-server)
batch jobs, cronjobs
storage.k8s.io storageclasses, volumeattachments
Verb Action Use Case
get Read a single resource by name View specific pod details
list List all resources of a type Show all pods in namespace
watch Stream changes in real-time Controllers watching for changes
create Create new resources Deploy new applications
update Modify existing resources Change pod labels or annotations
patch Partially modify resources Update specific fields only
delete Delete a single resource Remove pods or services
Common RBAC Patterns

Read-only access to pods:

- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

Full admin on deployments:

- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["*"]

Metrics access (for monitoring tools):

- apiGroups: ["metrics.k8s.io"]
  resources: ["nodes", "pods"]
  verbs: ["get", "list"]

ConfigMap reader only:

- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list"]
Testing Permissions

Always test permissions before deploying. Use kubectl auth can-i to verify:

# Check your own permissions
kubectl auth can-i get pods
kubectl auth can-i create deployments

# Check permissions for a ServiceAccount
kubectl auth can-i get pods \
  --as=system:serviceaccount:default:my-sa

# Check permissions in a specific namespace
kubectl auth can-i list secrets -n production \
  --as=system:serviceaccount:monitoring:prometheus

Tip: If you get "no" or a 403 error, check that the ServiceAccount has the right Role/ClusterRole bound to it.

8. Cluster Administration

These topics focus on managing the cluster infrastructure itself.

Control Plane Components CKA

The control plane is the brain of the cluster. These components run on the master nodes and manage the entire cluster. Understanding what each component does helps with troubleshooting cluster issues.

Component Purpose
kube-apiserver The front door. All kubectl commands and internal communication go through here.
etcd The database. Stores all cluster configuration and state.
kube-scheduler Decides which node should run each new Pod based on resources and constraints.
kube-controller-manager Runs background loops that ensure the actual state matches the desired state.
kubelet Runs on every node. Creates and manages containers, reports node health.
kube-proxy Runs on every node. Manages network rules for Services.

kubeadm CKA

The standard tool for creating and managing Kubernetes clusters. While you can use k3s for a homelab, the CKA exam tests kubeadm extensively. It handles initializing the control plane, joining worker nodes, and upgrading cluster versions.

Command Description
kubeadm init Initialize control plane node
kubeadm join Join worker node to cluster
kubeadm upgrade plan Check available upgrades

etcd Backup & Restore CKA

etcd stores everything about your cluster. All deployments, services, configurations, and secrets live there. Without backups, a corrupted etcd means rebuilding your entire cluster from scratch. Regular snapshots are essential for production environments.

Command Description
ETCDCTL_API=3 etcdctl snapshot save backup.db Create snapshot backup
ETCDCTL_API=3 etcdctl snapshot restore backup.db Restore from snapshot
Full Backup Command
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

Node Maintenance CKA

Take nodes offline safely without disrupting running applications. First cordon the node to prevent new Pods from scheduling there. Then drain the node to gracefully evict all running Pods, giving them time to shut down cleanly. The Pods will be rescheduled on other nodes. Finally, uncordon the node when maintenance is complete to allow new Pods to run there again.

Command Description
kubectl cordon <node> Mark node unschedulable (no new Pods)
kubectl drain <node> --ignore-daemonsets Evict Pods and cordon
kubectl uncordon <node> Mark node schedulable again

9. Application Development

Advanced topics for building and deploying applications on Kubernetes.

Custom Resource Definitions (CRDs) CKAD

Extend Kubernetes with your own resource types. This is how tools like ArgoCD add Application resources and Prometheus adds ServiceMonitor resources to the cluster. A CRD defines the schema, and a Controller watches for those resources and takes action. Together they form an Operator, which can automate complex application management.

Command Description
kubectl get crd List all Custom Resource Definitions
kubectl api-resources List all resources including CRDs
Example: ArgoCD Application CRD

ArgoCD is a GitOps tool that uses CRDs to manage deployments. This example shows an Application resource that tells ArgoCD to sync a folder from Git to your cluster:

apiVersion: argoproj.io/v1alpha1  # Custom API group from CRD
kind: Application                  # Custom resource type
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/myrepo.git
    targetRevision: HEAD            # Watch this Git branch
    path: apps/my-app               # Folder containing K8s manifests
  destination:
    server: https://kubernetes.default.svc
    namespace: production           # Deploy to this namespace
  syncPolicy:
    automated:
      prune: true                   # Delete resources removed from Git
      selfHeal: true                # Fix drift from desired state

Key points: This is not a built-in Kubernetes resource. The argoproj.io/v1alpha1 API group comes from the ArgoCD CRD. The ArgoCD controller watches for these resources and automatically syncs your cluster to match the Git repository.

Helm CKAD

The package manager for Kubernetes. Complex applications like databases or monitoring stacks might need dozens of YAML files. Helm packages these into charts with configurable values. You install a chart with a single command and customise it through values files. Helm tracks versions and supports rollbacks if an upgrade goes wrong.

Command Description
helm install <name> <chart> Install a chart
helm upgrade <name> <chart> Upgrade release
helm list List installed releases

Deployment Strategies CKAD

Strategy Description
RollingUpdate Default. Gradually replaces old Pods with new ones.
Recreate Kill all old Pods, then create new ones (downtime).
Blue/Green Run both versions, switch traffic at once (via Service selector).
Canary Route small % of traffic to new version, gradually increase.
RollingUpdate Config
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max pods over desired during update
      maxUnavailable: 0  # Max pods unavailable during update

Horizontal Pod Autoscaler (HPA) CKA

Automatically scales the number of Pods in a Deployment, StatefulSet, or ReplicaSet based on observed metrics like CPU or memory utilisation. HPA is essential for production workloads to handle traffic spikes without manual intervention. Requires metrics-server to be installed in the cluster.

Command Description
kubectl get hpa List HPAs and current metrics
kubectl describe hpa <name> Show scaling events and conditions
kubectl top pod View CPU/memory usage (needs metrics-server)
HPA Example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: backend-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: backend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Quick Reference

Essential commands and shortcuts for daily Kubernetes operations.

Resource Shortcuts

Shortcut Full Name
po pods
svc services
cm configmaps
secret secrets
ing ingresses
pvc persistentvolumeclaims
pv persistentvolumes
deploy deployments
sts statefulsets
ds daemonsets
rs replicasets
ns namespaces
netpol networkpolicies
sa serviceaccounts

Output Formats

Flag Description
-o yaml Output as YAML
-o json Output as JSON
-o wide Extra columns (node, IP)
-o name Just resource names
--show-labels Include all labels

Common Flags

Flag Description
-n <namespace> Target specific namespace
--all-namespaces or -A All namespaces
-l <selector> Filter by label selector
--selector=<selector> Same as -l
--dry-run=client Preview changes without applying
--field-selector=<field> Filter by field (status.phase=Running)

Essential Commands

Command Purpose
kubectl create -f manifest.yaml Create resources from file
kubectl apply -f manifest.yaml Create or update resources
kubectl delete -f manifest.yaml Delete resources from file
kubectl get all -n <ns> List all resources in namespace
kubectl describe <resource> <name> Detailed info and events
kubectl logs <pod> --previous Logs from previous container instance
kubectl exec -it <pod> -- /bin/sh Shell into running container
kubectl cp <pod>:/path ./local Copy files from pod to local
kubectl port-forward svc/<name> 8080:80 Forward local port to service
kubectl top pod Show CPU/memory usage (needs metrics-server)
kubectl explain <resource> Show resource documentation
kubectl api-resources List all available resources

My Architecture

Production-grade k3s homelab running on a repurposed laptop (32GB RAM, 512GB SSD) with Ubuntu Server 24.04 LTS.

Stack Overview

Layer Technology
Kubernetes k3s (lightweight distribution)
Remote Access Tailscale (encrypted VPN)
GitOps ArgoCD (Bootstrap Pattern)
Monitoring kube-prometheus-stack (Prometheus, Grafana, Alertmanager)
Ingress Traefik + Tailscale Operator
Secret Management Sealed Secrets
Storage Longhorn (distributed block storage)

Resource Usage

Component K8s Resource Purpose
Grafana Deployment Stateless web dashboard
Prometheus StatefulSet Metrics storage with persistent data
Alertmanager StatefulSet Alert routing with state persistence
ArgoCD Deployment GitOps controller
All Services ClusterIP Internal communication
Web UIs Ingress HTTPS routing via Tailscale
Metrics/Logs PVC Longhorn-backed persistent storage
Apps Namespace monitoring, argocd, tailscale
Grafana Dashboards ConfigMap Version-controlled configuration
Credentials SealedSecret Git-safe encrypted secrets

The GitOps Workflow

With the Bootstrap Pattern, deployments follow a simple flow: push YAML to Git, ArgoCD syncs automatically. No more kubectl apply commands.

Directory Contents
apps/ Application manifests (Deployments, Services)
infrastructure/ System manifests (monitoring, ingress, storage)
argocd/ Application CRDs (pointers to the above folders)
The Bootstrap Pattern Explained

The magic happens through a recursive pattern. You apply one file manually. Everything else happens automatically.

Step 1: Apply bootstrap.yaml once

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: bootstrap
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/your/repo
    path: argocd/              # Watch this folder
  destination:
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Step 2: ArgoCD reads argocd/ folder

Inside argocd/ are more Application CRDs. For example, argocd/monitoring.yaml:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: monitoring
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/your/repo
    path: infrastructure/monitoring/  # Points to manifests
  destination:
    namespace: monitoring
  syncPolicy:
    automated: 

The result: Bootstrap creates Applications that create more Applications. You push to Git, ArgoCD syncs everything. No more manual kubectl apply commands.